<div dir="ltr"><div>Hello.</div><div><br></div><div>I have not looked at the attachments, but you have limited Transient to 3500 MB. Getting "Could not get storage" should not be unexpected if a large enough amount of your transactions use Transient.</div><div><br></div><div>You can figure out which transactions are transient by filtering on the Storage tag. Both varnishlog and varnishncsa (with a good formatting string and both -b and -c enabled) can be used for this.</div><div><br></div><div>If no other alternative presents itself, maybe you need to switch to return (pipe) for some of your non-cacheable traffic just to save memory, but this disqualifies H2 and will give you a low connection reuse, so it is not optimal.</div><div><br></div><div>Best,</div><div>Pål<br></div><div><br></div><br><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">ons. 18. aug. 2021 kl. 10:04 skrev Marco Dickert - evolver group <<a href="mailto:marco.dickert@evolver.de">marco.dickert@evolver.de</a>>:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi all,<br>
<br>
I'm still investigating issues with one of our varnish instances. We<br>
use varnish as a cache and loadbalancer behind nginx and in front of a<br>
docker platform. We experienced an outage for about 20 minutes as<br>
clients received 503 errors being produced by varnish while the docker<br>
containers responded correct (according to the containers' logs).<br>
<br>
Setup is:<br>
<br>
[ nginx ==> varnish ] ==> [ docker swarm (4 hosts, lots of containers) ]<br>
<br>
<br>
Sites are distinguished by the exposed ports of the respective swarm<br>
services. Mapping site to service is done with a director containing<br>
the 4 hosts and the respective service port as backends.<br>
<br>
By comparing nginx logs with container logs we could confirm varnish<br>
being the culprit. It seemed like the backend request succeeds, but<br>
varnish returns a 503 error anyway.<br>
<br>
To investigate further, I activated some logging, which revealed some<br>
concerning information. Apparently varnish sometimes has problems with<br>
the storage, as the "FetchError" says "Could not get storage".<br>
<br>
```<br>
* << BeReq >> 70780723 <br>
- Begin bereq 70780722 pass<br>
[...]<br>
- Storage malloc Transient<br>
- Fetch_Body 2 chunked -<br>
- FetchError Could not get storage<br>
```<br>
<br>
I have attached two complete log examples to this mail.<br>
<br>
I did some extensive searching including the varnish book and stuff but<br>
so far did not come up with an explanation. Can anyone help understand<br>
why this happens and how to avoid it?<br>
<br>
Here are some additional information about our varnish instance:<br>
- Debian buster<br>
- system: HP DL360p G8, 32G RAM, Intel Xeon E5-2630<br>
- varnish 6.6.0-1~buster (using the varnish repos)<br>
- varnish start options:<br>
<br>
```<br>
ExecStart=/usr/sbin/varnishd -a :6081 \<br>
-T :6082 \<br>
-f /etc/varnish/default.vcl \<br>
-p ping_interval=6 -p cli_timeout=10 -p pipe_timeout=600 \<br>
-p listen_depth=4096 -p thread_pool_min=200<br>
-p thread_pool_max=500 -p workspace_client=128k<br>
-p nuke_limit=1000 -S /etc/varnish/secret \<br>
-s malloc,12G \<br>
-s Transient=malloc,3500M<br>
```<br>
<br>
Thanks in advance!<br>
<br>
-- <br>
Marco Dickert<br>
_______________________________________________<br>
varnish-misc mailing list<br>
<a href="mailto:varnish-misc@varnish-cache.org" target="_blank">varnish-misc@varnish-cache.org</a><br>
<a href="https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc" rel="noreferrer" target="_blank">https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc</a><br>
</blockquote></div>