Varnish returning synthetic 500 error even though it has stale content it should serve. But only seems to happen during/after a burst of traffic
Guillaume Quintard
guillaume.quintard at gmail.com
Thu Dec 16 20:23:15 UTC 2021
Could be a vary issue, can you post the output of `varnishlog -d -g request
-q 'RespStaus eq 500'? Please anonymize the bits you don't want us to see
by replacing them with "XXXXXX", but don't simply remove them.
In the meantime, here's a cheat sheet to get started with varnishlog:
https://docs.varnish-software.com/tutorials/vsl-query/
--
Guillaume Quintard
On Thu, Dec 16, 2021 at 12:15 PM Batanun B <batanun at hotmail.com> wrote:
> Hi,
>
> One of our websites usually has quite a low but steady stream of visitors,
> but occationally we get a sudden surge of requests over a very short time
> period (about 1-2 minutes). Varnish seems to handle the load fine, but the
> backends struggle with this. But I have noticed that Varnish doesn't serve
> the stale cached data, but instead shows our synthetic 500 page. This is
> true even for the start page, that definitely existed in the cache. And we
> have a grace period of 10 days, so I it's quite annoying that we can't
> simply serve the stale cached data during this short period.
>
> I have tried picturing the entire flow, following the logic in the vcl,
> but I can't see what we do wrong. And annoyingly I can't reproduce the
> problem locally by simply shutting down the backends (or setting them to
> unhealthy), because whenever I do that I get the stale content served just
> as intended. Could the sheer volume itself cause this, making it impossible
> to reproduce by simply fetching the page a few times in the browser before
> and after disabling the backends? Or is there some edge case that I haven't
> thought of that is causing this?
>
> A simpliedfied version of our vcl is included below, with only the
> relevant parts. But unless I have some blatent problem with the vcl, I
> think it would be good if I learned how to troubleshoot this using the
> Varnish tools, like varnishlog. So that next time this happens, I can use
> varnishlog etc to see what's happening.
>
> Is it possible using varnishlog to find the very first request for a
> specific path ("/" in our case) where it returned the synthetic 500 and put
> it in the cache? And is it also possible to find the request just before
> that one, for the same path. If I could extract those two requests
> (including all the metadata in those transactions) from the djungle of
> thousands of requests, then maybe I can find some explanation why the
> second request doesn't return the stale data.
>
> -----------------------------
>
> sub vcl_hit {
> if (obj.ttl > 0s) {
> // Regular cache hit
> return (deliver);
> } else if (req.restarts == 0 && std.healthy(req.backend_hint)) {
> // Graced cache hit, first attempt.
> // Force cache miss to trigger fetch in foreground (ie synchronous
> fetch).
> set req.http.X-forced-miss = "true";
> return (miss);
> } else {
> // Graced cache hit, previous attempts failed (or backend
> unhealthy). Let the fetch happen in the background (ie asynchronous fetch),
> and return the cached value.
> return (deliver);
> }
> }
>
> sub vcl_recv {
> if (req.http.X-cache-pass == "true") {
> return(pass);
> }
>
> set req.grace = 240h; // 10 day grace
> }
>
> sub vcl_backend_response {
> if (bereq.http.X-cache-pass != "true") {
> if (beresp.status < 400) {
> set beresp.grace = 240h;
> set beresp.ttl = 30m; // Cache invalidation in the form of xkey
> softpurge can put objects into grace before the TTL is past.
> } else if (beresp.status < 500) {
> set beresp.ttl = 1m;
> return (deliver);
> } else if (beresp.status >= 500) {
> // In some cases we want to abandon the backend request on
> 500-errors, since it otherwise would overwrite the cached object that still
> is usefull for grace.
> // This will make it jump to vcl_synth with a 503 status.
> There it will restart the request.
> if (bereq.is_bgfetch) {
> // In grace period. Abandoning 5xx request, since it
> otherwise would overwrite the cached object that still is usefull for grace
> return (abandon);
> } else if (bereq.http.X-forced-miss == "true") {
> return (abandon);
> }
> // Non background fetch, ie no grace period (and no stale
> content available). Cache the error page for a few seconds.
> set beresp.ttl = 15s;
> return (deliver);
> }
> }
> }
>
> sub vcl_synth {
> if (req.http.X-forced-miss == "true" && resp.status >= 500) {
> return (restart);
> }
> }
>
> sub vcl_backend_error {
> if (bereq.is_bgfetch || bereq.http.X-forced-miss == "true") {
> // By calling abandon, it will jump to vcl_synth with a 503
> status. There it will restart the request.
> return (abandon);
> }
> set beresp.http.Content-Type = "text/html; charset=utf-8";
> set beresp.ttl = 15s;
> synthetic(std.fileread("/etc/varnish/500.html"));
> return (deliver);
> }
>
> -----------------------------
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20211216/57851d97/attachment-0001.html>
More information about the varnish-misc
mailing list