backend timeouts/503s vs grace cache

Wed Nov 15 14:48:38 UTC 2017

What do you mean exactly when you say "drain the connections"? :D

On Wed, Nov 15, 2017 at 8:46 AM, Guillaume Quintard <
guillaume at varnish-software.com> wrote:

> Oh, then your life should be easier then! Don't forget to drain the
> connections, varnishstat will give you the number of open connections open
> to any backend.
>
> --
> Guillaume Quintard
>
> On Wed, Nov 15, 2017 at 3:42 PM, Andrei <lagged at gmail.com> wrote:
>
>> Thanks for the pointers! The tunnel setup is pretty flexible so I'll go
>> ahead and mark the backend sick before restarting the tunnel, then healthy
>> once confirmed up.
>>
>>
>> On Wed, Nov 15, 2017 at 8:34 AM, Guillaume Quintard <
>> guillaume at varnish-software.com> wrote:
>>
>>> You can wait until vcl_deliver and do a restart, possibly adding a
>>> marker saying "don't bother with the backend, serve from cache".
>>>
>>> The actual solution would be to mark the backend as sick before
>>> restarting the ssh tunnel, and draining the connections, but I guess that's
>>> not an option here, is it?
>>>
>>> --
>>> Guillaume Quintard
>>>
>>> On Wed, Nov 15, 2017 at 3:29 PM, Andrei <lagged at gmail.com> wrote:
>>>
>>>> Hi Guillaume,
>>>>
>>>> Thanks for getting back to me
>>>>
>>>> On Wed, Nov 15, 2017 at 8:11 AM, Guillaume Quintard <
>>>> guillaume at varnish-software.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Why bother with the complex vcl_hit? Since you are saying that the
>>>>> cache is regularly primed, I don't really see the added value.
>>>>>
>>>> I was mainly going by an example a while back, and not all sites/urls
>>>> are primed in the same manner. It just stuck in the conf ever since
>>>>
>>>>
>>>>>
>>>>> (note, after a quick glance at it, I think it could just be a race
>>>>> condition where the backend appears up in vcl_hit and is down by the time
>>>>> you ask it the content)
>>>>>
>>>> How would you suggest "restarting" the request to try and force a grace
>>>> cache object to be returned if present in that case?
>>>>
>>>>
>>>>
>>>>>
>>>>> --
>>>>> Guillaume Quintard
>>>>>
>>>>> On Wed, Nov 15, 2017 at 6:02 AM, Andrei <lagged at gmail.com> wrote:
>>>>>
>>>>>> bump
>>>>>>
>>>>>> On Sun, Nov 5, 2017 at 2:12 AM, Andrei <lagged at gmail.com> wrote:
>>>>>>
>>>>>>> Hello everyone,
>>>>>>>
>>>>>>> One of the backends we have configured, runs through an SSH tunnel
>>>>>>> which occasionally gets restarted. When the tunnel is restarted, Varnish is
>>>>>>> returning a 503 since it can't reach the backend for pages which would
>>>>>>> normally be cached (we force cache on the front page of the related site).
>>>>>>> I believe our grace implementation might be incorrect, as we would expect a
>>>>>>> grace period cache return instead of 503.
>>>>>>>
>>>>>>> Our grace ttl is set to 21600 seconds based on a global variable:
>>>>>>>
>>>>>>> sub vcl_backend_response {
>>>>>>>   set beresp.grace = std.duration(variable.global_get("ttl_grace")
>>>>>>> + "s", 6h);
>>>>>>> }
>>>>>>>
>>>>>>> Our grace implementation in sub vcl_hit is:
>>>>>>>
>>>>>>>   sub vcl_hit {
>>>>>>>     # We have no fresh fish. Lets look at the stale ones.
>>>>>>>     if (std.healthy(req.backend_hint)) {
>>>>>>>       # Backend is healthy. Limit age to 10s.
>>>>>>>       if (obj.ttl + 10s > 0s) {
>>>>>>>         #set req.http.grace = "normal(limited)";
>>>>>>>         std.log("OKHITDELIVER: obj.ttl:" + obj.ttl + " obj.keep: " +
>>>>>>> obj.keep + " obj.grace: " + obj.grace);
>>>>>>>         return (deliver);
>>>>>>>       } else {
>>>>>>>         # No candidate for grace. Fetch a fresh object.
>>>>>>>         std.log("No candidate for grace. Fetch a fresh object.
>>>>>>> obj.ttl:" + obj.ttl + " obj.keep: " + obj.keep + " obj.grace: " +
>>>>>>> obj.grace);
>>>>>>>         return(miss);
>>>>>>>       }
>>>>>>>     } else {
>>>>>>>       # backend is sick - use full grace
>>>>>>>         if (obj.ttl + obj.grace > 0s) {
>>>>>>>         #set req.http.grace = "full";
>>>>>>>         std.log("SICK DELIVERY: obj.hits: " +   obj.hits + "
>>>>>>> obj.ttl:" + obj.ttl + " obj.keep: " + obj.keep + " obj.grace: " +
>>>>>>> obj.grace);
>>>>>>>         return (deliver);
>>>>>>>       } else {
>>>>>>>         # no graced object.
>>>>>>>         std.log("No graced object. obj.ttl:" + obj.ttl + " obj.keep:
>>>>>>> " + obj.keep + " obj.grace: " + obj.grace);
>>>>>>>         return (miss);
>>>>>>>       }
>>>>>>>     }
>>>>>>>
>>>>>>>     # fetch & deliver once we get the result
>>>>>>>     return (miss); # Dead code, keep as a safeguard
>>>>>>>   }
>>>>>>>
>>>>>>>
>>>>>>> Occasionally we see:
>>>>>>> -   VCL_Log        No candidate for grace. Fetch a fresh object.
>>>>>>> obj.ttl:-1369.659 obj.keep: 0.000 obj.grace: 21600.000
>>>>>>>
>>>>>>> For the most part, it's:
>>>>>>> -   VCL_Log        OKHITDELIVER: obj.ttl:26.872 obj.keep: 0.000
>>>>>>> obj.grace: 21600.000
>>>>>>>
>>>>>>> Are we setting the grace ttl too low perhaps?
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> varnish-misc mailing list
>>>>>> varnish-misc at varnish-cache.org
>>>>>> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20171115/dab7ce59/attachment.html>