backend timeouts/503s vs grace cache

Wed Nov 15 14:46:36 UTC 2017

Oh, then your life should be easier then! Don't forget to drain the
connections, varnishstat will give you the number of open connections open
to any backend.

-- 
Guillaume Quintard

On Wed, Nov 15, 2017 at 3:42 PM, Andrei <lagged at gmail.com> wrote:

> Thanks for the pointers! The tunnel setup is pretty flexible so I'll go
> ahead and mark the backend sick before restarting the tunnel, then healthy
> once confirmed up.
>
>
> On Wed, Nov 15, 2017 at 8:34 AM, Guillaume Quintard <
> guillaume at varnish-software.com> wrote:
>
>> You can wait until vcl_deliver and do a restart, possibly adding a marker
>> saying "don't bother with the backend, serve from cache".
>>
>> The actual solution would be to mark the backend as sick before
>> restarting the ssh tunnel, and draining the connections, but I guess that's
>> not an option here, is it?
>>
>> --
>> Guillaume Quintard
>>
>> On Wed, Nov 15, 2017 at 3:29 PM, Andrei <lagged at gmail.com> wrote:
>>
>>> Hi Guillaume,
>>>
>>> Thanks for getting back to me
>>>
>>> On Wed, Nov 15, 2017 at 8:11 AM, Guillaume Quintard <
>>> guillaume at varnish-software.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> Why bother with the complex vcl_hit? Since you are saying that the
>>>> cache is regularly primed, I don't really see the added value.
>>>>
>>> I was mainly going by an example a while back, and not all sites/urls
>>> are primed in the same manner. It just stuck in the conf ever since
>>>
>>>
>>>>
>>>> (note, after a quick glance at it, I think it could just be a race
>>>> condition where the backend appears up in vcl_hit and is down by the time
>>>> you ask it the content)
>>>>
>>> How would you suggest "restarting" the request to try and force a grace
>>> cache object to be returned if present in that case?
>>>
>>>
>>>
>>>>
>>>> --
>>>> Guillaume Quintard
>>>>
>>>> On Wed, Nov 15, 2017 at 6:02 AM, Andrei <lagged at gmail.com> wrote:
>>>>
>>>>> bump
>>>>>
>>>>> On Sun, Nov 5, 2017 at 2:12 AM, Andrei <lagged at gmail.com> wrote:
>>>>>
>>>>>> Hello everyone,
>>>>>>
>>>>>> One of the backends we have configured, runs through an SSH tunnel
>>>>>> which occasionally gets restarted. When the tunnel is restarted, Varnish is
>>>>>> returning a 503 since it can't reach the backend for pages which would
>>>>>> normally be cached (we force cache on the front page of the related site).
>>>>>> I believe our grace implementation might be incorrect, as we would expect a
>>>>>> grace period cache return instead of 503.
>>>>>>
>>>>>> Our grace ttl is set to 21600 seconds based on a global variable:
>>>>>>
>>>>>> sub vcl_backend_response {
>>>>>>   set beresp.grace = std.duration(variable.global_get("ttl_grace") +
>>>>>> "s", 6h);
>>>>>> }
>>>>>>
>>>>>> Our grace implementation in sub vcl_hit is:
>>>>>>
>>>>>>   sub vcl_hit {
>>>>>>     # We have no fresh fish. Lets look at the stale ones.
>>>>>>     if (std.healthy(req.backend_hint)) {
>>>>>>       # Backend is healthy. Limit age to 10s.
>>>>>>       if (obj.ttl + 10s > 0s) {
>>>>>>         #set req.http.grace = "normal(limited)";
>>>>>>         std.log("OKHITDELIVER: obj.ttl:" + obj.ttl + " obj.keep: " +
>>>>>> obj.keep + " obj.grace: " + obj.grace);
>>>>>>         return (deliver);
>>>>>>       } else {
>>>>>>         # No candidate for grace. Fetch a fresh object.
>>>>>>         std.log("No candidate for grace. Fetch a fresh object.
>>>>>> obj.ttl:" + obj.ttl + " obj.keep: " + obj.keep + " obj.grace: " +
>>>>>> obj.grace);
>>>>>>         return(miss);
>>>>>>       }
>>>>>>     } else {
>>>>>>       # backend is sick - use full grace
>>>>>>         if (obj.ttl + obj.grace > 0s) {
>>>>>>         #set req.http.grace = "full";
>>>>>>         std.log("SICK DELIVERY: obj.hits: " +   obj.hits + "
>>>>>> obj.ttl:" + obj.ttl + " obj.keep: " + obj.keep + " obj.grace: " +
>>>>>> obj.grace);
>>>>>>         return (deliver);
>>>>>>       } else {
>>>>>>         # no graced object.
>>>>>>         std.log("No graced object. obj.ttl:" + obj.ttl + " obj.keep:
>>>>>> " + obj.keep + " obj.grace: " + obj.grace);
>>>>>>         return (miss);
>>>>>>       }
>>>>>>     }
>>>>>>
>>>>>>     # fetch & deliver once we get the result
>>>>>>     return (miss); # Dead code, keep as a safeguard
>>>>>>   }
>>>>>>
>>>>>>
>>>>>> Occasionally we see:
>>>>>> -   VCL_Log        No candidate for grace. Fetch a fresh object.
>>>>>> obj.ttl:-1369.659 obj.keep: 0.000 obj.grace: 21600.000
>>>>>>
>>>>>> For the most part, it's:
>>>>>> -   VCL_Log        OKHITDELIVER: obj.ttl:26.872 obj.keep: 0.000
>>>>>> obj.grace: 21600.000
>>>>>>
>>>>>> Are we setting the grace ttl too low perhaps?
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> varnish-misc mailing list
>>>>> varnish-misc at varnish-cache.org
>>>>> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>>>>>
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20171115/2b599427/attachment-0001.html>