Varnish 503 on POST requests once in a while

Guillaume Quintard guillaume at varnish-software.com
Tue Jan 7 19:00:27 UTC 2020


Could it be that you have a 5s timeout on your backend?

-- 
Guillaume Quintard


On Sun, Dec 29, 2019 at 10:24 PM Maninder Singh <mandys at gmail.com> wrote:

> Hi Guillaume,
>
> Any idea on what the 5 sec timeout might be ( which happens only 3-4 times
> a day ) ?
>
> Is it the timeout_idle as mentioned by Dridi ?
>
> What can be done to fix it ?
>
> On Fri, 27 Dec 2019 at 14:22, Maninder Singh <mandys at gmail.com> wrote:
>
>> Here are the different timeouts:
>>
>> *timeout_idle* (default: 5 seconds): How long we will wait from we
>> `accept()` the connection, until the client must have sent us a
>> non-whitespace character. (typically this will be the G in "GET /url
>> HTTP/1.1")
>>
>> Isn't idle_timeout for client to varnish requests ??
>>
>> Also, just so that you know, these requests dont make it to apache logs
>> also.
>>
>>
>>
>> backend_idle_timeout       60.000 [seconds] (default)
>> between_bytes_timeout      60.000 [seconds] (default)
>> cli_timeout                60.000 [seconds] (default)
>> connect_timeout            3.500 [seconds] (default)
>> first_byte_timeout         60.000 [seconds] (default)
>> idle_send_timeout          60.000 [seconds] (default)
>> pipe_timeout               60.000 [seconds] (default)
>> send_timeout               600.000 [seconds] (default)
>> thread_pool_timeout        300.000 [seconds] (default)
>> timeout_idle               5.000 [seconds] (default)
>> timeout_linger             0.050 [seconds] (default)
>>
>>
>> On Fri, 27 Dec 2019 at 13:15, Dridi Boukelmoune <dridi at varni.sh> wrote:
>>
>>> On Fri, Dec 27, 2019 at 1:00 AM Maninder Singh <mandys at gmail.com> wrote:
>>> >
>>> > I only have php-fpm slow logging for requests greater than 5secs.
>>> Otherwise it should be all defaults for varnish/apache/fpm.
>>> >
>>> > Does varnish or apache have anything defaulting to 5secs?
>>>
>>> timeout_idle defaults to 5s
>>>
>>> > Otherwise fpm might be closing the connection for logging a slow
>>> request ? Strange?
>>> >
>>> > On Thu, 26 Dec, 2019, 10:44 pm Guillaume Quintard, <
>>> guillaume at varnish-software.com> wrote:
>>> >>
>>> >> The first looks like the backend is just closing the door on you. The
>>> other one (Resource temporarily unavailable), because of the
>>> Timestamp:Beresp,  is most probably a timeout (did you set anything to 5s?)
>>> >>
>>> >> --
>>> >> Guillaume Quintard
>>> >>
>>> >>
>>> >> On Thu, Dec 26, 2019 at 3:12 PM Maninder Singh <mandys at gmail.com>
>>> wrote:
>>> >>>
>>> >>> Another one.
>>> >>> This one is diff error.
>>> >>>
>>> >>> *   << BeReq    >> 19327398
>>> >>> -   Begin          bereq 19327397 pass
>>> >>> -   Timestamp      Start: 1577370022.344818 0.000000 0.000000
>>> >>> -   BereqMethod    POST
>>> >>> -   BereqURL       /LetsCelebrate
>>> >>> -   BereqProtocol  HTTP/1.1
>>> >>> -   BereqHeader    User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS
>>> X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.88
>>> Safari/537.36
>>> >>> -   BereqHeader    Accept:
>>> text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9
>>> >>> -   BereqHeader    Cache-Control: max-age=0
>>> >>> -   BereqHeader    SSLClientCertStatus: NoClientCert
>>> >>> -   BereqHeader    X-Forwarded-Proto: https
>>> >>> -   BereqHeader    Content-Type: multipart/form-data;
>>> boundary=----WebKitFormBoundaryBBVOJciq1v4rbZs9
>>> >>> -   BereqHeader    Accept-Language: en-GB,en-US;q=0.9,en;q=0.8
>>> >>> -   BereqHeader    SSLClientCipher:
>>> SSL_ECDHE_RSA_WITH_AES_256_GCM_SHA384, version=TLSv1.2, bits=256
>>> >>> -   BereqHeader    Host: mydomain.com
>>> >>> -   BereqHeader    Referer: https://mydomain.com/LetsCelebrate
>>> >>> -   BereqHeader    SSLSessionID:
>>> 97258494A3DD008A855B552F701E9F95C25CDC7B1CF5FD63D382ED09391585C7
>>> >>> -   BereqHeader    Origin: https://mydomain.com
>>> >>> -   BereqHeader    X-Cluster-Client-Ip: X.X.X.X
>>> >>> -   BereqHeader    X-Forwarded-Port: 443
>>> >>> -   BereqHeader    Upgrade-Insecure-Requests: 1
>>> >>> -   BereqHeader    Sec-Fetch-User: ?1
>>> >>> -   BereqHeader    Sec-Fetch-Site: same-origin
>>> >>> -   BereqHeader    Sec-Fetch-Mode: nested-navigate
>>> >>> -   BereqHeader    Accept-Encoding: gzip, deflate, br
>>> >>> -   BereqHeader    Content-Length: 2076
>>> >>> -   BereqHeader    X-Forwarded-For: X.X.X.X, 10.187.187.36
>>> >>> -   BereqHeader    browser: other
>>> >>> -   BereqHeader    serverIp: 10.208.225.235
>>> >>> -   BereqHeader    X-Varnish: 19327398
>>> >>> -   VCL_call       BACKEND_FETCH
>>> >>> -   VCL_return     fetch
>>> >>> -   BackendOpen    23 reload_2019-12-18T054238.default 127.0.0.1
>>> 8080 127.0.0.1 39568
>>> >>> -   BackendStart   127.0.0.1 8080
>>> >>> -   FetchError     req.body read error: 11 (Resource temporarily
>>> unavailable)
>>> >>> -   FetchError     backend write error: 11 (Resource temporarily
>>> unavailable)
>>> >>> -   Timestamp      Bereq: 1577370027.344908 5.000090 5.000090
>>> >>> -   BackendClose   23 reload_2019-12-18T054238.default
>>> >>> -   Timestamp      Beresp: 1577370027.345014 5.000196 0.000105
>>> >>> -   Timestamp      Error: 1577370027.345025 5.000206 0.000011
>>> >>> -   BerespProtocol HTTP/1.1
>>> >>> -   BerespStatus   503
>>> >>> -   BerespReason   Service Unavailable
>>> >>> -   BerespReason   Backend fetch failed
>>> >>> -   BerespHeader   Date: Thu, 26 Dec 2019 14:20:27 GMT
>>> >>> -   BerespHeader   Server: Varnish
>>> >>> -   VCL_call       BACKEND_ERROR
>>> >>> -   BerespHeader   Content-Type: text/html; charset=utf-8
>>> >>> -   BerespHeader   Retry-After: 5
>>> >>> -   VCL_return     deliver
>>> >>> -   Storage        malloc Transient
>>> >>> -   ObjProtocol    HTTP/1.1
>>> >>> -   ObjStatus      503
>>> >>> -   ObjReason      Backend fetch failed
>>> >>> -   ObjHeader      Date: Thu, 26 Dec 2019 14:20:27 GMT
>>> >>> -   ObjHeader      Server: Varnish
>>> >>> -   ObjHeader      Content-Type: text/html; charset=utf-8
>>> >>> -   ObjHeader      Retry-After: 5
>>> >>> -   Length         1288
>>> >>> -   BereqAcct      1276 0 1276 0 0 0
>>> >>> -   End
>>> >>>
>>> >>> On Thu, 26 Dec 2019 at 17:27, Maninder Singh <mandys at gmail.com>
>>> wrote:
>>> >>>>
>>> >>>> Got one in the log.
>>> >>>>
>>> >>>> varnishlog -r /var/log/varnish/503.log
>>> >>>> *   << BeReq    >> 21993384
>>> >>>> -   Begin          bereq 21993383 pass
>>> >>>> -   Timestamp      Start: 1577360704.829258 0.000000 0.000000
>>> >>>> -   BereqMethod    POST
>>> >>>> -   BereqURL       /corp/contact.php?cf=exitpopup
>>> >>>> -   BereqProtocol  HTTP/1.1
>>> >>>> -   BereqHeader    User-Agent: Mozilla/5.0 (Windows NT 6.1;
>>> Trident/7.0; rv:11.0) like Gecko
>>> >>>> -   BereqHeader    Accept: image/gif, image/x-xbitmap, image/jpeg,
>>> image/pjpeg, application/x-shockwave-flash, */*
>>> >>>> -   BereqHeader    SSLClientCertStatus: NoClientCert
>>> >>>> -   BereqHeader    X-Forwarded-Proto: https
>>> >>>> -   BereqHeader    Content-Type: application/x-www-form-urlencoded
>>> >>>> -   BereqHeader    SSLClientCipher:
>>> SSL_ECDHE_RSA_WITH_AES_256_CBC_SHA, version=TLSv1, bits=256
>>> >>>> -   BereqHeader    Host: www.mydomain.com
>>> >>>> -   BereqHeader    Referer:
>>> http://mydomain.com/solutions/enterprise.php
>>> >>>> -   BereqHeader    SSLSessionID:
>>> 3C119EF3FE70318789F82E9B583581C876B4DDE520C0182C33CD6AF1AA46E63A
>>> >>>> -   BereqHeader    X-Cluster-Client-Ip: X.X.X.X
>>> >>>> -   BereqHeader    Cookie:
>>> X-Mapping-fjhppofk=E8F494E9A630955F1975EEDF11EBC4D7
>>> >>>> -   BereqHeader    X-Forwarded-Port: 443
>>> >>>> -   BereqHeader    Accept-Encoding: gzip, deflate
>>> >>>> -   BereqHeader    Content-Length: 1366
>>> >>>> -   BereqHeader    X-Forwarded-For: X.X.X.X, 10.187.187.36
>>> >>>> -   BereqHeader    browser: other
>>> >>>> -   BereqHeader    serverIp: 10.208.224.192
>>> >>>> -   BereqHeader    X-Varnish: 21993384
>>> >>>> -   VCL_call       BACKEND_FETCH
>>> >>>> -   VCL_return     fetch
>>> >>>> -   BackendOpen    26 reload_2019-12-18T054253.default 127.0.0.1
>>> 8080 127.0.0.1 59998
>>> >>>> -   BackendStart   127.0.0.1 8080
>>> >>>> -   Timestamp      Bereq: 1577360705.664932 0.835674 0.835674
>>> >>>> -   FetchError     http first read error: EOF
>>> >>>> -   BackendClose   26 reload_2019-12-18T054253.default
>>> >>>> -   Timestamp      Beresp: 1577360705.665020 0.835763 0.000089
>>> >>>> -   Timestamp      Error: 1577360705.665028 0.835770 0.000008
>>> >>>> -   BerespProtocol HTTP/1.1
>>> >>>> -   BerespStatus   503
>>> >>>> -   BerespReason   Service Unavailable
>>> >>>> -   BerespReason   Backend fetch failed
>>> >>>> -   BerespHeader   Date: Thu, 26 Dec 2019 11:45:05 GMT
>>> >>>> -   BerespHeader   Server: Varnish
>>> >>>> -   VCL_call       BACKEND_ERROR
>>> >>>> -   BerespHeader   Content-Type: text/html; charset=utf-8
>>> >>>> -   BerespHeader   Retry-After: 5
>>> >>>> -   VCL_return     deliver
>>> >>>> -   Storage        malloc Transient
>>> >>>> -   ObjProtocol    HTTP/1.1
>>> >>>> -   ObjStatus      503
>>> >>>> -   ObjReason      Backend fetch failed
>>> >>>> -   ObjHeader      Date: Thu, 26 Dec 2019 11:45:05 GMT
>>> >>>> -   ObjHeader      Server: Varnish
>>> >>>> -   ObjHeader      Content-Type: text/html; charset=utf-8
>>> >>>> -   ObjHeader      Retry-After: 5
>>> >>>> -   Length         1288
>>> >>>> -   BereqAcct      849 1366 2215 0 0 0
>>> >>>> -   End
>>> >>>>
>>> >>>> On Thu, 26 Dec 2019 at 15:21, Maninder Singh <mandys at gmail.com>
>>> wrote:
>>> >>>>>
>>> >>>>> Okay. Let me set it up and get back to you once i see something in
>>> that log. It happens only once/twice a day so could be a while before i get
>>> back.
>>> >>>>>
>>> >>>>> On Thu, 26 Dec, 2019, 3:10 pm Guillaume Quintard, <
>>> guillaume at varnish-software.com> wrote:
>>> >>>>>>
>>> >>>>>> Hi,
>>> >>>>>>
>>> >>>>>> Most probably, varnishncsa isn't going to give you the full
>>> picture, so let's go for the exhaustive way:
>>> >>>>>>
>>> >>>>>> varnishlog -q "BerespStatus eq 503" -w /whatever/file
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> you can then read the logs with
>>> >>>>>>
>>> >>>>>> varnishlog -r /whatever/file
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> (more logging pointers here:
>>> https://docs.varnish-software.com/tutorials/vsl-query/)
>>> >>>>>>
>>> >>>>>> If you can get your hands on a failed transaction, we should be
>>> able to tell you what went wrong. Most probably, you hit a timeout of some
>>> sort. Or it could be that the backend was declared sick, if you have probes.
>>> >>>>>>
>>> >>>>>> --
>>> >>>>>> Guillaume Quintard
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> On Thu, Dec 26, 2019 at 6:33 AM Maninder Singh <mandys at gmail.com>
>>> wrote:
>>> >>>>>>>
>>> >>>>>>> Hi,
>>> >>>>>>>
>>> >>>>>>> I am facing a strange problem.
>>> >>>>>>>
>>> >>>>>>> I am running varnish 4.1.11 on port 80 which is behind a LB.
>>> >>>>>>>
>>> >>>>>>> Varnish then connects to apache 2.4 ( on port 8080 ) and also
>>> have php-fpm running.
>>> >>>>>>>
>>> >>>>>>> Now, everything works fine but once in a while I am seeing 503
>>> response from varnish for a POST request. During this time there is no log
>>> in apache also ( neither access log nor error log ).
>>> >>>>>>>
>>> >>>>>>> Where does this request vanish ?
>>> >>>>>>>
>>> >>>>>>> How do I debug this issue ?
>>> >>>>>>>
>>> >>>>>>> Some more details:
>>> >>>>>>>
>>> >>>>>>> * I have varnishncsa turned on with following flags.
>>> >>>>>>>
>>> >>>>>>> ExecStart=/usr/bin/varnishncsa -q 'not (ReqURL ~
>>> "\.(js|css|png|jpeg|jpg|ico|gif|swf|woff)" or ReqHeader:User-Agent ~
>>> "HTTP-Monitor" or ReqHeader:User-Agent ~ "Monitoring/1.1")' -a -w
>>> /var/log/varnish/varnishncsa.log -D -f /etc/sysconfig/varnishncsa
>>> >>>>>>>
>>> >>>>>>> Here is the entry in varnish log during that time.
>>> >>>>>>>
>>> >>>>>>> X.X.X.X, LB.IP.0.0 - - [25/Dec/2019:00:26:40 -0600] 2629363
>>> "POST http://mydomain.com/LetsCelebrate HTTP/1.1" 503 1288 "
>>> https://referringdomain.com/LetsCelebrate?hostingUrl=https%3A%2F%2Fxy.abc.com"
>>> "Mozilla/5.0 (Linux; Android 9; SM-A920F) AppleWebKit/537.36 (KHTML, like
>>> Gecko) Chrome/79.0.3945.93 Mobile Safari/537.36" "mydomain.com" miss
>>> >>>>>>>
>>> >>>>>>> In .vcl file I have following set.
>>> >>>>>>>  .first_byte_timeout = 120s;
>>> >>>>>>>
>>> >>>>>>> Now, I am not logging varnish to apache requests ( which backend
>>> fetch ).
>>> >>>>>>> As you can see above I am only logging client requests.
>>> >>>>>>>
>>> >>>>>>> Is their a way to log client requests like above but then also
>>> log backend fetch failures to another log file ?
>>> >>>>>>> [ Running 2 instances of varnishncsa ? ]
>>> >>>>>>>
>>> >>>>>>> Any pointers ?
>>> >>>>>>>
>>> >>>>>>> Please let me know.
>>> >>>>>>>
>>> >>>>>>> Thanks.
>>> >>>>>>>
>>> >>>>>>> _______________________________________________
>>> >>>>>>> varnish-misc mailing list
>>> >>>>>>> varnish-misc at varnish-cache.org
>>> >>>>>>>
>>> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>>> >
>>> > _______________________________________________
>>> > varnish-misc mailing list
>>> > varnish-misc at varnish-cache.org
>>> > https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20200107/b34d283e/attachment-0001.html>


More information about the varnish-misc mailing list