<div dir="ltr">I am experiencing some strange behavior. Periodically requests will take 2x the time specified in first_byte_timeout before they actually are killed and the server returns a 503. This is quite annoying and it means that our users receive a 504 gateway timeout instead of our error message.<div>
<div><br></div><div>My gut tells me that it has something to do with the way return(restart) works. The response code returned from the backend is a 503.</div><div><br></div><div>Let me describe my vcl config briefly. The backend definitions look like this:</div>
<div><br></div><div><div><font face="courier new, monospace">backend test1 {</font></div><div><font face="courier new, monospace"> .host = "10.0.0.1";</font></div><div><font face="courier new, monospace"> .port = "80";</font></div>
<div><font face="courier new, monospace"> .connect_timeout = 1s;</font></div><div><font face="courier new, monospace"> .first_byte_timeout = 65s;</font></div><div><font face="courier new, monospace"> .between_bytes_timeout = 1s;</font></div>
<div><font face="courier new, monospace"> .saintmode_threshold = 5;</font></div><div><font face="courier new, monospace"> .probe = {</font></div><div><font face="courier new, monospace"> .request =</font></div><div>
<font face="courier new, monospace"> "GET /status HTTP/1.1"</font></div>
<div><font face="courier new, monospace"> "Host: <a href="http://test.com">test.com</a>"</font></div><div><font face="courier new, monospace"> "Connection: close";</font></div><div><font face="courier new, monospace"> .interval = 60s;</font></div>
<div><font face="courier new, monospace"> .timeout = 10s;</font></div><div><font face="courier new, monospace"> .window = 5;</font></div><div><font face="courier new, monospace"> .threshold = 3;</font></div><div>
<font face="courier new, monospace"> }</font></div><div><font face="courier new, monospace">}</font></div></div><div><font face="courier new, monospace"><br></font></div><div><font face="courier new, monospace">(similar for backend test2)</font></div>
<div><br></div><div>I'm using the round-robin directory. Under vcl_fetch I have this:</div><div><br></div><div><div><font face="courier new, monospace">sub vcl_fetch {</font></div><div><font face="courier new, monospace"> if (beresp.status == 503) {</font></div>
<div><font face="courier new, monospace"> set beresp.saintmode = 60s;</font></div><div><font face="courier new, monospace"> if (req.request != "POST") {</font></div><div><font face="courier new, monospace"> return(restart);</font></div>
<div><font face="courier new, monospace"> } else {</font></div><div><font face="courier new, monospace"> error 500 "Failed";</font></div><div><font face="courier new, monospace"> }</font></div><div><font face="courier new, monospace"> }</font></div>
<div><font face="courier new, monospace"><br></font></div><div><font face="courier new, monospace"> set beresp.grace = 10s;</font></div><div><font face="courier new, monospace">}</font></div></div><div><br></div><div>Lastly, here's me catching the varnish server in the act of essentially making two requests and not timing out.</div>
<div><font face="courier new, monospace"><br></font></div><div><div><font face="courier new, monospace"> 169 BackendXID b 1547011740</font></div><div><font face="courier new, monospace"> 169 TxRequest b GET</font></div>
<div><font face="courier new, monospace"> 169 TxURL b test?</font></div><div><font face="courier new, monospace"> 169 TxProtocol b HTTP/1.1</font></div><div><font face="courier new, monospace"> 169 TxHeader b User-Agent: curl/7.30.0</font></div>
<div><font face="courier new, monospace"> 169 TxHeader b Host: xxx.xxx.xxx</font></div><div><font face="courier new, monospace"> 169 TxHeader b Accept: */*</font></div><div><font face="courier new, monospace"> 169 TxHeader b X-Varnish: 1547011740</font></div>
<div><font face="courier new, monospace"> 169 TxHeader b Accept-Encoding: gzip</font></div><div><font face="courier new, monospace"> 169 BackendClose b test1</font></div><div><font face="courier new, monospace"><br>
</font></div><div><font face="courier new, monospace"> 107 BackendXID b 1547011740</font></div><div><font face="courier new, monospace"> 107 TxRequest b GET</font></div><div><font face="courier new, monospace"> 107 TxURL b /test?</font></div>
<div><font face="courier new, monospace"> 107 TxProtocol b HTTP/1.1</font></div><div><font face="courier new, monospace"> 107 TxHeader b User-Agent: curl/7.30.0</font></div><div><font face="courier new, monospace"> 107 TxHeader b Host: xxx.xxx.xxx</font></div>
<div><font face="courier new, monospace"> 107 TxHeader b Accept: */*</font></div><div><font face="courier new, monospace"> 107 TxHeader b X-Varnish: 1547011740</font></div><div><font face="courier new, monospace"> 107 TxHeader b Accept-Encoding: gzip</font></div>
<div><font face="courier new, monospace"> 107 BackendClose b test2</font></div></div><div><font face="courier new, monospace"><br></font></div><div><div><font face="courier new, monospace"><br></font></div><div><font face="courier new, monospace"> 47 SessionOpen c 162.208.42.130 59288 :80</font></div>
<div><font face="courier new, monospace"> 47 ReqStart c 162.208.42.130 59288 1547011740</font></div><div><font face="courier new, monospace"> 47 RxRequest c GET</font></div><div><font face="courier new, monospace"> 47 RxURL c /test?</font></div>
<div><font face="courier new, monospace"> 47 RxProtocol c HTTP/1.1</font></div><div><font face="courier new, monospace"> 47 RxHeader c User-Agent: curl/7.30.0</font></div><div><font face="courier new, monospace"> 47 RxHeader c Accept: */*</font></div>
<div><font face="courier new, monospace"> 47 RxHeader c Connection: keep-alive</font></div><div><font face="courier new, monospace"> 47 VCL_call c recv lookup</font></div><div><font face="courier new, monospace"> 47 VCL_call c hash</font></div>
<div><font face="courier new, monospace"> 47 Hash c</font></div><div><font face="courier new, monospace"> 47 Hash c /test?</font></div><div><font face="courier new, monospace"> 47 Hash c</font></div>
<div><font face="courier new, monospace"> 47 VCL_return c hash</font></div><div><font face="courier new, monospace"> 47 VCL_call c miss fetch</font></div><div><font face="courier new, monospace"> 47 Backend c 169 main_cluster test1</font></div>
<div><font face="courier new, monospace"> 47 FetchError c http first read error: -1 11 (Resource temporarily unavailable)</font></div><div><font face="courier new, monospace"> 47 Backend c 107 main_cluster test2</font></div>
<div><font face="courier new, monospace"> 47 FetchError c http first read error: -1 11 (Resource temporarily unavailable)</font></div><div><font face="courier new, monospace"> 47 VCL_call c error deliver</font></div>
<div><font face="courier new, monospace"> 47 VCL_call c deliver deliver</font></div><div><font face="courier new, monospace"> 47 TxProtocol c HTTP/1.1</font></div><div><font face="courier new, monospace"> 47 TxStatus c 503</font></div>
<div><font face="courier new, monospace"> 47 TxResponse c Service Unavailable</font></div><div><font face="courier new, monospace"> 47 TxHeader c Server: Varnish</font></div><div><font face="courier new, monospace"> 47 TxHeader c Content-Type: application/json; charset=utf-8</font></div>
<div><font face="courier new, monospace"> 47 TxHeader c Content-Length: 65</font></div><div><font face="courier new, monospace"> 47 TxHeader c Accept-Ranges: bytes</font></div><div><font face="courier new, monospace"> 47 TxHeader c Date: Fri, 10 Jan 2014 23:24:52 GMT</font></div>
<div><font face="courier new, monospace"> 47 TxHeader c X-Varnish: 1547011740</font></div><div><font face="courier new, monospace"> 47 TxHeader c Age: 130</font></div><div><font face="courier new, monospace"> 47 TxHeader c Via: 1.1 varnish</font></div>
<div><font face="courier new, monospace"> 47 TxHeader c Connection: close</font></div><div><font face="courier new, monospace"> 47 Length c 65</font></div><div><font face="courier new, monospace"> 47 ReqEnd c 1547011740 1389396162.194909096 1389396292.196509361 0.000040293 130.000771999 0.000828266</font></div>
</div><div><br></div><div>What it looks to me like what's happening is the first server is returning a bad status code after 60 seconds (the first_byte_timeout is 65), which satisfies the requirement of first_byte_timeout. The request triggers a return(restart) and it's tried again on another host, which takes another 60 or so seconds. Finally the request just dies by only after it's been hanging around for twice the time allotted in first_byte_timeout</div>
<div><br></div><div>Does anyone know what I'm doing wrong? I do want to trigger a return(restart) for spurious status codes, but I want first_byte_timeout to serve as the time I'm willing to let a request go through. Is there a better parameter I can use than this one?</div>
<div><br></div><div><div>It looks very similar to <a href="https://www.varnish-cache.org/trac/ticket/1156">this bug</a> but it looks like that's been fixed. I'm using varnish 3.0.5.<br></div><div><div><br></div><div>
Any help is greatly appreciated.</div>-- <br><div dir="ltr">Stephen Wood<div><a href="http://www.heystephenwood.com" target="_blank">www.heystephenwood.com</a></div></div>
</div></div></div></div>