Tracking down sporadic 503 errors

Javier Frias jfrias at gmail.com
Tue Mar 30 21:13:02 CEST 2010


Do you have a back-end test, if so, have checked the varnish logs for
back-end failed messages?

 I'm not sure what the varnish timeout may be, but you could try
reading the times of the 503's in the varnishlog, if they all look to
be about the same, you may be hitting a time out issue. ( if you log
times in apache, I would try to check those as well ).


-Javier


On Tue, Mar 30, 2010 at 2:32 PM, Vladimir <vlists at veus.hr> wrote:
> This is what I have seen as well. I could not pin it down. What made it
> somewhat better is adding
>
>    if (obj.status == 503 && req.restarts < 4) {
>                restart;
>    }
>
> under vcl_error subroutine. It will rerequest the document however even with
> that behavior is still happening though much less :-(. I have attached some
> graphs of 500 responses from varnish and corresponding apache responses.
> Units are hits per second.
>
> I even looked at corresponding responses from Apache and Apache would claim
> that the request succeeded yet varnish would throw a 500.
>
> Vladimir
>
> On Tue, 30 Mar 2010, Justin Pasher wrote:
>
>> I'm currently running Varnish r4633 (I can upgrade if absolutely
>> necessary), and we've been receiving very sporadic 503 errors listed in the
>> log files generated by varnishncsa. It's a very small percentage, but
>> nonetheless, when it happens on an important page, it's noticeable.
>>
>> Stats from yesterday's log file show 116 "503" errors out of about 4.1
>> million hits. About 80% of the failed requests are POST requests, which it
>> setup in my VCL as a "pass through". If I look in the apache logs (the
>> backend server), I only see one 503 error returned by apache itself, so
>> maybe there's a timeout issue somewhere. I'm trying to figure out the best
>> way to troubleshoot this, since it's too inconsistent to just sit watching
>> the output of varnishlog.
>>
>> Perhaps it's hitting the default value for between_bytes_timeout (60
>> seconds)? If processing the data on the backend takes too long, then varnish
>> would time out after 60 seconds of no data, even if the backend is still
>> churning, right? I guess the question is what situations cause Varnish to
>> return a 503 aside from when the backend itself returns a 503.
>>
>> I can post details of my VCL if needed, but it's pretty simple (mostly
>> taken from examples on the site).
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at varnish-cache.org
> http://lists.varnish-cache.org/mailman/listinfo/varnish-misc
>




More information about the varnish-misc mailing list