Tracking down sporadic 503 errors
justinp at newmediagateway.com
Tue Mar 30 17:02:19 CEST 2010
I'm currently running Varnish r4633 (I can upgrade if absolutely
necessary), and we've been receiving very sporadic 503 errors listed in
the log files generated by varnishncsa. It's a very small percentage,
but nonetheless, when it happens on an important page, it's noticeable.
Stats from yesterday's log file show 116 "503" errors out of about 4.1
million hits. About 80% of the failed requests are POST requests, which
it setup in my VCL as a "pass through". If I look in the apache logs
(the backend server), I only see one 503 error returned by apache
itself, so maybe there's a timeout issue somewhere. I'm trying to figure
out the best way to troubleshoot this, since it's too inconsistent to
just sit watching the output of varnishlog.
Perhaps it's hitting the default value for between_bytes_timeout (60
seconds)? If processing the data on the backend takes too long, then
varnish would time out after 60 seconds of no data, even if the backend
is still churning, right? I guess the question is what situations cause
Varnish to return a 503 aside from when the backend itself returns a 503.
I can post details of my VCL if needed, but it's pretty simple (mostly
taken from examples on the site).
More information about the varnish-misc