[Varnish] #1431: varnish 3.0.3 stops working - CLI communication error (hdr)
Varnish
varnish-bugs at varnish-cache.org
Fri Feb 7 12:40:52 CET 2014
#1431: varnish 3.0.3 stops working - CLI communication error (hdr)
-----------------------+----------------------
Reporter: cherouvim | Type: defect
Status: new | Priority: normal
Milestone: | Component: varnishd
Version: 3.0.3 | Severity: blocker
Keywords: |
-----------------------+----------------------
I have a low/average traffic site with varnish 3.0.3 fronting a tomcat.
varnishstat gives around 20 client_req, 10 cache_hit, 6 cache_miss per
second. Uptime has been great (2 months without a problem) but recently it
hanged with:
{{{
varnishd[1257]: CLI communication error (hdr)
varnishd[1257]: Pushing vcls failed:
varnishd[1257]: Stopping Child
varnishd[1257]: Child (18715) said Child starts
varnishd[1257]: ...
varnishd[1257]: Child cleanup complete
varnishd[1257]: Child (18715) said Child dies
varnishd[1257]: Child (18715) died status=1
varnishd[1257]: Child cleanup complete
}}}
(full messages at http://pastebin.com/raw.php?i=NtmtWNtv)
A varnish restart was required for the site to work.
Why does this happen? And why can't varnish recover from this situation?
Note that this child stop/start has happened once again 10 days ago but
that time it was successful: http://pastebin.com/raw.php?i=TF6Y0tpr
The VM running this has 10GB ram and the varnish configuration is:
{{{
VARNISHD_PARAMS="-f /etc/varnish/vcl.conf -a :90 \
-p thread_pool_min=300 \
-p thread_pool_max=3000 \
-p thread_pool_add_delay=2 \
-s malloc,5G \
-T:6082 \
-u varnish"
}}}
I've read that a potential issue may be cli_timeout (default: 10 secs) and
the advice is to increase it. Is this true? By how much?
Also, I think this issue relates to ticket #1119.
--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1431>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator
More information about the varnish-bugs
mailing list