varnish crashes

Angelo Höngens a.hongens at netmatch.nl
Sun Jan 24 19:40:23 CET 2010


On 24-1-2010 18:57, Michael S. Fischer wrote:
>> Thread_pool_max is set to 500 threads.. But I just increased it to
>> 4000 (as per http://varnish.projects.linpro.no/wiki/Performance),
>> as 'top' shows me it's using around 480~490 threads now..
>> 
>> You suggest lowering it, what would be the effect of that? I would
>> think it would run out of threads or something? Well, we'll see
>> what happens with the increased threads..
> 
> 
> Increasing concurrency is unlikely to solve the problem, although
> setting the number of thread pools to the number of CPUs is probably
> a good idea.
> 
> Assuming a high hit ratio and high CPU utilization (you haven't
> posted either), lowering concurrency (i.e. reducing thread_pool_max)
> can help reduce CPU contention incurred by context switching.
> 
> If maximum concurrency is reached, incoming connections will be
> deferred to the TCP listen(2) backlog (the overflowed_requests
> counter in varnishstat increases when this happens).   When the
> request reaches the head of the queue, it will then be picked up by a
> processing thread.  The net effect is some additional latency, but
> probably not as much as you're experiencing if your CPU is swamped
> with context switches.
> 
> There are a few cases where increasing thread_pool_max can help, in
> particular, where you have a high cache-miss ratio and you have slow
> origin servers.  But if CPU is already high, it will only make the
> problem worse.
> 
> BTW, on FreeBSD you can view the current length of the listen(2)
> backlog via "netstat -aL"  By default, varnishd's listen(2) backlog
> is 512; as long as you don't see the length hit that value you should
> be ok.

According to top, the CPU usage for the varnishd process is 0.0% at 400
req/sec. The load over the past 15 minutes is 0.45, probably mostly
because of haproxy running on the same machine. So I don't think load is
a problem.. My problem is that varnish sometimes just crashes or stops
responding.

My hit cache ratio is not that high, around 80%, and the backend servers
can be slow at times (quite complex .net web apps). But I've changed
some settings, and I am waiting for the next time varnish starts to stop
responding.. I'm beginning to think it's something that grows over time,
after restarting the varnish process things tend to run smooth for a
while. I'll just keep monitoring it.

You did give a lot of extra insight, and some counters I can look into,
thank you very much for that.

-- 


With kind regards,


Angelo Höngens
systems administrator

MCSE on Windows 2003
MCSE on Windows 2000
MS Small Business Specialist
------------------------------------------
NetMatch
tourism internet software solutions

Ringbaan Oost 2b
5013 CA Tilburg
+31 (0)13 5811088
+31 (0)13 5821239

A.Hongens at netmatch.nl
www.netmatch.nl
------------------------------------------





More information about the varnish-misc mailing list