varnish crashes
Angelo Höngens
a.hongens at netmatch.nl
Sat Jan 23 12:08:32 CET 2010
On 23-1-2010 11:27, Poul-Henning Kamp wrote:
> In message <4B5ACD81.3000903 at netmatch.nl>, =?ISO-8859-1?Q?Angelo_H=F6ngens?= wr
> ites:
>
>> We have 4 balancers, each running FreeBSD 7.2 with 'device carp'
>> compiled in. I haven't dared upgrade to 8.0 yet, because I had problems
>> on my testmachine earlier with ipv6 and carp interfaces on 8.0.
>
> It sounds mostly like a resource issue, but I can't say exactly from
> what you have provided.
I get that feeling as well, but I can't seem to find anything wrong.
Thanks for your reaction, I hope you can give me some more pointers..
By the way: the balancers do a total of 2000 req/sec now, but when
stresstesting I can easily get 9000 cache/hits persec. So I don't think
it's hanging on the upper limits of its performance.
Even worse, I just had to reboot one of the balancers, it (almost)
completely locked up. Ping responds, but ssh dies, and the local console
on the machine does not respond either (it does not show any messages).
The machines ran a heavy squid load for over a year, but never hung. Grrr..
> You can consider increasing the "cli_timeout" parameter a bit
> and see if it is simply a matter of a busy machine.
ok, will try.
I now have in my /etc/rc.conf:
varnishd_enable="YES"
varnishd_listen=":80"
varnishd_storage="file,/cache,80%"
varnishd_config="/usr/local/etc/varnish/default.vcl"
I just changed this (after reading the tuning page some more) to:
varnishd_enable="YES"
varnishd_flags="-P /var/run/varnishd.pid -a :80 -T localhost:81 -f
/usr/local/etc/varnish/default.vcl -s file,/cache,80% -u www -g www -p
cli_timeout=30 -p lru_interval=20"
Let's see what happens..
>
> Are you running on 32 bit or 64 bit machines ?
64-bit..
> Use FreeBSD's gstat to see what your disk-activity is like,
> pay particular attention to the service times (ms/r & ms/w cols)
wow, that's a nice tool, I only knew iostat ;)
The disk system is a gmirror of 2 sata disks, and I see on average it
does 33 iops, with a response time of 8.3ms/r, and 4.2ms/w.. Not really
shocking.
>
> Also check your varnishlog and varnishstat for signs of trouble...
Did that, everything looks peachy. Varnishlog produces too much output
(and I would not know what to filter), and varnishstat looks ok as well.
Are there specific counters that could indicate trouble?
0+01:16:24
nmt-nlb-04.netmatchcolo1.local
Hitrate ratio: 10 100 199
Hitrate avg: 0.7894 0.8059 0.8054
362508 133.35 79.08 Client connections accepted
1807294 366.95 394.26 Client requests received
1261936 260.68 275.29 Cache hits
130026 30.08 28.37 Cache hits for pass
323451 64.17 70.56 Cache misses
545215 106.28 118.94 Backend conn. success
995 0.00 0.22 Fetch head
543052 98.26 118.47 Fetch with Length
209 0.00 0.05 Fetch chunked
453 0.00 0.10 Fetch wanted close
1215 . . N struct sess_mem
510 . . N struct sess
121156 . . N struct object
120051 . . N struct objecthead
242414 . . N struct smf
809 . . N small free smf
1 . . N large free smf
122 . . N struct vbe_conn
438 . . N struct bereq
467 . . N worker threads
699 0.00 0.15 N worker threads created
0 0.00 0.00 N queued work requests
8000 0.00 1.75 N overflowed work requests
317 . . N backends
202676 . . N expired objects
698992 . . N LRU moved objects
1356134 264.69 295.84 Objects sent with write
47428 38.10 10.35 Total Sessions
--
With kind regards,
Angelo Höngens
systems administrator
MCSE on Windows 2003
MCSE on Windows 2000
MS Small Business Specialist
------------------------------------------
NetMatch
tourism internet software solutions
Ringbaan Oost 2b
5013 CA Tilburg
+31 (0)13 5811088
+31 (0)13 5821239
A.Hongens at netmatch.nl
www.netmatch.nl
------------------------------------------
More information about the varnish-misc
mailing list