Panics/reboots with Varnish
ltning at anduin.net
Mon May 7 18:50:32 CEST 2007
I'm running a server (fbsd-amd64, 6.2-STABLE) with a bunch of jails
(~10). A couple of these are seeing pretty heavy HTTP traffic, so I
threw in a varnishd in each of the two main offenders, each using the
httpd in the same jail as back-end. Then I use pf on the host to NAT
incoming requests to <jail>:http to <jail>:varnishdport .
Now this does indeed lessen the load on the server quite
dramatically, and also leads to positive reports from users whenever
this is switched on. I can globally enable/disable this with pfctl -e/
pfctl -d, or by modifying+reloading pf config files. In other words -
when it works, it works great.
Problem: Whenever this is in effect, the box rarely stays up for >48
hours. Without this in effect, it can stay up for >30 days. I've been
playing with this since the day varnish 1.0 was released, and from
what I can tell this is consistent behavior.
I don't have any coredumps - it appears as if the server is just
booting. I do have a core dump from another situation (where I used a
mismatched raid controller driver, causing a known panic), so I know
the whole dump stuff works as it should. I have also swapped memory
with no change in behavior.
Does anyone have any idea what can be causing this? Which edge cases
might be touched by varnish that I'm not seeing elsewhere? I can see
the box is swapping a bit from time to time, but nothing dramatic,
and that is as it should be, just the kernel doing its job.
More information about the varnish-misc