Varnishd stops sending logs to VSM after a while

Cédric Jeanneret varnish at tengu.ch
Thu Mar 13 09:30:44 CET 2014


Hello,

We have a small problem with 5 of our varnishes (on a total of 6…):
it seems every morning, varnishd stops sending stuff to the Shared
Memory[1], meaning we don't have any logs.

The only thing I found in order to get logs back is to restart varnish,
but of course this isn't the best way to solve the problem…

Here are some information:

Version: varnishd (varnish-3.0.2 revision 55e70a4)
** OS: Debian Squeeze
** System memory: 7468Mo
** CPU: dual E5645 @ 2.40GHz (Note: for those who knows about Amazon AWS
instances, it's an m1.large, instance-store AMI.)
** Daemon options:
DAEMON_OPTS="-n <instance-name> \
             -u varnish -g varnish \
             -a :80 \
             -T localhost:6082 \
             -s malloc,5G \
             -f /etc/varnish/<configuration-file>.vcl \
             -S /path/to/secret \
             -p shm_reclen=65535 \
"

** Tasks around the time the logs stop: logrotate for varnishncsa logs,
with a varnishncsa restart. This shouldn't break varnishd log system,
and it worked fine for months…

We didn't detect any problem with memory nor disk I/O during the outage.
This morning, it was the third time in a row we detected this issue.

For what we know, neither the VCL nor daemon options were changed just
before the problem appears (well, VCL was changed, some backend
"routing" was updated, but nothing out of the ordinary stuff we do for
months now).

Symptoms:
- /var/lib/varnish/<instance-name>/_.vsm isn't updated
- running varnishncsa or varnishlog from the shell doesn't show any log
entries
- restarting the varnishd service bring logs up again (we can see the
flow if we keep the varnishncsa up)

An "lsof -p <varnish-pid>" shows this line:
varnishd 17948 varnish  DEL    REG      202,1              264837
/var/lib/varnish/<instance-name>/_.vsm

I'm not very comfortable with the "DEL" FD: when I do the same command
once logs flow, I get:
varnishd 22603 varnish  mem       REG      202,1 84934656     264058
/var/lib/varnish/<instance-name>/_.vsm

It seems "something" is degrading the shared memory…

Any help would be welcome, I'm a bit stuck with the investigations right
now :(.

Cheers,

C.


[1] https://www.varnish-cache.org/docs/trunk/reference/vsm.html



More information about the varnish-misc mailing list