Segfault in libvarnishcompat.so.1.0.0, after upgrading to build 4131

Ross Brown ross at trademe.co.nz
Mon Jul 6 23:16:07 CEST 2009


After upgrading to trunk (build 4131) last week, we are seeing an issue when the object cache (using malloc) becomes full. We are running a server with 16GB of RAM with the following startup options:

        -s malloc,12G 
        -a 0.0.0.0:80 
        -T 0.0.0.0:8021 
        -f /usr/local/etc/current.vcl 
        -t 86400 
        -h classic,42013 
        -P /var/run/varnish.pid 
        -p obj_workspace=4096 
        -p sess_workspace=262144 
        -p lru_interval=60 
        -p sess_timeout=10 
        -p shm_workspace=32768 
        -p ping_interval=1 
        -p thread_pools=4 
        -p thread_pool_min=50 
        -p thread_pool_max=4000 
        -p cli_timeout=20

VCL is pretty basic, we normalise and only accept GET and HEAD requests. 

Plotting usage using Cacti, we see varnishd crash and restart when the object cache is full.

Example of an error occurring :
Jul  3 11:04:50 tmcache2 kernel: [68325.150385] varnishd[15155]: segfault at ff ip 00007f1df03a4d06 sp 00007f1dd44b6120 error 4 in libvarnishcompat.so.1.0.0[7f1df039e000+e000]
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child (15130) not responding to ping, killing it.
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child (15130) not responding to ping, killing it.
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child (15130) died signal=11
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child cleanup complete
Jul  3 11:04:52 tmcache2 varnishd[2594]: child (5066) Started
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child (5066) said Closed fds: 3 4 5 8 9 11 12
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child (5066) said Child starts
Jul  3 11:04:52 tmcache2 varnishd[2594]: Child (5066) said Ready

This bug only occurs in build 4131, prior to this we were using build 4019 and didn't have this issue. 

Ross Brown
Trade Me Limited




More information about the varnish-misc mailing list