Angelo Höngens A.Hongens at
Sun Dec 19 17:50:10 CET 2010


I noticed on some of my varnish machines that varnish was continuously restarting. In the logs I saw these errors:

Dec 19 16:14:15 nmt-nlb-05 varnishd[4014]: Child (32358) died signal=6
Dec 19 16:14:15 nmt-nlb-05 varnishd[4014]: Child (32358) Panic message: Missing errorhandling code in sma_alloc(), storage_malloc.c line 81:   Condition((sma->s.ptr) != 0) not true.errno = 12 (Cannot allocate memory) 

Using top, I saw the machine was using a lot of swap. I don't really know how to interpret the memory counters, but it looks like varnishnca was the culprit, using 10GB of virtual memory:

Tasks: 149 total,   1 running, 148 sleeping,   0 stopped,   0 zombie
Cpu(s):  3.5%us,  1.3%sy,  0.0%ni, 93.8%id,  0.2%wa,  0.1%hi,  1.0%si,  0.0%st
Mem:   8173416k total,  8081768k used,    91648k free,    32860k buffers
Swap:  4192888k total,  3900592k used,   292296k free,  1523580k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND                                                                                                 
 3304 haproxy   15   0 62724  43m  440 S 21.8  0.5 461:28.79 haproxy                                                                                                  
27159 varnish   18   0 1647m 232m  64m S  9.9  2.9   0:03.91 varnishd                                                                                                 
 3770 root      15   0 9743m 5.8g  80m S  2.0 74.8  35:42.34 varnishncsa                                                                                              
    1 root      15   0 10352  588  552 S  0.0  0.0   0:01.17 init   

Looks like a memory leak to me? After I restarted varnishnca and had it running for 15 minutes (it's doing ~750 requests/sec), it still used 122MB, and the machine was happy (not using swap anymore). As a workaround I created a cron job to regularly restart the varnishncsa service, but perhaps this might need some looking into?

About my environment: Running Varnish 2.1.4 on CentOS 5.5 x64. I downloaded the SRPM on a dedicated build box, applied Tollef's varnishncsa patch he wrote for our company, built the RPM's, and deployed these to our balancer machines. 


