varnish crash: Error in munmap() / Cannot allocate memory

Enno van Amerongen enno at
Tue Jun 25 14:01:41 CEST 2013

Dear Varnish List,

Yesterday I upgraded memory of one of our Varnish servers (from 32GB to 96GB).
Everything was running perfectly fine (hitrate increased from 75% to 85%), until today when I tried to load new VCL using varnishadm:

    Reloading varnish config: live_20130625_131055
    VCL compiled.dlopen(./ ./ failed to map segment from shared object: Cannot allocate memory

    Command failed with error code 106

Immediately after vcl.load failed, the following errors started popping up in syslog:

    Jun 25 13:11:19 host /var/www/varnish[10425]: Child (10426) said <jemalloc>: (malloc) Error in munmap(): P
    Jun 25 13:11:31 host /var/www/varnish[10425]: Child (10426) said <jemalloc>: (malloc) Error in munmap(): #001
    Jun 25 13:11:31 host /var/www/varnish[10425]: Child (10426) said <jemalloc>: (malloc) Error in munmap(): #020���X#177
    Jun 25 13:11:31 host /var/www/varnish[10425]: Child (10426) said <jemalloc>: (malloc) Error in munmap(): #020���X#177

Shortly after, Varnish crashed completely and lost the whole cache. Before Varnish crashed, there was 16GB memory free on the server.

I have no idea yet what caused the crash, so hopefully someone on the list can shed some light on it.

Varnish is started with the following settings:

echo starting varnish daemon
ulimit -n 131072
ulimit -l unlimited
sysctl -p /path/to/sysctl.conf
/path/to/varnishd \
    -s malloc,70G \
    -a \
    -T \
    -p thread_pools=2 \
    -p thread_pool_add_delay=2 \
    -p thread_pool_min=500 \
    -p thread_pool_max=3000 \
    -p session_linger=50 \
    -p sess_workspace=65536 \
    -p connect_timeout=1 \
    -p lru_interval=10 \
    -n /var/www/varnish \
    -f /path/to/server.vcl
echo done

varnishstat showed the following:

$ varnishstat -1 | grep -i trans
SMA.Transient.c_req       845311        10.32 Allocator requests
SMA.Transient.c_fail         104         0.00 Allocator failures
SMA.Transient.c_bytes  10835594005    132281.74 Bytes allocated
SMA.Transient.c_freed  10835063135    132275.26 Bytes freed
SMA.Transient.g_alloc           68          .   Allocations outstanding
SMA.Transient.g_bytes       530870          .   Bytes outstanding
SMA.Transient.g_space            0          .   Bytes available

104 Allocator failures on this instance, while we normally never see Allocator failures.

Any ideas what caused this, or how I can fix it?

Kind regards,

Enno van Amerongen
