[Varnish] #1617: Varnish 4 weird memory consumption / calculation
Varnish
varnish-bugs at varnish-cache.org
Fri Oct 24 09:27:29 CEST 2014
#1617: Varnish 4 weird memory consumption / calculation
----------------------+----------------------
Reporter: whocares | Type: defect
Status: new | Priority: normal
Milestone: | Component: varnishd
Version: 4.0.2 | Severity: normal
Keywords: |
----------------------+----------------------
Since switching to Varnish 4 (4.0.2 currently) we're seeing some weirdness
in how Varnish consumes and calculates memory usage. In short, we're
seeing that memory requirements seem to have quadrupled compared to
Varnish 3.0.5. Or at least that's what Varnish thinks for in reality it
doesn't even use the memory it is given. This is best shown in an example:
Here's an excerpt of the output from "top" on one of the machines:
{{{
top - 08:18:52 up 22 days, 1:16, 3 users, load average: 0.06, 0.07,
0.05
Tasks: 80 total, 1 running, 79 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.5 us, 0.5 sy, 0.0 ni, 97.8 id, 0.0 wa, 0.0 hi, 0.2 si,
0.0 st
KiB Mem: 8716524 total, 3599224 used, 5117300 free, 163632 buffers
KiB Swap: 265212 total, 0 used, 265212 free, 1038644 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
13716 nobody 20 0 8457m 2.2g 81m S 4.0 26.1 44:26.34 varnishd
}}}
As one can see, Varnish only uses 2.2G of RAM and there's plenty of free
RAM available on the host. However, as shown in the VIRT row, Varnish
somehow manages to virtually allocate 8.4G of RAM. This is also reflected
by varnishstats:
{{{
~# varnishstat -1 | grep SMA
SMA.s0.c_req 741182 13.15 Allocator
requests
SMA.s0.c_fail 0 0.00 Allocator
failures
SMA.s0.c_bytes 24589352765 436360.54 Bytes
allocated
SMA.s0.c_freed 16205023965 287572.96 Bytes freed
SMA.s0.g_alloc 132231 . Allocations
outstanding
SMA.s0.g_bytes 8384328800 . Bytes
outstanding
SMA.s0.g_space 205605792 . Bytes
available
SMA.Transient.c_req 149507 2.65 Allocator
requests
SMA.Transient.c_fail 0 0.00 Allocator
failures
SMA.Transient.c_bytes 8960312066 159008.93 Bytes
allocated
SMA.Transient.c_freed 8960162698 159006.28 Bytes freed
SMA.Transient.g_alloc 90 . Allocations
outstanding
SMA.Transient.g_bytes 149368 . Bytes
outstanding
SMA.Transient.g_space 0 . Bytes
available
}}}
The SMA.s0.g_bytes seems to be in line with the memory usage reported by
top's VIRT row. And this also seems to be what varnish thinks it is using.
Which can't be the case since in reality it uses a lot less memory.
At the same time we're also seeing a lot less objects being stored in
Varnish's cache. When still on 3.0.5 we were able to keep ~140k objects in
a 4GB sized memory cache. With Varnish 4.0.2 we now need to give Varnish
8GB of cache to be able to hold ~70k objects.
I found that when disabling jemalloc the number get slightly better.
Especially the RAM usage reported be the RES row goes down dramatically,
not so much for the VIRT part. The examples above are from Varnish
compiled with `--disable-jemalloc`. Using an identical twin with jemalloc
enabled I get this:
{{{
top - 08:20:37 up 16 days, 20:34, 2 users, load average: 0.15, 0.14,
0.09
Tasks: 78 total, 1 running, 77 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.2 us, 0.3 sy, 0.0 ni, 98.2 id, 0.0 wa, 0.0 hi, 0.2 si,
0.2 st
KiB Mem: 8713744 total, 6227532 used, 2486212 free, 155360 buffers
KiB Swap: 265212 total, 0 used, 265212 free, 419784 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18182 nobody 20 0 10.0g 5.3g 83m S 4.3 64.0 461:00.62 varnishd
}}}
As one can see, more than twice RES usage. That's with jemalloc 3.6.0
(backported from Jessie), I also tried with 3.0.0 (as shipping with Debian
Wheezy) and 3.4.0. And here's the object count:
Varnish with standard malloc:
{{{
root at bus-cw-vrn-02:~# varnishstat -1 | grep obje
MAIN.n_object 67877 . N struct object
MAIN.n_vampireobject 0 . N unresurrected objects
MAIN.n_objectcore 67941 . N struct objectcore
MAIN.n_objecthead 68241 . N struct objecthead
}}}
Varnish with jemalloc:
{{{
root at bus-cw-vrn-01:~# varnishstat -1 | grep obje
MAIN.n_object 68456 . N struct object
MAIN.n_vampireobject 0 . N unresurrected objects
MAIN.n_objectcore 65987 . N struct objectcore
MAIN.n_objecthead 67722 . N struct objecthead
}}}
Both machines are sitting behind the same load balancer and as would be
expected they show roughly the same amount of elements. Just heavily
differing memory usage values.
We have other machines where we had to tell Varnish to "use" 19G of RAM to
stop it from LRU nuking objects while at the same time it only uses around
6-7 GB in reality, like this one for example:
{{{
top - 09:25:18 up 7 days, 23:12, 2 users, load average: 0.00, 0.04, 0.05
Tasks: 86 total, 1 running, 85 sleeping, 0 stopped, 0 zombie
%Cpu(s): 1.5 us, 1.2 sy, 0.0 ni, 96.5 id, 0.0 wa, 0.0 hi, 0.8 si,
0.0 st
KiB Mem: 20608476 total, 6223084 used, 14385392 free, 161092 buffers
KiB Swap: 265212 total, 0 used, 265212 free, 298496 cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
18138 nobody 20 0 21.7g 5.4g 83m S 6.6 27.4 734:59.64 varnishd
}}}
Now I generally assume that it's me who is doing something wrong, so I'd
be really thankful to get some pointers as to what the source of our
problems could be.
Here's the system and Varnish config:
OS: Debian Wheezy with all updates included
Varnish: 4.0.2 from the official repository and also manually built to
disable jemalloc
Varnish runtime parameters:
`/usr/sbin/varnishd -P /var/run/varnishd.pid -a 0.0.0.0:80 -f
/etc/varnish/default.vcl -T 0.0.0.0:6082 -t 120 -S /etc/varnish/secret -s
malloc,8G`
If there's anything else you need to know just let me know.
--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1617>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator
More information about the varnish-bugs
mailing list