Varnish restarts when all memory is allocated

Marco Walraven m.walraven at terantula.com
Tue May 26 23:29:08 CEST 2009


Hi,

We are testing a Varnish Cache in our production environment with a 500Gb storage file and
32Gb of RAM. Varnish performance is excellent when all of the 32Gb is not allocated yet.
The rates I am seeing here are around 40-60Mbit/s, with roughly 2.2M objects in cache and
hitting a ratio of ~0.65, even then Varnish can handle it easily. However it is still
warming up since we have a lot of objects that need to be cached.

The problem I am facing is that as soon as RAM is exhausted Varnish restarts itself.
Since this looked like an IO problem, we dropped ext2 in favour of xfs with much
better results on writing to disk. However varnishd still stops working after it get
to the 32G RAM limit. Note that I don't see any IO until just before it hits the 97% of
RAM usage.      

So we thought to combine the file storage type with malloc and limit the amount of
memory Varnish is allowed to allocate, first to 5G and see how that would work out. 
It turned out that it did not get limited and it seems from reading some posts     
this is not needed..

I have seen some posts on running large caches with the same kind but not a real
approach to a solution. What is the best way to get around this issue ?

Below are the init script and output of both varnisstat and top.

Hitrate ratio:        3        3        3
Hitrate avg:     0.6008   0.6008   0.6008

       10871         1.00         1.17 Client connections accepted
     5278218       273.99       566.76 Client requests received
     2864011       172.99       307.53 Cache hits
     2413896       101.00       259.20 Cache misses
     2413920       101.00       259.20 Backend connections success
     2391749        99.00       256.82 Backend connections reuses
     2391795        99.00       256.82 Backend connections recycles
         148          .            .   N struct sess_mem
          29          .            .   N struct sess
     2366595          .            .   N struct object
     2364206          .            .   N struct objecthead
     4733079          .            .   N struct smf
           0          .            .   N small free smf
           1          .            .   N large free smf
          10          .            .   N struct vbe_conn
          96          .            .   N struct bereq
         400          .            .   N worker threads
         400         0.00         0.04 N worker threads created
           2          .            .   N backends
       47353          .            .   N expired objects
     2090535          .            .   N LRU moved objects
     5086915       265.99       546.22 Objects sent with write
       10867         1.00         1.17 Total Sessions
     5278227       273.99       566.76 Total Requests
          12         0.00         0.00 Total pipe
          13         0.00         0.00 Total pass
     2413900       101.00       259.20 Total fetch
  1865669893     97172.71    200329.64 Total header bytes
 22763257823   1297006.09   2444245.44 Total body bytes
        3335         0.00         0.36 Session Closed
     5275957       273.99       566.52 Session herd
   292178030     14367.51     31373.14 SHM records
     7758036       382.99       833.03 SHM writes
        6264         2.00         0.67 SHM flushes due to overflow
         239         0.00         0.03 SHM MTX contention
         125         0.00         0.01 SHM cycles through buffer
     4828098       201.99       518.43 allocator requests
     4733078          .            .   outstanding allocations
 30790995968          .            .   bytes allocated
506079916032          .            .   bytes free
         303         0.00         0.03 SMS allocator requests
      130986          .            .   SMS bytes allocated
      130986          .            .   SMS bytes freed
     2413909       101.00       259.20 Backend requests made
           1         0.00         0.00 N vcl total
           1         0.00         0.00 N vcl available
           1          .            .   N total active purges
           1         0.00         0.00 N new purges added

top - 15:13:40 up 7 days, 33 min,  2 users,  load average: 0.14, 0.71, 0.75
Tasks: 116 total,   1 running, 115 sleeping,   0 stopped,   0 zombie
Cpu0  :  3.0%us,  1.0%sy,  0.0%ni, 93.0%id,  1.7%wa,  0.0%hi,  1.3%si,  0.0%st
Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu3  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu4  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu6  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Cpu7  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:  32942712k total, 32777060k used,   165652k free,     2164k buffers
Swap:   506008k total,    25664k used,   480344k free, 29918680k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
26992 nobody    15   0  505g  30g  28g S    4 98.6  17:26.94 varnishd
28537 root      15   0  6628 1208  864 R    0  0.0   0:00.52 top
    1 root      15   0  6120  556  500 S    0  0.0   0:08.61 init

Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz avgqu-sz   await  svctm  %util
sda               0.00     0.00  0.00  0.00     0.00     0.00     0.00     0.00    0.00   0.00   0.00
sdb               2.00  1589.00 14.00 37.00  1960.00 13008.00   293.49     0.16    3.06   0.71   3.60

We are running Varnish 2.0.4 on Linux 2.6. 64bit.

Regards,

Marco

-- 
 Terantula - Industrial Strength Open Source
 phone:+31 64 3232 400 / www: http://www.terantula.com / pgpkey: E7EE7A46
 pgp fingerprint: F2EE 122D 964C DE68 7380 6F95 3710 7719 E7EE 7A46 



More information about the varnish-misc mailing list