FW: Varnish virtual memory usage

Henry Paulissen h.paulissen at qbell.nl
Thu Nov 5 01:48:30 CET 2009


See the pmap.txt attachment.
The startup command is in the beginning of the file.


/usr/local/varnish/sbin/varnishd -P /var/run/xxx.pid -a 0.0.0.0:xxx -f
/usr/local/varnish/etc/varnish/xxx.xxx.xxx.vcl -T 0.0.0.0:xxx -s malloc,1G
-i xxx -n /usr/local/varnish/var/varnish/xxx -p obj_workspace 8192 -p
sess_workspace 262144 -p listen_depth 8192 -p lru_interval 60 -p
sess_timeout 10 -p shm_workspace 32768 -p ping_interval 2 -p thread_pools 4
-p thread_pool_min 50 -p thread_pool_max 4000 -p esi_syntax 1 -p
overflow_max 10000


P.S. Sorry for the double mail. Forgot to CC.

-----Oorspronkelijk bericht-----
Van: Ken Brownfield [mailto:kb+varnish at slide.com] 
Verzonden: donderdag 5 november 2009 1:42
Aan: Henry Paulissen
Onderwerp: Re: Varnish virtual memory usage

Is your -s set at 1.5GB?  What's your varnishd command line?

I'm not sure if you realize that thread_pool does not control the  
number of threads, only the number of pools (and mutexes).  I think  
thread_pool_max is what you're looking for?
-- 
Ken

On Nov 4, 2009, at 4:37 PM, Henry Paulissen wrote:

> Running varnishd now for abount 30 minutes with a thread_pool of 4.
>
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> ========================
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> ========================
> uptime                   2637          .   Child uptime
> client_conn            316759       120.12 Client connections accepted
> client_drop                 0         0.00 Connection dropped, no sess
> client_req             316738       120.11 Client requests received
> cache_hit               32477        12.32 Cache hits
> cache_hitpass               0         0.00 Cache hits for pass
> cache_miss              93703        35.53 Cache misses
> backend_conn           261033        98.99 Backend conn. success
> backend_unhealthy            0         0.00 Backend conn. not  
> attempted
> backend_busy                0         0.00 Backend conn. too many
> backend_fail                0         0.00 Backend conn. failures
> backend_reuse           23305         8.84 Backend conn. reuses
> backend_toolate           528         0.20 Backend conn. was closed
> backend_recycle         23833         9.04 Backend conn. recycles
> backend_unused              0         0.00 Backend conn. unused
> fetch_head                  0         0.00 Fetch head
> fetch_length           280973       106.55 Fetch with Length
> fetch_chunked            1801         0.68 Fetch chunked
> fetch_eof                   0         0.00 Fetch EOF
> fetch_bad                   0         0.00 Fetch had bad headers
> fetch_close              1329         0.50 Fetch wanted close
> fetch_oldhttp               0         0.00 Fetch pre HTTP/1.1 closed
> fetch_zero                  0         0.00 Fetch zero len
> fetch_failed                0         0.00 Fetch failed
> n_sess_mem                284          .   N struct sess_mem
> n_sess                     35          .   N struct sess
> n_object                90560          .   N struct object
> n_vampireobject             0          .   N unresurrected objects
> n_objectcore            90616          .   N struct objectcore
> n_objecthead            25146          .   N struct objecthead
> n_smf                       0          .   N struct smf
> n_smf_frag                  0          .   N small free smf
> n_smf_large                 0          .   N large free smf
> n_vbe_conn                 10          .   N struct vbe_conn
> n_wrk                     200          .   N worker threads
> n_wrk_create              248         0.09 N worker threads created
> n_wrk_failed                0         0.00 N worker threads not  
> created
> n_wrk_max              100988        38.30 N worker threads limited
> n_wrk_queue                 0         0.00 N queued work requests
> n_wrk_overflow            630         0.24 N overflowed work requests
> n_wrk_drop                  0         0.00 N dropped work requests
> n_backend                   5          .   N backends
> n_expired                1027          .   N expired objects
> n_lru_nuked              2108          .   N LRU nuked objects
> n_lru_saved                 0          .   N LRU saved objects
> n_lru_moved             12558          .   N LRU moved objects
> n_deathrow                  0          .   N objects on deathrow
> losthdr                     5         0.00 HTTP header overflows
> n_objsendfile               0         0.00 Objects sent with sendfile
> n_objwrite             315222       119.54 Objects sent with write
> n_objoverflow               0         0.00 Objects overflowing  
> workspace
> s_sess                 316740       120.11 Total Sessions
> s_req                  316738       120.11 Total Requests
> s_pipe                      0         0.00 Total pipe
> s_pass                 190664        72.30 Total pass
> s_fetch                284103       107.74 Total fetch
> s_hdrbytes          114236150     43320.50 Total header bytes
> s_bodybytes         355198316    134697.88 Total body bytes
> sess_closed            316740       120.11 Session Closed
> sess_pipeline               0         0.00 Session Pipeline
> sess_readahead              0         0.00 Session Read Ahead
> sess_linger                 0         0.00 Session Linger
> sess_herd                  33         0.01 Session herd
> shm_records          27534992     10441.79 SHM records
> shm_writes            1555265       589.79 SHM writes
> shm_flushes                 0         0.00 SHM flushes due to overflow
> shm_cont                 1689         0.64 SHM MTX contention
> shm_cycles                 12         0.00 SHM cycles through buffer
> sm_nreq                     0         0.00 allocator requests
> sm_nobj                     0          .   outstanding allocations
> sm_balloc                   0          .   bytes allocated
> sm_bfree                    0          .   bytes free
> sma_nreq               379783       144.02 SMA allocator requests
> sma_nobj               181121          .   SMA outstanding allocations
> sma_nbytes         1073735584          .   SMA outstanding bytes
> sma_balloc         1488895305          .   SMA bytes allocated
> sma_bfree           415159721          .   SMA bytes free
> sms_nreq                  268         0.10 SMS allocator requests
> sms_nobj                    0          .   SMS outstanding allocations
> sms_nbytes                  0          .   SMS outstanding bytes
> sms_balloc             156684          .   SMS bytes allocated
> sms_bfree              156684          .   SMS bytes freed
> backend_req            284202       107.77 Backend requests made
> n_vcl                       1         0.00 N vcl total
> n_vcl_avail                 1         0.00 N vcl available
> n_vcl_discard               0         0.00 N vcl discarded
> n_purge                     1          .   N total active purges
> n_purge_add                 1         0.00 N new purges added
> n_purge_retire              0         0.00 N old purges deleted
> n_purge_obj_test            0         0.00 N objects tested
> n_purge_re_test             0         0.00 N regexps tested against
> n_purge_dups                0         0.00 N duplicate purges removed
> hcb_nolock                  0         0.00 HCB Lookups without lock
> hcb_lock                    0         0.00 HCB Lookups with lock
> hcb_insert                  0         0.00 HCB Inserts
> esi_parse                   0         0.00 Objects ESI parsed (unlock)
> esi_errors                  0         0.00 ESI parse errors (unlock)
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> ========================
> = 
> = 
> = 
> = 
> = 
> = 
> ======================================================================
> ========================
>
> As you can see I have now 200 worker threads.
> Still its using 1.8G and is still increasing (~1 to 5 mb/s)
>
>
> -----Oorspronkelijk bericht-----
> Van: Ken Brownfield [mailto:kb+varnish at slide.com]
> Verzonden: donderdag 5 november 2009 1:18
> Aan: Henry Paulissen
> CC: varnish-misc at projects.linpro.no
> Onderwerp: Re: Varnish virtual memory usage
>
> Hmm, well the memory adds up to a 1.5G -s option (can you confirm what
> you use with -s?) and memory required to run the number of threads
> you're running.  Unless your -s is drastically smaller than 1.5GB, the
> pmap you sent is of a normal, non-leaking process.
>
> Ken
>
> On Nov 4, 2009, at 3:48 PM, Henry Paulissen wrote:
>
>> Our load balancer transforms all connections from keep-alive to  
>> close.
>> So keep-alive connections aren’t the issue here.
>>
>> Also, if I limit the thread count I still see the same behavior.
>>
>> -----Oorspronkelijk bericht-----
>> Van: Ken Brownfield [mailto:kb at slide.com]
>> Verzonden: donderdag 5 november 2009 0:31
>> Aan: Henry Paulissen
>> CC: varnish-misc at projects.linpro.no
>> Onderwerp: Re: Varnish virtual memory usage
>>
>> Looks like varnish is allocating ~1.5GB of RAM for pure cache (which
>> may roughly match your "-s file" option) but 1,610 threads with your
>> 1MB stack limit will use 1.7GB of RAM.  Pmap is reporting the
>> footprint of this instance as roughly 3.6GB, and I'm assuming top/ps
>> agree with that number.
>>
>> Unless your "-s file" option is significantly less than 1-1.5GB, the
>> sheer thread count explains your memory usage: maybe using a  
>> stacksize
>> of 512K or 256K could help, and/or disable keepalives on the client
>> side?
>>
>> Also, if you happen to be using a load balancer, TCP Buffering
>> (NetScaler) or Proxy Buffering? (BigIP) or the like can drastically
>> reduce the thread count (and they can handle the persistent  
>> keepalives
>> as well).
>>
>> But IMHO, an event-based (for example) handler for "idle" or "slow"
>> threads is probably the next important feature, just below
>> persistence.  Without something like TCP buffering, the memory
>> available for actual caching is dwarfed by the thread stacksize alloc
>> overhead.
>>
>> Ken
>>
>> On Nov 4, 2009, at 3:18 PM, Henry Paulissen wrote:
>>
>>> I attached the memory dump.
>>>
>>> Child processes count gives me 1610 processes (on this instance).
>>> Currently the server isn’t so busy (~175 requests / sec).
>>>
>>> Varnishstat -1:
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> = 
>>> ====================================================================
>>> ======
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> = 
>>> ====================================================================
>>> ======
>>> uptime                   3090          .   Child uptime
>>> client_conn            435325       140.88 Client connections
>>> accepted
>>> client_drop                 0         0.00 Connection dropped, no
>>> sess
>>> client_req             435294       140.87 Client requests received
>>> cache_hit               45740        14.80 Cache hits
>>> cache_hitpass               0         0.00 Cache hits for pass
>>> cache_miss             126445        40.92 Cache misses
>>> backend_conn           355277       114.98 Backend conn. success
>>> backend_unhealthy            0         0.00 Backend conn. not
>>> attempted
>>> backend_busy                0         0.00 Backend conn. too many
>>> backend_fail                0         0.00 Backend conn. failures
>>> backend_reuse           34331        11.11 Backend conn. reuses
>>> backend_toolate           690         0.22 Backend conn. was closed
>>> backend_recycle         35021        11.33 Backend conn. recycles
>>> backend_unused              0         0.00 Backend conn. unused
>>> fetch_head                  0         0.00 Fetch head
>>> fetch_length           384525       124.44 Fetch with Length
>>> fetch_chunked            2441         0.79 Fetch chunked
>>> fetch_eof                   0         0.00 Fetch EOF
>>> fetch_bad                   0         0.00 Fetch had bad headers
>>> fetch_close              2028         0.66 Fetch wanted close
>>> fetch_oldhttp               0         0.00 Fetch pre HTTP/1.1 closed
>>> fetch_zero                  0         0.00 Fetch zero len
>>> fetch_failed                0         0.00 Fetch failed
>>> n_sess_mem                989          .   N struct sess_mem
>>> n_sess                     94          .   N struct sess
>>> n_object                89296          .   N struct object
>>> n_vampireobject             0          .   N unresurrected objects
>>> n_objectcore            89640          .   N struct objectcore
>>> n_objecthead            25379          .   N struct objecthead
>>> n_smf                       0          .   N struct smf
>>> n_smf_frag                  0          .   N small free smf
>>> n_smf_large                 0          .   N large free smf
>>> n_vbe_conn                 26          .   N struct vbe_conn
>>> n_wrk                    1600          .   N worker threads
>>> n_wrk_create             1600         0.52 N worker threads created
>>> n_wrk_failed                0         0.00 N worker threads not
>>> created
>>> n_wrk_max                1274         0.41 N worker threads limited
>>> n_wrk_queue                 0         0.00 N queued work requests
>>> n_wrk_overflow           1342         0.43 N overflowed work  
>>> requests
>>> n_wrk_drop                  0         0.00 N dropped work requests
>>> n_backend                   5          .   N backends
>>> n_expired                1393          .   N expired objects
>>> n_lru_nuked             35678          .   N LRU nuked objects
>>> n_lru_saved                 0          .   N LRU saved objects
>>> n_lru_moved             20020          .   N LRU moved objects
>>> n_deathrow                  0          .   N objects on deathrow
>>> losthdr                    11         0.00 HTTP header overflows
>>> n_objsendfile               0         0.00 Objects sent with  
>>> sendfile
>>> n_objwrite             433558       140.31 Objects sent with write
>>> n_objoverflow               0         0.00 Objects overflowing
>>> workspace
>>> s_sess                 435298       140.87 Total Sessions
>>> s_req                  435294       140.87 Total Requests
>>> s_pipe                      0         0.00 Total pipe
>>> s_pass                 263190        85.17 Total pass
>>> s_fetch                388994       125.89 Total fetch
>>> s_hdrbytes          157405143     50940.18 Total header bytes
>>> s_bodybytes         533077018    172516.83 Total body bytes
>>> sess_closed            435291       140.87 Session Closed
>>> sess_pipeline               0         0.00 Session Pipeline
>>> sess_readahead              0         0.00 Session Read Ahead
>>> sess_linger                 0         0.00 Session Linger
>>> sess_herd                  69         0.02 Session herd
>>> shm_records          37936743     12277.26 SHM records
>>> shm_writes            2141029       692.89 SHM writes
>>> shm_flushes                 0         0.00 SHM flushes due to
>>> overflow
>>> shm_cont                 3956         1.28 SHM MTX contention
>>> shm_cycles                 16         0.01 SHM cycles through buffer
>>> sm_nreq                     0         0.00 allocator requests
>>> sm_nobj                     0          .   outstanding allocations
>>> sm_balloc                   0          .   bytes allocated
>>> sm_bfree                    0          .   bytes free
>>> sma_nreq               550879       178.28 SMA allocator requests
>>> sma_nobj               178590          .   SMA outstanding
>>> allocations
>>> sma_nbytes         1073690180          .   SMA outstanding bytes
>>> sma_balloc         2066782844          .   SMA bytes allocated
>>> sma_bfree           993092664          .   SMA bytes free
>>> sms_nreq                  649         0.21 SMS allocator requests
>>> sms_nobj                    0          .   SMS outstanding
>>> allocations
>>> sms_nbytes                  0          .   SMS outstanding bytes
>>> sms_balloc             378848          .   SMS bytes allocated
>>> sms_bfree              378848          .   SMS bytes freed
>>> backend_req            389342       126.00 Backend requests made
>>> n_vcl                       1         0.00 N vcl total
>>> n_vcl_avail                 1         0.00 N vcl available
>>> n_vcl_discard               0         0.00 N vcl discarded
>>> n_purge                     1          .   N total active purges
>>> n_purge_add                 1         0.00 N new purges added
>>> n_purge_retire              0         0.00 N old purges deleted
>>> n_purge_obj_test            0         0.00 N objects tested
>>> n_purge_re_test             0         0.00 N regexps tested against
>>> n_purge_dups                0         0.00 N duplicate purges  
>>> removed
>>> hcb_nolock                  0         0.00 HCB Lookups without lock
>>> hcb_lock                    0         0.00 HCB Lookups with lock
>>> hcb_insert                  0         0.00 HCB Inserts
>>> esi_parse                   0         0.00 Objects ESI parsed
>>> (unlock)
>>> esi_errors                  0         0.00 ESI parse errors (unlock)
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> = 
>>> ====================================================================
>>> ======
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> =
>>> = 
>>> ====================================================================
>>> ======
>>>
>>>
>>>
>>> -----Oorspronkelijk bericht-----
>>> Van: Ken Brownfield [mailto:kb at slide.com]
>>> Verzonden: donderdag 5 november 2009 0:01
>>> Aan: Henry Paulissen
>>> CC: Rogério Schneider
>>> Onderwerp: Re: Varnish virtual memory usage
>>>
>>> Curious: For a heavily leaked varnish instance, can you run "pmap -x
>>> PID" on the parent PID and child PID, and record how many threads  
>>> are
>>> active (something like 'ps -efT | grep varnish | wc -l')?  Might  
>>> help
>>> isolate the RAM usage.
>>>
>>> Sorry if you have done this already; didn't find it in my email
>>> archive.
>>>
>>> Ken
>>>
>>> On Nov 4, 2009, at 2:53 PM, Henry Paulissen wrote:
>>>
>>>> No, varnishd still usages way more than allowed.
>>>> The only solutions I found at the moment are:
>>>>
>>>> Run on x64 linux and restart varnish every 4 hours (crontab).
>>>> Run on x32 linux (all is working as expected but you cant allocate
>>>> more as
>>>> 4G each instance).
>>>>
>>>>
>>>> I hope linpro will find this issue and address it.
>>>>
>>>>
>>>>
>>>> Again @ linpro: if you need a machine (with live traffic) to run
>>>> some tests,
>>>> please contact me.
>>>> We have multiple machines in high availability, so testing and
>>>> rebooting a
>>>> instance wouldn’t hurt us.	
>>>>
>>>>
>>>> Regards.
>>>>
>>>> -----Oorspronkelijk bericht-----
>>>> Van: Rogério Schneider [mailto:stockrt at gmail.com]
>>>> Verzonden: woensdag 4 november 2009 22:04
>>>> Aan: Henry Paulissen
>>>> CC: Scott Wilson; varnish-misc at projects.linpro.no
>>>> Onderwerp: Re: Varnish virtual memory usage
>>>>
>>>> On Thu, Oct 22, 2009 at 6:04 AM, Henry Paulissen
>>>> <h.paulissen at qbell.nl>
>>>> wrote:
>>>>> I will report back.
>>>>
>>>> Did this solve the problem?
>>>>
>>>> Removing this?
>>>>
>>>>>>    if (req.http.Cache-Control == "no-cache" || req.http.Pragma ==
>>>> "no-cache") {
>>>>>>            purge_url(req.url);
>>>>>>    }
>>>>>>
>>>>
>>>> Cheers
>>>>
>>>> Att,
>>>> -- 
>>>> Rogério Schneider
>>>>
>>>> MSN: stockrt at hotmail.com
>>>> GTalk: stockrt at gmail.com
>>>> Skype: stockrt
>>>> http://stockrt.github.com
>>>>
>>>> _______________________________________________
>>>> varnish-misc mailing list
>>>> varnish-misc at projects.linpro.no
>>>> http://projects.linpro.no/mailman/listinfo/varnish-misc
>>> <pmap.txt>
>>
> <pmap.txt>




More information about the varnish-misc mailing list