Varnish virtual memory usage

Henry Paulissen h.paulissen at qbell.nl
Thu Nov 5 08:34:46 CET 2009


Good morning :)

What do you propose for sess_workspace and shm_workspace?
In beginning I didn’t set these settings at all and was seeing the same
issue. Is the default also set to high?

Set them both to 8192 now. Will report back later on this day.


overflow_max isn’t ignored as I see a overflow of 1000+ if I set the worker
count to 1600.



Regards.

-----Oorspronkelijk bericht-----
Van: Ken Brownfield [mailto:kb+varnish at slide.com] 
Verzonden: donderdag 5 november 2009 2:16
Aan: Henry Paulissen
CC: varnish-misc at projects.linpro.no
Onderwerp: Re: Varnish virtual memory usage

Ah, sorry, missed that the command-line was in there.

Given 1G of cache, a large sess_workspace and shm_workspace buffers,  
and the number of threads, the math adds up correctly.  Do you  
definitely need those large buffers?

Your memory footprint will simply increase with thread count; reducing  
active simultaneous connections, reducing the stack size, and reducing  
the large sess_workspace are the only ways I know of for you to  
control the memory.  I'm really not seeing a leak or malfunction,  
IMHO.  The reason behind your high/growing worker count is worth  
investigating (lower send_timeout? slow/disconnecting clients? strace  
the threads to see what they're doing?)

Minor thing: overflow_max is a percentage, so 10000 is probably ignored?
-- 
Ken

On Nov 4, 2009, at 4:47 PM, Henry Paulissen wrote:

> See the pmap.txt attachment.
> The startup command is in the beginning of the file.
>
>
> /usr/local/varnish/sbin/varnishd -P /var/run/xxx.pid -a 0.0.0.0:xxx -f
> /usr/local/varnish/etc/varnish/xxx.xxx.xxx.vcl -T 0.0.0.0:xxx -s  
> malloc,1G
> -i xxx -n /usr/local/varnish/var/varnish/xxx -p obj_workspace 8192 -p
> sess_workspace 262144 -p listen_depth 8192 -p lru_interval 60 -p
> sess_timeout 10 -p shm_workspace 32768 -p ping_interval 2 -p  
> thread_pools 4
> -p thread_pool_min 50 -p thread_pool_max 4000 -p esi_syntax 1 -p
> overflow_max 10000
>
>
> -----Oorspronkelijk bericht-----
> Van: Ken Brownfield [mailto:kb+varnish at slide.com]
> Verzonden: donderdag 5 november 2009 1:42
> Aan: Henry Paulissen
> Onderwerp: Re: Varnish virtual memory usage
>
> Is your -s set at 1.5GB?  What's your varnishd command line?
>
> I'm not sure if you realize that thread_pool does not control the
> number of threads, only the number of pools (and mutexes).  I think
> thread_pool_max is what you're looking for?
> -- 
> Ken
>
> On Nov 4, 2009, at 4:37 PM, Henry Paulissen wrote:
>
>> Running varnishd now for abount 30 minutes with a thread_pool of 4.
>>
>> =
>> =
>> =
>> =
>> =
>> =
>> = 
>> =====================================================================
>> ========================
>> =
>> =
>> =
>> =
>> =
>> =
>> = 
>> =====================================================================
>> ========================
>> uptime                   2637          .   Child uptime
>> client_conn            316759       120.12 Client connections  
>> accepted
>> client_drop                 0         0.00 Connection dropped, no  
>> sess
>> client_req             316738       120.11 Client requests received
>> cache_hit               32477        12.32 Cache hits
>> cache_hitpass               0         0.00 Cache hits for pass
>> cache_miss              93703        35.53 Cache misses
>> backend_conn           261033        98.99 Backend conn. success
>> backend_unhealthy            0         0.00 Backend conn. not
>> attempted
>> backend_busy                0         0.00 Backend conn. too many
>> backend_fail                0         0.00 Backend conn. failures
>> backend_reuse           23305         8.84 Backend conn. reuses
>> backend_toolate           528         0.20 Backend conn. was closed
>> backend_recycle         23833         9.04 Backend conn. recycles
>> backend_unused              0         0.00 Backend conn. unused
>> fetch_head                  0         0.00 Fetch head
>> fetch_length           280973       106.55 Fetch with Length
>> fetch_chunked            1801         0.68 Fetch chunked
>> fetch_eof                   0         0.00 Fetch EOF
>> fetch_bad                   0         0.00 Fetch had bad headers
>> fetch_close              1329         0.50 Fetch wanted close
>> fetch_oldhttp               0         0.00 Fetch pre HTTP/1.1 closed
>> fetch_zero                  0         0.00 Fetch zero len
>> fetch_failed                0         0.00 Fetch failed
>> n_sess_mem                284          .   N struct sess_mem
>> n_sess                     35          .   N struct sess
>> n_object                90560          .   N struct object
>> n_vampireobject             0          .   N unresurrected objects
>> n_objectcore            90616          .   N struct objectcore
>> n_objecthead            25146          .   N struct objecthead
>> n_smf                       0          .   N struct smf
>> n_smf_frag                  0          .   N small free smf
>> n_smf_large                 0          .   N large free smf
>> n_vbe_conn                 10          .   N struct vbe_conn
>> n_wrk                     200          .   N worker threads
>> n_wrk_create              248         0.09 N worker threads created
>> n_wrk_failed                0         0.00 N worker threads not
>> created
>> n_wrk_max              100988        38.30 N worker threads limited
>> n_wrk_queue                 0         0.00 N queued work requests
>> n_wrk_overflow            630         0.24 N overflowed work requests
>> n_wrk_drop                  0         0.00 N dropped work requests
>> n_backend                   5          .   N backends
>> n_expired                1027          .   N expired objects
>> n_lru_nuked              2108          .   N LRU nuked objects
>> n_lru_saved                 0          .   N LRU saved objects
>> n_lru_moved             12558          .   N LRU moved objects
>> n_deathrow                  0          .   N objects on deathrow
>> losthdr                     5         0.00 HTTP header overflows
>> n_objsendfile               0         0.00 Objects sent with sendfile
>> n_objwrite             315222       119.54 Objects sent with write
>> n_objoverflow               0         0.00 Objects overflowing
>> workspace
>> s_sess                 316740       120.11 Total Sessions
>> s_req                  316738       120.11 Total Requests
>> s_pipe                      0         0.00 Total pipe
>> s_pass                 190664        72.30 Total pass
>> s_fetch                284103       107.74 Total fetch
>> s_hdrbytes          114236150     43320.50 Total header bytes
>> s_bodybytes         355198316    134697.88 Total body bytes
>> sess_closed            316740       120.11 Session Closed
>> sess_pipeline               0         0.00 Session Pipeline
>> sess_readahead              0         0.00 Session Read Ahead
>> sess_linger                 0         0.00 Session Linger
>> sess_herd                  33         0.01 Session herd
>> shm_records          27534992     10441.79 SHM records
>> shm_writes            1555265       589.79 SHM writes
>> shm_flushes                 0         0.00 SHM flushes due to  
>> overflow
>> shm_cont                 1689         0.64 SHM MTX contention
>> shm_cycles                 12         0.00 SHM cycles through buffer
>> sm_nreq                     0         0.00 allocator requests
>> sm_nobj                     0          .   outstanding allocations
>> sm_balloc                   0          .   bytes allocated
>> sm_bfree                    0          .   bytes free
>> sma_nreq               379783       144.02 SMA allocator requests
>> sma_nobj               181121          .   SMA outstanding  
>> allocations
>> sma_nbytes         1073735584          .   SMA outstanding bytes
>> sma_balloc         1488895305          .   SMA bytes allocated
>> sma_bfree           415159721          .   SMA bytes free
>> sms_nreq                  268         0.10 SMS allocator requests
>> sms_nobj                    0          .   SMS outstanding  
>> allocations
>> sms_nbytes                  0          .   SMS outstanding bytes
>> sms_balloc             156684          .   SMS bytes allocated
>> sms_bfree              156684          .   SMS bytes freed
>> backend_req            284202       107.77 Backend requests made
>> n_vcl                       1         0.00 N vcl total
>> n_vcl_avail                 1         0.00 N vcl available
>> n_vcl_discard               0         0.00 N vcl discarded
>> n_purge                     1          .   N total active purges
>> n_purge_add                 1         0.00 N new purges added
>> n_purge_retire              0         0.00 N old purges deleted
>> n_purge_obj_test            0         0.00 N objects tested
>> n_purge_re_test             0         0.00 N regexps tested against
>> n_purge_dups                0         0.00 N duplicate purges removed
>> hcb_nolock                  0         0.00 HCB Lookups without lock
>> hcb_lock                    0         0.00 HCB Lookups with lock
>> hcb_insert                  0         0.00 HCB Inserts
>> esi_parse                   0         0.00 Objects ESI parsed  
>> (unlock)
>> esi_errors                  0         0.00 ESI parse errors (unlock)
>> =
>> =
>> =
>> =
>> =
>> =
>> = 
>> =====================================================================
>> ========================
>> =
>> =
>> =
>> =
>> =
>> =
>> = 
>> =====================================================================
>> ========================
>>
>> As you can see I have now 200 worker threads.
>> Still its using 1.8G and is still increasing (~1 to 5 mb/s)
>>
>>
>> -----Oorspronkelijk bericht-----
>> Van: Ken Brownfield [mailto:kb+varnish at slide.com]
>> Verzonden: donderdag 5 november 2009 1:18
>> Aan: Henry Paulissen
>> CC: varnish-misc at projects.linpro.no
>> Onderwerp: Re: Varnish virtual memory usage
>>
>> Hmm, well the memory adds up to a 1.5G -s option (can you confirm  
>> what
>> you use with -s?) and memory required to run the number of threads
>> you're running.  Unless your -s is drastically smaller than 1.5GB,  
>> the
>> pmap you sent is of a normal, non-leaking process.
>>
>> Ken
>>
>> On Nov 4, 2009, at 3:48 PM, Henry Paulissen wrote:
>>
>>> Our load balancer transforms all connections from keep-alive to
>>> close.
>>> So keep-alive connections aren’t the issue here.
>>>
>>> Also, if I limit the thread count I still see the same behavior.
>>>
>>> -----Oorspronkelijk bericht-----
>>> Van: Ken Brownfield [mailto:kb at slide.com]
>>> Verzonden: donderdag 5 november 2009 0:31
>>> Aan: Henry Paulissen
>>> CC: varnish-misc at projects.linpro.no
>>> Onderwerp: Re: Varnish virtual memory usage
>>>
>>> Looks like varnish is allocating ~1.5GB of RAM for pure cache (which
>>> may roughly match your "-s file" option) but 1,610 threads with your
>>> 1MB stack limit will use 1.7GB of RAM.  Pmap is reporting the
>>> footprint of this instance as roughly 3.6GB, and I'm assuming top/ps
>>> agree with that number.
>>>
>>> Unless your "-s file" option is significantly less than 1-1.5GB, the
>>> sheer thread count explains your memory usage: maybe using a
>>> stacksize
>>> of 512K or 256K could help, and/or disable keepalives on the client
>>> side?
>>>
>>> Also, if you happen to be using a load balancer, TCP Buffering
>>> (NetScaler) or Proxy Buffering? (BigIP) or the like can drastically
>>> reduce the thread count (and they can handle the persistent
>>> keepalives
>>> as well).
>>>
>>> But IMHO, an event-based (for example) handler for "idle" or "slow"
>>> threads is probably the next important feature, just below
>>> persistence.  Without something like TCP buffering, the memory
>>> available for actual caching is dwarfed by the thread stacksize  
>>> alloc
>>> overhead.
>>>
>>> Ken
>>>
>>> On Nov 4, 2009, at 3:18 PM, Henry Paulissen wrote:
>>>
>>>> I attached the memory dump.
>>>>
>>>> Child processes count gives me 1610 processes (on this instance).
>>>> Currently the server isn’t so busy (~175 requests / sec).
>>>>
>>>> Varnishstat -1:
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> = 
>>>> ===================================================================
>>>> ======
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> = 
>>>> ===================================================================
>>>> ======
>>>> uptime                   3090          .   Child uptime
>>>> client_conn            435325       140.88 Client connections
>>>> accepted
>>>> client_drop                 0         0.00 Connection dropped, no
>>>> sess
>>>> client_req             435294       140.87 Client requests received
>>>> cache_hit               45740        14.80 Cache hits
>>>> cache_hitpass               0         0.00 Cache hits for pass
>>>> cache_miss             126445        40.92 Cache misses
>>>> backend_conn           355277       114.98 Backend conn. success
>>>> backend_unhealthy            0         0.00 Backend conn. not
>>>> attempted
>>>> backend_busy                0         0.00 Backend conn. too many
>>>> backend_fail                0         0.00 Backend conn. failures
>>>> backend_reuse           34331        11.11 Backend conn. reuses
>>>> backend_toolate           690         0.22 Backend conn. was closed
>>>> backend_recycle         35021        11.33 Backend conn. recycles
>>>> backend_unused              0         0.00 Backend conn. unused
>>>> fetch_head                  0         0.00 Fetch head
>>>> fetch_length           384525       124.44 Fetch with Length
>>>> fetch_chunked            2441         0.79 Fetch chunked
>>>> fetch_eof                   0         0.00 Fetch EOF
>>>> fetch_bad                   0         0.00 Fetch had bad headers
>>>> fetch_close              2028         0.66 Fetch wanted close
>>>> fetch_oldhttp               0         0.00 Fetch pre HTTP/1.1  
>>>> closed
>>>> fetch_zero                  0         0.00 Fetch zero len
>>>> fetch_failed                0         0.00 Fetch failed
>>>> n_sess_mem                989          .   N struct sess_mem
>>>> n_sess                     94          .   N struct sess
>>>> n_object                89296          .   N struct object
>>>> n_vampireobject             0          .   N unresurrected objects
>>>> n_objectcore            89640          .   N struct objectcore
>>>> n_objecthead            25379          .   N struct objecthead
>>>> n_smf                       0          .   N struct smf
>>>> n_smf_frag                  0          .   N small free smf
>>>> n_smf_large                 0          .   N large free smf
>>>> n_vbe_conn                 26          .   N struct vbe_conn
>>>> n_wrk                    1600          .   N worker threads
>>>> n_wrk_create             1600         0.52 N worker threads created
>>>> n_wrk_failed                0         0.00 N worker threads not
>>>> created
>>>> n_wrk_max                1274         0.41 N worker threads limited
>>>> n_wrk_queue                 0         0.00 N queued work requests
>>>> n_wrk_overflow           1342         0.43 N overflowed work
>>>> requests
>>>> n_wrk_drop                  0         0.00 N dropped work requests
>>>> n_backend                   5          .   N backends
>>>> n_expired                1393          .   N expired objects
>>>> n_lru_nuked             35678          .   N LRU nuked objects
>>>> n_lru_saved                 0          .   N LRU saved objects
>>>> n_lru_moved             20020          .   N LRU moved objects
>>>> n_deathrow                  0          .   N objects on deathrow
>>>> losthdr                    11         0.00 HTTP header overflows
>>>> n_objsendfile               0         0.00 Objects sent with
>>>> sendfile
>>>> n_objwrite             433558       140.31 Objects sent with write
>>>> n_objoverflow               0         0.00 Objects overflowing
>>>> workspace
>>>> s_sess                 435298       140.87 Total Sessions
>>>> s_req                  435294       140.87 Total Requests
>>>> s_pipe                      0         0.00 Total pipe
>>>> s_pass                 263190        85.17 Total pass
>>>> s_fetch                388994       125.89 Total fetch
>>>> s_hdrbytes          157405143     50940.18 Total header bytes
>>>> s_bodybytes         533077018    172516.83 Total body bytes
>>>> sess_closed            435291       140.87 Session Closed
>>>> sess_pipeline               0         0.00 Session Pipeline
>>>> sess_readahead              0         0.00 Session Read Ahead
>>>> sess_linger                 0         0.00 Session Linger
>>>> sess_herd                  69         0.02 Session herd
>>>> shm_records          37936743     12277.26 SHM records
>>>> shm_writes            2141029       692.89 SHM writes
>>>> shm_flushes                 0         0.00 SHM flushes due to
>>>> overflow
>>>> shm_cont                 3956         1.28 SHM MTX contention
>>>> shm_cycles                 16         0.01 SHM cycles through  
>>>> buffer
>>>> sm_nreq                     0         0.00 allocator requests
>>>> sm_nobj                     0          .   outstanding allocations
>>>> sm_balloc                   0          .   bytes allocated
>>>> sm_bfree                    0          .   bytes free
>>>> sma_nreq               550879       178.28 SMA allocator requests
>>>> sma_nobj               178590          .   SMA outstanding
>>>> allocations
>>>> sma_nbytes         1073690180          .   SMA outstanding bytes
>>>> sma_balloc         2066782844          .   SMA bytes allocated
>>>> sma_bfree           993092664          .   SMA bytes free
>>>> sms_nreq                  649         0.21 SMS allocator requests
>>>> sms_nobj                    0          .   SMS outstanding
>>>> allocations
>>>> sms_nbytes                  0          .   SMS outstanding bytes
>>>> sms_balloc             378848          .   SMS bytes allocated
>>>> sms_bfree              378848          .   SMS bytes freed
>>>> backend_req            389342       126.00 Backend requests made
>>>> n_vcl                       1         0.00 N vcl total
>>>> n_vcl_avail                 1         0.00 N vcl available
>>>> n_vcl_discard               0         0.00 N vcl discarded
>>>> n_purge                     1          .   N total active purges
>>>> n_purge_add                 1         0.00 N new purges added
>>>> n_purge_retire              0         0.00 N old purges deleted
>>>> n_purge_obj_test            0         0.00 N objects tested
>>>> n_purge_re_test             0         0.00 N regexps tested against
>>>> n_purge_dups                0         0.00 N duplicate purges
>>>> removed
>>>> hcb_nolock                  0         0.00 HCB Lookups without lock
>>>> hcb_lock                    0         0.00 HCB Lookups with lock
>>>> hcb_insert                  0         0.00 HCB Inserts
>>>> esi_parse                   0         0.00 Objects ESI parsed
>>>> (unlock)
>>>> esi_errors                  0         0.00 ESI parse errors  
>>>> (unlock)
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> = 
>>>> ===================================================================
>>>> ======
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> =
>>>> = 
>>>> ===================================================================
>>>> ======
>>>>
>>>>
>>>>
>>>> -----Oorspronkelijk bericht-----
>>>> Van: Ken Brownfield [mailto:kb at slide.com]
>>>> Verzonden: donderdag 5 november 2009 0:01
>>>> Aan: Henry Paulissen
>>>> CC: Rogério Schneider
>>>> Onderwerp: Re: Varnish virtual memory usage
>>>>
>>>> Curious: For a heavily leaked varnish instance, can you run "pmap  
>>>> -x
>>>> PID" on the parent PID and child PID, and record how many threads
>>>> are
>>>> active (something like 'ps -efT | grep varnish | wc -l')?  Might
>>>> help
>>>> isolate the RAM usage.
>>>>
>>>> Sorry if you have done this already; didn't find it in my email
>>>> archive.
>>>>
>>>> Ken
>>>>
>>>> On Nov 4, 2009, at 2:53 PM, Henry Paulissen wrote:
>>>>
>>>>> No, varnishd still usages way more than allowed.
>>>>> The only solutions I found at the moment are:
>>>>>
>>>>> Run on x64 linux and restart varnish every 4 hours (crontab).
>>>>> Run on x32 linux (all is working as expected but you cant allocate
>>>>> more as
>>>>> 4G each instance).
>>>>>
>>>>>
>>>>> I hope linpro will find this issue and address it.
>>>>>
>>>>>
>>>>>
>>>>> Again @ linpro: if you need a machine (with live traffic) to run
>>>>> some tests,
>>>>> please contact me.
>>>>> We have multiple machines in high availability, so testing and
>>>>> rebooting a
>>>>> instance wouldn’t hurt us.	
>>>>>
>>>>>
>>>>> Regards.
>>>>>
>>>>> -----Oorspronkelijk bericht-----
>>>>> Van: Rogério Schneider [mailto:stockrt at gmail.com]
>>>>> Verzonden: woensdag 4 november 2009 22:04
>>>>> Aan: Henry Paulissen
>>>>> CC: Scott Wilson; varnish-misc at projects.linpro.no
>>>>> Onderwerp: Re: Varnish virtual memory usage
>>>>>
>>>>> On Thu, Oct 22, 2009 at 6:04 AM, Henry Paulissen
>>>>> <h.paulissen at qbell.nl>
>>>>> wrote:
>>>>>> I will report back.
>>>>>
>>>>> Did this solve the problem?
>>>>>
>>>>> Removing this?
>>>>>
>>>>>>>   if (req.http.Cache-Control == "no-cache" || req.http.Pragma ==
>>>>> "no-cache") {
>>>>>>>           purge_url(req.url);
>>>>>>>   }
>>>>>>>
>>>>>
>>>>> Cheers
>>>>>
>>>>> Att,
>>>>> -- 
>>>>> Rogério Schneider
>>>>>
>>>>> MSN: stockrt at hotmail.com
>>>>> GTalk: stockrt at gmail.com
>>>>> Skype: stockrt
>>>>> http://stockrt.github.com
>>>>>
>>>>> _______________________________________________
>>>>> varnish-misc mailing list
>>>>> varnish-misc at projects.linpro.no
>>>>> http://projects.linpro.no/mailman/listinfo/varnish-misc
>>>> <pmap.txt>
>>>
>> <pmap.txt>
>




More information about the varnish-misc mailing list