Increases in memory usage and dropped connections
Hugo Cisneiros (Eitch)
hugo.cisneiros at gmail.com
Tue Aug 23 00:27:53 CEST 2011
Hi,
I use Varnish for at least one year and in the past days began to get
really strange behavior (currently using 2.1.3). Sometimes I get some
"Connection reset by peer" requesting pages. When this happens, I look
at the stats and see something like these, for example:
client_conn 1565694 187.17 Client connections accepted
client_drop 271 0.03 Connection dropped, no sess/wrk
client_req 2790588 333.60 Client requests received
cache_hit 1813984 216.85 Cache hits
cache_hitpass 0 0.00 Cache hits for pass
cache_miss 546088 65.28 Cache misses
backend_conn 32042 3.83 Backend conn. success
backend_unhealthy 1351 0.16 Backend conn. not attempted
backend_busy 0 0.00 Backend conn. too many
backend_fail 1 0.00 Backend conn. failures
backend_reuse 889261 106.31 Backend conn. reuses
backend_toolate 157 0.02 Backend conn. was closed
backend_recycle 889425 106.33 Backend conn. recycles
backend_unused 0 0.00 Backend conn. unused
fetch_head 759 0.09 Fetch head
fetch_length 920288 110.02 Fetch with Length
fetch_chunked 0 0.00 Fetch chunked
fetch_eof 0 0.00 Fetch EOF
fetch_bad 0 0.00 Fetch had bad headers
fetch_close 0 0.00 Fetch wanted close
fetch_oldhttp 0 0.00 Fetch pre HTTP/1.1 closed
fetch_zero 0 0.00 Fetch zero len
fetch_failed 0 0.00 Fetch failed
n_sess_mem 65783 . N struct sess_mem
n_sess 54012 . N struct sess
n_object 217320 . N struct object
n_vampireobject 0 . N unresurrected objects
n_objectcore 218999 . N struct objectcore
n_objecthead 57686 . N struct objecthead
n_smf 0 . N struct smf
n_smf_frag 0 . N small free smf
n_smf_large 0 . N large free smf
n_vbe_conn 28 . N struct vbe_conn
n_wrk 2283 . N worker threads
n_wrk_create 2326 0.28 N worker threads created
n_wrk_failed 0 0.00 N worker threads not created
n_wrk_max 0 0.00 N worker threads limited
n_wrk_queue 0 0.00 N queued work requests
n_wrk_overflow 25533 3.05 N overflowed work requests
n_wrk_drop 286 0.03 N dropped work requests
n_backend 14 . N backends
n_expired 0 . N expired objects
n_lru_nuked 157040 . N LRU nuked objects
n_lru_saved 0 . N LRU saved objects
n_lru_moved 74969 . N LRU moved objects
n_deathrow 0 . N objects on deathrow
losthdr 0 0.00 HTTP header overflows
n_objsendfile 0 0.00 Objects sent with sendfile
n_objwrite 2494603 298.22 Objects sent with write
n_objoverflow 0 0.00 Objects overflowing workspace
s_sess 1565320 187.13 Total Sessions
s_req 2790588 333.60 Total Requests
s_pipe 0 0.00 Total pipe
s_pass 376511 45.01 Total pass
s_fetch 921039 110.11 Total fetch
s_hdrbytes 844190338 100919.35 Total header bytes
s_bodybytes 40983737008 4899430.60 Total body bytes
sess_closed 125532 15.01 Session Closed
sess_pipeline 788 0.09 Session Pipeline
sess_readahead 810 0.10 Session Read Ahead
sess_linger 2656690 317.60 Session Linger
sess_herd 2874808 343.67 Session herd
shm_records 146433038 17505.44 SHM records
shm_writes 12159673 1453.64 SHM writes
shm_flushes 0 0.00 SHM flushes due to overflow
shm_cont 51273 6.13 SHM MTX contention
shm_cycles 69 0.01 SHM cycles through buffer
sm_nreq 0 0.00 allocator requests
sm_nobj 0 . outstanding allocations
sm_balloc 0 . bytes allocated
sm_bfree 0 . bytes free
sma_nreq 1440966 172.26 SMA allocator requests
sma_nobj 433212 . SMA outstanding allocations
sma_nbytes 5368687547 . SMA outstanding bytes
sma_balloc 10635681699 . SMA bytes allocated
sma_bfree 5266994152 . SMA bytes free
sms_nreq 1206 0.14 SMS allocator requests
sms_nobj 0 . SMS outstanding allocations
sms_nbytes 0 . SMS outstanding bytes
sms_balloc 472992 . SMS bytes allocated
sms_bfree 472992 . SMS bytes freed
backend_req 921115 110.12 Backend requests made
n_vcl 1 0.00 N vcl total
n_vcl_avail 1 0.00 N vcl available
n_vcl_discard 0 0.00 N vcl discarded
n_purge 1 . N total active purges
n_purge_add 1 0.00 N new purges added
n_purge_retire 0 0.00 N old purges deleted
n_purge_obj_test 0 0.00 N objects tested
n_purge_re_test 0 0.00 N regexps tested against
n_purge_dups 0 0.00 N duplicate purges removed
hcb_nolock 0 0.00 HCB Lookups without lock
hcb_lock 0 0.00 HCB Lookups with lock
hcb_insert 0 0.00 HCB Inserts
esi_parse 0 0.00 Objects ESI parsed (unlock)
esi_errors 0 0.00 ESI parse errors (unlock)
accept_fail 0 0.00 Accept failures
client_drop_late 15 0.00 Connection dropped late
uptime 8365 1.00 Client uptime
There are lots of worrisome stats like n_wrk (2283), n_wrk_overflow
(25533), n_wrk_drop (286), client_drop (271) ... The number of active
threads (2283) is very high for no apparent reason since the normal
usage is like 300 threads or so.
The server has 7.5GB RAM and is using malloc with 5GB size, running
only a Varnish instance on an Ubuntu Server 10.10. Tried going to 3GB
malloc size, increasing thread_pool_min to 500 with 2 pools (1000 idle
threads total), while max is 5000 (from 3000), session_linger to 50,
100 and 150ms, connection_timeout to 5s, between_bytes_timeout to 5s,
but sometimes the workers go way up crazy and the entire ram plus 3gb
of swap get in use.
Any ideas? Maybe something hanging up the sessions (network latency
between varnish and backend? or someone doing this on purpose?)
Here are my settings:
acceptor_sleep_decay 0.900000 []
acceptor_sleep_incr 0.001000 [s]
acceptor_sleep_max 0.050000 [s]
auto_restart on [bool]
ban_lurker_sleep 30.000000 [s]
between_bytes_timeout 50.000000 [s]
cache_vbe_conns off [bool]
cc_command "exec cc -fpic -shared -Wl,-x -o %o %s"
cli_buffer 8192 [bytes]
cli_timeout 25 [seconds]
clock_skew 10 [s]
connect_timeout 50.000000 [s]
critbit_cooloff 180.000000 [s]
default_grace 10 [seconds]
default_ttl 120 [seconds]
diag_bitmap 0x0 [bitmap]
err_ttl 0 [seconds]
esi_syntax 0 [bitmap]
fetch_chunksize 128 [kilobytes]
first_byte_timeout 25.000000 [s]
group varnish (112)
http_headers 64 [header lines]
http_range_support off [bool]
listen_address :80
listen_depth 8192 [connections]
log_hashstring off [bool]
log_local_address off [bool]
lru_interval 60 [seconds]
max_esi_includes 5 [includes]
max_restarts 4 [restarts]
overflow_max 100 [%]
ping_interval 3 [seconds]
pipe_timeout 60 [seconds]
prefer_ipv6 off [bool]
purge_dups on [bool]
rush_exponent 3 [requests per request]
saintmode_threshold 10 [objects]
send_timeout 6000 [seconds]
sess_timeout 10 [seconds]
sess_workspace 524288 [bytes]
session_linger 50 [ms]
session_max 100000 [sessions]
shm_reclen 255 [bytes]
shm_workspace 262144 [bytes]
syslog_cli_traffic on [bool]
thread_pool_add_delay 2 [milliseconds]
thread_pool_add_threshold 2 [requests]
thread_pool_fail_delay 200 [milliseconds]
thread_pool_max 5000 [threads]
thread_pool_min 500 [threads]
thread_pool_purge_delay 1000 [milliseconds]
thread_pool_stack unlimited [bytes]
thread_pool_timeout 300 [seconds]
thread_pools 2 [pools]
thread_stats_rate 10 [requests]
user varnish (106)
vcl_trace off [bool]
waiter default (epoll, poll)
--
[]'s
Hugo
More information about the varnish-misc
mailing list