Varnish 2.1 + kswapd0 freak out
Augusto Becciu
augusto at jadedpixel.com
Fri May 21 00:45:25 CEST 2010
Hey guys,
I'm running Varnish 2.1 in two m2.xlarge ec2 instances (17G of RAM +
linux kernel 2.6.21.7-2.fc8xen-ec2-v1.0). Those two servers have been
running for 2 months now almost without trouble. But I've noticed some
crazy spikes in cpu usage (mostly in kernel land) once in a while.
A few days ago I saw kswapd0 consuming 100% of a cpu core and varnishd
consuming 100% of the other cpu core. I could strace varnish for a few
seconds and everything looked normal, but then it crashed and left a
zombie process eating most of the cpu so I had to restart the server.
Today, exactly the same thing happened in the other server, and this
is starting to scare me out.
We're running varnish with the following params:
varnishd -P /var/run/varnishd.pid -a 0.0.0.0:2000 -T 127.0.0.1:6082 -w
200,2000 -s malloc,12G -p lru_interval=20 -f /etc/varnish/varnish.vcl
We don't have swap enabled on these servers.
Here's varnishstat -1 when varnish was freaking out:
client_conn 9592723 7.82 Client connections accepted
client_drop 0 0.00 Connection dropped, no sess/wrk
client_req 67302765 54.84 Client requests received
cache_hit 50571130 41.20 Cache hits
cache_hitpass 0 0.00 Cache hits for pass
cache_miss 16050808 13.08 Cache misses
backend_conn 16029200 13.06 Backend conn. success
backend_unhealthy 0 0.00 Backend conn. not attempted
backend_busy 0 0.00 Backend conn. too many
backend_fail 20649 0.02 Backend conn. failures
backend_reuse 12352 0.01 Backend conn. reuses
backend_toolate 0 0.00 Backend conn. was closed
backend_recycle 12352 0.01 Backend conn. recycles
backend_unused 0 0.00 Backend conn. unused
fetch_head 0 0.00 Fetch head
fetch_length 12764170 10.40 Fetch with Length
fetch_chunked 3272791 2.67 Fetch chunked
fetch_eof 0 0.00 Fetch EOF
fetch_bad 0 0.00 Fetch had bad headers
fetch_close 49 0.00 Fetch wanted close
fetch_oldhttp 0 0.00 Fetch pre HTTP/1.1 closed
fetch_zero 0 0.00 Fetch zero len
fetch_failed 3272895 2.67 Fetch failed
n_sess_mem 587 . N struct sess_mem
n_sess 465 . N struct sess
n_object 659083 . N struct object
n_vampireobject 0 . N unresurrected objects
n_objectcore 659439 . N struct objectcore
n_objecthead 907405 . N struct objecthead
n_smf 0 . N struct smf
n_smf_frag 0 . N small free smf
n_smf_large 0 . N large free smf
n_vbe_conn 277 . N struct vbe_conn
n_wrk 400 . N worker threads
n_wrk_create 458 0.00 N worker threads created
n_wrk_failed 0 0.00 N worker threads not created
n_wrk_max 0 0.00 N worker threads limited
n_wrk_queue 0 0.00 N queued work requests
n_wrk_overflow 874 0.00 N overflowed work requests
n_wrk_drop 0 0.00 N dropped work requests
n_backend 2 . N backends
n_expired 112662 . N expired objects
n_lru_nuked 11954429 . N LRU nuked objects
n_lru_saved 0 . N LRU saved objects
n_lru_moved 46618517 . N LRU moved objects
n_deathrow 0 . N objects on deathrow
losthdr 2 0.00 HTTP header overflows
n_objsendfile 0 0.00 Objects sent with sendfile
n_objwrite 60192420 49.04 Objects sent with write
n_objoverflow 0 0.00 Objects overflowing workspace
s_sess 9592577 7.82 Total Sessions
s_req 67302765 54.84 Total Requests
s_pipe 110 0.00 Total pipe
s_pass 1691 0.00 Total pass
s_fetch 12764115 10.40 Total fetch
s_hdrbytes 21558035591 17564.97 Total header bytes
s_bodybytes 1162454990977 947140.58 Total body bytes
sess_closed 7687689 6.26 Session Closed
sess_pipeline 0 0.00 Session Pipeline
sess_readahead 0 0.00 Session Read Ahead
sess_linger 61236267 49.89 Session Linger
sess_herd 16659649 13.57 Session herd
shm_records 3395953253 2766.94 SHM records
shm_writes 131371160 107.04 SHM writes
shm_flushes 661 0.00 SHM flushes due to overflow
shm_cont 114836 0.09 SHM MTX contention
shm_cycles 1378 0.00 SHM cycles through buffer
sm_nreq 0 0.00 allocator requests
sm_nobj 0 . outstanding allocations
sm_balloc 0 . bytes allocated
sm_bfree 0 . bytes free
sma_nreq 37442974 30.51 SMA allocator requests
sma_nobj 1318091 . SMA outstanding allocations
sma_nbytes 12884892751 . SMA outstanding bytes
sma_balloc 250925494011 . SMA bytes allocated
sma_bfree 238040601260 . SMA bytes free
sms_nreq 3967048 3.23 SMS allocator requests
sms_nobj 0 . SMS outstanding allocations
sms_nbytes 18446744073709527064 . SMS outstanding bytes
sms_balloc 1895595320 . SMS bytes allocated
sms_bfree 1895619352 . SMS bytes freed
backend_req 16043889 13.07 Backend requests made
n_vcl 1 0.00 N vcl total
n_vcl_avail 1 0.00 N vcl available
n_vcl_discard 0 0.00 N vcl discarded
n_purge 26155 . N total active purges
n_purge_add 678663 0.55 N new purges added
n_purge_retire 652508 0.53 N old purges deleted
n_purge_obj_test 47484518 38.69 N objects tested
n_purge_re_test 41413683761 33742.88 N regexps tested against
n_purge_dups 485656 0.40 N duplicate purges removed
hcb_nolock 50605455 41.23 HCB Lookups without lock
hcb_lock 566 0.00 HCB Lookups with lock
hcb_insert 16016509 13.05 HCB Inserts
esi_parse 0 0.00 Objects ESI parsed (unlock)
esi_errors 0 0.00 ESI parse errors (unlock)
accept_fail 0 0.00 Accept failures
client_drop_late 0 0.00 Connection dropped late
uptime 1227331 1.00 Client uptime
Have anyone experienced something similar?
Thanks,
Augusto
More information about the varnish-misc
mailing list