Release schedule for saint mode.
Ken Brownfield
kb+varnish at slide.com
Sat Jan 30 22:22:32 CET 2010
I started a test yesterday with trunk at 4478, and after running for about 12 hours under pretty good load, the machine started swapping heavily and we had to restart Varnish. After coming back up, it didn't have the 2.8M objects it did when it was shut down:
uptime 464 . Child uptime
client_conn 1951 4.20 Client connections accepted
client_drop 0 0.00 Connection dropped, no sess
client_req 79366 171.05 Client requests received
cache_hit 19669 42.39 Cache hits
cache_hitpass 0 0.00 Cache hits for pass
cache_miss 59265 127.73 Cache misses
backend_conn 288 0.62 Backend conn. success
backend_unhealthy 0 0.00 Backend conn. not attempted
backend_busy 0 0.00 Backend conn. too many
backend_fail 0 0.00 Backend conn. failures
backend_reuse 59118 127.41 Backend conn. reuses
backend_toolate 0 0.00 Backend conn. was closed
backend_recycle 59128 127.43 Backend conn. recycles
backend_unused 76 0.16 Backend conn. unused
fetch_head 0 0.00 Fetch head
fetch_length 59076 127.32 Fetch with Length
fetch_chunked 0 0.00 Fetch chunked
fetch_eof 0 0.00 Fetch EOF
fetch_bad 0 0.00 Fetch had bad headers
fetch_close 0 0.00 Fetch wanted close
fetch_oldhttp 0 0.00 Fetch pre HTTP/1.1 closed
fetch_zero 0 0.00 Fetch zero len
fetch_failed 0 0.00 Fetch failed
n_sess_mem 249 . N struct sess_mem
n_sess 186 . N struct sess
n_object 58989 . N struct object
n_vampireobject 0 . N unresurrected objects
n_objectcore 59018 . N struct objectcore
n_objecthead 59018 . N struct objecthead
n_smf 0 . N struct smf
n_smf_frag 0 . N small free smf
n_smf_large 0 . N large free smf
n_vbe_conn 119 . N struct vbe_conn
n_wrk 256 . N worker threads
n_wrk_create 256 0.55 N worker threads created
n_wrk_failed 0 0.00 N worker threads not created
n_wrk_max 0 0.00 N worker threads limited
n_wrk_queue 0 0.00 N queued work requests
n_wrk_overflow 0 0.00 N overflowed work requests
n_wrk_drop 0 0.00 N dropped work requests
n_backend 1 . N backends
n_expired 0 . N expired objects
n_lru_nuked 0 . N LRU nuked objects
n_lru_saved 0 . N LRU saved objects
n_lru_moved 0 . N LRU moved objects
n_deathrow 0 . N objects on deathrow
losthdr 0 0.00 HTTP header overflows
n_objsendfile 0 0.00 Objects sent with sendfile
n_objwrite 78656 169.52 Objects sent with write
n_objoverflow 0 0.00 Objects overflowing workspace
s_sess 1945 4.19 Total Sessions
s_req 79366 171.05 Total Requests
s_pipe 0 0.00 Total pipe
s_pass 56 0.12 Total pass
s_fetch 59074 127.31 Total fetch
s_hdrbytes 31974718 68911.03 Total header bytes
s_bodybytes 2381590875 5132738.95 Total body bytes
sess_closed 965 2.08 Session Closed
sess_pipeline 0 0.00 Session Pipeline
sess_readahead 0 0.00 Session Read Ahead
sess_linger 78420 169.01 Session Linger
sess_herd 55547 119.71 Session herd
shm_records 5479085 11808.37 SHM records
shm_writes 201347 433.94 SHM writes
shm_flushes 0 0.00 SHM flushes due to overflow
shm_cont 453 0.98 SHM MTX contention
shm_cycles 2 0.00 SHM cycles through buffer
sm_nreq 0 0.00 allocator requests
sm_nobj 0 . outstanding allocations
sm_balloc 0 . bytes allocated
sm_bfree 0 . bytes free
sma_nreq 0 0.00 SMA allocator requests
sma_nobj 0 . SMA outstanding allocations
sma_nbytes 0 . SMA outstanding bytes
sma_balloc 0 . SMA bytes allocated
sma_bfree 0 . SMA bytes free
sms_nreq 622 1.34 SMS allocator requests
sms_nobj 0 . SMS outstanding allocations
sms_nbytes 0 . SMS outstanding bytes
sms_balloc 327230 . SMS bytes allocated
sms_bfree 327230 . SMS bytes freed
backend_req 59406 128.03 Backend requests made
n_vcl 1 0.00 N vcl total
n_vcl_avail 1 0.00 N vcl available
n_vcl_discard 0 0.00 N vcl discarded
n_purge 1 . N total active purges
n_purge_add 1 0.00 N new purges added
n_purge_retire 0 0.00 N old purges deleted
n_purge_obj_test 0 0.00 N objects tested
n_purge_re_test 0 0.00 N regexps tested against
n_purge_dups 0 0.00 N duplicate purges removed
hcb_nolock 0 0.00 HCB Lookups without lock
hcb_lock 0 0.00 HCB Lookups with lock
hcb_insert 0 0.00 HCB Inserts
esi_parse 0 0.00 Objects ESI parsed (unlock)
esi_errors 0 0.00 ESI parse errors (unlock)
Nothing abnormal in the logs, just normal stopping/starting messages.
Jan 30 05:30:48 v0 varnishd[16001]: Manager got SIGINT
Jan 30 05:30:48 v0 varnishd[16001]: Stopping Child
Jan 30 12:52:28 v0 varnishd[27107]: child (27108) Started
Jan 30 12:52:30 v0 varnishd[27107]: Child (27108) said Closed fds: 15 16 17 20 21 23 24
Jan 30 12:52:30 v0 varnishd[27107]: Child (27108) said Child starts
Jan 30 12:52:30 v0 varnishd[27107]: Child (27108) said Silo completely loaded
Jan 30 12:52:30 v0 varnishd[27107]: Child (27108) said Probe("GET /r/1/0/dl/gAO0cAU8rz-UR3ZNecmCwC1R0_hIR_QR HTTP/1.1
Jan 30 12:52:30 v0 varnishd[27107]: Child (27108) said
Jan 30 12:52:30 v0 varnishd[27107]: Child (27108) said Host: deco.slide.com
Jan 30 12:52:30 v0 varnishd[27107]: Child (27108) said
Jan 30 12:52:30 v0 varnishd[27107]: Child (27108) said Connection: close
Jan 30 12:52:30 v0 varnishd[27107]: Child (27108) said
Jan 30 12:52:30 v0 varnishd[27107]: Child (27108) said ", 0.8, 3)
Cmdline:
varnishd \
-a 0.0.0.0:3228 \
-f config.vcl \
-g nogroup \
-h classic,524287 \
-P varnish.pid \
-p ban_lurker_sleep=60s \
-p between_bytes_timeout=15s \
-p cache_vbe_conns=on \
-p cli_buffer=32768 \
-p client_http11=on \
-p default_grace=0s \
-p default_ttl=0s \
-p err_ttl=0s \
-p first_byte_timeout=15s \
-p listen_depth=8192 \
-p lru_interval=3600 \
-p max_restarts=2 \
-p obj_workspace=2048 \
-p pipe_timeout=15s \
-p purge_dups=on \
-p rush_exponent=32 \
-p sess_timeout=15s \
-p thread_pools=8 \
-p thread_pool_min=32 \
-p thread_pool_max=8192 \
-p thread_pool_stack=262144 \
-s persistent,/cache/sdm/varnish,120G \
-s persistent,/cache/sdn/varnish,120G \
-s persistent,/cache/sdo/varnish,120G \
-s persistent,/cache/sdp/varnish,120G \
-s persistent,/cache/sdq/varnish,120G \
-s persistent,/cache/sdr/varnish,120G \
-s persistent,/cache/sds/varnish,120G \
-s persistent,/cache/sdt/varnish,120G \
-s persistent,/cache/sdu/varnish,120G \
-s persistent,/cache/sdv/varnish,120G \
-s persistent,/cache/sdw/varnish,120G \
-s persistent,/cache/sdx/varnish,120G \
-T 0.0.0.0:6666 \
-u nobody
OT: why does Varnish swap when the working set no longer fits in RAM -- and when dirty pages are in the 10-100MB range (64GB box). I have a feeling that turning off swap would cause the kernel to eventually lock up, so I'll have to see if there's a way to tune this (swappiness already at 1).
Thanks,
--
Ken
On Jan 29, 2010, at 11:16 AM, Ken Brownfield wrote:
> On Jan 29, 2010, at 3:54 AM, Tollef Fog Heen wrote:
>> It should be. You'll lose the last storage silo (since that's not
>> closed yet), but older objects should be available.
>
> This might be the source of the confusion. How often are silos closed? My testing was simply "hit the cache for a single object a few times to increase obj.hits, bounce varnish, next request was a miss". If I can get a handle on how often silos are closed, I can more accurately check the behavior under production load.
>
>> Can you see if you can reproduce and post a varnishlog -1 when the child
>> has restarted as well as after having restarted the parent?
>
> Will do, thanks.
> --
> Ken
>
>
>> | Maybe there's a graceful, undocumented method of stopping the parent that I'm not aware of?
>>
>> Shouldn't matter.
>>
>> --
>> Tollef Fog Heen
>> Redpill Linpro -- Changing the game!
>> t: +47 21 54 41 73
>
More information about the varnish-misc
mailing list