varnish stopped resonding

Matt Schurenko MSchurenko at airg.com
Wed Sep 21 19:15:51 CEST 2011


Hi,

I'm running two varnish servers in production (ver  2.1.5). Both are using the same hardware and have the same amount of RAM (48GB). Last night one of the varnish servers stopped responding on port 80. Since we are using HAproxy in front of both varnish servers for load balancing this did not have much effect on our end users. The symptoms of the problem were either a client ( HAproxy, telnet) could not establish a layer 4 connection to varnish or, if a client could establish a connection and issued an HTTP GET, varnish returned nothing, no HTTP headers, nothing.

Running "ps -efL | grep varnish | wc -l" I revelaed that there were ~ 500 varnish threads. I am using the default configuration with regards to threads (max of 500). To me it seemed that when a client tried to connect to varnish there were no thread available to use so the client just hung there until either it or varnish timeout out and disconnected. Unfortunately I didn't have the good sense to capture a "varnishastat -l" after this happened. I was focused on getting the server back to a working state so I ended up restarting varnishd.

Here is my varnishd command line followed by a current "varnishstat -l" (I have set the weight for this server to be lower than the other varnish instance so that the cache can "warm up". There is typically 4 x as much traffic):

/usr/local/sbin/varnishd -s file,/tmp/varnish-cache,60G -T 127.0.0.1:2000 -a 0.0.0.0:80 -t 604800 -f /usr/local/etc/varnish/default.vcl -p http_headers 384 -p connect_timeout 4.0

client_conn           4985179       120.45 Client connections accepted
client_drop                 0         0.00 Connection dropped, no sess/wrk
client_req            4907077       118.56 Client requests received
cache_hit             3356368        81.09 Cache hits
cache_hitpass               0         0.00 Cache hits for pass
cache_miss            1550606        37.46 Cache misses
backend_conn          1530014        36.97 Backend conn. success
backend_unhealthy            0         0.00 Backend conn. not attempted
backend_busy                0         0.00 Backend conn. too many
backend_fail                0         0.00 Backend conn. failures
backend_reuse           20690         0.50 Backend conn. reuses
backend_toolate             0         0.00 Backend conn. was closed
backend_recycle         20691         0.50 Backend conn. recycles
backend_unused              0         0.00 Backend conn. unused
fetch_head                  1         0.00 Fetch head
fetch_length            33270         0.80 Fetch with Length
fetch_chunked         1517362        36.66 Fetch chunked
fetch_eof                   0         0.00 Fetch EOF
fetch_bad                   0         0.00 Fetch had bad headers
fetch_close                70         0.00 Fetch wanted close
fetch_oldhttp               0         0.00 Fetch pre HTTP/1.1 closed
fetch_zero                  0         0.00 Fetch zero len
fetch_failed                0         0.00 Fetch failed
n_sess_mem                262          .   N struct sess_mem
n_sess                     68          .   N struct sess
n_object              1550439          .   N struct object
n_vampireobject             0          .   N unresurrected objects
n_objectcore          1550458          .   N struct objectcore
n_objecthead          1550412          .   N struct objecthead
n_smf                 3100879          .   N struct smf
n_smf_frag                  0          .   N small free smf
n_smf_large                 1          .   N large free smf
n_vbe_conn                  1          .   N struct vbe_conn
n_wrk                      29          .   N worker threads
n_wrk_create              870         0.02 N worker threads created
n_wrk_failed                0         0.00 N worker threads not created
n_wrk_max                3128         0.08 N worker threads limited
n_wrk_queue                 0         0.00 N queued work requests
n_wrk_overflow           4696         0.11 N overflowed work requests
n_wrk_drop                  0         0.00 N dropped work requests
n_backend                   2          .   N backends
n_expired                 157          .   N expired objects
n_lru_nuked                 0          .   N LRU nuked objects
n_lru_saved                 0          .   N LRU saved objects
n_lru_moved           3077705          .   N LRU moved objects
n_deathrow                  0          .   N objects on deathrow
losthdr                     0         0.00 HTTP header overflows
n_objsendfile               0         0.00 Objects sent with sendfile
n_objwrite            4817364       116.39 Objects sent with write
n_objoverflow               0         0.00 Objects overflowing workspace
s_sess                4985176       120.45 Total Sessions
s_req                 4907077       118.56 Total Requests
s_pipe                      0         0.00 Total pipe
s_pass                    102         0.00 Total pass
s_fetch               1550703        37.47 Total fetch
s_hdrbytes         1590643697     38431.56 Total header bytes
s_bodybytes       17647134982    426372.59 Total body bytes
sess_closed           4522198       109.26 Session Closed
sess_pipeline               4         0.00 Session Pipeline
sess_readahead              8         0.00 Session Read Ahead
sess_linger            469810        11.35 Session Linger
sess_herd              476189        11.51 Session herd
shm_records         297887487      7197.26 SHM records
shm_writes           23469767       567.05 SHM writes
shm_flushes                 0         0.00 SHM flushes due to overflow
shm_cont                51830         1.25 SHM MTX contention
shm_cycles                137         0.00 SHM cycles through buffer
sm_nreq               3101298        74.93 allocator requests
sm_nobj               3100878          .   outstanding allocations
sm_balloc         13670006784          .   bytes allocated
sm_bfree          50754502656          .   bytes free
sma_nreq                    0         0.00 SMA allocator requests
sma_nobj                    0          .   SMA outstanding allocations
sma_nbytes                  0          .   SMA outstanding bytes
sma_balloc                  0          .   SMA bytes allocated
sma_bfree                   0          .   SMA bytes free
sms_nreq                    5         0.00 SMS allocator requests
sma_nobj                    0          .   SMA outstanding allocations
sma_nbytes                  0          .   SMA outstanding bytes
sma_balloc                  0          .   SMA bytes allocated
sma_bfree                   0          .   SMA bytes free
sms_nreq                    5         0.00 SMS allocator requests
sms_nobj                    0          .   SMS outstanding allocations
sms_nbytes                  0          .   SMS outstanding bytes
sms_balloc               2090          .   SMS bytes allocated
sms_bfree                2090          .   SMS bytes freed
backend_req           1550708        37.47 Backend requests made
n_vcl                       1         0.00 N vcl total
n_vcl_avail                 1         0.00 N vcl available
n_vcl_discard               0         0.00 N vcl discarded
n_purge                     1          .   N total active purges
n_purge_add                 1         0.00 N new purges added
n_purge_retire              0         0.00 N old purges deleted
n_purge_obj_test            0         0.00 N objects tested
n_purge_re_test             0         0.00 N regexps tested against
n_purge_dups                0         0.00 N duplicate purges removed
hcb_nolock            4906976       118.56 HCB Lookups without lock
hcb_lock              1550518        37.46 HCB Lookups with lock
hcb_insert            1550517        37.46 HCB Inserts
esi_parse                   0         0.00 Objects ESI parsed (unlock)
esi_errors                  0         0.00 ESI parse errors (unlock)
accept_fail                 0         0.00 Accept failures
client_drop_late            0         0.00 Connection dropped late
uptime                  41389         1.00 Client uptime
backend_retry               0         0.00 Backend conn. retry
dir_dns_lookups             0         0.00 DNS director lookups
dir_dns_failed              0         0.00 DNS director failed lookups
dir_dns_hit                 0         0.00 DNS director cached lookups hit
dir_dns_cache_full            0         0.00 DNS director full dnscache
fetch_1xx                   0         0.00 Fetch no body (1xx)
fetch_204                   0         0.00 Fetch no body (204)
fetch_304                   0         0.00 Fetch no body (304)

Could there be something wrong with my configuration that caused this problem?

Thanks

Matt Schurenko
Systems Administrator

airG(r) Share Your World
Suite 710, 1133 Melville Street
Vancouver, BC  V6E 4E5
P: +1.604.408.2228
F: +1.866.874.8136
E: MSchurenko at airg.com
W: www.airg.com<http://www.airg.com>

airG is one of BC's Top 55 Employers and
Canada's Top Employers for Young People

P Please consider the environment before printing this e-mail.
The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material communicated under NDA. Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20110921/d7e96017/attachment-0003.html>


More information about the varnish-misc mailing list