Varnish with many TIME_WAIT sockets and traffic problems
John S.
mailingoijv at gmail.com
Mon Sep 26 15:21:24 CEST 2011
Hi,
We have an existing infrastructure using Squid for reverse proxy,
serving images with nginx as backend on a Debian GNU/Linux
2.6.32-5-amd64 system. We have several hosts behind a pf/relayd load
balancer, each machine is a dual-core Xeon with 12GB of memory.
We are currently evaluating Varnish (varnish-3.0.1 revision cbf1284)
and setup a reverse proxy using it, with the same hardware and
software configuration than our Squids. When we put it in our pool of
servers, it gets half traffic than our Squids, and we have a lot of
sockets stuck in TIME_WAIT state, with few established connexions, cf
output of netstat on the varnish host :
6 CLOSING
259 ESTABLISHED
32 FIN_WAIT1
755 FIN_WAIT2
20 LAST_ACK
33 SYN_RECV
2781 TIME_WAIT
When we disable keep-alive option the sockets stats return back to
normal, but performance is of course very poor and traffic is always
half than a squid.
On the pf/relayd gateway, we had at the same time ~2110 states related
to our Varnish host (Squid hosts were at ~5030), with the following
repartition (varnish host only) :
35 CLOSING:ESTABLISHED
3 CLOSING:FIN_WAIT_2
314 ESTABLISHED:ESTABLISHED
5 ESTABLISHED:FIN_WAIT_2
464 FIN_WAIT_2:ESTABLISHED
1100 FIN_WAIT_2:FIN_WAIT_2
11 PROXY:SRC
177 TIME_WAIT:TIME_WAIT
We used tcpdump on our varnish host to analyse the traffic, and
nothing seems wrong at this level (connections are properly closed).
After some research we found this thread
https://www.varnish-cache.org/lists/pipermail/varnish-misc/2011-September/006805.html
and tried a 10GB malloc cache instead of our previous 60GB disk cache,
to no avail. Nothing wrong in syslog.
Do we have something misconfigured, or is Varnish running into some
kind of resource limitation ?
Following some complementary stats.
# varnishstat -1
client_conn 964002 77.75 Client connections accepted
client_drop 0 0.00 Connection dropped, no sess/wrk
client_req 3371261 271.90 Client requests received
cache_hit 3225106 260.11 Cache hits
cache_hitpass 4 0.00 Cache hits for pass
cache_miss 146003 11.78 Cache misses
backend_conn 15964 1.29 Backend conn. success
backend_unhealthy 0 0.00 Backend conn. not attempted
backend_busy 0 0.00 Backend conn. too many
backend_fail 0 0.00 Backend conn. failures
backend_reuse 130187 10.50 Backend conn. reuses
backend_toolate 0 0.00 Backend conn. was closed
backend_recycle 130189 10.50 Backend conn. recycles
backend_retry 0 0.00 Backend conn. retry
fetch_head 0 0.00 Fetch head
fetch_length 144362 11.64 Fetch with Length
fetch_chunked 1578 0.13 Fetch chunked
fetch_eof 0 0.00 Fetch EOF
fetch_bad 0 0.00 Fetch had bad headers
fetch_close 0 0.00 Fetch wanted close
fetch_oldhttp 0 0.00 Fetch pre HTTP/1.1 closed
fetch_zero 0 0.00 Fetch zero len
fetch_failed 0 0.00 Fetch failed
fetch_1xx 0 0.00 Fetch no body (1xx)
fetch_204 0 0.00 Fetch no body (204)
fetch_304 0 0.00 Fetch no body (304)
n_sess_mem 539 . N struct sess_mem
n_sess 306 . N struct sess
n_object 143971 . N struct object
n_vampireobject 0 . N unresurrected objects
n_objectcore 144056 . N struct objectcore
n_objecthead 144099 . N struct objecthead
n_waitinglist 116 . N struct waitinglist
n_vbc 2 . N struct vbc
n_wrk 100 . N worker threads
n_wrk_create 100 0.01 N worker threads created
n_wrk_failed 0 0.00 N worker threads not created
n_wrk_max 0 0.00 N worker threads limited
n_wrk_lqueue 0 0.00 work request queue length
n_wrk_queued 6 0.00 N queued work requests
n_wrk_drop 0 0.00 N dropped work requests
n_backend 1 . N backends
n_expired 1827 . N expired objects
n_lru_nuked 0 . N LRU nuked objects
n_lru_moved 1383093 . N LRU moved objects
losthdr 5 0.00 HTTP header overflows
n_objsendfile 0 0.00 Objects sent with sendfile
n_objwrite 2652875 213.96 Objects sent with write
n_objoverflow 0 0.00 Objects overflowing workspace
s_sess 963986 77.75 Total Sessions
s_req 3371261 271.90 Total Requests
s_pipe 5 0.00 Total pipe
s_pass 142 0.01 Total pass
s_fetch 145940 11.77 Total fetch
s_hdrbytes 921297466 74304.18 Total header bytes
s_bodybytes 29623671087 2389198.41 Total body bytes
sess_closed 127057 10.25 Session Closed
sess_pipeline 12585 1.02 Session Pipeline
sess_readahead 6810 0.55 Session Read Ahead
sess_linger 3316796 267.51 Session Linger
sess_herd 2846740 229.59 Session herd
shm_records 138220355 11147.70 SHM records
shm_writes 9411107 759.02 SHM writes
shm_flushes 0 0.00 SHM flushes due to overflow
shm_cont 836 0.07 SHM MTX contention
shm_cycles 58 0.00 SHM cycles through buffer
sms_nreq 210 0.02 SMS allocator requests
sms_nobj 0 . SMS outstanding allocations
sms_nbytes 0 . SMS outstanding bytes
sms_balloc 87855 . SMS bytes allocated
sms_bfree 87855 . SMS bytes freed
backend_req 146146 11.79 Backend requests made
n_vcl 1 0.00 N vcl total
n_vcl_avail 1 0.00 N vcl available
n_vcl_discard 0 0.00 N vcl discarded
n_ban 1 . N total active bans
n_ban_add 1 0.00 N new bans added
n_ban_retire 0 0.00 N old bans deleted
n_ban_obj_test 0 0.00 N objects tested
n_ban_re_test 0 0.00 N regexps tested against
n_ban_dups 0 0.00 N duplicate bans removed
hcb_nolock 3371117 271.89 HCB Lookups without lock
hcb_lock 145899 11.77 HCB Lookups with lock
hcb_insert 145899 11.77 HCB Inserts
esi_errors 0 0.00 ESI parse errors (unlock)
esi_warnings 0 0.00 ESI parse warnings (unlock)
accept_fail 0 0.00 Accept failures
client_drop_late 0 0.00 Connection dropped late
uptime 12399 1.00 Client uptime
dir_dns_lookups 0 0.00 DNS director lookups
dir_dns_failed 0 0.00 DNS director failed lookups
dir_dns_hit 0 0.00 DNS director cached lookups hit
dir_dns_cache_full 0 0.00 DNS director full dnscache
vmods 0 . Loaded VMODs
n_gzip 0 0.00 Gzip operations
n_gunzip 1741 0.14 Gunzip operations
LCK.sms.creat 1 0.00 Created locks
LCK.sms.destroy 0 0.00 Destroyed locks
LCK.sms.locks 630 0.05 Lock Operations
LCK.sms.colls 0 0.00 Collisions
LCK.smp.creat 0 0.00 Created locks
LCK.smp.destroy 0 0.00 Destroyed locks
LCK.smp.locks 0 0.00 Lock Operations
LCK.smp.colls 0 0.00 Collisions
LCK.sma.creat 2 0.00 Created locks
LCK.sma.destroy 0 0.00 Destroyed locks
LCK.sma.locks 297818 24.02 Lock Operations
LCK.sma.colls 0 0.00 Collisions
LCK.smf.creat 0 0.00 Created locks
LCK.smf.destroy 0 0.00 Destroyed locks
LCK.smf.locks 0 0.00 Lock Operations
LCK.smf.colls 0 0.00 Collisions
LCK.hsl.creat 0 0.00 Created locks
LCK.hsl.destroy 0 0.00 Destroyed locks
LCK.hsl.locks 0 0.00 Lock Operations
LCK.hsl.colls 0 0.00 Collisions
LCK.hcb.creat 1 0.00 Created locks
LCK.hcb.destroy 0 0.00 Destroyed locks
LCK.hcb.locks 147895 11.93 Lock Operations
LCK.hcb.colls 0 0.00 Collisions
LCK.hcl.creat 0 0.00 Created locks
LCK.hcl.destroy 0 0.00 Destroyed locks
LCK.hcl.locks 0 0.00 Lock Operations
LCK.hcl.colls 0 0.00 Collisions
LCK.vcl.creat 1 0.00 Created locks
LCK.vcl.destroy 0 0.00 Destroyed locks
LCK.vcl.locks 134 0.01 Lock Operations
LCK.vcl.colls 0 0.00 Collisions
LCK.stat.creat 1 0.00 Created locks
LCK.stat.destroy 0 0.00 Destroyed locks
LCK.stat.locks 539 0.04 Lock Operations
LCK.stat.colls 0 0.00 Collisions
LCK.sessmem.creat 1 0.00 Created locks
LCK.sessmem.destroy 0 0.00 Destroyed locks
LCK.sessmem.locks 974191 78.57 Lock Operations
LCK.sessmem.colls 0 0.00 Collisions
LCK.wstat.creat 1 0.00 Created locks
LCK.wstat.destroy 0 0.00 Destroyed locks
LCK.wstat.locks 2485702 200.48 Lock Operations
LCK.wstat.colls 0 0.00 Collisions
LCK.herder.creat 1 0.00 Created locks
LCK.herder.destroy 0 0.00 Destroyed locks
LCK.herder.locks 1 0.00 Lock Operations
LCK.herder.colls 0 0.00 Collisions
LCK.wq.creat 2 0.00 Created locks
LCK.wq.destroy 0 0.00 Destroyed locks
LCK.wq.locks 5972534 481.69 Lock Operations
LCK.wq.colls 0 0.00 Collisions
LCK.objhdr.creat 145983 11.77 Created locks
LCK.objhdr.destroy 1884 0.15 Destroyed locks
LCK.objhdr.locks 13487826 1087.82 Lock Operations
LCK.objhdr.colls 0 0.00 Collisions
LCK.exp.creat 1 0.00 Created locks
LCK.exp.destroy 0 0.00 Destroyed locks
LCK.exp.locks 160025 12.91 Lock Operations
LCK.exp.colls 0 0.00 Collisions
LCK.lru.creat 2 0.00 Created locks
LCK.lru.destroy 0 0.00 Destroyed locks
LCK.lru.locks 145799 11.76 Lock Operations
LCK.lru.colls 0 0.00 Collisions
LCK.cli.creat 1 0.00 Created locks
LCK.cli.destroy 0 0.00 Destroyed locks
LCK.cli.locks 4145 0.33 Lock Operations
LCK.cli.colls 0 0.00 Collisions
LCK.ban.creat 1 0.00 Created locks
LCK.ban.destroy 0 0.00 Destroyed locks
LCK.ban.locks 1382436 111.50 Lock Operations
LCK.ban.colls 0 0.00 Collisions
LCK.vbp.creat 1 0.00 Created locks
LCK.vbp.destroy 0 0.00 Destroyed locks
LCK.vbp.locks 0 0.00 Lock Operations
LCK.vbp.colls 0 0.00 Collisions
LCK.vbe.creat 1 0.00 Created locks
LCK.vbe.destroy 0 0.00 Destroyed locks
LCK.vbe.locks 31926 2.57 Lock Operations
LCK.vbe.colls 0 0.00 Collisions
LCK.backend.creat 1 0.00 Created locks
LCK.backend.destroy 0 0.00 Destroyed locks
LCK.backend.locks 324303 26.16 Lock Operations
LCK.backend.colls 0 0.00 Collisions
SMA.s0.c_req 291808 23.53 Allocator requests
SMA.s0.c_fail 0 0.00 Allocator failures
SMA.s0.c_bytes 5460865273 440427.88 Bytes allocated
SMA.s0.c_freed 208057151 16780.16 Bytes freed
SMA.s0.g_alloc 287944 . Allocations outstanding
SMA.s0.g_bytes 5252808122 . Bytes outstanding
SMA.s0.g_space 5484610118 . Bytes available
SMA.Transient.c_req 284 0.02 Allocator requests
SMA.Transient.c_fail 0 0.00 Allocator failures
SMA.Transient.c_bytes 9188773 741.09 Bytes allocated
SMA.Transient.c_freed 9188773 741.09 Bytes freed
SMA.Transient.g_alloc 0 . Allocations outstanding
SMA.Transient.g_bytes 0 . Bytes outstanding
SMA.Transient.g_space 0 . Bytes available
VBE.default(127.0.0.1,,8080).vcls 1 . VCL references
VBE.default(127.0.0.1,,8080).happy 0 . Happy health probes
# cat /proc/`pgrep varnishd|tail -n1`/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes unlimited unlimited processes
Max open files 131072 131072 files
Max locked memory 83968000 83968000 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 16382 16382 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Varnish command line : /usr/sbin/varnishd -P /var/run/varnishd.pid -a
:80 -f /etc/varnish/conf.vcl -T localhost:6082 -u varnish -g varnish
-t 120 -w 50,2000,120 -S /etc/varnish/secret -s malloc,10G -p
thread_pool_add_delay 2 -p thread_pools 2
And our conf.vcl :
backend default {
.host = "127.0.0.1";
.port = "8080";
}
sub vcl_recv {
if(req.http.host == "www.example.com"){
return(lookup);
}
}
sub vcl_deliver {
remove resp.http.Via;
remove resp.http.X-Varnish;
remove resp.http.Server;
remove resp.http.X-Powered-By;
}
Thank you for your help.
More information about the varnish-misc
mailing list