Varnish unresponsive on our server
Uday Kumar
uday.polu at indiamart.com
Thu Jun 20 20:44:22 UTC 2024
Hi Guillaume,
pz_list and recv_queue are not (to my knowledge) Varnish counters, where
are you seeing them?
Yes, could not find them in varnish counters
for pz_list: we checked count of processes list during issue period [9th
June]
Ex: sar -q -f /var/log/sa/sa09 -s 21:15:00 -e 21:30:00
for recv_queue, we get this metric using netstat command [But here, we
cannot get old logs now]
Ex: sudo netstat -nltp
Could you share a "varnishstat -1" of the impacted machine?
I doubt, if we can get varnishstat of machine during the issue period -
correct me if I'm wrong
Below are the stats taken now --> 21st June - 02:00
MGT.uptime 9960545 1.00 Management process uptime
MGT.child_start 1 0.00 Child process started
MGT.child_exit 0 0.00 Child process normal exit
MGT.child_stop 0 0.00 Child process unexpected exit
MGT.child_died 0 0.00 Child process died (signal)
MGT.child_dump 0 0.00 Child process core dumped
MGT.child_panic 0 0.00 Child process panic
MAIN.summs 999145669 100.31 stat summ operations
MAIN.uptime 9960546 1.00 Child process uptime
MAIN.sess_conn 230308639 23.12 Sessions accepted
MAIN.sess_drop 0 0.00 Sessions dropped
MAIN.sess_fail 0 0.00 Session accept failures
MAIN.client_req_400 8578 0.00 Client requests received,
subject to 400 errors
MAIN.client_req_417 0 0.00 Client requests received,
subject to 417 errors
MAIN.client_req 272378619 27.35 Good client requests received
MAIN.cache_hit 61997297 6.22 Cache hits
MAIN.cache_hitpass 0 0.00 Cache hits for pass.
MAIN.cache_hitmiss 147993 0.01 Cache hits for miss.
MAIN.cache_miss 210380895 21.12 Cache misses
MAIN.backend_conn 5153783 0.52 Backend conn. success
MAIN.backend_unhealthy 0 0.00 Backend conn. not attempted
MAIN.backend_busy 0 0.00 Backend conn. too many
MAIN.backend_fail 42 0.00 Backend conn. failures
MAIN.backend_reuse 206881432 20.77 Backend conn. reuses
MAIN.backend_recycle 208365094 20.92 Backend conn. recycles
MAIN.backend_retry 3347 0.00 Backend conn. retry
MAIN.fetch_head 0 0.00 Fetch no body (HEAD)
MAIN.fetch_length 2073051 0.21 Fetch with Length
MAIN.fetch_chunked 209964885 21.08 Fetch chunked
MAIN.fetch_eof 9 0.00 Fetch EOF
MAIN.fetch_bad 0 0.00 Fetch bad T-E
MAIN.fetch_none 40 0.00 Fetch no body
MAIN.fetch_1xx 0 0.00 Fetch no body (1xx)
MAIN.fetch_204 0 0.00 Fetch no body (204)
MAIN.fetch_304 0 0.00 Fetch no body (304)
MAIN.fetch_failed 415941 0.04 Fetch failed (all causes)
MAIN.fetch_no_thread 0 0.00 Fetch failed (no thread)
MAIN.pools 2 . Number of thread pools
MAIN.threads 31 . Total number of threads
MAIN.threads_limited 75 0.00 Threads hit max
MAIN.threads_created 290240 0.03 Threads created
MAIN.threads_destroyed 290209 0.03 Threads destroyed
MAIN.threads_failed 0 0.00 Thread creation failed
MAIN.thread_queue_len 0 . Length of session queue
MAIN.busy_sleep 1000325 0.10 Number of requests sent to
sleep on busy objhdr
MAIN.busy_wakeup 1000325 0.10 Number of requests woken
after sleep on busy objhdr
MAIN.busy_killed 0 0.00 Number of requests killed
after sleep on busy objhdr
MAIN.sess_queued 324645 0.03 Sessions queued for thread
MAIN.sess_dropped 0 0.00 Sessions dropped for thread
MAIN.req_dropped 0 0.00 Requests dropped
MAIN.n_object 1363649 . object structs made
MAIN.n_vampireobject 0 . unresurrected objects
MAIN.n_objectcore 1363658 . objectcore structs made
MAIN.n_objecthead 1365235 . objecthead structs made
MAIN.n_backend 18 . Number of backends
MAIN.n_expired 18708716 . Number of expired objects
MAIN.n_lru_nuked 190240122 . Number of LRU nuked objects
MAIN.n_lru_moved 48881176 . Number of LRU moved objects
MAIN.losthdr 194 0.00 HTTP header overflows
MAIN.s_sess 230308639 23.12 Total sessions seen
MAIN.s_pipe 0 0.00 Total pipe sessions seen
MAIN.s_pass 221 0.00 Total pass-ed requests seen
MAIN.s_fetch 210381116 21.12 Total backend fetches
initiated
MAIN.s_synth 15291 0.00 Total synthethic responses
made
MAIN.s_req_hdrbytes 152588868591 15319.33 Request header bytes
MAIN.s_req_bodybytes 47945048076 4813.50 Request body bytes
MAIN.s_resp_hdrbytes 61429100826 6167.24 Response header bytes
MAIN.s_resp_bodybytes 31545695399316 3167064.88 Response body bytes
MAIN.s_pipe_hdrbytes 0 0.00 Pipe request header bytes
MAIN.s_pipe_in 0 0.00 Piped bytes from client
MAIN.s_pipe_out 0 0.00 Piped bytes to client
MAIN.sess_closed 118139254 11.86 Session Closed
MAIN.sess_closed_err 11257806 1.13 Session Closed with error
MAIN.sess_readahead 0 0.00 Session Read Ahead
MAIN.sess_herd 65082138 6.53 Session herd
MAIN.sc_rem_close 100948777 10.13 Session OK REM_CLOSE
MAIN.sc_req_close 116459448 11.69 Session OK REQ_CLOSE
MAIN.sc_req_http10 1044 0.00 Session Err REQ_HTTP10
MAIN.sc_rx_bad 0 0.00 Session Err RX_BAD
MAIN.sc_rx_body 0 0.00 Session Err RX_BODY
MAIN.sc_rx_junk 8578 0.00 Session Err RX_JUNK
MAIN.sc_rx_overflow 0 0.00 Session Err RX_OVERFLOW
MAIN.sc_rx_timeout 11248199 1.13 Session Err RX_TIMEOUT
MAIN.sc_tx_pipe 0 0.00 Session OK TX_PIPE
MAIN.sc_tx_error 0 0.00 Session Err TX_ERROR
MAIN.sc_tx_eof 1617225 0.16 Session OK TX_EOF
MAIN.sc_resp_close 0 0.00 Session OK RESP_CLOSE
MAIN.sc_overload 0 0.00 Session Err OVERLOAD
MAIN.sc_pipe_overflow 0 0.00 Session Err PIPE_OVERFLOW
MAIN.sc_range_short 0 0.00 Session Err RANGE_SHORT
MAIN.sc_req_http20 0 0.00 Session Err REQ_HTTP20
MAIN.sc_vcl_failure 4 0.00 Session Err VCL_FAILURE
MAIN.shm_records 34703350941 3484.08 SHM records
MAIN.shm_writes 2492186358 250.21 SHM writes
MAIN.shm_flushes 413619708 41.53 SHM flushes due to overflow
MAIN.shm_cont 16129035 1.62 SHM MTX contention
MAIN.shm_cycles 31683 0.00 SHM cycles through buffer
MAIN.backend_req 212039555 21.29 Backend requests made
MAIN.n_vcl 9 . Number of loaded VCLs in
total
MAIN.n_vcl_avail 9 . Number of VCLs available
MAIN.n_vcl_discard 0 . Number of discarded VCLs
MAIN.vcl_fail 4 0.00 VCL failures
MAIN.bans 1 . Count of bans
MAIN.bans_completed 1 . Number of bans marked
'completed'
MAIN.bans_obj 0 . Number of bans using obj.*
MAIN.bans_req 0 . Number of bans using req.*
MAIN.bans_added 1 0.00 Bans added
MAIN.bans_deleted 0 0.00 Bans deleted
MAIN.bans_tested 0 0.00 Bans tested against
objects (lookup)
MAIN.bans_obj_killed 0 0.00 Objects killed by bans
(lookup)
MAIN.bans_lurker_tested 0 0.00 Bans tested against
objects (lurker)
MAIN.bans_tests_tested 0 0.00 Ban tests tested against
objects (lookup)
MAIN.bans_lurker_tests_tested 0 0.00 Ban tests tested
against objects (lurker)
MAIN.bans_lurker_obj_killed 0 0.00 Objects killed by
bans (lurker)
MAIN.bans_lurker_obj_killed_cutoff 0 0.00 Objects killed
by bans for cutoff (lurker)
MAIN.bans_dups 0 0.00 Bans
superseded by other bans
MAIN.bans_lurker_contention 0 0.00 Lurker gave
way for lookup
MAIN.bans_persisted_bytes 16 . Bytes used by
the persisted ban lists
MAIN.bans_persisted_fragmentation 0 . Extra bytes in
persisted ban lists due to fragmentation
MAIN.n_purges 0 . Number of
purge operations executed
MAIN.n_obj_purged 0 . Number of
purged objects
MAIN.exp_mailed 400657705 40.22 Number of
objects mailed to expiry thread
MAIN.exp_received 400655888 40.22 Number of
objects received by expiry thread
MAIN.hcb_nolock 272378417 27.35 HCB Lookups
without lock
MAIN.hcb_lock 210148272 21.10 HCB Lookups
with lock
MAIN.hcb_insert 210147129 21.10 HCB Inserts
MAIN.esi_errors 0 0.00 ESI parse
errors (unlock)
MAIN.esi_warnings 0 0.00 ESI parse
warnings (unlock)
MAIN.vmods 2 . Loaded VMODs
MAIN.n_gzip 0 0.00 Gzip operations
MAIN.n_gunzip 0 0.00 Gunzip
operations
MAIN.n_test_gunzip 125951749 12.65 Test gunzip
operations
LCK.backend.creat 20 0.00 Created locks
LCK.backend.destroy 0 0.00 Destroyed locks
LCK.backend.locks 436028686 43.78 Lock Operations
LCK.backend_tcp.creat 2 0.00 Created locks
LCK.backend_tcp.destroy 0 0.00 Destroyed locks
LCK.backend_tcp.locks 839324465 84.26 Lock Operations
LCK.ban.creat 1 0.00 Created locks
LCK.ban.destroy 0 0.00 Destroyed locks
LCK.ban.locks 485319875 48.72 Lock Operations
LCK.busyobj.creat 275769226 27.69 Created locks
LCK.busyobj.destroy 275772300 27.69 Destroyed locks
LCK.busyobj.locks 45764668121 4594.59 Lock Operations
LCK.cli.creat 1 0.00 Created locks
LCK.cli.destroy 0 0.00 Destroyed locks
LCK.cli.locks 3319775 0.33 Lock Operations
LCK.exp.creat 1 0.00 Created locks
LCK.exp.destroy 0 0.00 Destroyed locks
LCK.exp.locks 1601937269 160.83 Lock Operations
LCK.hcb.creat 1 0.00 Created locks
LCK.hcb.destroy 0 0.00 Destroyed locks
LCK.hcb.locks 418989369 42.06 Lock Operations
LCK.lru.creat 2 0.00 Created locks
LCK.lru.destroy 0 0.00 Destroyed locks
LCK.lru.locks 658492325 66.11 Lock Operations
LCK.mempool.creat 5 0.00 Created locks
LCK.mempool.destroy 0 0.00 Destroyed locks
LCK.mempool.locks 1539542397 154.56 Lock Operations
LCK.objhdr.creat 210205399 21.10 Created locks
LCK.objhdr.destroy 208843468 20.97 Destroyed locks
LCK.objhdr.locks 4042638826 405.87 Lock Operations
LCK.pipestat.creat 1 0.00 Created locks
LCK.pipestat.destroy 0 0.00 Destroyed locks
LCK.pipestat.locks 0 0.00 Lock Operations
LCK.sess.creat 230305038 23.12 Created locks
LCK.sess.destroy 230306744 23.12 Destroyed locks
LCK.sess.locks 716377977 71.92 Lock Operations
LCK.vbe.creat 1 0.00 Created locks
LCK.vbe.destroy 0 0.00 Destroyed locks
LCK.vbe.locks 7305124 0.73 Lock Operations
LCK.vcapace.creat 1 0.00 Created locks
LCK.vcapace.destroy 0 0.00 Destroyed locks
LCK.vcapace.locks 0 0.00 Lock Operations
LCK.vcl.creat 1 0.00 Created locks
LCK.vcl.destroy 0 0.00 Destroyed locks
LCK.vcl.locks 424850700 42.65 Lock Operations
LCK.vxid.creat 1 0.00 Created locks
LCK.vxid.destroy 0 0.00 Destroyed locks
LCK.vxid.locks 265345 0.03 Lock Operations
LCK.waiter.creat 2 0.00 Created locks
LCK.waiter.destroy 0 0.00 Destroyed locks
LCK.waiter.locks 820192788 82.34 Lock Operations
LCK.wq.creat 3 0.00 Created locks
LCK.wq.destroy 0 0.00 Destroyed locks
LCK.wq.locks 1514678352 152.07 Lock Operations
LCK.wstat.creat 1 0.00 Created locks
LCK.wstat.destroy 0 0.00 Destroyed locks
LCK.wstat.locks 889992298 89.35 Lock Operations
MEMPOOL.busyobj.live 4 . In use
MEMPOOL.busyobj.pool 10 . In Pool
MEMPOOL.busyobj.sz_wanted 65536 . Size requested
MEMPOOL.busyobj.sz_actual 65504 . Size allocated
MEMPOOL.busyobj.allocs 210381121 21.12 Allocations
MEMPOOL.busyobj.frees 210381117 21.12 Frees
MEMPOOL.busyobj.recycle 210069371 21.09 Recycled from
pool
MEMPOOL.busyobj.timeout 6923652 0.70 Timed out from
pool
MEMPOOL.busyobj.toosmall 0 0.00 Too small to
recycle
MEMPOOL.busyobj.surplus 11360 0.00 Too many for
pool
MEMPOOL.busyobj.randry 311750 0.03 Pool ran dry
MEMPOOL.req0.live 3 . In use
MEMPOOL.req0.pool 11 . In Pool
MEMPOOL.req0.sz_wanted 65536 . Size requested
MEMPOOL.req0.sz_actual 65504 . Size allocated
MEMPOOL.req0.allocs 142052697 14.26 Allocations
MEMPOOL.req0.frees 142052694 14.26 Frees
MEMPOOL.req0.recycle 142028490 14.26 Recycled from
pool
MEMPOOL.req0.timeout 6143066 0.62 Timed out from
pool
MEMPOOL.req0.toosmall 0 0.00 Too small to
recycle
MEMPOOL.req0.surplus 1959 0.00 Too many for
pool
MEMPOOL.req0.randry 24207 0.00 Pool ran dry
MEMPOOL.sess0.live 8 . In use
MEMPOOL.sess0.pool 10 . In Pool
MEMPOOL.sess0.sz_wanted 512 . Size requested
MEMPOOL.sess0.sz_actual 480 . Size allocated
MEMPOOL.sess0.allocs 115153369 11.56 Allocations
MEMPOOL.sess0.frees 115153361 11.56 Frees
MEMPOOL.sess0.recycle 115143094 11.56 Recycled from
pool
MEMPOOL.sess0.timeout 6113104 0.61 Timed out from
pool
MEMPOOL.sess0.toosmall 0 0.00 Too small to
recycle
MEMPOOL.sess0.surplus 2787 0.00 Too many for
pool
MEMPOOL.sess0.randry 10275 0.00 Pool ran dry
LCK.smf.creat 1 0.00 Created locks
LCK.smf.destroy 0 0.00 Destroyed locks
LCK.smf.locks 3920420601 393.59 Lock Operations
SMF.s0.c_req 2064438640 207.26 Allocator
requests
SMF.s0.c_fail 195848371 19.66 Allocator
failures
SMF.s0.c_bytes 26626288971776 2673175.64 Bytes
allocated
SMF.s0.c_freed 26445825654784 2655057.83 Bytes freed
SMF.s0.g_alloc 12608309 . Allocations
outstanding
SMF.s0.g_bytes 180463316992 . Bytes
outstanding
SMF.s0.g_space 2072793088 . Bytes available
SMF.s0.g_smf 12977499 . N struct smf
SMF.s0.g_smf_frag 369190 . N small free
smf
SMF.s0.g_smf_large 0 . N large free
smf
LCK.sma.creat 1 0.00 Created locks
LCK.sma.destroy 0 0.00 Destroyed locks
LCK.sma.locks 263980168 26.50 Lock Operations
SMA.Transient.c_req 131990084 13.25 Allocator
requests
SMA.Transient.c_fail 0 0.00 Allocator
failures
SMA.Transient.c_bytes 59316019413 5955.10 Bytes allocated
SMA.Transient.c_freed 59316019413 5955.10 Bytes freed
SMA.Transient.g_alloc 0 . Allocations
outstanding
SMA.Transient.g_bytes 0 . Bytes
outstanding
SMA.Transient.g_space 0 . Bytes available
MEMPOOL.req1.live 1 . In use
MEMPOOL.req1.pool 10 . In Pool
MEMPOOL.req1.sz_wanted 65536 . Size requested
MEMPOOL.req1.sz_actual 65504 . Size allocated
MEMPOOL.req1.allocs 142089847 14.27 Allocations
MEMPOOL.req1.frees 142089846 14.27 Frees
MEMPOOL.req1.recycle 142065460 14.26 Recycled from
pool
MEMPOOL.req1.timeout 6145611 0.62 Timed out from
pool
MEMPOOL.req1.toosmall 0 0.00 Too small to
recycle
MEMPOOL.req1.surplus 2350 0.00 Too many for
pool
MEMPOOL.req1.randry 24387 0.00 Pool ran dry
MEMPOOL.sess1.live 7 . In use
MEMPOOL.sess1.pool 11 . In Pool
MEMPOOL.sess1.sz_wanted 512 . Size requested
MEMPOOL.sess1.sz_actual 480 . Size allocated
MEMPOOL.sess1.allocs 115155274 11.56 Allocations
MEMPOOL.sess1.frees 115155267 11.56 Frees
MEMPOOL.sess1.recycle 115144847 11.56 Recycled from
pool
MEMPOOL.sess1.timeout 6114536 0.61 Timed out from
pool
MEMPOOL.sess1.toosmall 0 0.00 Too small to
recycle
MEMPOOL.sess1.surplus 2953 0.00 Too many for
pool
MEMPOOL.sess1.randry 10427 0.00 Pool ran dry
VBE.reload_2024-06-20T181903.node66.happy 18446744073709551615 .
Happy health probes
VBE.reload_2024-06-20T181903.node66.bereq_hdrbytes 193871874
19.46 Request header bytes
VBE.reload_2024-06-20T181903.node66.bereq_bodybytes 0
0.00 Request body bytes
VBE.reload_2024-06-20T181903.node66.beresp_hdrbytes 65332553
6.56 Response header bytes
VBE.reload_2024-06-20T181903.node66.beresp_bodybytes 40260910590
4042.04 Response body bytes
VBE.reload_2024-06-20T181903.node66.pipe_hdrbytes 0
0.00 Pipe request header bytes
VBE.reload_2024-06-20T181903.node66.pipe_out 0
0.00 Piped bytes to backend
VBE.reload_2024-06-20T181903.node66.pipe_in 0
0.00 Piped bytes from backend
VBE.reload_2024-06-20T181903.node66.conn 1
. Concurrent connections to backend
VBE.reload_2024-06-20T181903.node66.req 247959
0.02 Backend requests sent
VBE.reload_2024-06-20T181903.node67.happy 18446744073709551615
. Happy health probes
VBE.reload_2024-06-20T181903.node67.bereq_hdrbytes 193960668
19.47 Request header bytes
VBE.reload_2024-06-20T181903.node67.bereq_bodybytes 0
0.00 Request body bytes
VBE.reload_2024-06-20T181903.node67.beresp_hdrbytes 65315238
6.56 Response header bytes
VBE.reload_2024-06-20T181903.node67.beresp_bodybytes 40142940116
4030.19 Response body bytes
VBE.reload_2024-06-20T181903.node67.pipe_hdrbytes 0
0.00 Pipe request header bytes
VBE.reload_2024-06-20T181903.node67.pipe_out 0
0.00 Piped bytes to backend
VBE.reload_2024-06-20T181903.node67.pipe_in 0
0.00 Piped bytes from backend
VBE.reload_2024-06-20T181903.node67.conn 3
. Concurrent connections to backend
VBE.reload_2024-06-20T181903.node67.req 247956
0.02 Backend requests sent
*Thanks & Regards,*
*Uday Kumar*
On Thu, Jun 20, 2024 at 11:36 PM Guillaume Quintard <
guillaume.quintard at gmail.com> wrote:
> Hi Uday,
>
> pz_list and recv_queue are not (to my knowledge) Varnish counters, where
> are you seeing them?
>
> I doubt Varnish is actually replying with 0, so that probably is your
> client faking a response code to have something to show. But that's a
> detail, as the unresponsiveness is real.
>
> Could you share a "varnishstat -1" of the impacted machine?
>
> --
> Guillaume Quintard
>
>
> On Thu, Jun 20, 2024 at 9:30 AM Uday Kumar <uday.polu at indiamart.com>
> wrote:
>
>> Hello all,
>>
>> We are facing frequent issues of varnish unresponsiveness for sometime on
>> our production server.
>>
>> During this time we have seen that pz_list is being increased to ~3000
>> and recv_queue is increased to ~130
>> Also, varnish is responding with response code '0' for sometime, which
>> meant unresponsive.
>>
>> This is causing multiple 5xx on front ends.
>>
>> FYR:
>> User request count during this time is normal.
>>
>> Note:
>> During this time, we have confirmed that our backend servers are healthy
>> without any issues.
>>
>>
>> May I know what could be the reason for this behaviour at varnish?
>>
>> Please give me the direction on how to debug this issue.
>> _______________________________________________
>> varnish-misc mailing list
>> varnish-misc at varnish-cache.org
>> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20240621/7c9a4c2a/attachment-0001.html>
More information about the varnish-misc
mailing list