[Varnish] #1054: Child not responding to CLI, killing it

Varnish varnish-bugs at varnish-cache.org
Mon May 14 10:50:56 CEST 2012


#1054: Child not responding to CLI, killing it
-----------------------+----------------------------------------------------
  Reporter:  scorillo  |        Type:  defect  
    Status:  reopened  |    Priority:  normal  
 Milestone:            |   Component:  varnishd
   Version:  3.0.2     |    Severity:  normal  
Resolution:            |    Keywords:          
-----------------------+----------------------------------------------------
Changes (by etherael):

  * status:  closed => reopened
  * resolution:  worksforme =>


Comment:

 I am seeing the exact same issue on a varnish cache here, dies every 3-4
 days. Details follow;
 {{{

 Messagelog content;
 incident 1;
 May  9 15:13:57 elk2 varnishd[25859]: Child (2551) not responding to CLI,
 killing it.
 May  9 15:14:07 elk2 varnishd[25859]: Child (2551) not responding to CLI,
 killing it.
 May  9 15:14:17 elk2 varnishd[25859]: Child (2551) not responding to CLI,
 killing it.
 May  9 15:14:27 elk2 varnishd[25859]: Child (2551) not responding to CLI,
 killing it.
 May  9 15:14:37 elk2 varnishd[25859]: Child (2551) not responding to CLI,
 killing it.
 May  9 15:14:47 elk2 varnishd[25859]: Child (2551) not responding to CLI,
 killing it.
 May  9 15:14:52 elk2 varnishd[25859]: Child (2551) not responding to CLI,
 killing it.
 May  9 15:14:52 elk2 varnishd[25859]: Child (2551) not responding to CLI,
 killing it.
 May  9 15:14:52 elk2 varnishd[25859]: Child (2551) died signal=3 (core
 dumped)
 May  9 15:14:52 elk2 varnishd[25859]: child (15127) Started
 May  9 15:14:52 elk2 varnishd[25859]: Child (15127) said Child starts
 May  9 15:14:52 elk2 varnishd[25859]: Child (15127) said SMF.s0 mmap'ed
 311385128960 bytes of 311385128960


 incident 2;

 May 14 00:53:33 elk2 varnishd[25859]: Child (15127) not responding to CLI,
 killing it.
 May 14 00:53:43 elk2 varnishd[25859]: Child (15127) not responding to CLI,
 killing it.
 May 14 00:53:53 elk2 varnishd[25859]: Child (15127) not responding to CLI,
 killing it.
 May 14 00:54:03 elk2 varnishd[25859]: Child (15127) not responding to CLI,
 killing it.
 May 14 00:54:13 elk2 varnishd[25859]: Child (15127) not responding to CLI,
 killing it.
 May 14 00:54:20 elk2 abrt[30696]: file /usr/sbin/varnishd seems to be
 deleted
 May 14 00:54:21 elk2 varnishd[25859]: Child (15127) not responding to CLI,
 killing it.
 May 14 00:54:21 elk2 varnishd[25859]: Child (15127) not responding to CLI,
 killing it.
 May 14 00:54:21 elk2 varnishd[25859]: Child (15127) died signal=3 (core
 dumped)
 May 14 00:54:21 elk2 varnishd[25859]: child (30697) Started
 May 14 00:54:21 elk2 varnishd[25859]: Child (30697) said Child starts
 May 14 00:54:21 elk2 varnishd[25859]: Child (30697) said SMF.s0 mmap'ed
 311385128960 bytes of 311385128960

 version info;


 root at parrot:~$ rpm -qa |grep varnish
 varnish-libs-3.0.2-1.el5.x86_64
 varnish-3.0.2-1.el5.x86_64
 varnish-release-3.0-1.noarch
 root at parrot:~$ uname -a
 Linux parrot 2.6.32-220.7.1.el6.x86_64 #1 SMP Wed Mar 7 00:52:02 GMT 2012
 x86_64 x86_64 x86_64 GNU/Linux
 root at parrot:~$ cat /etc/redhat-release
 CentOS release 6.2 (Final)

 startup cmd;

 varnish  30697 25859 10 00:54 ?        00:17:52 /usr/sbin/varnishd -P
 /var/run/varnish.pid -a :80 -T localhost:6082 -f /etc/varnish/default.vcl
 -u varnish
  -g varnish -S /etc/varnish/secret -s
 file,/var/lib/varnish/varnish_storage.bin,290G

 varnishstat;

 root at parrot:~$ varnishstat -1
 client_conn             253164        25.32 Client connections accepted
 client_drop                  0         0.00 Connection dropped, no
 sess/wrk
 client_req              338951        33.91 Client requests received
 cache_hit               156382        15.64 Cache hits
 cache_hitpass                0         0.00 Cache hits for pass
 cache_miss              153794        15.38 Cache misses
 backend_conn             69798         6.98 Backend conn. success
 backend_unhealthy            0         0.00 Backend conn. not attempted
 backend_busy                 0         0.00 Backend conn. too many
 backend_fail                 2         0.00 Backend conn. failures
 backend_reuse           112847        11.29 Backend conn. reuses
 backend_toolate           1116         0.11 Backend conn. was closed
 backend_recycle         113971        11.40 Backend conn. recycles
 backend_retry               38         0.00 Backend conn. retry
 fetch_head                   6         0.00 Fetch head
 fetch_length            178831        17.89 Fetch with Length
 fetch_chunked             3509         0.35 Fetch chunked
 fetch_eof                    0         0.00 Fetch EOF
 fetch_bad                    0         0.00 Fetch had bad headers
 fetch_close                204         0.02 Fetch wanted close
 fetch_oldhttp                0         0.00 Fetch pre HTTP/1.1 closed
 fetch_zero                   0         0.00 Fetch zero len
 fetch_failed                 0         0.00 Fetch failed
 fetch_1xx                    0         0.00 Fetch no body (1xx)
 fetch_204                    0         0.00 Fetch no body (204)
 fetch_304                    1         0.00 Fetch no body (304)
 n_sess_mem                 565          .   N struct sess_mem
 n_sess                     145          .   N struct sess
 n_object                153780          .   N struct object
 n_vampireobject              0          .   N unresurrected objects
 n_objectcore            153835          .   N struct objectcore
 n_objecthead            153835          .   N struct objecthead
 n_waitinglist             1139          .   N struct waitinglist
 n_vbc                       58          .   N struct vbc
 n_wrk                       90          .   N worker threads
 n_wrk_create               360         0.04 N worker threads created
 n_wrk_failed                 0         0.00 N worker threads not created
 n_wrk_max                    0         0.00 N worker threads limited
 n_wrk_lqueue                 0         0.00 work request queue length
 n_wrk_queued              4119         0.41 N queued work requests
 n_wrk_drop                   0         0.00 N dropped work requests
 n_backend                    3          .   N backends
 n_expired                    0          .   N expired objects
 n_lru_nuked                  0          .   N LRU nuked objects
 n_lru_moved             142671          .   N LRU moved objects
 losthdr                      0         0.00 HTTP header overflows
 n_objsendfile                0         0.00 Objects sent with sendfile
 n_objwrite              329530        32.96 Objects sent with write
 n_objoverflow                0         0.00 Objects overflowing workspace
 s_sess                  253107        25.32 Total Sessions
 s_req                   338951        33.91 Total Requests
 s_pipe                       0         0.00 Total pipe
 s_pass                   28775         2.88 Total pass
 s_fetch                 182551        18.26 Total fetch
 s_hdrbytes           138408349     13844.99 Total header bytes
 s_bodybytes         6200495767    620235.65 Total body bytes
 sess_closed             106492        10.65 Session Closed
 sess_pipeline              289         0.03 Session Pipeline
 sess_readahead              72         0.01 Session Read Ahead
 sess_linger             239202        23.93 Session Linger
 sess_herd               246213        24.63 Session herd
 shm_records           23151895      2315.88 SHM records
 shm_writes             1718951       171.95 SHM writes
 shm_flushes                  0         0.00 SHM flushes due to overflow
 shm_cont                  6496         0.65 SHM MTX contention
 shm_cycles                   9         0.00 SHM cycles through buffer
 sms_nreq                    18         0.00 SMS allocator requests
 sms_nobj                     0          .   SMS outstanding allocations
 sms_nbytes                   0          .   SMS outstanding bytes
 sms_balloc                7524          .   SMS bytes allocated
 sms_bfree                 7524          .   SMS bytes freed
 backend_req             182657        18.27 Backend requests made
 n_vcl                        1         0.00 N vcl total
 n_vcl_avail                  1         0.00 N vcl available
 n_vcl_discard                0         0.00 N vcl discarded
 n_ban                        1          .   N total active bans
 n_ban_add                    1         0.00 N new bans added
 n_ban_retire                 0         0.00 N old bans deleted
 n_ban_obj_test               0         0.00 N objects tested
 n_ban_re_test                0         0.00 N regexps tested against
 n_ban_dups                   0         0.00 N duplicate bans removed
 hcb_nolock              310164        31.03 HCB Lookups without lock
 hcb_lock                153842        15.39 HCB Lookups with lock
 hcb_insert              153842        15.39 HCB Inserts
 esi_errors                   0         0.00 ESI parse errors (unlock)
 esi_warnings                 0         0.00 ESI parse warnings (unlock)
 accept_fail                  0         0.00 Accept failures
 client_drop_late             0         0.00 Connection dropped late
 uptime                    9997         1.00 Client uptime
 dir_dns_lookups              0         0.00 DNS director lookups
 dir_dns_failed               0         0.00 DNS director failed lookups
 dir_dns_hit                  0         0.00 DNS director cached lookups
 hit
 dir_dns_cache_full           0         0.00 DNS director full dnscache
 vmods                        0          .   Loaded VMODs
 n_gzip                       0         0.00 Gzip operations
 n_gunzip                235860        23.59 Gunzip operations
 LCK.sms.creat                2         0.00 Created locks
 LCK.sms.destroy              0         0.00 Destroyed locks
 LCK.sms.locks           141948        14.20 Lock Operations
 LCK.sms.colls                0         0.00 Collisions
 LCK.smp.creat                0         0.00 Created locks
 LCK.smp.destroy              0         0.00 Destroyed locks
 LCK.smp.locks                0         0.00 Lock Operations
 LCK.smp.colls                0         0.00 Collisions
 LCK.sma.creat                2         0.00 Created locks
 LCK.sma.destroy              0         0.00 Destroyed locks
 LCK.sma.locks          4334608       433.59 Lock Operations
 LCK.sma.colls                0         0.00 Collisions
 LCK.smf.creat                2         0.00 Created locks
 LCK.smf.destroy              0         0.00 Destroyed locks
 LCK.smf.locks         20270014      2027.61 Lock Operations
 LCK.smf.colls                0         0.00 Collisions
 LCK.hsl.creat                0         0.00 Created locks
 LCK.hsl.destroy              0         0.00 Destroyed locks
 LCK.hsl.locks                0         0.00 Lock Operations
 LCK.hsl.colls                0         0.00 Collisions
 LCK.hcb.creat                2         0.00 Created locks
 LCK.hcb.destroy              0         0.00 Destroyed locks
 LCK.hcb.locks          6813196       681.52 Lock Operations
 LCK.hcb.colls                0         0.00 Collisions
 LCK.hcl.creat                0         0.00 Created locks
 LCK.hcl.destroy              0         0.00 Destroyed locks
 LCK.hcl.locks                0         0.00 Lock Operations
 LCK.hcl.colls                0         0.00 Collisions
 LCK.vcl.creat                2         0.00 Created locks
 LCK.vcl.destroy              0         0.00 Destroyed locks
 LCK.vcl.locks            27017         2.70 Lock Operations
 LCK.vcl.colls                0         0.00 Collisions
 LCK.stat.creat               2         0.00 Created locks
 LCK.stat.destroy             0         0.00 Destroyed locks
 LCK.stat.locks            3353         0.34 Lock Operations
 LCK.stat.colls               0         0.00 Collisions
 LCK.sessmem.creat            2         0.00 Created locks
 LCK.sessmem.destroy            0         0.00 Destroyed locks
 LCK.sessmem.locks       10434926      1043.81 Lock Operations
 LCK.sessmem.colls              0         0.00 Collisions
 LCK.wstat.creat                2         0.00 Created locks
 LCK.wstat.destroy              0         0.00 Destroyed locks
 LCK.wstat.locks          1062489       106.28 Lock Operations
 LCK.wstat.colls                0         0.00 Collisions
 LCK.herder.creat               2         0.00 Created locks
 LCK.herder.destroy             0         0.00 Destroyed locks
 LCK.herder.locks           28077         2.81 Lock Operations
 LCK.herder.colls               0         0.00 Collisions
 LCK.wq.creat                   4         0.00 Created locks
 LCK.wq.destroy                 0         0.00 Destroyed locks
 LCK.wq.locks            32085426      3209.51 Lock Operations
 LCK.wq.colls                   0         0.00 Collisions
 LCK.objhdr.creat         6777786       677.98 Created locks
 LCK.objhdr.destroy         38400         3.84 Destroyed locks
 LCK.objhdr.locks        58265283      5828.28 Lock Operations
 LCK.objhdr.colls               0         0.00 Collisions
 LCK.exp.creat                  2         0.00 Created locks
 LCK.exp.destroy                0         0.00 Destroyed locks
 LCK.exp.locks            7078276       708.04 Lock Operations
 LCK.exp.colls                  0         0.00 Collisions
 LCK.lru.creat                  4         0.00 Created locks
 LCK.lru.destroy                0         0.00 Destroyed locks
 LCK.lru.locks            6740732       674.28 Lock Operations
 LCK.lru.colls                  0         0.00 Collisions
 LCK.cli.creat                  2         0.00 Created locks
 LCK.cli.destroy                0         0.00 Destroyed locks
 LCK.cli.locks             112482        11.25 Lock Operations
 LCK.cli.colls                  0         0.00 Collisions
 LCK.ban.creat                  2         0.00 Created locks
 LCK.ban.destroy                0         0.00 Destroyed locks
 LCK.ban.locks            7078306       708.04 Lock Operations
 LCK.ban.colls                  0         0.00 Collisions
 LCK.vbp.creat                  2         0.00 Created locks
 LCK.vbp.destroy                0         0.00 Destroyed locks
 LCK.vbp.locks                  0         0.00 Lock Operations
 LCK.vbp.colls                  0         0.00 Collisions
 LCK.vbe.creat                  2         0.00 Created locks
 LCK.vbe.destroy                0         0.00 Destroyed locks
 LCK.vbe.locks            7031988       703.41 Lock Operations
 LCK.vbe.colls                  0         0.00 Collisions
 LCK.backend.creat              6         0.00 Created locks
 LCK.backend.destroy            0         0.00 Destroyed locks
 LCK.backend.locks       21975254      2198.18 Lock Operations
 LCK.backend.colls              0         0.00 Collisions
 SMF.s0.c_req            13531967      1353.60 Allocator requests
 SMF.s0.c_fail                  0         0.00 Allocator failures
 SMF.s0.c_bytes      907854680064  90812711.82 Bytes allocated
 SMF.s0.c_freed      795545841664  79578457.70 Bytes freed
 SMF.s0.g_alloc          13484647          .   Allocations outstanding
 SMF.s0.g_bytes      112308838400          .   Bytes outstanding
 SMF.s0.g_space      510461419520          .   Bytes available
 SMF.s0.g_smf            14758490          .   N struct smf
 SMF.s0.g_smf_frag        1273251          .   N small free smf
 SMF.s0.g_smf_large           592          .   N large free smf
 SMA.Transient.c_req        57850         5.79 Allocator requests
 SMA.Transient.c_fail           0         0.00 Allocator failures
 SMA.Transient.c_bytes   3809803264    381094.65 Bytes allocated
 SMA.Transient.c_freed   3809803264    381094.65 Bytes freed
 SMA.Transient.g_alloc            0          .   Allocations outstanding
 SMA.Transient.g_bytes            0          .   Bytes outstanding
 SMA.Transient.g_space            0          .   Bytes available
 VBE.default(10.52.5.194,,8080).vcls            2          .   VCL
 references
 VBE.default(10.52.5.194,,8080).happy           0          .   Happy health
 probes
 VBE.thumbs(127.0.0.1,,8080).vcls               2          .   VCL
 references
 VBE.thumbs(127.0.0.1,,8080).happy              0          .   Happy health
 probes
 VBE.nm_elk(50.23.148.202,,80).vcls             2          .   VCL
 references
 VBE.nm_elk(50.23.148.202,,80).happy            0          .   Happy health
 probes


 }}}

 Other strangeness; the directory that contains the actual varnish storage
 bin behaves erratically, the file storage.bin is a sparse file of the size
 requested on the startup (290gb) but sparse file listing shows that it
 never climbs above the 205gb mark in terms of actual space used.

 When varnish died on the 9th and auto restarted, this figure (actual size
 used in sparse file) was registered as 153gb despite the cache being
 completely empty.

 client_req average hovers around 30 but peaks at 70, nothing at all huge
 compared to what I've seen previously. The only thing that is out of the
 ordinary I suppose is the size of the storage file (very long tail of
 infrequently accessed objects on the backends with very high load on those
 backend servers that we want to ease off on, hence the immense cache size)

 The average io load is about 200kbytes reads per second and negligible
 writes, load average rarely exceeds 3 (16 core Intel Xeon E5620 @ 2.40Ghz,
 12gb memory.

 That's all I can think of in terms of detail, I note the version of the
 cache is up to date with upstream despite being a binary distributed from
 the vendor, could it help to compile from source and see if the issue
 persists?

-- 
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1054#comment:5>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator




More information about the varnish-bugs mailing list