[Varnish] #275: Varnish child  stops responding
    Varnish 
    varnish-bugs at projects.linpro.no
       
    Wed Sep 24 05:14:01 CEST 2008
    
    
  
#275: Varnish child  stops responding
---------------------------------+------------------------------------------
 Reporter:  chenxy               |        Owner:  phk                      
     Type:  defect               |       Status:  new                      
 Priority:  high                 |    Milestone:  Varnish 2.0 code complete
Component:  varnishd             |      Version:  trunk                    
 Severity:  normal               |   Resolution:                           
 Keywords:  Varnish child stuck  |  
---------------------------------+------------------------------------------
Comment (by chenxy):
 I try do a test again . This is a report
 Environment: FreeBSD 7.0-RELEASE amd64 ,SCHED_ULE ,filesystem Turn off
 soft-updates and noatime
 Varnish version : trunk revision: 3213
 Test tool: http_load  ,version 12mar2006
 urls count : 155573
 http_load command:
 {{{
 [root at localhost http_load-12mar2006]# ./http_load -proxy
 varnish_host_ip:80 -parallel 500 -seconds 600 siege_url.txt
 ......
 http://image2.XXX.com.cn/igame/new/new015/shop_index_67.gif: byte count
 wrong
 http://i2.XXXimg.cn/gm/http/blog.XXX.com.cn/s/U2217P115T146D16346F4314DT20080918102123_w80.jpg:
 byte count wrong
 http://image2.XXX.com.cn/music/artist/0/1200_200150.jpg: byte count wrong
 http://image2.XXX.com.cn/music/artist/0/1200_200150.jpg: byte count wrong
 http://i3.XXXimg.cn/qc/buy/U121P33T133D11263F1620DT20080917103216.jpg:
 byte count wrong
 http://i2.XXXimg.cn/IT/mobile/models/U1069P2T136D2764F3244DT20050526165505_150h.jpg:
 byte count wrong
 http://i2.XXXimg.cn/IT/cr/2008/0229/1730605940.jpg: byte count wrong
 http://i0.XXXimg.cn/lx/fa/p/2008/0805/U1562P8T1D750071F915DT20080805114645.jpg:
 byte count wrong
 http://image2.XXX.com.cn/download/ebook/2004-11-22/U74P176T13D5119F197DT20041122113641.gif:
 byte count wrong
 ......
 290320 fetches, 500 max parallel, 4.36283e+09 bytes, in 600 seconds
 15027.6 mean bytes/connection
 483.866 fetches/sec, 7.27137e+06 bytes/sec
 msecs/connect: 46.4545 mean, 6003.24 max, 0.219 min
 msecs/first-response: 788.459 mean, 59985.5 max, 0.349 min
 574 timeouts
 2398 bad byte counts
 HTTP response codes:
   code 200 -- 285218
   code 301 -- 7
   code 400 -- 2
   code 403 -- 939
   code 404 -- 1968
   code 503 -- 112
 }}}
 log info:
 {{{
 Sep 23 06:28:38 HP08020020 varnishd[25386]: child (25387) Started
 Sep 23 06:28:38 HP08020020 varnishd[25386]: Child (25387) said Closed fds:
 7 8 11 12 14 15
 Sep 23 06:28:38 HP08020020 varnishd[25386]: Child (25387) said Child
 starts
 Sep 23 06:28:38 HP08020020 varnishd[25386]: Child (25387) said managed to
 mmap 32212254720 bytes of 32212254720
 Sep 23 06:28:38 HP08020020 last message repeated 3 times
 Sep 23 06:28:38 HP08020020 varnishd[25386]: Child (25387) said Ready
 Sep 23 08:42:34 HP08020020 varnishd[25386]: Child (25387) not responding
 to ping, killing it.
 Sep 23 08:42:34 HP08020020 kernel: pid 25387 (varnishd), uid 65534: exited
 on signal 3
 Sep 23 08:42:34 HP08020020 varnishd[25386]: child (31998) Started
 Sep 23 08:42:35 HP08020020 varnishd[25386]: Child (31998) said Closed fds:
 7 8 11 12 14 15
 Sep 23 08:42:35 HP08020020 varnishd[25386]: Child (31998) said Child
 starts
 Sep 23 08:42:35 HP08020020 varnishd[25386]: Child (31998) said managed to
 mmap 32212254720 bytes of 32212254720
 Sep 23 08:42:35 HP08020020 last message repeated 3 times
 Sep 23 08:42:35 HP08020020 varnishd[25386]: Child (31998) said Ready
 Sep 23 08:43:19 HP08020020 varnishd[25386]: Child (31998) not responding
 to ping, killing it.
 Sep 23 08:43:49 HP08020020 last message repeated 3 times
 Sep 23 08:43:52 HP08020020 kernel: pid 31998 (varnishd), uid 65534: exited
 on signal 3
 Sep 23 08:43:52 HP08020020 varnishd[25386]: child (32040) Started
 Sep 23 08:43:53 HP08020020 varnishd[25386]: Child (32040) said Closed fds:
 7 8 11 12 14 15
 Sep 23 08:43:53 HP08020020 varnishd[25386]: Child (32040) said Child
 starts
 Sep 23 08:43:53 HP08020020 varnishd[25386]: Child (32040) said managed to
 mmap 32212254720 bytes of 32212254720
 Sep 23 08:43:53 HP08020020 last message repeated 3 times
 Sep 23 08:43:53 HP08020020 varnishd[25386]: Child (32040) said Ready
 Sep 23 08:44:21 HP08020020 cfexecd[667]:  Cannot mail to
 root at sys.XXXX.com.cn.
 Sep 23 08:44:37 HP08020020 varnishd[25386]: Child (32040) not responding
 to ping, killing it.
 Sep 23 08:45:17 HP08020020 last message repeated 4 times
 Sep 23 08:45:19 HP08020020 kernel: pid 32040 (varnishd), uid 65534: exited
 on signal 3
 Sep 23 08:45:19 HP08020020 varnishd[25386]: child (32100) Started
 Sep 23 08:45:19 HP08020020 varnishd[25386]: Child (32100) said Closed fds:
 7 8 11 12 14 15
 Sep 23 08:45:19 HP08020020 varnishd[25386]: Child (32100) said Child
 starts
 Sep 23 08:45:19 HP08020020 varnishd[25386]: Child (32100) said managed to
 mmap 32212254720 bytes of 32212254720
 Sep 23 08:45:19 HP08020020 last message repeated 3 times
 Sep 23 08:45:19 HP08020020 varnishd[25386]: Child (32100) said Ready
 Sep 23 08:47:04 HP08020020 varnishd[25386]: Child (32100) not responding
 to ping, killing it.
 Sep 23 08:47:54 HP08020020 last message repeated 5 times
 Sep 23 08:47:58 HP08020020 kernel: pid 32100 (varnishd), uid 65534: exited
 on signal 3
 Sep 23 08:47:58 HP08020020 varnishd[25386]: child (32210) Started
 Sep 23 08:47:58 HP08020020 varnishd[25386]: Child (32210) said Closed fds:
 7 8 11 12 14 15
 Sep 23 08:47:58 HP08020020 varnishd[25386]: Child (32210) said Child
 starts
 Sep 23 08:47:58 HP08020020 varnishd[25386]: Child (32210) said managed to
 mmap 32212254720 bytes of 32212254720
 Sep 23 08:47:58 HP08020020 last message repeated 3 times
 Sep 23 08:47:58 HP08020020 varnishd[25386]: Child (32210) said Ready
 }}}
 system run status:
 {{{
         system block,not response for a while and child killed and
 restarted again, threads reached about 550. After child restarted again,
 varnishstat show a little 'Client connections accepted'
         top command status:
           PID USERNAME    THR PRI NICE   SIZE    RES STATE  C   TIME
 WCPU COMMAND
         32100 nobody      555  44    0   120G  1425M STOP   2   0:27
 0.00% varnishd
         gstat command status:
             0    129     78   3325    8.8     51    815    6.5   42.7|
 ufs/data0
             0    143     77   2941    8.8     66   1055   10.1   49.7|
 ufs/data1
             5    112     57   2110    8.7     55    879    7.9   37.7|
 ufs/data2
             4    141     78   2861   11.0     63   1007    8.5   46.6|
 ufs/data3
 }}}
 varnish run command:
 {{{
                 varnishd -a 0.0.0.0:80 -f /usr/local/etc/vcl.conf \
                 -u nobody -g nobody \
                 -T 127.0.0.1:8080 \
                 -w 10,1000,600 \
                 -s file,/data0/varnish/v,30G \
                 -s file,/data1/varnish/v,30G \
                 -s file,/data2/varnish/v,30G \
                 -s file,/data3/varnish/v,30G \
                 -h classic,500009 \
                 -p thread_pools=4 \
                 -p listen_depth=4096 \
                 -p lru_interval=3600 \
                 -p ping_interval=10
 }}}
-- 
Ticket URL: <http://varnish.projects.linpro.no/ticket/275#comment:11>
Varnish <http://varnish.projects.linpro.no/>
The Varnish HTTP Accelerator
    
    
More information about the varnish-bugs
mailing list