[Varnish] #678: varnishd stops accepting requests

Varnish varnish-bugs at varnish-cache.org
Mon Apr 26 13:44:04 CEST 2010


#678: varnishd stops accepting requests
----------------------+-----------------------------------------------------
 Reporter:  ahongens  |       Owner:  phk                 
     Type:  defect    |      Status:  new                 
 Priority:  high      |   Milestone:                      
Component:  varnishd  |     Version:  trunk               
 Severity:  major     |    Keywords:  hang stop responding
----------------------+-----------------------------------------------------
Description changed by phk:

Old description:

> I have four balancers that ran 2.0.5 fine for months, and now I've
> upgraded them to 2.1.0, and sometimes one (at random which one) seems to
> hang.
>
> Varnishstat shows no requests coming in, and when I run varnishlog I only
> see a lot of lines like this:
>
>  8045 SessionClose - dropped
>  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
>  8045 SessionClose - dropped
>  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
>  8045 SessionClose - dropped
>  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
>  8045 SessionClose - dropped
>  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
>  8045 SessionClose - dropped
>  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
>  8045 SessionClose - dropped
>  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
>  8045 SessionClose - dropped
>  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
>  8045 SessionClose - dropped
>  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
>
> I don't see anything else that is strange.. The only thing I see in my
> cacti monitoring is that responses go down, and active tcp connections go
> up (probably as a result). Load of the balancers prior to the problem is
> a normal 0.2-0.5.
>
> After restaring, in my syslog I see (strange the time is off by 2 hours
> though, time was 13:34, all other daemons log ok)
>
> Apr 14 11:34:23 nmt-nlb-04 varnishd[54351]: Manager got SIGINT
> Apr 14 11:34:23 nmt-nlb-04 varnishd[54351]: Stopping Child
> Apr 14 11:34:37 nmt-nlb-04 varnishd[65642]: child (65643) Started
> Apr 14 11:34:37 nmt-nlb-04 varnishd[65642]: Child (65643) said Closed
> fds: 4 5 6 7 11 12 14 15
> Apr 14 11:34:37 nmt-nlb-04 varnishd[65642]: Child (65643) said Child
> starts
> Apr 14 11:34:37 nmt-nlb-04 varnishd[65642]: Child (65643) said managed to
> mmap 8589934592 bytes of 8589934592
>
> I cannot reproduce it.

New description:

 I have four balancers that ran 2.0.5 fine for months, and now I've
 upgraded them to 2.1.0, and sometimes one (at random which one) seems to
 hang.

 Varnishstat shows no requests coming in, and when I run varnishlog I only
 see a lot of lines like this:
 {{{
  8045 SessionClose - dropped
  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
  8045 SessionClose - dropped
  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
  8045 SessionClose - dropped
  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
  8045 SessionClose - dropped
  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
  8045 SessionClose - dropped
  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
  8045 SessionClose - dropped
  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
  8045 SessionClose - dropped
  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
  8045 SessionClose - dropped
  8045 StatSess     - (null) (null) 1271244853 0 0 0 0 0 0 0
 }}}
 I don't see anything else that is strange.. The only thing I see in my
 cacti monitoring is that responses go down, and active tcp connections go
 up (probably as a result). Load of the balancers prior to the problem is a
 normal 0.2-0.5.

 After restaring, in my syslog I see (strange the time is off by 2 hours
 though, time was 13:34, all other daemons log ok)
 {{{
 Apr 14 11:34:23 nmt-nlb-04 varnishd[54351]: Manager got SIGINT
 Apr 14 11:34:23 nmt-nlb-04 varnishd[54351]: Stopping Child
 Apr 14 11:34:37 nmt-nlb-04 varnishd[65642]: child (65643) Started
 Apr 14 11:34:37 nmt-nlb-04 varnishd[65642]: Child (65643) said Closed fds:
 4 5 6 7 11 12 14 15
 Apr 14 11:34:37 nmt-nlb-04 varnishd[65642]: Child (65643) said Child
 starts
 Apr 14 11:34:37 nmt-nlb-04 varnishd[65642]: Child (65643) said managed to
 mmap 8589934592 bytes of 8589934592
 }}}
 I cannot reproduce it.

--

-- 
Ticket URL: <http://www.varnish-cache.org/ticket/678#comment:1>
Varnish <http://varnish-cache.org/>
The Varnish HTTP Accelerator




More information about the varnish-bugs mailing list