[Varnish] #489: Assert error in BAN_CheckObject()

Varnish varnish-bugs at projects.linpro.no
Sat May 9 12:07:50 CEST 2009


#489: Assert error in BAN_CheckObject()
----------------------+-----------------------------------------------------
 Reporter:  waschtl   |        Owner:  phk  
     Type:  defect    |       Status:  new  
 Priority:  normal    |    Milestone:       
Component:  varnishd  |      Version:  trunk
 Severity:  major     |   Resolution:       
 Keywords:            |  
----------------------+-----------------------------------------------------
Old description:

> The following problem has been observed in r3965, as well as r4033:
>
> Periodically, at irregular intervals, varnish on a production server will
> issue the following error, and restart. As you know, this causes varnish
> to lose the cache, and subsequently the website is not very usable for
> several minutes afterwards.
>
> {{{
> Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child (11663) died
> signal=6
> Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child (11663) Panic
> message: Assert error in BAN_CheckObject(), cache_ban.c line 423:
> Condition((o->ban) != NULL) not true.  thread = (cache-worker)sp =
> 0x2aaaf504b008 {   fd = 9, id = 9, xid = 1986591368,   client =
> 127.0.0.1:44680,   step = STP_LOOKUP,   handling = hash,   ws =
> 0x2aaaf504b070 {      id = "sess",     {s,f,r,e} =
> {0x2aaaf504b800,,+297,(nil),+16384},   },     worker = 0x4600a430 {
> },     vcl = {       srcname = {         "input",         "Default",
> },     }, },
> Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child cleanup complete
> Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: child (20389) Started
> Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child (20389) said
> Closed fds: 4 5 6 9 10 12 13
> Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child (20389) said
> Child starts
> Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child (20389) said
> managed to mmap 1073741824 bytes of 1073741824
> Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child (20389) said
> Ready
> }}}
>
> This was previously also observed at the following times:
>
> {{{
> * r3965
> Mar 10 06:21:27
> Mar 18 10:37:45
> Mar 28 07:59:44
> Mar 28 11:52:40
> Mar 30 16:41:45
> Apr  1 11:41:25
> Apr  6 09:47:34
> Apr  6 15:29:26
> Apr  6 21:10:02
> Apr  7 10:50:43
> Apr  7 22:05:46
> Apr  9 06:56:42
> Apr  9 13:16:57
> Apr  9 19:53:26
> * update to r4033
> Apr 15 22:41:49
> }}}
>
> Varnish is serving a plone website using pound for loadbalancing. The
> website serves up to 80,000 GET-requests per hour.
>
> Varnish is running on a dual-quad-core machine using Intel E5420
> processors @2.5 GHz with 16 GB RAM, where varnish has a 1 GB cache.
>
> If you need further specifics to debug the problem, I will gladly provide
> them.

New description:

 The following problem has been observed in r3965, as well as r4033:

 Periodically, at irregular intervals, varnish on a production server will
 issue the following error, and restart. As you know, this causes varnish
 to lose the cache, and subsequently the website is not very usable for
 several minutes afterwards.

 {{{
 Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child (11663) died
 signal=6
 Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child (11663)
  Panic message: Assert error in BAN_CheckObject(), cache_ban.c line 423:
    Condition((o->ban) != NULL) not true.
   thread = (cache-worker)sp = 0x2aaaf504b008 {
    fd = 9, id = 9, xid = 1986591368,
    client = 127.0.0.1:44680,
    step = STP_LOOKUP,
    handling = hash,
    ws = 0x2aaaf504b070 {
       id = "sess",
      {s,f,r,e} = {0x2aaaf504b800,,+297,(nil),+16384},
    },
      worker = 0x4600a430 {
      },
      vcl = {
        srcname = {
          "input",
          "Default",
        },
      },
  },
 Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child cleanup complete
 Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: child (20389) Started
 Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child (20389) said
 Closed fds: 4 5 6 9 10 12 13
 Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child (20389) said
 Child starts
 Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child (20389) said
 managed to mmap 1073741824 bytes of 1073741824
 Apr 15 22:41:49 <hidden-hostname> varnishd[11662]: Child (20389) said
 Ready
 }}}

 This was previously also observed at the following times:

 {{{
 * r3965
 Mar 10 06:21:27
 Mar 18 10:37:45
 Mar 28 07:59:44
 Mar 28 11:52:40
 Mar 30 16:41:45
 Apr  1 11:41:25
 Apr  6 09:47:34
 Apr  6 15:29:26
 Apr  6 21:10:02
 Apr  7 10:50:43
 Apr  7 22:05:46
 Apr  9 06:56:42
 Apr  9 13:16:57
 Apr  9 19:53:26
 * update to r4033
 Apr 15 22:41:49
 }}}

 Varnish is serving a plone website using pound for loadbalancing. The
 website serves up to 80,000 GET-requests per hour.

 Varnish is running on a dual-quad-core machine using Intel E5420
 processors @2.5 GHz with 16 GB RAM, where varnish has a 1 GB cache.

 If you need further specifics to debug the problem, I will gladly provide
 them.

-- 
Ticket URL: <http://varnish.projects.linpro.no/ticket/489#comment:3>
Varnish <http://varnish.projects.linpro.no/>
The Varnish HTTP Accelerator


More information about the varnish-bugs mailing list