[Varnish] #1083: Persistent Varnish crashes since using bans and lurker

Varnish varnish-bugs at varnish-cache.org
Mon Oct 29 12:18:02 CET 2012


#1083: Persistent Varnish crashes since using bans and lurker
-------------------------+---------------------
 Reporter:  rmohrbacher  |       Owner:  martin
     Type:  defect       |      Status:  new
 Priority:  high         |   Milestone:
Component:  varnishd     |     Version:  3.0.2
 Severity:  major        |  Resolution:
 Keywords:               |
-------------------------+---------------------
Description changed by tfheen:

Old description:

> We use a farm with three persistent Varnishes (-s
> persistent,/cms/varnish_cache/persistent/varnish_storage.bin,204800M").
>
> This Varnishes runs since 3 months without any crashes (in the moment not
> in production, but stressed with several stress tests).
>
> Since some days, we use bans and the lurker process (lurker-friendly bans
> via:  ban("obj.http.x-url ~ " + req.url);
> We have about 250 bans/hour.
>
> Now we have the big problem, that the varnishes crashes after some hours.
> Curios: all three Varnishes crashes in the same moment. And they runs on
> three different Servers!
>
> The follow part from syslog suggest, that there is an problem with an
> invalid ban:
>

> Jan  9 19:40:32 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
> said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1
> Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
> died signal=6
> Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
> Panic message: Missing errorhandling code in smp_append_sign(),
> storage_persistent_subr.c line 128:#012  Condition((smp_chk_sign(ctx)) ==
> 0) not true.thread = (cache-worker)#012ident =
> Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,epoll#012Backtrace:#012
> 0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012  0x44a346:
> /usr/sbin/varnishd(smp_append_sign+0x126) [0x44a346]#012  0x447b6d:
> /usr/sbin/varnishd(SMP_NewBan+0x3d) [0x447b6d]#012  0x4125c7:
> /usr/sbin/varnishd(BAN_Insert+0x1a7) [0x4125c7]#012  0x433bd5:
> /usr/sbin/varnishd(VRT_ban_string+0xc5) [0x433bd5]#012  0x7f91f39fa4be:
> ./vcl.PNU3fGhs.so(+0x24be) [0x7f91f39fa4be]#012  0x433863:
> /usr/sbin/varnishd(VCL_recv_method+0x43) [0x433863]#012  0x417c22:
> /usr/sbin/varnishd(CNT_Session+0xb62) [0x417c22]#012  0x42efb8:
> /usr/sbin/varnishd() [0x42efb8]#012  0x42e19b: /usr/sbin/varnishd()
> [0x42e19b]#012sp = 0x7f91ed4ab008 {#012  fd = 15, id = 15, xid =
> 683670119,#012  client = 172.27.70.103 36115,#012  step = STP_RECV,#012
> handling = deliver,#012  restarts = 0, esi_level = 0#012  flags = #012
> bodystatus = 4#012  ws = 0x7f91ed4ab080 { #012    id = "sess",#012
> {s,f,r,e} = {0x7f91ed4abc90,+56,(nil),+65536},#012  },#012  http[req] =
> {#012    ws = 0x7f91ed4ab080[sess]#012      "PURGE",#012
> "105867846",#012      "HTTP/1.0",#012  },#012  worker = 0x7f91ef1faa80
> {#012    ws = 0x7f91ef1facc0 { #012      id = "wrk",#012      {s,f,r,e} =
> {0x7f91ef1e8a30,+32,(nil),+65536},#012    },#012    },#012    vcl = {#012
> srcname = {#012        "input",#012        "Default",#012      },#012
> },#012},#012
> Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: child (6907)
> Started
> Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Pushing vcls
> failed:#012CLI communication error (hdr)
> Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
> died signal=6
> Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
> Panic message: Assert error in smp_open(), storage_persistent.c line
> 320:#012  Condition((smp_valid_silo(sc)) == 0) not true.#012thread =
> (cache-main)#012ident =
> Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,no_waiter#012Backtrace:#012
> 0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012  0x44756a:
> /usr/sbin/varnishd() [0x44756a]#012  0x444d57:
> /usr/sbin/varnishd(STV_open+0x27) [0x444d57]#012  0x42b525:
> /usr/sbin/varnishd(child_main+0xc5) [0x42b525]#012  0x43d5ec:
> /usr/sbin/varnishd() [0x43d5ec]#012  0x43de7c: /usr/sbin/varnishd()
> [0x43de7c]#012  0x7f92015684c7: /usr/lib64/varnish/libvarnish.so(+0x94c7)
> [0x7f92015684c7]#012  0x7f9201568b58:
> /usr/lib64/varnish/libvarnish.so(vev_schedule+0x88) [0x7f9201568b58]#012
> 0x43d7c2: /usr/sbin/varnishd(MGT_Run+0x132) [0x43d7c2]#012  0x44cacb:
> /usr/sbin/varnishd(main+0xd1b) [0x44cacb]#012
> Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
> said Child starts
> Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
> said CHK(0x7f91ffd26120 BAN 1 0x7f34723f4000 BAN 1) = 4
> Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
> said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1

New description:

 We use a farm with three persistent Varnishes (-s
 persistent,/cms/varnish_cache/persistent/varnish_storage.bin,204800M").

 This Varnishes runs since 3 months without any crashes (in the moment not
 in production, but stressed with several stress tests).

 Since some days, we use bans and the lurker process (lurker-friendly bans
 via:  ban("obj.http.x-url ~ " + req.url);
 We have about 250 bans/hour.

 Now we have the big problem, that the varnishes crashes after some hours.
 Curios: all three Varnishes crashes in the same moment. And they runs on
 three different Servers!

 The follow part from syslog suggest, that there is an problem with an
 invalid ban:

 {{{
 Jan  9 19:40:32 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
 said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1
 Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
 died signal=6
 Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (19623)
 Panic message: Missing errorhandling code in smp_append_sign(),
 storage_persistent_subr.c line 128:#012  Condition((smp_chk_sign(ctx)) ==
 0) not true.thread = (cache-worker)#012ident =
 Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,epoll#012Backtrace:#012
 0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012  0x44a346:
 /usr/sbin/varnishd(smp_append_sign+0x126) [0x44a346]#012  0x447b6d:
 /usr/sbin/varnishd(SMP_NewBan+0x3d) [0x447b6d]#012  0x4125c7:
 /usr/sbin/varnishd(BAN_Insert+0x1a7) [0x4125c7]#012  0x433bd5:
 /usr/sbin/varnishd(VRT_ban_string+0xc5) [0x433bd5]#012  0x7f91f39fa4be:
 ./vcl.PNU3fGhs.so(+0x24be) [0x7f91f39fa4be]#012  0x433863:
 /usr/sbin/varnishd(VCL_recv_method+0x43) [0x433863]#012  0x417c22:
 /usr/sbin/varnishd(CNT_Session+0xb62) [0x417c22]#012  0x42efb8:
 /usr/sbin/varnishd() [0x42efb8]#012  0x42e19b: /usr/sbin/varnishd()
 [0x42e19b]#012sp = 0x7f91ed4ab008 {#012  fd = 15, id = 15, xid =
 683670119,#012  client = 172.27.70.103 36115,#012  step = STP_RECV,#012
 handling = deliver,#012  restarts = 0, esi_level = 0#012  flags = #012
 bodystatus = 4#012  ws = 0x7f91ed4ab080 { #012    id = "sess",#012
 {s,f,r,e} = {0x7f91ed4abc90,+56,(nil),+65536},#012  },#012  http[req] =
 {#012    ws = 0x7f91ed4ab080[sess]#012      "PURGE",#012
 "105867846",#012      "HTTP/1.0",#012  },#012  worker = 0x7f91ef1faa80
 {#012    ws = 0x7f91ef1facc0 { #012      id = "wrk",#012      {s,f,r,e} =
 {0x7f91ef1e8a30,+32,(nil),+65536},#012    },#012    },#012    vcl = {#012
 srcname = {#012        "input",#012        "Default",#012      },#012
 },#012},#012
 Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: child (6907)
 Started
 Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Pushing vcls
 failed:#012CLI communication error (hdr)
 Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
 died signal=6
 Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
 Panic message: Assert error in smp_open(), storage_persistent.c line
 320:#012  Condition((smp_valid_silo(sc)) == 0) not true.#012thread =
 (cache-main)#012ident =
 Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,no_waiter#012Backtrace:#012
 0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012  0x44756a:
 /usr/sbin/varnishd() [0x44756a]#012  0x444d57:
 /usr/sbin/varnishd(STV_open+0x27) [0x444d57]#012  0x42b525:
 /usr/sbin/varnishd(child_main+0xc5) [0x42b525]#012  0x43d5ec:
 /usr/sbin/varnishd() [0x43d5ec]#012  0x43de7c: /usr/sbin/varnishd()
 [0x43de7c]#012  0x7f92015684c7: /usr/lib64/varnish/libvarnish.so(+0x94c7)
 [0x7f92015684c7]#012  0x7f9201568b58:
 /usr/lib64/varnish/libvarnish.so(vev_schedule+0x88) [0x7f9201568b58]#012
 0x43d7c2: /usr/sbin/varnishd(MGT_Run+0x132) [0x43d7c2]#012  0x44cacb:
 /usr/sbin/varnishd(main+0xd1b) [0x44cacb]#012
 Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
 said Child starts
 Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
 said CHK(0x7f91ffd26120 BAN 1 0x7f34723f4000 BAN 1) = 4
 Jan  9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
 said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1
 }}}

--

-- 
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1083#comment:2>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator




More information about the varnish-bugs mailing list