Persistent Varnish crashes since using bans and lurker
Roland Mohrbacher
roland at mohrbacher.eu
Tue Jan 10 11:11:27 CET 2012
Hello all,
we use a farm with three persistent Varnishes (-s
persistent,/cms/varnish_cache/persistent/varnish_storage.bin,204800M").
This Varnishes runs since 3 months without any crashes (in the moment
not in production, but stressed with several stress tests).
Since some days, we use bans and the lurker process (lurker-friendly
bans via: ban("obj.http.x-url ~ " + req.url);
We have about 250 bans/hour.
Now we have the big problem, that the varnishes crashes after some hours.
Curios: all three Varnishes crashes in the same moment. And they runs on
three different Servers!
The follow part from syslog suggest, that there is an problem with an
invalid ban:
Jan 9 19:40:32 ece-fe1 /var/lib/varnish/persistent[19622]: Child
(19623) said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child
(19623) died signal=6
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child
(19623) Panic message: Missing errorhandling code in smp_append_sign(),
storage_persistent_subr.c line 128:#012 Condition((smp_chk_sign(ctx))
== 0) not true.thread = (cache-worker)#012ident =
Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,epoll#012Backtrace:#012
0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012 0x44a346:
/usr/sbin/varnishd(smp_append_sign+0x126) [0x44a346]#012 0x447b6d:
/usr/sbin/varnishd(SMP_NewBan+0x3d) [0x447b6d]#012 0x4125c7:
/usr/sbin/varnishd(BAN_Insert+0x1a7) [0x4125c7]#012 0x433bd5:
/usr/sbin/varnishd(VRT_ban_string+0xc5) [0x433bd5]#012 0x7f91f39fa4be:
./vcl.PNU3fGhs.so(+0x24be) [0x7f91f39fa4be]#012 0x433863:
/usr/sbin/varnishd(VCL_recv_method+0x43) [0x433863]#012 0x417c22:
/usr/sbin/varnishd(CNT_Session+0xb62) [0x417c22]#012 0x42efb8:
/usr/sbin/varnishd() [0x42efb8]#012 0x42e19b: /usr/sbin/varnishd()
[0x42e19b]#012sp = 0x7f91ed4ab008 {#012 fd = 15, id = 15, xid =
683670119,#012 client = 172.27.70.103 36115,#012 step = STP_RECV,#012
handling = deliver,#012 restarts = 0, esi_level = 0#012 flags = #012
bodystatus = 4#012 ws = 0x7f91ed4ab080 { #012 id = "sess",#012
{s,f,r,e} = {0x7f91ed4abc90,+56,(nil),+65536},#012 },#012 http[req] =
{#012 ws = 0x7f91ed4ab080[sess]#012 "PURGE",#012
"105867846",#012 "HTTP/1.0",#012 },#012 worker = 0x7f91ef1faa80
{#012 ws = 0x7f91ef1facc0 { #012 id = "wrk",#012 {s,f,r,e}
= {0x7f91ef1e8a30,+32,(nil),+65536},#012 },#012 },#012 vcl =
{#012 srcname = {#012 "input",#012
"Default",#012 },#012 },#012},#012
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: child (6907)
Started
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Pushing vcls
failed:#012CLI communication error (hdr)
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
died signal=6
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (6907)
Panic message: Assert error in smp_open(), storage_persistent.c line
320:#012 Condition((smp_valid_silo(sc)) == 0) not true.#012thread =
(cache-main)#012ident =
Linux,2.6.32-131.2.1.el6.x86_64,x86_64,-spersistent,-smalloc,-hcritbit,no_waiter#012Backtrace:#012
0x42c7a6: /usr/sbin/varnishd() [0x42c7a6]#012 0x44756a:
/usr/sbin/varnishd() [0x44756a]#012 0x444d57:
/usr/sbin/varnishd(STV_open+0x27) [0x444d57]#012 0x42b525:
/usr/sbin/varnishd(child_main+0xc5) [0x42b525]#012 0x43d5ec:
/usr/sbin/varnishd() [0x43d5ec]#012 0x43de7c: /usr/sbin/varnishd()
[0x43de7c]#012 0x7f92015684c7:
/usr/lib64/varnish/libvarnish.so(+0x94c7) [0x7f92015684c7]#012
0x7f9201568b58: /usr/lib64/varnish/libvarnish.so(vev_schedule+0x88)
[0x7f9201568b58]#012 0x43d7c2: /usr/sbin/varnishd(MGT_Run+0x132)
[0x43d7c2]#012 0x44cacb: /usr/sbin/varnishd(main+0xd1b) [0x44cacb]#012
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
said Child starts
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
said CHK(0x7f91ffd26120 BAN 1 0x7f34723f4000 BAN 1) = 4
Jan 9 19:40:33 ece-fe1 /var/lib/varnish/persistent[19622]: Child (-1)
said CHK(0x7f91ffd261a0 BAN 2 0x7f34724f4000 <invalid>) = 1
Is this an known problem?
Are there work a rounds to use persistent Varnish together with lurkers?
Best regards
Roland
More information about the varnish-misc
mailing list