[Varnish] #796: ban lurker deadlock in varnish 2.1.3

Varnish varnish-bugs at varnish-cache.org
Sat Oct 16 02:02:27 CEST 2010


#796: ban lurker deadlock in varnish 2.1.3
------------------------+---------------------------------------------------
 Reporter:  ryan.krebs  |       Owner:  phk                
     Type:  defect      |      Status:  new                
 Priority:  normal      |   Milestone:                     
Component:  varnishd    |     Version:  2.1.3              
 Severity:  normal      |    Keywords:  ban lurker deadlock
------------------------+---------------------------------------------------
 On CentOS 5.5, Linux 2.6.18, varnishd occasionally (about once a week on
 any one of three production servers) hangs in what looks like a deadlock
 between the ban lurker and a worker thread doing a cache lookup.  If the
 ban lurker happens to start processing an object at the same time that a
 request is looking up that object from the cache, the two can get stuck
 trying to lock ban_mtx and oh->mtx.  Attaching a full backtrace, but the
 relevant threads appear to be:[[BR]]

 {{{
 the ban lurker thread, which would already have ban_mtx locked at this
 point:

 #0  0x000000390a80d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
 #1  0x000000390a808e1a in _L_lock_1034 () from /lib64/libpthread.so.0
 #2  0x000000390a808cdc in pthread_mutex_lock () from
 /lib64/libpthread.so.0
 #3  0x0000000000421a69 in Lck__Lock ()
 #4  0x000000000041afea in HSH_FindBan ()
 #5  0x0000000000410b43 in ban_lurker ()
 #6  0x0000000000424429 in wrk_bgthread ()
 #7  0x000000390a80673d in start_thread () from /lib64/libpthread.so.0
 #8  0x000000390a0d3d1d in clone () from /lib64/libc.so.6
 }}}
 and

 {{{
 a large number of other threads, which have locked their respective
 oh->mtxs and are trying to lock ban_mtx:

 #0  0x000000390a80d4c4 in __lll_lock_wait () from /lib64/libpthread.so.0
 #1  0x000000390a808e1a in _L_lock_1034 () from /lib64/libpthread.so.0
 #2  0x000000390a808cdc in pthread_mutex_lock () from
 /lib64/libpthread.so.0
 #3  0x0000000000421a69 in Lck__Lock ()
 #4  0x000000000040f926 in ban_check_object ()
 #5  0x000000000041c42f in HSH_Lookup ()
 #6  0x0000000000411810 in cnt_lookup ()
 #7  0x0000000000413ce4 in CNT_Session ()
 #8  0x0000000000424668 in wrk_do_cnt_sess ()
 #9  0x000000000042396e in wrk_thread_real ()
 #10 0x000000390a80673d in start_thread () from /lib64/libpthread.so.0
 #11 0x000000390a0d3d1d in clone () from /lib64/libc.so.6
 }}}

 ban_lurker_sleep is currently set to 0.0005.

-- 
Ticket URL: <http://varnish-cache.org/trac/ticket/796>
Varnish <http://varnish-cache.org/>
The Varnish HTTP Accelerator




More information about the varnish-bugs mailing list