Another Varnish 1.1.2 crash, in cache_backend.c

Dag-Erling Smørgrav des at linpro.no
Tue Jan 8 17:15:04 CET 2008


Anders Nordby <anders at fupp.net> writes:
> Poul-Henning Kamp <phk at phk.freebsd.dk> writes:
> > Was there any errno information available ?  I belive the assert
> > message would have printed it ?
> Nope, no assert.

Yes there was, let me show the backtrace which phk missed and you
edited out of your reply:

#0  0x28129f37 in thr_kill () from /lib/libc.so.6
#1  0x280cf1a5 in pthread_mutex_unlock () from /usr/lib/libthr.so.2
#2  0x280c72ae in raise () from /usr/lib/libthr.so.2
#3  0x281a4b78 in abort () from /lib/libc.so.6
#4  0x280aa0ed in lbv_assert (func=0x806ed70 "VBE_ClosedFd",
    file=0x806ebfb "cache_backend.c", line=357,
    cond=0x806edba "(close(vc->fd)) == 0", err=22) at assert.c:58
#5  0x0804fbdf in VBE_ClosedFd (w=0xb3b3ecf0, vc=0xb1ce040)
    at cache_backend.c:357

errno 22 is EINVAL.

Similar issues in trunk were addressed by a number of commits
(including r2264 and r2285) which unfortunately can't be merged back
to 1.1, because the code has been completely reorganized in trunk.
They need to be reimplemented for 1.1.

> Got two more of them here:

Actually, those are two almost identical instances of a completely
different bug, a segfault in vbe_sock_conn() due to vbe_conn_try()
passing a NULL pointer:

> (gdb) bt
> #0  0x0000000000408a90 in vbe_sock_conn (ai=0x0) at cache_backend.c:162
> #1  0x0000000000408b98 in vbe_conn_try (bp=0xaf2d00, pai=0x7ffffadd5838)
>     at cache_backend.c:190
> #2  0x0000000000408d14 in vbe_connect (sp=0xb4d008, bp=0xaf2d00)
>     at cache_backend.c:228

The fault lies in the following loop on bin/varnishd/cache_backend.c
lines 188-196:

        /* Then try the list until the cached last good address */
        for (ai = bp->addr; ai != bp->last_addr; ai = ai->ai_next) {
                s = vbe_sock_conn(ai);
                if (s >= 0) {
                        bp->last_addr = ai;
                        *pai = ai;
                        return (s);
                }
        }

The loop condition should check that ai != NULL.

There may or may not be a race condition at the bottom of this
(between vbe_conn_try() and something else modifying the backend
struct that bp points to)

DES
-- 
Dag-Erling Smørgrav
Senior Software Developer
Linpro AS - www.linpro.no



More information about the varnish-misc mailing list