varnish (varnish-2.1.5 SVN 0843d7a) crashes on FreeBSD (8.3-RELEASE-p3)

hsc hsc at nostromo.joeh.org
Fri Jan 25 11:04:09 CET 2013


Hello list,


I experience regular, crashes with my varnish setup. Unfortunately I 
cannot reproduce or trigger them. I tried to get clues on what causes the
crashes by debugging the core files, but I am not an expert on gdb and
debugging. I followed the guidelines on "https://www.varnish-cache.org\
/trac/wiki/DebuggingVarnish". Here's the information on the setup:

1) Include the exact version of Varnish you are using, and on which OS/OS version.

    varnishd (varnish-2.1.5 SVN 0843d7a) running on "FreeBSD 8.3-RELEASE-p3" with
    a "GENERIC amd64" kernel.
    
    $ ldd `which varnishd`
    /usr/local/sbin/varnishd:
        libvarnish.so.1 => /usr/local/lib/libvarnish.so.1 (0x80069c000)
        librt.so.1 => /usr/lib//librt.so.1 (0x8007ae000)
        libpcre.so.0 => /usr/local/lib/compat/pkg/libpcre.so.0 (0x8008b3000)
        libvarnishcompat.so.1 => /usr/local/lib/libvarnishcompat.so.1 (0x8009e5000)
        libvcl.so.1 => /usr/local/lib/libvcl.so.1 (0x800af3000)
        libthr.so.3 => /lib/libthr.so.3 (0x800c0d000)
        libm.so.5 => /lib/libm.so.5 (0x800d26000)
        libc.so.7 => /lib/libc.so.7 (0x800e46000)

    $ ulimit -c
    unlimited
 

2) Include the VCL used.

    The VCL config is quite huge, around 1200 lines of configuration.
    Which sections of the config files are of specific interest so I
    can provide them here?
    I think narrowing it down makes more sense than pasting it all.


3) Include the assert message.

    No assert message was given.
    
    But in syslog i noticed this every time varnishd crashed:
    Some usual system messages logged to the syslog, with a timestamp of 14:00. Then
    the timestamps jump back one hour (CEST vs. CET maybe?), followed by varnish log-
    entries, resulting in a crash with a core-dump. As soon as varnishd has crashed,
    log-entries show the correct timestamps of 14:00 again, which is the "normal" time.

    13:59:02 cache [some usual and generic system log-entries]
    13:59:56 cache last message repeated 2 times
    14:00:04 cache kernel: pid 5421 (varnishd), uid 80: exited on signal 11
    13:00:05 cache varnishd[1455]: child (19494) Started
    13:00:05 cache varnishd[1455]: Child (19494) said
    13:00:05 cache varnishd[1455]: Child (19494) said Child starts
    13:00:05 cache varnishd[1455]: Child (19494) said Probe("GET /ping HTTP/1.1
    13:00:05 cache varnishd[1455]: Child (19494) said
    13:00:05 cache varnishd[1455]: Child (19494) said Host: mysite.com
    13:00:05 cache varnishd[1455]: Child (19494) said
    13:00:05 cache varnishd[1455]: Child (19494) said Connection: close
    13:00:05 cache varnishd[1455]: Child (19494) said
    13:00:05 cache varnishd[1455]: Child (19494) said Accept-Encoding: foo/bar
    13:00:05 cache varnishd[1455]: Child (19494) said
    13:00:05 cache last message repeated 2 times
    13:00:05 cache varnishd[1455]: Child (19494) said ", 15, 1)
    13:00:05 cache varnishd[1455]: Child (19494) said Probe("GET /ping HTTP/1.1
    13:00:05 cache varnishd[1455]: Child (19494) said
    13:00:05 cache varnishd[1455]: Child (19494) said Host: mysite.com
    13:00:05 cache varnishd[1455]: Child (19494) said
    13:00:05 cache varnishd[1455]: Child (19494) said Connection: close
    13:00:05 cache varnishd[1455]: Child (19494) said
    13:00:05 cache varnishd[1455]: Child (19494) said Accept-Encoding: foo/bar
    13:00:05 cache varnishd[1455]: Child (19494) said
    13:00:05 cache last message repeated 2 times
    13:00:05 cache varnishd[1455]: Child (19494) said ", 15, 1)
    13:00:05 cache varnishd[1455]: Child (19494) said Probe("GET /ping HTTP/1.1
    13:00:05 cache varnishd[1455]: Child (19494) said
    13:00:05 cache varnishd[1455]: Child (19494) said Host: mysite.com
    13:00:05 cache varnishd[1455]: Child (19494) said
    13:00:05 cache varnishd[1455]: Child (19494) said Connection: close
    13:00:05 cache varnishd[1455]: Child (19494) said
    13:00:05 cache varnishd[1455]: Child (19494) said Accept-Encoding: foo/bar
    13:00:05 cache varnishd[1455]: Child (19494) said
    13:00:05 cache last message repeated 2 times

    [ ...the ping stuff syslog-entries show up here repeatedly... ]

    13:00:30 cache last message repeated 2 times
    13:00:30 cache varnishd[1455]: Child (19925) said ", 65, 60)
    13:00:30 cache varnishd[1455]: Child (19925) said Probe("GET / HTTP/1.1
    13:00:30 cache varnishd[1455]: Child (19925) said
    13:00:30 cache varnishd[1455]: Child (19925) said Host: mysite.com
    13:00:30 cache varnishd[1455]: Child (19925) said
    13:00:30 cache varnishd[1455]: Child (19925) said Connection: close
    13:00:30 cache varnishd[1455]: Child (19925) said
    13:00:30 cache varnishd[1455]: Child (19925) said Accept-Encoding: foo/bar
    13:00:30 cache varnishd[1455]: Child (19925) said
    13:00:30 cache last message repeated 2 times
    13:00:30 cache varnishd[1455]: Child (19925) said ", 65, 60)
    14:00:46 cache last message repeated 3 times
    14:01:19 cache last message repeated 2 times

    The "/ping" URL is for varnishd's happiness (via "varnishadm -T :81 debug.health").


4) Include a backtrace of the core-dump.

    Steps I have done so far: compile an unstripped varnishd, collect core-dumps.
    So currently there are five core-dumps I can examine. Looking at them I do
    see that "cache_ws.c" pops up every time a crash occurs. Thus I tried having
    a closer look at it via "frame <frame>" and "print *ws", but I have no idea
    how to read the output. Also 'shm_workspace=Variable "shm_workspace" is not
    available.' is another message that pops up. A search for these things though
    did not give me clues on what might be wrong so far. Maybe you could give me
    some hints on what to take a closer look reading those backtraces.

    - core::1 ------------------------------------------------------------
    Core was generated by `varnishd'.
    Program terminated with signal 11, Segmentation fault.
    #0  0x0000000800a1a39b in ?? ()
    (gdb) bt
    #0  0x0000000800a1a39b in ?? ()
    #1  0x00000000004451c3 in __func__.5737 ()
    #2  0x0000000000000003 in ?? ()
    #3  0x0000000000000000 in ?? ()
    #4  0x00007fff84223e58 in ?? ()
    #5  0x00000008012144c0 in ?? ()
    #6  0x00000008153182e0 in ?? ()
    #7  0x0000000000425c66 in pan_ic (func=Variable "func" is not available.
    ) at cache_panic.c:273
    #8  0x00000000004214f8 in http_SetH (to=0x8153182e0, n=3, fm=0x0) at cache_http.c:656
    #9  0x000000000042f49a in VRT_l_obj_status (sp=0x818248008, num=302) at cache_vrt.c:282
    #10 0x000000080720d39e in ?? ()
    #11 0x0000000815318220 in ?? ()
    #12 0x00000000000001c8 in ?? ()
    #13 0x00000008182493d9 in ?? ()
    #14 0x0000000000000004 in ?? ()
    #15 0x0000000000000270 in ?? ()
    #16 0x00000000004323b5 in WS_Alloc (ws=0x7fff84210570, bytes=355566304) at cache_ws.c:130
    #17 0x00000008153182e0 in ?? ()
    #18 0x00007fff84223d00 in ?? ()
    #19 0x00007fff84223e58 in ?? ()
    #20 0x0000000818248008 in ?? ()
    #21 0x000000000042aafb in VCL_error_method (sp=0x818248008) at vcl_returns.h:61
    #22 0x000000000041425a in cnt_error (sp=0x818248008) at cache_center.c:370
    #23 0x0000000000414d79 in CNT_Session (sp=0x818248008) at steps.h:43
    #24 0x00000000004282c1 in wrk_do_cnt_sess (w=0x7fff84223d00, priv=Variable "priv" is not available.
    ) at cache_pool.c:294
    #25 0x0000000000427547 in wrk_thread_real (qp=0x8012144c0, shm_workspace=Variable "shm_workspace" is not available.
    ) at cache_pool.c:183
    #26 0x0000000800c3d511 in ?? ()
    #27 0x0000000000000000 in ?? ()
    #28 0x0000000000000000 in ?? ()
    #29 0x0000000000000000 in ?? ()
    #30 0x0000000000000000 in ?? ()
    #31 0x0000000000000000 in ?? ()
    #32 0x00007fffff9fdf90 in ?? ()
    #33 0x0000000000000000 in ?? ()
    #34 0x0000000000000000 in ?? ()
    Cannot access memory at address 0x7fff84224000
    (gdb) frame 16
    #16 0x00000000004323b5 in WS_Alloc (ws=0x7fff84210570, bytes=355566304) at cache_ws.c:130
    130             WS_Assert(ws);
    (gdb) print *ws
    $1 = {magic = 405045256, id = 0x42aafb "\213½@\001", s = 0x7fff84223d00 "ϭ\221c", f = 0x7fff84223e58 "T5", r = 0x818248008 "Z\234/,p\002", e = 0x41425a "\213\205@\001", overflow = 745434957}
    ----------------------------------------------------------------------

    - core::2 ------------------------------------------------------------
    Core was generated by `varnishd'.
    Program terminated with signal 11, Segmentation fault.
    #0  0x0000000800a1a39b in ?? ()
    (gdb) bt
    #0  0x0000000800a1a39b in ?? ()
    #1  0x00000000004451c3 in __func__.5737 ()
    #2  0x00000000000054a8 in ?? ()
    #3  0x00000000000054a8 in ?? ()
    #4  0x0000000811872cc8 in ?? ()
    #5  0x0000000000000000 in ?? ()
    #6  0x0000000811862080 in ?? ()
    #7  0x0000000000425c66 in pan_ic (func=Variable "func" is not available.
    ) at cache_panic.c:273
    #8  0x000000000043208c in WS_Release (ws=0x811862080, bytes=21672) at cache_ws.c:193
    #9  0x000000000042b84e in vrt_assemble_string (hp=0x8118623b8, h=0x0, p=Variable "p" is not available.
    ) at cache_vrt.c:182
    #10 0x000000000042f575 in vrt_do_string (w=0x7ffe86a3bd00, fd=395, hp=0x8118623b8, fld=1, err=0x44a7b0 "resp.response", 
        p=0x811868348 "/newsfeed/detail/nachrichten_nachrichten_rss.xml?0.8671897399282649?0.7050130921254134?0.5526925072390788?0.04943594061651574?0.2816845032576461?0.2989058246531162?0.5712539436218447?0.798410399952849"..., ap=0x7ffe86a28490) at cache_vrt.c:224
    #11 0x0000000000430200 in VRT_l_req_url (sp=0x811862008, 
        p=0x811868348 "/newsfeed/detail/nachrichten_nachrichten_rss.xml?0.8671897399282649?0.7050130921254134?0.5526925072390788?0.04943594061651574?0.2816845032576461?0.2989058246531162?0.5712539436218447?0.798410399952849"...) at cache_vrt.c:255
    #12 0x0000000806c626c4 in ?? ()
    #13 0x00000000000000d0 in ?? ()
    #14 0x0000000811862008 in ?? ()
    #15 0x00007ffe86a3be58 in ?? ()
    #16 0x0000000000000000 in ?? ()
    #17 0x00007ffe86a285d0 in ?? ()
    #18 0x0000000806c627a6 in ?? ()
    #19 0x0000000811862008 in ?? ()
    #20 0x0000000811862008 in ?? ()
    #21 0x00007ffe86a3be58 in ?? ()
    #22 0x0000000000000000 in ?? ()
    #23 0x0000000811862008 in ?? ()
    #24 0x000000000042b378 in VCL_recv_method (sp=0x7ffe86a285a0) at vcl_returns.h:21
    #25 0x0000000000415143 in CNT_Session (sp=0x811862008) at cache_center.c:1070
    #26 0x00000000004282c1 in wrk_do_cnt_sess (w=0x7ffe86a3bd00, priv=Variable "priv" is not available.
    ) at cache_pool.c:294
    #27 0x0000000000427547 in wrk_thread_real (qp=0x801214740, shm_workspace=Variable "shm_workspace" is not available.
    ) at cache_pool.c:183
    #28 0x0000000800c3d511 in ?? ()
    #29 0x0000000000000000 in ?? ()
    #30 0x0000000000000000 in ?? ()
    #31 0x0000000000000000 in ?? ()
    #32 0x0000000000000000 in ?? ()
    #33 0x0000000000000000 in ?? ()
    #34 0x00007fffff9fdf90 in ?? ()
    #35 0x0000000000000000 in ?? ()
    #36 0x0000000000000000 in ?? ()
    Cannot access memory at address 0x7ffe86a3c000
    ----------------------------------------------------------------------

    - core::3 ------------------------------------------------------------
    Core was generated by `varnishd'.
    Program terminated with signal 11, Segmentation fault.
    #0  0x0000000800a1a39b in ?? ()
    (gdb) bt
    #0  0x0000000800a1a39b in ?? ()
    #1  0x00000000004451c3 in __func__.5737 ()
    #2  0x0000000000000003 in ?? ()
    #3  0x0000000000000000 in ?? ()
    #4  0x00007fffca855e58 in ?? ()
    #5  0x00000008012e1380 in ?? ()
    #6  0x00000008216e2ce0 in ?? ()
    #7  0x0000000000425c66 in pan_ic (func=Variable "func" is not available.
    ) at cache_panic.c:273
    #8  0x00000000004214f8 in http_SetH (to=0x8216e2ce0, n=3, fm=0x0) at cache_http.c:656
    #9  0x000000000042f49a in VRT_l_obj_status (sp=0x81b8bd008, num=302) at cache_vrt.c:282
    #10 0x00000008010bd35e in ?? ()
    #11 0x00000008216e2c20 in ?? ()
    #12 0x00000000000001c8 in ?? ()
    #13 0x000000081b8be541 in ?? ()
    #14 0x0000000000000004 in ?? ()
    #15 0x00000000000000bf in ?? ()
    #16 0x00000000004323b5 in WS_Alloc (ws=0x7fffca840570, bytes=560868576) at cache_ws.c:130
    #17 0x00000008216e2ce0 in ?? ()
    #18 0x00007fffca855d00 in ?? ()
    #19 0x00007fffca855e58 in ?? ()
    #20 0x000000081b8bd008 in ?? ()
    #21 0x000000000042aafb in VCL_error_method (sp=0x81b8bd008) at vcl_returns.h:61
    #22 0x000000000041425a in cnt_error (sp=0x81b8bd008) at cache_center.c:370
    #23 0x0000000000414d79 in CNT_Session (sp=0x81b8bd008) at steps.h:43
    #24 0x00000000004282c1 in wrk_do_cnt_sess (w=0x7fffca855d00, priv=Variable "priv" is not available.
    ) at cache_pool.c:294
    #25 0x0000000000427547 in wrk_thread_real (qp=0x8012e1380, shm_workspace=Variable "shm_workspace" is not available.
    ) at cache_pool.c:183
    #26 0x0000000800c3d511 in ?? ()
    #27 0x0000000000000000 in ?? ()
    #28 0x0000000000000000 in ?? ()
    #29 0x0000000000000000 in ?? ()
    #30 0x0000000000000000 in ?? ()
    #31 0x0000000000000000 in ?? ()
    #32 0x00007fffff9fdf90 in ?? ()
    #33 0x0000000000000000 in ?? ()
    #34 0x0000000000000000 in ?? ()
    Cannot access memory at address 0x7fffca856000
    ----------------------------------------------------------------------

    - core::4 ------------------------------------------------------------
    Core was generated by `varnishd'.
    Program terminated with signal 11, Segmentation fault.
    #0  0x0000000800a13d5f in ?? ()
    (gdb) bt
    #0  0x0000000800a13d5f in ?? ()
    #1  0x00007fff43408430 in ?? ()
    #2  0x0000000800a13d9c in ?? ()
    #3  0x00000000004451c3 in __func__.5737 ()
    #4  0x0000000000000000 in ?? ()
    #5  0x00000000000001f7 in ?? ()
    #6  0x00007fff4341de58 in ?? ()
    #7  0x00000008012e14c0 in ?? ()
    #8  0x0000000000000004 in ?? ()
    #9  0x0000000000425c66 in pan_ic (func=Variable "func" is not available.
    ) at cache_panic.c:273
    #10 0x000000000042e3df in VRT_int_string (sp=Variable "sp" is not available.
    ) at cache_vrt.c:900
    #11 0x00000008010bd4de in ?? ()
    #12 0x0000000823264420 in ?? ()
    #13 0x0000000000000018 in ?? ()
    #14 0x0000000000448a88 in __func__.6447 ()
    #15 0x0000000000000004 in ?? ()
    #16 0x000000000000001d in ?? ()
    #17 0x00000000004323b5 in WS_Alloc (ws=0x7fff43408570, bytes=589711816) at cache_ws.c:130
    #18 0x00000008232644e0 in ?? ()
    #19 0x00007fff4341dd00 in ?? ()
    #20 0x00007fff4341de58 in ?? ()
    #21 0x000000080871b008 in ?? ()
    #22 0x000000000042aafb in VCL_error_method (sp=0x80871b008) at vcl_returns.h:61
    #23 0x000000000041425a in cnt_error (sp=0x80871b008) at cache_center.c:370
    #24 0x0000000000414d79 in CNT_Session (sp=0x80871b008) at steps.h:43
    #25 0x00000000004282c1 in wrk_do_cnt_sess (w=0x7fff4341dd00, priv=Variable "priv" is not available.
    ) at cache_pool.c:294
    #26 0x0000000000427547 in wrk_thread_real (qp=0x8012e14c0, shm_workspace=Variable "shm_workspace" is not available.
    ) at cache_pool.c:183
    #27 0x0000000800c3d511 in ?? ()
    #28 0x0000000000000000 in ?? ()
    #29 0x0000000000000000 in ?? ()
    #30 0x0000000000000000 in ?? ()
    #31 0x0000000000000000 in ?? ()
    #32 0x0000000000000000 in ?? ()
    #33 0x00007fffff9fdf90 in ?? ()
    #34 0x0000000000000000 in ?? ()
    #35 0x0000000000000000 in ?? ()
    Cannot access memory at address 0x7fff4341e000
    ----------------------------------------------------------------------

    - core::5 ------------------------------------------------------------
    #0  0x0000000800a1a39b in ?? ()
    #1  0x00000000004451c3 in __func__.5737 ()
    #2  0x0000000000000003 in ?? ()
    #3  0x0000000000000000 in ?? ()
    #4  0x00007fff03e23e58 in ?? ()
    #5  0x00000008012e1560 in ?? ()
    #6  0x00000008264deae0 in ?? ()
    #7  0x0000000000425c66 in pan_ic (func=Variable "func" is not available.
    ) at cache_panic.c:273
    #8  0x00000000004214f8 in http_SetH (to=0x8264deae0, n=3, fm=0x0) at cache_http.c:656
    #9  0x000000000042f49a in VRT_l_obj_status (sp=0x80b925008, num=302) at cache_vrt.c:282
    #10 0x00000008010bd47e in ?? ()
    #11 0x00000008264dea20 in ?? ()
    #12 0x00000000000001c8 in ?? ()
    #13 0x000000080b9262e1 in ?? ()
    #14 0x0000000000000004 in ?? ()
    #15 0x0000000000000054 in ?? ()
    #16 0x00000000004323b5 in WS_Alloc (ws=0x7fff03e0e570, bytes=642640608) at cache_ws.c:130
    #17 0x00000008264deae0 in ?? ()
    #18 0x00007fff03e23d00 in ?? ()
    #19 0x00007fff03e23e58 in ?? ()
    #20 0x000000080b925008 in ?? ()
    #21 0x000000000042aafb in VCL_error_method (sp=0x80b925008) at vcl_returns.h:61
    #22 0x000000000041425a in cnt_error (sp=0x80b925008) at cache_center.c:370
    #23 0x0000000000414d79 in CNT_Session (sp=0x80b925008) at steps.h:43
    #24 0x00000000004282c1 in wrk_do_cnt_sess (w=0x7fff03e23d00, priv=Variable "priv" is not available.
    ) at cache_pool.c:294
    #25 0x0000000000427547 in wrk_thread_real (qp=0x8012e1560, shm_workspace=Variable "shm_workspace" is not available.
    ) at cache_pool.c:183
    #26 0x0000000800c3d511 in ?? ()
    #27 0x0000000000000000 in ?? ()
    #28 0x0000000000000000 in ?? ()
    #29 0x0000000000000000 in ?? ()
    #30 0x0000000000000000 in ?? ()
    #31 0x0000000000000000 in ?? ()
    #32 0x00007fffff9fdf90 in ?? ()
    #33 0x0000000000000000 in ?? ()
    #34 0x0000000000000000 in ?? ()
    Cannot access memory at address 0x7fff03e24000
    ----------------------------------------------------------------------

I am happy to provide more information if needed. Please let me know. 
Thanks.

br, hsc


More information about the varnish-misc mailing list