Ticket #310 (closed defect: invalid)

Opened 18 months ago

Last modified 5 weeks ago

WS_Reserve panic + error

Reported by: sky Owned by: phk
Priority: lowest Milestone: Varnish 2.1 release
Component: varnishd Version: trunk
Severity: normal Keywords:
Cc:

Description

Sep 3 00:20:20 varnish2 varnishd[25355]: Child (25376) died signal=6 Sep 3 00:20:20 varnish2 varnishd[25355]: Child (25376) Panic message: Assert error in WS_Reserve(), cache_ws.c line 156: Condition(ws->r == NULL) not true. thread = (cache-worker)sp = 0x14a64fc8 { fd = 1507, id = 1507, xid = 684175780, client = 72.81.232.3:1299, step = 0x8, handling = 0x0, err_code = 301, err_reason = (null), ws = 0x14a65038 { id = "sess", {s,f,r,e} = {0x14a65760+947,(nil),+32768}, }, worker = 0x2aaee1db0c10 { }, vcl = { srcname = { "/home/artur/wikia-beta.vcl", "Default", }, }, obj = 0x2aabad2f3000 { refcnt = 3, xid = 676411472, ws = 0x2aabad2f3028 { id = "obj", {s,f,r,e} = {0x2aabad2f3348+530,(nil),+31928}, }, http = { ws = 0x2aabad2f3028 { id = "obj", {s,f,r,e} = {0x2aabad2f3348+530,(nil),+31928}, }, hd = { "Date: Tue, 02 Sep 2008 20:44:22 GMT", "Server: Apache", "X-Powered-By: PHP/5.2.6", "Content-language: en", Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(7, sock) Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(8, sock) Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(9, cli_in) Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(12, cli_out) Sep 3 00:20:20 varnish2 varnishd[25355]: child (28791) Started Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(9, (null)) Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(12, (null)) Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(7, (null)) Sep 3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(8, (null)) Sep 3 00:20:20 varnish2 varnishd[25355]: Pushing vcls failed: dlopen(./vcl.1P9zoqAU.so): ./vcl.1P9zoqAU.so: cannot open shared object file: No such file or directory Sep 3 00:20:20 varnish2 varnishd[25355]: Child (28791) said Closed fds: 5 6 10 11 13 14 Sep 3 00:20:20 varnish2 varnishd[25355]: Child (28791) said Child starts Sep 3 00:20:20 varnish2 varnishd[25355]: Child (28791) said managed to mmap 15032385536 bytes of 15032385536 Sep 3 00:20:20 varnish2 varnishd[25355]: Child (28791) said Ready

No core dump seem to have survived.

Any suggestion for what to look for?

Change History

Changed 18 months ago by sky

Sep  3 00:20:20 varnish2 varnishd[25355]: Child (25376) died signal=6
Sep  3 00:20:20 varnish2 varnishd[25355]: Child (25376) Panic message: Assert error in WS_Reserve(), cache_ws.c line 156:   Condition(ws->r == NULL) not true.  
thread = (cache-worker)sp = 0x14a64fc8 {   fd = 1507,
id = 1507, 
xid = 684175780,   
client = 72.81.232.3:1299,
step = 0x8,   
handling = 0x0,   
err_code = 301, 
err_reason = (null),   
ws = 0x14a65038 {
     id = "sess",
     {s,f,r,e} = {0x14a65760,,+947,(nil),+32768},
   },    
 worker = 0x2aaee1db0c10 {     },    
 vcl = {       srcname = {         "/home/artur/wikia-beta.vcl",         "Default",       },     },   
 
obj = 0x2aabad2f3000 {
    refcnt = 3, xid = 676411472,
    ws = 0x2aabad2f3028 {
      id = "obj",
      {s,f,r,e} = {0x2aabad2f3348,,+530,(nil),+31928},
    },
    http = {
      ws = 0x2aabad2f3028 {
        id = "obj",
        {s,f,r,e} = {0x2aabad2f3348,,+530,(nil),+31928},
      },
      hd = {
        "Date: Tue, 02 Sep 2008 20:44:22 GMT",
        "Server: Apache",
        "X-Powered-By: PHP/5.2.6",
        "Content-language: en",
        "X-Time-CPU-Time: 0.951 0.601",
        "Vary: Accept-Encoding,Cookie",
        "X-Vary-Options: Cookie;string-contains=wikicitiesUserID;string-contains=wikicities_session,Accept-Encoding;list-contains=gzip",   
        "Cache-Control: s-maxage=18000, must-revalidate, max-age=0",
        "Last-modified: Wed, 27 Aug 2008 09:04:40 GMT",
        "Content-Encoding: gzip",
        "Content-Type: text/html; charset=utf-8",
        "Content-Length: 22225",
        "X-Cacheable: YES",
        "X-Cache: HIT",
      },
    },
    len = 22225,
    store = {
      22225 {
        1f 8b 08 00 00 00 00 00 00 03 ed bd fb 77 db 36 |.............w.6|
        16 27 fe 73 72 4e ff 07 44 dd 36 c9 34 b2 5e 7e |.'.srN..D.6.4.^~|
        bf ba 8e ed 24 9e c9 c3 63 bb cd cc b6 3d 3a 94 |....$...c....=:.|
        44 4b ac 29 51 43 51 76 dc d9 fe ef fb b9 78 10 |DK.)QCQv......x.|
        [22161 more]
      },
    },
  },
},
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(7, sock)
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(8, sock)
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(9, cli_in)
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(12, cli_out)
Sep  3 00:20:20 varnish2 varnishd[25355]: child (28791) Started
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(9, (null))
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(12, (null))
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(7, (null))
Sep  3 00:20:20 varnish2 varnishd[25355]: mgt_child_inherit(8, (null))
Sep  3 00:20:20 varnish2 varnishd[25355]: Pushing vcls failed: dlopen(./vcl.1P9zoqAU.so): ./vcl.1P9zoqAU.so: cannot open shared object file: No such file or directory 
Sep  3 00:20:20 varnish2 varnishd[25355]: Child (28791) said Closed fds: 5 6 10 11 13 14
Sep  3 00:20:20 varnish2 varnishd[25355]: Child (28791) said Child starts
Sep  3 00:20:20 varnish2 varnishd[25355]: Child (28791) said managed to mmap 15032385536 bytes of 15032385536
Sep  3 00:20:20 varnish2 varnishd[25355]: Child (28791) said Ready


Changed 18 months ago by sky

backend default {
        .host = "127.0.0.1";
        .port = "80";
}

backend wikia {
        .host = "xxxx"; 
        .port = "80";
}

backend armchair {
        .host = "xxxx";
        .port = "80";
}

backend gamewikis {
        .host = "xxxx";
        .port = "80";
}

backend images { 
        .host = "xxxx";
        .port = "80";
}

# Below is a commented-out copy of the default VCL logic.  If you
# redefine any of these subroutines, the built-in logic will be
# appended to your code.

## Called when a client request is received
#
sub vcl_recv {


        if (req.http.Accept-Encoding) {
                if (req.http.Accept-Encoding ~ "gzip") {
                    set req.http.Accept-Encoding = "gzip";
                } elsif (req.http.Accept-Encoding ~ "deflate") {
                    set req.http.Accept-Encoding = "deflate";
                } else {
                    # unkown algorithm
                    remove req.http.Accept-Encoding;
                }
        }

        # clean out requests sent via curls -X mode and LWP
        if (req.url ~ "http://") {
                set req.url = regsub(req.url, "http://[^/]*","");
        }


        if (req.request == "PURGE") {
          if (req.http.purgeurl) {
            purge_hash(req.http.purgeurl); 
            error 200 "purged";
          } else {
            error 503 "empty purgeurl";
          }
        }

        if (req.url == "/lvscheck.html") {
                error 200 "varnish is okay";
        }

        if(req.http.host ~ "^(nwn|oblivion|meta|war|gw)$") {
                set req.backend = wikia;
        } elsif(req.http.host ~ "(gw.wikia.com|gamewikis.org)$") {
                set req.backend = gamewikis;
        } elsif(req.http.host ~ "^siwiki.sportsillustrated.cnn.com$") {
                set req.backend = armchair;
        } elsif(req.http.host ~ "^thirdturn.armchairgm.com$") {
                set req.backend = wikia;
        } elsif(req.http.host ~ "armchairgm.com$") {
                set req.backend = armchair;
        } elsif(req.http.host ~ "images.wikia.com") {
                set req.backend = images;
        } else {
                set req.backend = wikia;
        }

        if (req.url ~ "/__utm.gif") {
                set req.url = "/__utm.gif";
        }

        if (req.url ~ "/rx") {
              error 200 "not serving this";
        }


        if (req.http.host == "216.151.156.11") {
                set req.http.origurl = req.url;
        }


        if (req.request != "GET" && req.request != "HEAD" && req.request != "PURGE") {
                pipe;
        }
        if (req.http.Expect) {
                pipe;
        }

        if (req.http.User-Agent ~ "Opera") {
                pipe;
        }

        if (req.http.Cookie ~ "UserID") {
                set req.http.tmp_userid  = regsuball(req.http.Cookie, "(.*?)(^|;|\s)(.*UserID=[^;]*).*", "\3; ");
        } else { 
                set req.http.tmp_userid = " ";
        }
        if (req.http.Cookie ~ "UserName") {
                set req.http.tmp_username  = regsuball(req.http.Cookie, "(.*?)(^|;|\s)(.*UserName=[^;]*).*", "\3; ");
        } else { 
                set req.http.tmp_username = " ";
        }
        if (req.http.Cookie ~ "session") {
                set req.http.tmp_session  = regsub(req.http.Cookie, "(.*)(^|;|\s)(.*session=[^;]*).*", "\3; ");
        } else { 
                set req.http.tmp_session = " ";
        }
        if (req.http.Cookie ~ "Token") {
                set req.http.tmp_token  = regsub(req.http.Cookie, "(.*)(^|;|\s)(.*Token=[^;]*).*", "\3; ");
        } else { 
                set req.http.tmp_token = " ";
        }
        if (req.http.Cookie ~ "LoggedOut") {
                set req.http.tmp_loggedout  = regsub(req.http.Cookie, "(.*)(^|;|\s)(.*LoggedOut=[^;]*).*", "\3; ");
        } else { 
                set req.http.tmp_loggedout = " ";
        }
        set req.http.Cookie = "";
        set req.http.Cookie = req.http.tmp_userid " ; " req.http.tmp_session " ; " req.http.tmp_token " ; " req.http.tmp_username " ; " req.http.tmp_loggedout;
        remove req.http.tmp_userid;
        remove req.http.tmp_session;
        remove req.http.tmp_token;
        remove req.http.tmp_loggedout;
        remove req.http.tmp_username;

        if (req.http.Authenticate) {
                pass;
        }
        lookup;
}


sub vcl_hash {
        set req.hash += req.url;
        set req.hash += req.http.host;
        hash;
}

#
## Called when entering pipe mode
#
#sub vcl_pipe {
#       pipe;
#}
#
## Called when entering pass mode
#
#sub vcl_pass {
#        set resp.http.foo = "x";
#       pass;
#}
#
## Called when entering an object into the cache
#


#
## Called when the requested object was found in the cache
#
sub vcl_hit {
       if (req.request == "PURGE") {
                error 200 "purged";
        }

        if (!obj.cacheable) {
                pass;
        }
        if (obj.http.X-Cache == "MISS") {
                set obj.http.X-Cache = "HIT";
        }

        deliver;
}
#
## Called when the requested object was not found in the cache
#
sub vcl_miss {
       if (req.request == "PURGE") {
#               nuke;
                error 200 "purged";
        }

        fetch;
}
#
## Called when the requested object has been retrieved from the
## backend, or the request to the backend has failed
#
sub vcl_fetch {
        set obj.http.X-Cache = "MISS";
        # this is the old wow ip, so issue redirect
        if (req.http.host == "216.151.156.11") {
                set obj.http.origurl = req.http.origurl;
        }

        if (!obj.cacheable) {
                set obj.http.X-Cacheable = "NO:Not-Cacheable";
                pass;
        }
        if (obj.http.Cache-Control ~ "private") {
                if(req.http.Cookie ~"(UserID|_session)") {
                        set obj.http.X-Cacheable = "NO:Got Session";
                } else {
                        set obj.http.X-Cacheable = "NO:Cache-Control=private";
                }
                pass;
        }
        if (obj.http.Set-Cookie ~ "(UserID|_session)") {
                set obj.http.X-Cacheable = "NO:Set-Cookie";
                pass;
        }

        set obj.http.X-Cacheable = "YES";
        set obj.grace = 10s;
        deliver;
}

sub vcl_prefetch {
        pass;
}

#
#
## Called before a cached object is delivered to the client
#
sub vcl_deliver {

        set resp.http.X-Served-By = "varnish2";

        # this is the old wow ip, so issue redirect
        if (resp.http.origurl) {
                set resp.http.Location = regsub(resp.http.origurl, "^" , "http://www.wowwiki.com");
                set resp.status = 301;
                set resp.response = "Moved Permanently";
                deliver;
        }



    set resp.http.Cache-Control = "private, s-maxage=0, max-age=0, must-revalidate";
    set resp.http.Expires = "Thu, 01 Jan 1970 00:00:00 GMT";
    if(!resp.http.Vary) {
        set resp.http.Vary = "Accept-Encoding,Cookie";
        }
    deliver;
}

Changed 18 months ago by sky

started as such

bin/varnishd/varnishd -w 500,2000 -a :80 -T localhost:6082 -f /home/artur/wikia-beta.vcl -u varnish -g varnish -s file,/var/lib/varnish/varnish_storage.bin,14G -p obj_workspace=32768  -p sess_workspace=32768 -p thread_pools=2 -p listen_depth=8192   -p ping_interval=0 -p log_hashstring=on -F

Changed 18 months ago by phk

  • status changed from new to closed
  • resolution set to worksforme

It looks like you ran out of workspace for http modifications, as far as I can tell you need to increase sess_workspace param.

Changed 18 months ago by sky

  • status changed from closed to reopened
  • resolution worksforme deleted

The path it took was via recv/hash/hit/deliver -- the sess_workspace is already 32kb, there is nothing very different from all the other requests. I am rather surprised how it could eat up 32768 bytes of work space for a hit

The second bug would be failing to reload the vcl after the crash.

Changed 18 months ago by sky

(gdb) bt
#0  0x0000003482e30055 in raise () from /lib64/libc.so.6
#1  0x0000003482e31af0 in abort () from /lib64/libc.so.6
#2  0x0000000000419bc2 in pan_ic (func=<value optimized out>, file=<value optimized out>, line=<value optimized out>,     cond=<value optimized out>, err=64, xxx=<value optimized out>) at cache_panic.c:325
#3  0x0000000000426054 in WS_Reserve (ws=0x2aab8d8b9028, bytes=0) at cache_ws.c:156
#4  0x0000000000420ce6 in vrt_assemble_string (hp=dwarf2_read_address: Corrupted DWARF expression.) at cache_vrt.c:131
#5  0x0000000000422af5 in VRT_SetHdr (sp=0xb122b68, where=<value optimized out>, hdr=0x2aae30291417 "\bX-Cache:",     p=0x2aae30291426 "HIT") at cache_vrt.c:179
#6  0x00002aae302903b1 in ?? ()
#7  0x00002aaeee5c4c10 in ?? ()
#8  0x000000000b122b68 in ?? ()
#9  0x000000000b122b68 in ?? ()
#10 0x000000000041f466 in VCL_hit_method (sp=0xb122b68) at ../../include/vcl_returns.h:42
#11 0x000000000040dfd8 in cnt_hit (sp=0xb122b68) at cache_center.c:518
#12 0x000000000040f8d4 in CNT_Session (sp=0x2aaeee5c2b90) at steps.h:40
#13 0x000000000041b761 in wrk_do_cnt_sess (w=0x2aaeee5c4c10, priv=<value optimized out>) at cache_pool.c:363
#14 0x000000000041a8a7 in wrk_thread (priv=0x3744e20) at cache_pool.c:276
#15 0x0000003483e062f7 in start_thread () from /lib64/libpthread.so.0#16 0x0000003482ece85d in clone () from /lib64/libc.so.6

Changed 18 months ago by phk

  • owner changed from des to phk
  • status changed from reopened to new
  • component changed from build to varnishd
  • milestone set to After 2.0

I guess this is a "Ah-ha" bug.

We cannot safely assign a string variable that is built on the object workspace in vcl_hit{} because the object is not locked.

If two worker threads tries to do it at the same time, they will collide like we see here.

I don't think this will be fixed for 2.0, it will have to be an errata.

Changed 18 months ago by phk

  • priority changed from normal to lowest

Changed 9 months ago by kb

Just as a note to others, this issue also occurs if you set obj headers in vcl_fetch(). That's a good place to mark the cached object with new headers with useful information (say, the original URL if you're rewriting), so IMHO it would be nice to have this functionality at some point.

Changed 5 weeks ago by phk

  • status changed from new to closed
  • resolution set to invalid

The new allocation policy for objects, forced on us by the persistent stevedore, makes it impossible to change objects headers once they are created.

In response to this, we have made a number of changes to the VCL, including the move in vcl_fetch from obj.* to beresp.*.

In this light, this ticket is now invalid, in the sense that it has been overtaken by events.

Note: See TracTickets for help on using tickets.