Varnish dying in stevedore.c with lots of free space

Mark Moseley moseleymark at gmail.com
Mon Oct 4 21:52:45 CEST 2010


I get "Panic message: Assert error in STV_alloc(), stevedore.c line
192" a lot. The culprit is almost always huge files (and this is web
hosting so can be almost anything). The example that this came from
was a 920meg Flash video, but I've seen the same thing triggered by
zip/wmv/mpg/rar, i.e. anything that can get huge. Looking in the list
archives, this is usually diagnosed as being out of storage space. In
this case, there was almost certainly space available. This is
sm_bfree in the minutes surrounding this log entry:

Mon Oct  4 11:59:00 2010: 7190153625
Mon Oct  4 12:00:00 2010: 7127413555
Mon Oct  4 12:01:00 2010: 15418954547
Mon Oct  4 12:02:00 2010: 22966786458

The probability that there was also another 6.1gb dropped into the
cache in the same minute is really really low. This is 64-bit Lenny on
Dell Poweredge 1950s, all with 4gb. I've seen this error on disk-based
caches running on regular drives as well as SSDs. I got the same
stevedore.c error on another box over the weekend and that's got a 47
gig varnish cachefile, and I've never even seen that dip less than
10gb free; when this box's varnishd hit that error, it had around 27gb
free in sm_bfree.

It happens regardless of the file type, though it's always very very
large (at least every example I've looked at). I've seen this error a
few times a day whenever I'm using disk-based cache -- with malloc, I
get the stevedore.c error and "Panic message: Assert error in
fetch_straight(), cache_fetch.c line 64", but less often. Is there
anything besides out-of-space that can trigger this?

For the (poorly sanitized) logs below, this is the invocation:

/srv/varnish/varnish-amd64/sbin/varnishd -a :8099 -T :8100 -f
/srv/varnish/etc/varnish/customer.vcl-amd64-ssd -t 0 -l 80m -s
file,/var/cache/varnish/cache,23000M -u nobody -P /var/run/varnishd -p
listen_depth 4096 -p thread_pools 6 -p thread_pool_max 800 -p
thread_pool_min 200 -p lru_interval 60 -p cli_timeout 30 -p
ping_interval 5 -p default_grace 120 -p thread_pool_stack 1048576 -p
cache_vbe_conns on

Anything I can do to further debug this? I'm also entirely receptive
to any recommendations on varnishd command line :)

Thanks!


Oct 4 16:01:25 myserver varnishd[1864]: Child (17243) died signal=6
Oct 4 16:01:25 myserver varnishd[1864]: Child (17243) Panic message:
Assert error in STV_alloc(), stevedore.c line 192:
Oct 4 12:01:25 myserver Condition((st): != NULL) not true.
Oct 4 12:01:25 myserver thread: = (cache-worker)
Oct 4 12:01:25 myserver ident: = Linux,2.6.35.6,x86_64,-sfile,-hcritbit,epoll
Oct 4 12:01:25 myserver Backtrace:
Oct 4 12:01:25 myserver 0x4248f3: pan_ic+b3
Oct 4 12:01:25 myserver 0x43a655: STV_alloc+125
Oct 4 12:01:25 myserver 0x41c5f6: FetchBody+4e6
Oct 4 12:01:25 myserver 0x413d00: cnt_fetch+680
Oct 4 12:01:25 myserver 0x4152fd: CNT_Session+35d
Oct 4 12:01:25 myserver 0x426d93: wrk_do_cnt_sess+93
Oct 4 12:01:25 myserver 0x42608e: wrk_thread_real+34e
Oct 4 12:01:25 myserver 0x3b034f5ffc7: _end+3b0348eaa1f
Oct 4 12:01:25 myserver 0x3b03483a64d: _end+3b0341c50a5
Oct 4 12:01:25 myserver sp: = 0x3aa9072b008 {
Oct 4 12:01:25 myserver fd: = 12, id = 12, xid = 1657464786,
Oct 4 12:01:25 myserver client: = 192.168.0.40:43834,
Oct 4 12:01:25 myserver step: = STP_FETCH,
Oct 4 12:01:25 myserver handling: = deliver,
Oct 4 12:01:25 myserver err_code: = 200, err_reason = (null),
Oct 4 12:01:25 myserver restarts: = 0, esis = 0
Oct 4 12:01:25 myserver ws: = 0x3aa9072b078 {
Oct 4 12:01:25 myserver id: = "sess",
Oct 4 12:01:25 myserver {s,f,r,e}: = {0x3aa9072bcd0,+576,(nil),+65536},
Oct 4 12:01:25 myserver },:
Oct 4 12:01:25 myserver http[req]: = {
Oct 4 12:01:25 myserver ws: = 0x3aa9072b078[sess]
Oct 4 12:01:25 myserver "GET",:
Oct 4 12:01:25 myserver "/my/url",:
Oct 4 12:01:25 myserver "HTTP/1.0",:
Oct 4 12:01:25 myserver "Host: example.com
Oct 4 12:01:25 myserver "X-EN-FWD-IP: 1.2.3.4",
Oct 4 12:01:25 myserver "Connection: close",
Oct 4 12:01:25 myserver "Accept: */*",
Oct 4 12:01:25 myserver "Accept-Language: en-US",
Oct 4 12:01:25 myserver "Referer: http://example.com/some/url",
Oct 4 12:01:25 myserver "x-flash-version: 10,0,32,18",
Oct 4 12:01:25 myserver "User-Agent: Mozilla/4.0 (compatible; MSIE
8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR
3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.2;
OfficeLiveConnector.1.4; OfficeLivePatch.1.3)",
Oct 4 12:01:25 myserver "Accept-Encoding: gzip",
Oct 4 12:01:25 myserver },:
Oct 4 12:01:25 myserver worker: = 0x70424e70 {
Oct 4 12:01:25 myserver ws: = 0x70424fe0 {
Oct 4 12:01:25 myserver id: = "wrk",
Oct 4 12:01:25 myserver {s,f,r,e}: = {0x70412e00,+32792,(nil),+65536},
Oct 4 12:01:25 myserver },:
Oct 4 12:01:25 myserver http[bereq]: = {
Oct 4 12:01:25 myserver ws: = 0x70424fe0[wrk]
Oct 4 12:01:25 myserver "GET",:
Oct 4 12:01:25 myserver "/my/url",:
Oct 4 12:01:25 myserver "HTTP/1.1",:
Oct 4 12:01:25 myserver "Host: example.com
Oct 4 12:01:25 myserver "X-EN-FWD-IP: 1.2.3.4",
Oct 4 12:01:25 myserver "Accept: */*",
Oct 4 12:01:25 myserver "Accept-Language: en-US",
Oct 4 12:01:25 myserver "Referer: http://example.com/some/url",
Oct 4 12:01:25 myserver "x-flash-version: 10,0,32,18",
Oct 4 12:01:25 myserver "User-Agent: Mozilla/4.0 (compatible; MSIE
8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR
3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.2;
OfficeLiveConnector.1.4; OfficeLivePatch.1.3)",
Oct 4 12:01:25 myserver "Accept-Encoding: gzip",
Oct 4 12:01:25 myserver "X-Varnish: 1657464786",
Oct 4 12:01:25 myserver },:
Oct 4 12:01:25 myserver http[beresp]: = {
Oct 4 12:01:25 myserver ws: = 0x70424fe0[wrk]
Oct 4 12:01:25 myserver "HTTP/1.1",:
Oct 4 12:01:25 myserver "200",:
Oct 4 12:01:25 myserver "OK",:
Oct 4 12:01:25 myserver "Date: Mon, 04 Oct 2010 15:59:45 GMT",
Oct 4 12:01:25 myserver "Server: Apache",
Oct 4 12:01:25 myserver "Last-Modified: Mon, 30 Aug 2010 23:48:54 GMT",
Oct 4 12:01:25 myserver "ETag: "b40ecc95-36e43bb5-48f131b7121b1"",
Oct 4 12:01:25 myserver "Accept-Ranges: bytes",
Oct 4 12:01:25 myserver "Content-Length: 920927157",
Oct 4 12:01:25 myserver "Cache-Control: max-age=14400, public",
Oct 4 12:01:25 myserver "Expires: Mon, 04 Oct 2010 19:59:45 GMT",
Oct 4 12:01:25 myserver "Connection: close",
Oct 4 12:01:25 myserver "Content-Type: text/html",
Oct 4 12:01:25 myserver },:
Oct 4 12:01:25 myserver },:
Oct 4 12:01:25 myserver vcl: = {
Oct 4 12:01:25 myserver srcname: = {
Oct 4 12:01:25 myserver "input",:
Oct 4 12:01:25 myserver "Default",:
Oct 4 12:01:25 myserver "/srv/varnish/etc/varnish/backends_def.vcl",:
Oct 4 12:01:25 myserver "/srv/varnish/etc/varnish/backends_set.vcl",:
Oct 4 12:01:25 myserver },:
Oct 4 12:01:25 myserver },:
Oct 4 12:01:25 myserver obj: = 0x3ac3f979000 {
Oct 4 12:01:25 myserver xid: = 1657464786,
Oct 4 12:01:25 myserver ws: = 0x3ac3f979020 {
Oct 4 12:01:25 myserver id: = "obj",
Oct 4 12:01:25 myserver {s,f,r,e}: = {0x3ac3f979228,+280,(nil),+3544},
Oct 4 12:01:25 myserver },:
Oct 4 12:01:25 myserver http[obj]: = {
Oct 4 12:01:25 myserver ws: = 0x3ac3f979020[obj]
Oct 4 12:01:25 myserver "HTTP/1.1",:
Oct 4 12:01:25 myserver "200",:
Oct 4 12:01:25 myserver "OK",:
Oct 4 12:01:25 myserver "Date: Mon, 04 Oct 2010 15:59:45 GMT",
Oct 4 12:01:25 myserver "Server: Apache",
Oct 4 12:01:25 myserver "Last-Modified: Mon, 30 Aug 2010 23:48:54 GMT",
Oct 4 12:01:25 myserver "ETag: "b40ecc95-36e43bb5-48f131b7121b1"",
Oct 4 12:01:25 myserver "Cache-Control: max-age=14400, public",
Oct 4 12:01:25 myserver "Expires: Mon, 04 Oct 2010 19:59:45 GMT",
Oct 4 12:01:25 myserver "Content-Type: text/html",
Oct 4 12:01:25 myserver },:
Oct 4 12:01:25 myserver len: = 0,
Oct 4 12:01:25 myserver store: = {
Oct 4 12:01:25 myserver },:
Oct 4 12:01:25 myserver },:
Oct 4 12:01:25 myserver },:
Oct 4 16:01:26 myserver varnishd[1864]: Child cleanup complete
Oct 4 16:01:26 myserver varnishd[1864]: child (25617) Started
Oct 4 16:01:26 myserver varnishd[1864]: Child (25617) said
Oct 4 16:01:26 myserver varnishd[1864]: Child (25617) said Child starts
Oct 4 16:01:26 myserver varnishd[1864]: Child (25617) said managed to
mmap 24117248000 bytes of 24117248000
Oct 4 16:01:30 myserver varnishd[1864]: Child (25617) died signal=6
Oct 4 16:01:30 myserver varnishd[1864]: Child (25617) Panic message:
Assert error in STV_alloc(), stevedore.c line 192:
Oct 4 12:01:30 myserver Condition((st): != NULL) not true.
Oct 4 12:01:30 myserver thread: = (cache-worker)
Oct 4 12:01:30 myserver ident: = Linux,2.6.35.6,x86_64,-sfile,-hcritbit,epoll
Oct 4 12:01:30 myserver Backtrace:
Oct 4 12:01:30 myserver 0x4248f3: pan_ic+b3
Oct 4 12:01:30 myserver 0x43a655: STV_alloc+125
Oct 4 12:01:30 myserver 0x41c5f6: FetchBody+4e6
Oct 4 12:01:30 myserver 0x413d00: cnt_fetch+680
Oct 4 12:01:30 myserver 0x4152fd: CNT_Session+35d
Oct 4 12:01:30 myserver 0x426d93: wrk_do_cnt_sess+93
Oct 4 12:01:30 myserver 0x42608e: wrk_thread_real+34e
Oct 4 12:01:30 myserver 0x3b034f5ffc7: _end+3b0348eaa1f
Oct 4 12:01:30 myserver 0x3b03483a64d: _end+3b0341c50a5
Oct 4 12:01:30 myserver sp: = 0x3aa90946008 {
Oct 4 12:01:30 myserver fd: = 33, id = 33, xid = 733630841,
Oct 4 12:01:30 myserver client: = 192.168.0.40:44005,
Oct 4 12:01:30 myserver step: = STP_FETCH,
Oct 4 12:01:30 myserver handling: = deliver,
Oct 4 12:01:30 myserver err_code: = 200, err_reason = (null),
Oct 4 12:01:30 myserver restarts: = 0, esis = 0
Oct 4 12:01:30 myserver ws: = 0x3aa90946078 {
Oct 4 12:01:30 myserver id: = "sess",
Oct 4 12:01:30 myserver {s,f,r,e}: = {0x3aa90946cd0,+576,(nil),+65536},
Oct 4 12:01:30 myserver },:
Oct 4 12:01:30 myserver http[req]: = {
Oct 4 12:01:30 myserver ws: = 0x3aa90946078[sess]
Oct 4 12:01:30 myserver "GET",:
Oct 4 12:01:30 myserver "/my/url",:
Oct 4 12:01:30 myserver "HTTP/1.0",:
Oct 4 12:01:30 myserver "Host: example.com
Oct 4 12:01:30 myserver "X-EN-FWD-IP: 1.2.3.4",
Oct 4 12:01:30 myserver "Connection: close",
Oct 4 12:01:30 myserver "Accept: */*",
Oct 4 12:01:30 myserver "Accept-Language: en-US",
Oct 4 12:01:30 myserver "Referer: http://example.com/some/url",
Oct 4 12:01:30 myserver "x-flash-version: 10,0,32,18",
Oct 4 12:01:30 myserver "User-Agent: Mozilla/4.0 (compatible; MSIE
8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR
3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.2;
OfficeLiveConnector.1.4; OfficeLivePatch.1.3)",
Oct 4 12:01:30 myserver "Accept-Encoding: gzip",
Oct 4 12:01:30 myserver },:
Oct 4 12:01:30 myserver worker: = 0x4d9b6e70 {
Oct 4 12:01:30 myserver ws: = 0x4d9b6fe0 {
Oct 4 12:01:30 myserver id: = "wrk",
Oct 4 12:01:30 myserver {s,f,r,e}: = {0x4d9a4e00,+32792,(nil),+65536},
Oct 4 12:01:30 myserver },:
Oct 4 12:01:30 myserver http[bereq]: = {
Oct 4 12:01:30 myserver ws: = 0x4d9b6fe0[wrk]
Oct 4 12:01:30 myserver "GET",:
Oct 4 12:01:30 myserver "/my/url",:
Oct 4 12:01:30 myserver "HTTP/1.1",:
Oct 4 12:01:30 myserver "Host: example.com
Oct 4 12:01:30 myserver "X-EN-FWD-IP: 1.2.3.4",
Oct 4 12:01:30 myserver "Accept: */*",
Oct 4 12:01:30 myserver "Accept-Language: en-US",
Oct 4 12:01:30 myserver "Referer: http://example.com/some/url",
Oct 4 12:01:30 myserver "x-flash-version: 10,0,32,18",
Oct 4 12:01:30 myserver "User-Agent: Mozilla/4.0 (compatible; MSIE
8.0; Windows NT 6.1; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR
3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; InfoPath.2;
OfficeLiveConnector.1.4; OfficeLivePatch.1.3)",
Oct 4 12:01:30 myserver "Accept-Encoding: gzip",
Oct 4 12:01:30 myserver "X-Varnish: 733630841",
Oct 4 12:01:30 myserver },:
Oct 4 12:01:30 myserver http[beresp]: = {
Oct 4 12:01:30 myserver ws: = 0x4d9b6fe0[wrk]
Oct 4 12:01:30 myserver "HTTP/1.1",:
Oct 4 12:01:30 myserver "200",:
Oct 4 12:01:30 myserver "OK",:
Oct 4 12:01:30 myserver "Date: Mon, 04 Oct 2010 16:01:30 GMT",
Oct 4 12:01:30 myserver "Server: Apache",
Oct 4 12:01:30 myserver "Last-Modified: Mon, 30 Aug 2010 23:48:54 GMT",
Oct 4 12:01:30 myserver "ETag: "b40ecc95-36e43bb5-48f131b7121b1"",
Oct 4 12:01:30 myserver "Accept-Ranges: bytes",
Oct 4 12:01:30 myserver "Content-Length: 920927157",
Oct 4 12:01:30 myserver "Cache-Control: max-age=14400, public",
Oct 4 12:01:30 myserver "Expires: Mon, 04 Oct 2010 20:01:30 GMT",
Oct 4 12:01:30 myserver "Connection: close",
Oct 4 12:01:30 myserver "Content-Type: text/html",
Oct 4 12:01:30 myserver },:
Oct 4 12:01:30 myserver },:
Oct 4 12:01:30 myserver vcl: = {
Oct 4 12:01:30 myserver srcname: = {
Oct 4 12:01:30 myserver "input",:
Oct 4 12:01:30 myserver "Default",:
Oct 4 12:01:30 myserver "/srv/varnish/etc/varnish/backends_def.vcl",:
Oct 4 12:01:30 myserver "/srv/varnish/etc/varnish/backends_set.vcl",:
Oct 4 12:01:30 myserver },:
Oct 4 12:01:30 myserver },:
Oct 4 12:01:30 myserver obj: = 0x3b001f66000 {
Oct 4 12:01:30 myserver xid: = 733630841,
Oct 4 12:01:30 myserver ws: = 0x3b001f66020 {
Oct 4 12:01:30 myserver id: = "obj",
Oct 4 12:01:30 myserver {s,f,r,e}: = {0x3b001f66228,+280,(nil),+3544},
Oct 4 12:01:30 myserver },:
Oct 4 12:01:30 myserver http[obj]: = {
Oct 4 12:01:30 myserver ws: = 0x3b001f66020[obj]
Oct 4 12:01:30 myserver "HTTP/1.1",:
Oct 4 12:01:30 myserver "200",:
Oct 4 12:01:30 myserver "OK",:
Oct 4 12:01:30 myserver "Date: Mon, 04 Oct 2010 16:01:30 GMT",
Oct 4 12:01:30 myserver "Server: Apache",
Oct 4 12:01:30 myserver "Last-Modified: Mon, 30 Aug 2010 23:48:54 GMT",
Oct 4 12:01:30 myserver "ETag: "b40ecc95-36e43bb5-48f131b7121b1"",
Oct 4 12:01:30 myserver "Cache-Control: max-age=14400, public",
Oct 4 12:01:30 myserver "Expires: Mon, 04 Oct 2010 20:01:30 GMT",
Oct 4 12:01:30 myserver "Content-Type: text/html",
Oct 4 12:01:30 myserver },:
Oct 4 12:01:30 myserver len: = 0,
Oct 4 12:01:30 myserver store: = {
Oct 4 12:01:30 myserver },:
Oct 4 12:01:30 myserver },:
Oct 4 12:01:30 myserver },:
Oct 4 16:01:30 myserver varnishd[1864]: Child cleanup complete
Oct 4 16:01:30 myserver varnishd[1864]: child (25832) Started
Oct 4 16:01:30 myserver varnishd[1864]: Child (25832) said
Oct 4 16:01:30 myserver varnishd[1864]: Child (25832) said Child starts
Oct 4 16:01:30 myserver varnishd[1864]: Child (25832) said managed to
mmap 24117248000 bytes of 24117248000




More information about the varnish-misc mailing list