Child panics on OpenSolaris
Paul Wright
wrighty+varnishmisc at gmail.com
Thu Mar 4 12:53:10 CET 2010
On 22 February 2010 18:02, Paul Wright <wrighty+varnishmisc at gmail.com> wrote:
...
> For anyone else following along I've now had varnish running for over
> 5 hours without issue, here are the things I found out:
>
> * add the Range unsetting code to ensure that such requests don't make
> it through to the back end
> * remove the TCP_Assert() that wraps the setsockopt() call on line 184
> of bin/varnishd/cache_acceptor.c
> * compile with gcc, not Sun Studio (there's still some sort of
> funniness with TCP_(non)blocking() )
>
> CC=/usr/bin/gcc CFLAGS="-O3 -L/lib/amd64 -pthreads -m64
> -fomit-frame-pointer" LDFLAGS="-lumem -pthreads" ./configure
> --prefix=/opt
>
> * pass the right flags through to gcc when launching vanishd
>
> newtask -p highfile /opt/sbin/varnishd -f /opt/etc/varnish/firebox.vcl -F \
> -p 'cc_command=/usr/bin/gcc -fpic -shared -m64 -o %o %s' \
> -T 127.0.0.1:9001 \
> -s malloc,2G \
> -p sess_timeout=5s \
> -p max_restarts=12 \
> -p waiter=poll \
> -p connect_timeout=0s \
> -p sess_workspace=65536
>
> * keep checking http://letsgetdugg.com/2009/12/04/varnish-on-solaris/
> for hints and suggestions
Latest update, we're seeing "Connection refused" panics like the following:
Child (1955) died signal=6
Child (1955) Panic message: Assert error in TCP_blocking(), tcp.c line 164:
Condition(TCP_Check(j)) not true.
errno = 146 (Connection refused)
thread = (cache-worker)
ident = -smalloc,-hcritbit,poll
Backtrace:
42fb51: /opt/sbin/varnishd'pan_ic+0xb1 [0x42fb51]
2f: [0x2f]
sp = 13570018 {
fd = 47, id = 47, xid = 0,
client = ?.?.?.?:?,
step = STP_FIRST,
handling = deliver,
restarts = 0, esis = 0
ws = 13570088 {
id = "sess",
{s,f,r,e} = {13570d90,13570d90,0,+65536},
},
http[req] = {
ws = 13570088[sess]
"",
"/i/nl/all/0.gif",
"HTTP/1.1",
"Host: media.firebox.com",
"User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_2;
en-us) AppleWebKit/531.21.8 (KHTML, like Gecko)",
"Accept: */*",
"Accept-Language: en-us",
"If-Modified-Since: Tue, 22 Sep 2009 15:34:33 GMT",
"If-None-Match: "16a8070-2b-4742c55c1f440"",
"Connection: keep-alive",
"X-Forwarded-For: 85.189.102.193",
},
worker = fffffd7ff7bf1d80 {
ws = fffffd7ff7bf1ec8 {
id = "wrk",
{s,f,r,e} = {fffffd7ff7bdfcb0,fffffd7ff7bdfcb0,0,+65536},
},
},
},
Interesting things to note, we're confident that this request is a
cache hit which rules out the backend (handling = deliver). Also the
client address appears to have been mangled:
client = ?.?.?.?:?,
Would this cause varnish to attempt opening a connection which is then refused?
As a workaround would it be advisable to add a clause to TCP_Check (in
include/libvarnish.h) to skip over errno 146 (Connection refused)
along with the existing ECONNRESET and ENOTCONN clauses?
Cheers,
Paul.
More information about the varnish-misc
mailing list