varnishd child panic
Gao Yongwei
itxx00 at gmail.com
Thu Apr 11 03:25:33 CEST 2013
Hello,lists.
I an using a varnish for testing on a CentOS6.3 box, some basic info:
[root at cdn001 ~]# uname -r
2.6.32-358.2.1.el6.x86_64
[root at cdn001 ~]# rpm -q varnish
varnish-3.0.3-3.el6.art.x86_64
[root at cdn001 ~]# free -m
total used free shared buffers cached
Mem: 24023 1054 22968 0 137 240
-/+ buffers/cache: 676 23347
Swap: 12079 0 12079
uptime:
09:05:24 up 6 days, 11:29, 1 user, load average: 0.00, 0.00, 0.00
and my /etc/sysconfig/varnish looks like below:
WORKER_STACK_SIZE=512
... ...
VARNISH_STORAGE_SIZE=16G
VARNISH_STORAGE="malloc,${VARNISH_STORAGE_SIZE}"
VARNISH_TTL=900
thread_pools=2
thread_pool_min=500
thread_pool_max=4000
thread_pool_timeout=120
thread_pool_add_delay=2
thread_pool_fail_delay=100
sess_workspace=32768
session_max=500000
thread_pool_stack=16384
connect_timeout=10
first_byte_timeout=60
between_bytes_timeout=60
DAEMON_OPTS="-a ${VARNISH_LISTEN_ADDRESS}:${VARNISH_LISTEN_PORT} \
-f ${VARNISH_VCL_CONF} \
-T
${VARNISH_ADMIN_LISTEN_ADDRESS}:${VARNISH_ADMIN_LISTEN_PORT} \
-t ${VARNISH_TTL} \
-u varnish -g varnish \
-S ${VARNISH_SECRET_FILE} \
-p thread_pools=${thread_pools} \
-p thread_pool_min=${thread_pool_min} \
-p thread_pool_max=${thread_pool_max} \
-p thread_pool_timeout=${thread_pool_timeout} \
-p thread_pool_add_delay=${thread_pool_add_delay} \
-p thread_pool_fail_delay=${thread_pool_fail_delay} \
-p sess_workspace=${sess_workspace} \
-p session_max=${session_max} \
-p connect_timeout=${connect_timeout} \
-p first_byte_timeout=${first_byte_timeout} \
-p between_bytes_timeout=${between_bytes_timeout} \
-s ${VARNISH_STORAGE}"
as my backend web servers runs in many vms,so I use ' dns director':
director dnsdomain dns {
.list = {
.port = "80";
"10.0.0.0"/24;
}
.ttl = 12h;
}
and the connect_timeout in /etc/sysconfig/varnish has been set to a large
number(10 s).
everything works good since last night, I got a panic message in system log.
By default I have 1000 workers on startup, but when this panic occur,I can
just say 500+ workers
using varnishstat command. bellow is the panic message in syslog:
Apr 10 22:12:58 cdn001 varnishd[25730]: Child (25731) Panic message: Assert
error in VRT_IP_string(), cache_vrt.c line 312:#012 Condition((p =
WS_Alloc(sp->http->ws, len)) != 0) not true.#012thread =
(cache-worker)#012ident =
Linux,2.6.32-358.2.1.el6.x86_64,x86_64,-smalloc,-smalloc,-hcritbit,epoll#012Backtrace:#012
0x42ee88: /usr/sbin/varnishd() [0x42ee88]#012 0x436dc5:
/usr/sbin/varnishd(VRT_IP_string+0x135) [0x436dc5]#012 0x7f56bc4b922f: ./
vcl.xlWKkvTA.so(+0xb622f) [0x7f56bc4b922f]#012 0x436203:
/usr/sbin/varnishd(VCL_recv_method+0x43) [0x436203]#012 0x418eaf:
/usr/sbin/varnishd(CNT_Session+0xb7f) [0x418eaf]#012 0x430bd1:
/usr/sbin/varnishd() [0x430bd1]#012 0x7f56c388f851:
/lib64/libpthread.so.0(+0x7851) [0x7f56c388f851]#012 0x7f56c35dd90d:
/lib64/libc.so.6(clone+0x6d) [0x7f56c35dd90d]#012sp = 0x7f569b089008 {#012
fd = 320, id = 320, xid = 1298044854,#012 client = 10.0.0.170 33991,#012
step = STP_RECV,#012 handling = deliver,#012 err_code = 404, err_reason
= (null),#012 restarts = 0, esi_level = 0#012 flags = #012 bodystatus =
3#012 ws = 0x7f569b089080 { overflow#012 id = "sess",#012 {s,f,r,e}
= {0x7f569b089c78,+32768,(nil),+32768},#012 },#012 http[req] = {#012
ws = 0x7f569b089080[sess]#012 "GET",#012
"/images/ico_hots2.gif",#012 "HTTP/1.1",#012 "User-Agent:
Opera/9.80 (Android; Opera Mini/6.7.30171/29.3222; U; zh) Presto/2.8.119
Version/11.10",#012 "Host: www.example.com",#012 "Accept:
text/html, application/xml;q=0.9, application/xhtml+xml, image/png,
image/webp, image/jpeg, image/gif, image/x-xbitmap, */*;q=0.1",#012
"Accept-Language: zh-cn,en;q=0.9",#012 "Accept-Encoding: gzip,
deflate",#012 "Referer: http://www.example.com/",#012
"Connection: Keep-Alive",#012 "clientip: 117.149.35.78",#012
"X-OperaMini-Features: advanced, file_system, camera, touch, folding,
viewport",#012 "Device-Stock-UA: Mozilla/5.0 (Linux; U; Android 2.3.5;
zh-cn; BOWAY I5 Build/MocorDroid2.3.5) AppleWebKit/533.1
Apr 10 22:12:58 cdn001 varnishd[25730]: child (7270) Started
Apr 10 22:12:59 cdn001 kernel: varnishd[7270]: segfault at 0 ip
000000000041067d sp 00007fff4f027a40 error 6 in varnishd[400000+70000]
Apr 10 22:12:59 cdn001 varnishd[25730]: Pushing vcls failed:#012CLI
communication error (hdr)
Apr 10 22:12:59 cdn001 varnishd[25730]: Child (7270) died signal=11
Apr 10 22:12:59 cdn001 varnishd[25730]: Child (-1) said Child starts
I have restarted varnish daemon and things looks fine now, but I am
thinking that how could this error happen? is there something wrong in my
varnish configuration ? or something else ?
thanks .
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20130411/2f0dc87c/attachment-0001.html>
More information about the varnish-misc
mailing list