[Varnish] #51: crash in cache_acceptor_epoll.c

Varnish varnish-bugs at projects.linpro.no
Thu Oct 5 22:24:50 CEST 2006


#51: crash in cache_acceptor_epoll.c
----------------------+-----------------------------------------------------
 Reporter:  airmax    |        Owner:  des     
     Type:  defect    |       Status:  assigned
 Priority:  high      |    Milestone:          
Component:  varnishd  |      Version:  1.0     
 Severity:  blocker   |   Resolution:          
 Keywords:            |  
----------------------+-----------------------------------------------------
Comment (by airmax):

 I tried strace. Got some errors messages, then it hangs, I had to kill it
 with a SIGKILL. Retried, same thing. I suppose that it's because strace
 and threads aren't friends.

 But I think I got our murderer : Foundry Iron load balancer and its
 probes. The crash occurs while varnishd is serving its probe.
 The iron LB check connect to port 80, and sends "HEAD
 http://www.diabloteen.com/ HTTP/1.1" to the server. And then, euhm,
 foundry knows. But I suspect them of closing the connection in an ugly way
 at some point .

 Interesting part of the strace is :
 {{{
 15440 read(12, "HEAD http://www.diabloteen.com/ "..., 4095) = 44
 15440 clock_gettime(CLOCK_REALTIME, {1160078181, 3230000}) = 0
 15440 socket(PF_INET, SOCK_STREAM, IPPROTO_TCP) = 15
 15440 connect(15, {sa_family=AF_INET, sin_port=htons(80),
 sin_addr=inet_addr("192.168.4.230")}, 16) = 0
 15440 getsockname(15, {sa_family=AF_INET, sin_port=htons(36177),
 sin_addr=inet_addr("192.168.4.46")}, [16]) = 0
 15440 writev(15, [{"GET", 3}, {" ", 1}, {"http://www.diabloteen.com/",
 26}, {" ", 1}, {"HTTP/1.1", 8}, {"\r\n", 2}, {"X-Varnish: 974115635", 20},
 {"\r\n", 2}, {"Host: 192.168.4.230", 19}, {"\r\n", 2}, {"\r\n", 2}], 1
 1) = 86
 15440 read(15, "HTTP/1.1 200 OK\r\nDate: Thu, 05 O"..., 4095) = 1445
 15440 time(NULL)                        = 1160078181
 15440 open("/etc/localtime", O_RDONLY)  = 16
 15440 fstat64(16, {st_mode=S_IFREG|0644, st_size=1082, ...}) = 0
 15440 mmap2(NULL, 131072, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
 -1, 0) = 0x83a1b000
 15440 read(16,
 "TZif\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\f\0\0\0\f\0"..., 131072) =
 1082
 15440 close(16)                         = 0
 15440 munmap(0x83a1b000, 131072)        = 0
 15440 fcntl64(15, F_GETFL)              = 0x2 (flags O_RDWR)
 15440 fcntl64(15, F_SETFL, O_RDWR)      = 0
 15440 close(15)                         = 0
 15440 clock_gettime(CLOCK_REALTIME, {1160078181, 13725000}) = 0
 15440 writev(12, [{"HTTP/1.1", 8}, {" ", 1}, {"200", 3}, {" ", 1}, {"OK",
 2}, {"\r\n", 2}, {"Date: Thu, 05 Oct 2006 19:56:19 "..., 35}, {"\r\n", 2},
 {"Server: Apache/1.3.28 (Unix) Aut"..., 62}, {"\r\n", 2}, {"pics-la
 bel: (pics-1.1 \"http://ww"..., 220}, {"\r\n", 2}, {"Last-Modified: Fri,
 03 Mar 2006 "..., 44}, {"\r\n", 2}, {"ETag: \"26eb95d-3ac-44085a19\"",
 28}, {"\r\n", 2}, {"Content-Type: text/html", 23}, {"\r\n", 2}, {"Conten
 t-Length: 940", 19}, {"\r\n", 2}, {"X-Varnish: 974115635", 20}, {"\r\n",
 2}, {"Age: 0", 6}, {"\r\n", 2}, {"Via: 1.1 varnish", 16}, {"\r\n", 2},
 {"\r\n", 2}], 27) = 512
 15440 clock_gettime(CLOCK_REALTIME, {1160078181, 13924000}) = 0
 15440 write(10, "\304\34\10\10", 4 <unfinished ...>
 15438 <... epoll_wait resumed> {{EPOLLIN, {u32=134680252,
 u64=134680252}}}, 1, 100) = 1
 15440 <... write resumed> )             = 4
 15438 read(7, "\304\34\10\10", 4)       = 4
 15438 epoll_ctl(11, EPOLL_CTL_ADD, 12, {EPOLLIN|EPOLLPRI, {u32=134749380,
 u64=134749380}}) = 0
 15438 clock_gettime(CLOCK_REALTIME, {1160078181, 14120000}) = 0
 15438 epoll_wait(11,  <unfinished ...>
 15440 read(13,  <unfinished ...>
 15438 <... epoll_wait resumed> {{EPOLLIN|EPOLLERR|EPOLLHUP,
 {u32=134749380, u64=134749380}}}, 1, 100) = 1
 15438 read(12, 0x8081f44, 4095)         = -1 ECONNRESET (Connection reset
 by peer)
 15438 close(12)                         = 0
 15438 epoll_ctl(11, EPOLL_CTL_DEL, -1, NULL) = -1 EFAULT (Bad address)
 15438 write(2, "Assert error in vca_del(), cache"..., 134 <unfinished ...>
 15426 <... poll resumed> [{fd=0, events=POLLIN}, {fd=5, events=POLLIN},
 {fd=11, events=POLLIN, revents=POLLIN}], 3, 2225) = 1
 15438 <... write resumed> )             = 134
 15426 read(11,  <unfinished ...>
 15438 rt_sigprocmask(SIG_UNBLOCK, [ABRT],  <unfinished ...>
 15426 <... read resumed> "Assert error in vca_del(), cache"..., 8191) =
 134
 15438 <... rt_sigprocmask resumed> NULL, 8) = 0
 15426 write(2, "Child said (2, 15434): <<Assert "..., 162 <unfinished ...>
 15438 tgkill(15434, 15438, SIGABRT <unfinished ...>
 15426 <... write resumed> )             = 162
 15438 <... tgkill resumed> )            = 0
 15426 clock_gettime(CLOCK_MONOTONIC,  <unfinished ...>
 15438 --- SIGABRT (Aborted) @ 0 (0) ---
 }}}

 Kernel on this test machine 2.6.8.
 {{{
 Linux tdell4 2.6.8-2-686 #1 Thu May 19 17:53:30 JST 2005 i686 GNU/Linux
 }}}

 This kernel is a basic Debian Sarge kernel (kernel-image-2.6.8-2-686
 package, version 2.6.8-16).

-- 
Ticket URL: <http://varnish.projects.linpro.no/ticket/51>
Varnish <http://varnish.projects.linpro.no/>
The Varnish HTTP Accelerator


More information about the varnish-bugs mailing list