[Varnish] #38: Fault tree for performance diagnosis
Varnish
varnish-bugs at projects.linpro.no
Sun Aug 20 10:08:52 CEST 2006
#38: Fault tree for performance diagnosis
----------------------+-----------------------------------------------------
Reporter: phk | Owner: phk
Type: defect | Status: new
Priority: normal | Milestone:
Component: varnishd | Version:
Severity: normal | Resolution:
Keywords: |
----------------------+-----------------------------------------------------
Old description:
> This is a fault tree we can work through to eliminate theories:
>
> {{{
> Lower than expected traffic handling
> Alteon not allocating traffic
> ''Packet loss''
> Packet delay
> TCP/session setup failures or rejections
> Bad Varnish responsetime
>
> Alteon not allocating traffic
> Bad health-check response time
> Health-check failues
> TCP/session setup failures or rejections
> TCP/session count confusion
> Bad Varnish responsetime
>
> Bad health-check response time
> ''Packet loss''
> Packet delay
> TCP/session setup failures or rejections
> Bad Varnish responsetime
>
> Health-check failues
> ''Packet loss''
> Packet delay
> TCP/session setup failures or rejections
>
> Packet loss
> Alteon interface
> GigE switch
> bge1 interface
> FreeBSD Network stack bugs
> FreeBSD resource starvation
>
> Packet delay
> Alteon bugs
> Alteon interface
> GigE switch
> bge1 interface
> FreeBSD Network stack bugs
> FreeBSD rate limiting
> FreeBSD resource starvation
>
> TCP/session setup failures or rejections
> FreeBSD Network stack bugs
> FreeBSD rate limiting
> FreeBSD firewalling
> FreeBSD routing
> FreeBSD resource starvation
> Varnish acceptor bugs
> Varnish acceptor resource starvation
>
> Bad Varnish responsetime
> varnish acceptor bugs
> varnish response bugs
> varnish lock contention
> varnish resource starvation
> thread library schedule bugs
> FreeBSD rate limiting
> FreeBSD resource starvation (sendfile ?)
>
> FreeBSD resource starvation
> is sysctl kern.ipc.somaxconn: 128 enough ?
>
> Packet loss
> An aggressive ping-test does not show any losses.
> This is not conclusive, but at least we can defer further
> Investigation until later.
> }}}
New description:
This is a fault tree we can work through to eliminate theories:
{{{
Lower than expected traffic handling
Alteon not allocating traffic
-Packet loss
Packet delay
TCP/session setup failures or rejections
Bad Varnish responsetime
Alteon not allocating traffic
Bad health-check response time
Health-check failues
TCP/session setup failures or rejections
TCP/session count confusion
Bad Varnish responsetime
Bad health-check response time
-Packet loss
Packet delay
TCP/session setup failures or rejections
Bad Varnish responsetime
Health-check failues
-Packet loss
Packet delay
TCP/session setup failures or rejections
Packet loss
Alteon interface
GigE switch
bge1 interface
FreeBSD Network stack bugs
FreeBSD resource starvation
Packet delay
Alteon bugs
Alteon interface
GigE switch
bge1 interface
FreeBSD Network stack bugs
FreeBSD rate limiting
FreeBSD resource starvation
TCP/session setup failures or rejections
FreeBSD Network stack bugs
FreeBSD rate limiting
FreeBSD firewalling
FreeBSD routing
FreeBSD resource starvation
Varnish acceptor bugs
Varnish acceptor resource starvation
Bad Varnish responsetime
varnish acceptor bugs
varnish response bugs
varnish lock contention
varnish resource starvation
thread library schedule bugs
FreeBSD rate limiting
FreeBSD resource starvation (sendfile ?)
FreeBSD resource starvation
is sysctl kern.ipc.somaxconn: 128 enough ?
Packet loss
An aggressive ping-test does not show any losses.
This is not conclusive, but at least we can defer further
Investigation until later.
Packet delay
This one is mightlig suspect.
The 200msec delays we see against the Alteon is not only
present with the alteons health-check but also on a ping:
c21# ping -i .001 -c 1000 -q 10.0.2.1
round-trip min/avg/max/stddev = 0.171/0.354/1.038/0.109 ms
c21# ping -i .001 -c 1000 -q 10.0.0.2
round-trip min/avg/max/stddev = 0.193/18.655/220.481/46.626 ms
But the squids also see the 200msec delay in a ping test:
c1# ping -i .001 -c 1000 -q 10.0.0.2
round-trip min/avg/max/stddev = 0.219/22.747/222.899/50.803 ms
So this is not unique to us.
}}}
--
Ticket URL: <http://varnish.projects.linpro.no/ticket/38>
Varnish <http://varnish.projects.linpro.no/>
The Varnish HTTP Accelerator
More information about the varnish-bugs
mailing list