Varnish 2.0.6 thread pile-up

Dennis Hendriksen dennis.hendriksen at
Mon Dec 20 12:05:33 CET 2010


We are running Varnish 2.0.6 on two machines (dual quad core CPU, kernel
2.6.18-194.3.1.el5) through a load balancer. Both Varnishes are
connected to load-balanced (fast responding) back-ends, each handling on
average 500 requests/s (97+% hits).

Each once and a while (in the range from once an hour to once a week) we
see an instant worker thread pile-up with overflowed and sometimes
dropped work requests. During these short periods (about 15min)
everything returns to normal. The pile-ups occur at non regular
intervals at both quiet and busy times. Pile-ups occur on both machines
at the same time. During pile-ups Varnish responses are extremely slow
and transfer rates drop disrupting our services. 

Furthermore we sometimes see the following in /var/log/messages:
Dec 19 22:40:04 cache1 kernel: possible SYN flooding on port 80. Sending
Dec 19 22:41:05 cache1 kernel: possible SYN flooding on port 80. Sending

We've been looking closely at internal (back-ends) and external
(network) factors but so far haven't found a cause of the problems
described above.

Anyone has a clue what could be the cause of this problem? Or recognizes
this issue?
Thanks for your time,



Varnish config (storage: malloc 10GB):
accept_fd_holdoff          50 [ms]
acceptor                   default (epoll, poll)
auto_restart               on [bool]
backend_http11             on [bool]
between_bytes_timeout      60.000000 [s]
cache_vbe_conns            off [bool]
cc_command                 "exec cc -fpic -shared -Wl,-x -o %o %s"
cli_buffer                 8192 [bytes]
cli_timeout                5 [seconds]
client_http11              off [bool]
clock_skew                 10 [s]
connect_timeout            0.400000 [s]
default_grace              10
default_ttl                120 [seconds]
diag_bitmap                0x0 [bitmap]
err_ttl                    0 [seconds]
esi_syntax                 0 [bitmap]
fetch_chunksize            128 [kilobytes]
first_byte_timeout         60.000000 [s]
group                      varnish (103)
listen_address             :80
listen_depth               1024 [connections]
log_hashstring             off [bool]
log_local_address          off [bool]
lru_interval               2 [seconds]
max_esi_includes           5 [includes]
max_restarts               4 [restarts]
obj_workspace              8192 [bytes]
overflow_max               100 [%]
ping_interval              3 [seconds]
pipe_timeout               60 [seconds]
prefer_ipv6                off [bool]
purge_dups                 on [bool]
purge_hash                 on [bool]
rush_exponent              3 [requests per request]
send_timeout               600 [seconds]
sess_timeout               5 [seconds]
sess_workspace             16384 [bytes]
session_linger             50 [ms]
session_max                100000 [sessions]
shm_reclen                 255 [bytes]
shm_workspace              8192 [bytes]
srcaddr_hash               1049 [buckets]
srcaddr_ttl                0 [seconds]
thread_pool_add_delay      20 [milliseconds]
thread_pool_add_threshold  2 [requests]
thread_pool_fail_delay     200 [milliseconds]
thread_pool_max            1000 [threads]
thread_pool_min            15 [threads]
thread_pool_purge_delay    1000 [milliseconds]
thread_pool_stack          unlimited [bytes]
thread_pool_timeout        300 [seconds]
thread_pools               8 [pools]
user                       varnish (101)
vcl_trace                  off [bool]

More information about the varnish-misc mailing list