Varnish hangs / requests time out

Paras Fadte plfgoa at gmail.com
Thu Mar 5 11:44:59 CET 2009


I too have encountered varnish hanging many times .

On Wed, Mar 4, 2009 at 10:46 AM, Ross Brown <ross at trademe.co.nz> wrote:
> Hi all
>
> We are hoping to use Varnish for serving image content on our reasonably busy auction site here in New Zealand, but are having an interesting problem during testing.
>
> We are using latest Varnish (2.0.3) on Ubuntu 8.10 server (64-bit) and have built two servers for testing - both are located in the same datacentre and situated behind an F5 hardware load balancer. We want to keep all images cached in RAM and are using Varnish with jemalloc to achieve this. For the most part, Varnish is working well for us and performance is great.
>
> However, we have seen both our Varnish servers lock up at precisely the same time and stop processing incoming HTTP requests until Varnishd is manually restarted. This has happened twice and seems to occur at random - the last time was after 5 days of uptime and a significant amount of processed traffic (<1TB).
>
> When this problem happens, the backend is still reachable and happily serving images. It is not a particularly busy period for us (600 requests/sec/Varnish server - approx 350Mbps outbound each - we got up to nearly 3 times that level without incident previously) but for some reason unknown to us, the servers just suddenly stop processing requests and worker processes increase dramatically.
>
> After the lockup happened last time, I tried firing up varnishlog and hitting the server directly - my requests were not showing up at all. The *only* entries in the varnish log were related to worker processes being killed over time - no PINGs, PONGs, load balancer healthchecks or anything related to 'normal' varnish activity. It's as if varnishd has completely locked up, but we can't understand what causes both our varnish servers to exhibit this behaviour at exactly the same time, nor why varnish does not detect it and attempt a restart. After a restart, varnish is fine and behaves itself.
>
> There is nothing to indicate an error with the backend, nor anything in syslog to indicate a Varnish problem. Pointers of any kind would be appreciated :)
>
> Best regards
>
> Ross Brown
> Trade Me
> www.trademe.co.nz
>
> *** Startup Options (as per hints in wiki for caching millions of objects):
> -a 0.0.0.0:80 -f /usr/local/etc/default.net.vcl -T 0.0.0.0:8021 -t 86400 -h classic,1200007 -p thread_pool_max=4000 -p thread_pools=4 -p listen_depth=4096 -p lru_interval=3600 -p obj_workspace=4096 -s malloc,10G
>
> *** Running VCL:
> backend default {
>        .host = "10.10.10.10";
>        .port = "80";
> }
>
> sub vcl_recv {
>        # Don't cache objects requested with query string in URI.
>        # Needed for newsletter headers (openrate) and health checks.
>        if (req.url ~ "\?.*") {
>                pass;
>        }
>
>        # Force lookup if the request is a no-cache request from the client.
>        if (req.http.Cache-Control ~ "no-cache") {
>                unset req.http.Cache-Control;
>                lookup;
>        }
>
>        # By default, Varnish will not serve requests that come with a cookie from its cache.
>        unset req.http.cookie;
>        unset req.http.authenticate;
>
>        # No action here, continue into default vcl_recv{}
> }
>
>
> ***Stats
>      458887  Client connections accepted
>   170714631  Client requests received
>   133012763  Cache hits
>        3715  Cache hits for pass
>    27646213  Cache misses
>    37700868  Backend connections success
>           0  Backend connections not attempted
>           0  Backend connections too many
>          40  Backend connections failures
>    37512808  Backend connections reuses
>    37514682  Backend connections recycles
>           0  Backend connections unused
>        1339  N struct srcaddr
>          16  N active struct srcaddr
>         756  N struct sess_mem
>          12  N struct sess
>      761152  N struct object
>      761243  N struct objecthead
>           0  N struct smf
>           0  N small free smf
>           0  N large free smf
>         322  N struct vbe_conn
>         345  N struct bereq
>          20  N worker threads
>        2331  N worker threads created
>           0  N worker threads not created
>           0  N worker threads limited
>           0  N queued work requests
>       35249  N overflowed work requests
>           0  N dropped work requests
>           1  N backends
>          44  N expired objects
>    26886639  N LRU nuked objects
>           0  N LRU saved objects
>    15847787  N LRU moved objects
>           0  N objects on deathrow
>           3  HTTP header overflows
>           0  Objects sent with sendfile
>   164595318  Objects sent with write
>           0  Objects overflowing workspace
>      458886  Total Sessions
>   170715215  Total Requests
>         306  Total pipe
>    10054413  Total pass
>    37700586  Total fetch
>  49458782160  Total header bytes
> 1151144727614  Total body bytes
>       89464  Session Closed
>           0  Session Pipeline
>           0  Session Read Ahead
>           0  Session Linger
>   170622902  Session herd
>  7875546129  SHM records
>   380705819  SHM writes
>         138  SHM flushes due to overflow
>      763205  SHM MTX contention
>        2889  SHM cycles through buffer
>           0  allocator requests
>           0  outstanding allocations
>           0  bytes allocated
>           0  bytes free
>   101839895  SMA allocator requests
>     1519005  SMA outstanding allocations
>  10736616112  SMA outstanding bytes
> 562900737623  SMA bytes allocated
> 552164121511  SMA bytes free
>          56  SMS allocator requests
>           0  SMS outstanding allocations
>           0  SMS outstanding bytes
>       25712  SMS bytes allocated
>       25712  SMS bytes freed
>    37700490  Backend requests made
>           3  N vcl total
>           3  N vcl available
>           0  N vcl discarded
>           1  N total active purges
>           1  N new purges added
>           0  N old purges deleted
>           0  N objects tested
>           0  N regexps tested against
>           0  N duplicate purges removed
>           0  HCB Lookups without lock
>           0  HCB Lookups with lock
>           0  HCB Inserts
>           0  Objects ESI parsed (unlock)
>           0  ESI parse errors (unlock)
>
>
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-misc
>



More information about the varnish-misc mailing list