Under Load: Server Unavailable/Connection Dropped/Delayed Reponse

Tejaswi Nadahalli nadahalli at gmail.com
Fri Mar 4 21:19:34 CET 2011


Under loaded conditions (3 machines doing httperf separately), I did a
separate wget on the side, and am attaching the TCPDUMP of that request. As
you can see, there is a delay in the middle where varnish didn't respond
immediately. If thread/hit-rate conditions are optimal, this delay should be
minimal I thought.

Any help would be appreciated.

-T

On Fri, Mar 4, 2011 at 2:30 PM, Tejaswi Nadahalli <nadahalli at gmail.com>wrote:

> On Fri, Mar 4, 2011 at 2:25 PM, Caunter, Stefan <scaunter at topscms.com>wrote:
>
>> There’s no health check in the backend. Not sure what that does with a one
>> hour grace. I set a short grace with
>>
>>
>>
>>   if (req.backend.healthy) {
>>
>>                 set req.grace = 60s;
>>
>>         } else {
>>
>>                 set req.grace = 4h;
>>
>>         }
>>
>
> I am still to add health-checks, directors, etc. Will add them soon. But
> those make sense if the cache-primed performance is good. In my test, I am
> requesting URLs who I know are already in the cache. Varnishstat also shows
> that - there are no cache misses at all.
>
>
>>
>>
>> You also don’t appear to select a backend in recv.
>>
>
> The default backend seems to be getting picked up automatically.
>
> -T
>
>
>>
>>
>> Stefan Caunter
>>
>> Operations
>>
>> Torstar Digital
>>
>> m: (416) 561-4871
>>
>>
>>
>>
>>
>> *From:* varnish-misc-bounces at varnish-cache.org [mailto:
>> varnish-misc-bounces at varnish-cache.org] *On Behalf Of *Tejaswi Nadahalli
>> *Sent:* March-04-11 1:23 PM
>>
>> *To:* varnish-misc at varnish-cache.org
>> *Subject:* Re: Under Load: Server Unavailable/Connection Dropped/Delayed
>> Reponse
>>
>>
>>
>> On Fri, Mar 4, 2011 at 9:43 AM, Caunter, Stefan <scaunter at topscms.com>
>> wrote:
>>
>>
>>
>> What does something like firebug show when you request during the load
>> test? The delay may be anything from DNS to the ec2 network.
>>
>>
>> The DNS requests are getting resolved super quick. I am unable to see any
>> other network issues with EC2. I have a similar machine in the same data
>> center running nginx which is doing similar loads, but with no caching
>> requirement, and it's running fine.
>>
>> In my first post, I forgot to attach my VCL, which is a bit too minimal.
>> Am I missing something obvious?
>>
>> ------
>> backend default0 {
>>     .host = "10.202.30.39";
>>     .port = "8000";
>> }
>>
>> sub vcl_recv {
>>     unset req.http.Cookie;
>>     set req.grace = 3600s;
>>     set req.url = regsub(req.url, "&refurl=.*&t=.*&c=.*&r=.*", "");
>> }
>>
>> sub vcl_deliver {
>>   if (obj.hits > 0) {
>>     set resp.http.X-Cache = "HIT";
>>   } else {
>>     set resp.http.X-Cache = "MISS";
>>   }
>> }
>> -------------------------
>>
>> Could there be some kind of TCP packet pileup that I am missing?
>>
>> -T
>>
>>
>>
>>
>> Stefan Caunter
>>
>> Operations
>>
>> Torstar Digital
>>
>> m: (416) 561-4871
>>
>>
>>
>>
>>
>> *From:* varnish-misc-bounces at varnish-cache.org [mailto:
>> varnish-misc-bounces at varnish-cache.org] *On Behalf Of *Tejaswi Nadahalli
>> *Sent:* March-04-11 1:09 AM
>> *To:* varnish-misc at varnish-cache.org
>> *Subject:* Under Load: Server Unavailable/Connection Dropped/Delayed
>> Reponse
>>
>>
>>
>> Hi Everyone,
>>
>> I am seeing a situation similar to :
>>
>>
>> http://www.varnish-cache.org/lists/pipermail/varnish-misc/2011-January/005351.html(Connections Dropped Under Load)
>>
>> http://www.varnish-cache.org/lists/pipermail/varnish-misc/2010-December/005258.html(Hanging Connections)
>>
>> I have httperf loading a varnish cache with never-expire content. While
>> the load is on, other browser/wget requests to the varnish server get
>> delayed to 10+ seconds. Any ideas what could be happening? ssh doesn't seem
>> to be impacted. So, is it some kind of thread problem?
>>
>> In production, I see a similar situation with around 1000 req/second load.
>>
>>
>> I am running varnishd with the following command line options (as per
>> http://kristianlyng.wordpress.com/2009/10/19/high-end-varnish-tuning/):
>>
>> sudo varnishd -f /etc/varnish/default.vcl -s malloc,5G -T 127.0.0.1:2000-a
>> 0.0.0.0:80 -p thread_pools=8 -p thread_pool_min=100 -p
>> thread_pool_max=5000 -p thread_pool_add_delay=2 -p cli_timeout=25 -p
>> session_linger=100 -p lru_interval=20 -t 31536000
>>
>> I am on Ubuntu Lucid 64 bit Amazon EC2 C1.XLarge with 8 processing units.
>>
>> My network sysctl parameters are tuned according to:
>> http://varnish-cache.org/trac/wiki/Performance
>> fs.file-max = 360000
>> net.ipv4.ip_local_port_range = 1024 65536
>> net.core.rmem_max = 16777216
>> net.core.wmem_max = 16777216
>> net.ipv4.tcp_rmem = 4096 87380 16777216
>> net.ipv4.tcp_wmem = 4096 65536 16777216
>> net.ipv4.tcp_fin_timeout = 3
>> net.core.netdev_max_backlog = 30000
>> net.ipv4.tcp_no_metrics_save = 1
>> net.core.somaxconn = 262144
>> net.ipv4.tcp_syncookies = 0
>> net.ipv4.tcp_max_orphans = 262144
>> net.ipv4.tcp_max_syn_backlog = 262144
>> net.ipv4.tcp_synack_retries = 2
>> net.ipv4.tcp_syn_retries = 2
>>
>>
>> Any help would be greatly appreciated
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20110304/ee7bb1d3/attachment-0001.html>
-------------- next part --------------
20:15:46.896200 IP 208.64.111.126.7544 > 10.202.30.39.80: Flags [S], seq 975218147, win 5840, options [mss 1460,sackOK,TS val 239507633 ecr 0,nop,wscale 6], length 0
20:15:46.896220 IP 10.202.30.39.80 > 208.64.111.126.7544: Flags [S.], seq 2642556500, ack 975218148, win 5792, options [mss 1460,sackOK,TS val 267323553 ecr 239507633,nop,wscale 9], length 0
20:15:46.932874 IP 208.64.111.126.7544 > 10.202.30.39.80: Flags [.], ack 1, win 92, options [nop,nop,TS val 239507639 ecr 267323553], length 0
20:15:46.932900 IP 208.64.111.126.7544 > 10.202.30.39.80: Flags [P.], seq 1:341, ack 1, win 92, options [nop,nop,TS val 239507639 ecr 267323553], length 340
20:15:46.933404 IP 10.202.30.39.80 > 208.64.111.126.7544: Flags [.], ack 341, win 14, options [nop,nop,TS val 267323556 ecr 239507639], length 0
20:16:07.129730 IP 10.202.30.39.80 > 208.64.111.126.7544: Flags [.], seq 1:2897, ack 341, win 14, options [nop,nop,TS val 267325576 ecr 239507639], length 2896
20:16:07.129752 IP 10.202.30.39.80 > 208.64.111.126.7544: Flags [.], seq 2897:4345, ack 341, win 14, options [nop,nop,TS val 267325576 ecr 239507639], length 1448
20:16:07.138422 IP 208.64.111.126.7544 > 10.202.30.39.80: Flags [.], ack 1449, win 137, options [nop,nop,TS val 239512697 ecr 267325576], length 0
20:16:07.138439 IP 10.202.30.39.80 > 208.64.111.126.7544: Flags [.], seq 4345:5793, ack 341, win 14, options [nop,nop,TS val 267325577 ecr 239512697], length 1448
20:16:07.138446 IP 10.202.30.39.80 > 208.64.111.126.7544: Flags [P.], seq 5793:5998, ack 341, win 14, options [nop,nop,TS val 267325577 ecr 239512697], length 205
20:16:07.138450 IP 208.64.111.126.7544 > 10.202.30.39.80: Flags [.], ack 2897, win 182, options [nop,nop,TS val 239512697 ecr 267325576], length 0
20:16:07.138456 IP 208.64.111.126.7544 > 10.202.30.39.80: Flags [.], ack 4345, win 227, options [nop,nop,TS val 239512697 ecr 267325576], length 0
20:16:07.148340 IP 208.64.111.126.7544 > 10.202.30.39.80: Flags [.], ack 5793, win 273, options [nop,nop,TS val 239512699 ecr 267325577], length 0
20:16:07.148350 IP 208.64.111.126.7544 > 10.202.30.39.80: Flags [.], ack 5998, win 318, options [nop,nop,TS val 239512699 ecr 267325577], length 0
20:16:07.148353 IP 208.64.111.126.7544 > 10.202.30.39.80: Flags [F.], seq 341, ack 5998, win 318, options [nop,nop,TS val 239512699 ecr 267325577], length 0
20:16:07.148441 IP 10.202.30.39.80 > 208.64.111.126.7544: Flags [F.], seq 5998, ack 342, win 14, options [nop,nop,TS val 267325578 ecr 239512699], length 0
20:16:07.156951 IP 208.64.111.126.7544 > 10.202.30.39.80: Flags [.], ack 5999, win 318, options [nop,nop,TS val 239512702 ecr 267325578], length 0


More information about the varnish-misc mailing list