Varnish and TCP Incast Throughput Collapse

Thu Jul 6 09:08:20 CEST 2017

Two things: do you get the same results when the client is directly on the
Varnish server? (ie. not going through the switch) And is each new request
opening a new connection?

-- 
Guillaume Quintard

On Thu, Jul 6, 2017 at 6:45 AM, Andrei <lagged at gmail.com> wrote:

> Out of curiosity, what does ethtool show for the related nics on both
> servers? I also have Varnish on a 10G server, and can reach around
> 7.7Gbit/s serving anywhere between 6-28k requests/second, however it did
> take some sysctl tuning and the westwood TCP congestion control algo
>
> On Wed, Jul 5, 2017 at 3:09 PM, John Salmon <John.Salmon at deshawresearch.
> com> wrote:
>
>> I've been using Varnish in an "intranet" application.  The picture is
>> roughly:
>>
>>   origin <-> Varnish <-- 10G channel ---> switch <-- 1G channel --> client
>>
>> The machine running Varnish is a high-performance server.  It can
>> easily saturate a 10Gbit channel.  The machine running the client is a
>> more modest desktop workstation, but it's fully capable of saturating
>> a 1Gbit channel.
>>
>> The client makes HTTP requests for objects of size 128kB.
>>
>> When the client makes those requests serially, "useful" data is
>> transferred at about 80% of the channel bandwidth of the Gigabit
>> link, which seems perfectly reasonable.
>>
>> But when the client makes the requests in parallel (typically
>> 4-at-a-time, but it can vary), *total* throughput drops to about 25%
>> of the channel bandwidth, i.e., about 30Mbyte/sec.
>>
>> After looking at traces and doing a fair amount of experimentation, we
>> have reached the tentative conclusion that we're seeing "TCP Incast
>> Throughput Collapse" (see references below)
>>
>> The literature on "TCP Incast Throughput Collapse" typically describes
>> scenarios where a large number of servers overwhelm a single inbound
>> port.  I haven't found any discussion of incast collapse with only one
>> server, but it seems like a natural consequence of a 10Gigabit-capable
>> server feeding a 1-Gigabit downlink.
>>
>> Has anybody else seen anything similar?  With Varnish or other single
>> servers on 10Gbit to 1Gbit links.
>>
>> The literature offers a variety of mitigation strategies, but there are
>> non-trivial tradeoffs and none appears to be a silver bullet.
>>
>> If anyone has seen TCP Incast Collapse with Varnish, were you able to work
>> around it, and if so, how?
>>
>> Thanks,
>> John Salmon
>>
>> References:
>>
>> http://www.pdl.cmu.edu/Incast/
>>
>> Annotated Bibliography in:
>>    https://lists.freebsd.org/pipermail/freebsd-net/2015-Novembe
>> r/043926.html
>>
>> --
>> *.*
>>
>> _______________________________________________
>> varnish-misc mailing list
>> varnish-misc at varnish-cache.org
>> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>>
>
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20170706/d4f031f5/attachment.html>