suggesting to increase timeout_req default to 7 seconds

Nils Goroll slink at schokola.de
Mon Mar 16 13:43:00 CET 2015


Here's the consensus from the bugwash on irc:

- we'll add stats counters for SESS_CLOSE
- then we'll check on larger production systems which impact increasing
timeout_req to 7 seconds has by
  - comparing the RX_TIMEOUT to sess_conn
  - comparing tcp counters

scn wants to provide advise on which tcp stats to watch

Nils

On 15/03/15 18:36, Nils Goroll wrote:
> Hi,
> 
> some time has passed since my initial email regarding this suggestion and it
> still holds.
> 
> Unless there is a strong argument against it, I think we really should increase
> the default timeout_req to 7 seconds. I think the argumentation for this value
> is sound and I haven't found any reasons against it.
> 
> Please keep this suggestion separate from the suggestion to re-introduce
> SO_LINGER. I still need to do production system tests with it.
> 
> Nils
> 
> On 26/02/15 11:27, Nils Goroll wrote:
>> This tcpdump output illustrates an issue we seem to have with default Linux tcp
>> timeouts and the default timeout_req of 2 seconds:
>>
>> 16:47:44.542049 IP client.49550 > varnish.80: Flags [S], seq 29295818, win 4380,
>> options [mss 1460,sackOK,eol], length 0
>> 16:47:44.542080 IP varnish.80 > client.49550: Flags [S.], seq 3652568857, ack
>> 29295819, win 29200, options [mss 1460,nop,nop,sackOK], length 0
>> 16:47:44.542250 IP client.49550 > varnish.80: Flags [.], ack 1, win 4380, length 0
>> 16:47:46.080501 IP client.49550 > varnish.80: Flags [P.], seq 1:1453, ack 1, win
>> 4380, length 1452
>> 16:47:46.080528 IP varnish.80 > client.49550: Flags [.], ack 1453, win 31944,
>> length 0
>> 16:47:48.082783 IP varnish.80 > client.49550: Flags [F.], seq 1, ack 1453, win
>> 31944, length 0
>> 16:47:48.083070 IP client.49550 > varnish.80: Flags [.], ack 2, win 4380, length 0
>> 16:47:48.350763 IP client.49550 > varnish.80: Flags [P.], seq 1453:2905, ack 2,
>> win 4380, length 1452
>> 16:47:48.350792 IP varnish.80 > client.49550: Flags [R], seq 3652568859, win 0,
>> length 0
>>
>> The packet at 16:47:46.080501 contains the first part of a request up to the
>> start of a very long cookie line.
>>
>> At 16:47:48 varnish closes after reaching timeout_req of 2s. Then, the client
>> immediately acks.
>>
>> My understanding is that the varnish->client ack 1453 got lost and the client
>> did not get around to retransmit seq 1:1453 before we timed out.
>>
>>
>> The most helpful online reference regarding recommended initial tcp
>> retransmittion timeouts I have found so far is
>> http://tools.ietf.org/html/rfc6298#ref-PA00
>>
>> In summary, an initial timeout (RTO) of 1s is now recommended, but the former 3s
>> RTO remains valid. So, for any client following the former 3s recommendation,
>> current we don't even tolerate a single packet retransmission after 3way is
>> complete. For those clients following the new 1s recommended RTO, timing is also
>> really tight it seems unlikely that we tolerate retransmission of two packets.
>>
>> Based on this, I'd suggest to raise the default timeout_req to 7 seconds to
>> allow for two retransmissions at RTO=3.
>>
>> This seems to be particularly relevant with the growing popularity of mobile
>> clients.
>>
>> The risk is increased resource usage for malicious requests. To address it, I'd
>> suggest to document that lowering timeout_req can be an option to mitigate
>> certain DoS (slowloris) attacks.
>>
>>
>> Nils
>>
>>
>> _______________________________________________
>> varnish-dev mailing list
>> varnish-dev at varnish-cache.org
>> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
>>
> 
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at varnish-cache.org
> https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev
> 



More information about the varnish-dev mailing list