suggesting to increase timeout_req

Nils Goroll slink at schokola.de
Thu Feb 26 11:27:05 CET 2015


This tcpdump output illustrates an issue we seem to have with default Linux tcp
timeouts and the default timeout_req of 2 seconds:

16:47:44.542049 IP client.49550 > varnish.80: Flags [S], seq 29295818, win 4380,
options [mss 1460,sackOK,eol], length 0
16:47:44.542080 IP varnish.80 > client.49550: Flags [S.], seq 3652568857, ack
29295819, win 29200, options [mss 1460,nop,nop,sackOK], length 0
16:47:44.542250 IP client.49550 > varnish.80: Flags [.], ack 1, win 4380, length 0
16:47:46.080501 IP client.49550 > varnish.80: Flags [P.], seq 1:1453, ack 1, win
4380, length 1452
16:47:46.080528 IP varnish.80 > client.49550: Flags [.], ack 1453, win 31944,
length 0
16:47:48.082783 IP varnish.80 > client.49550: Flags [F.], seq 1, ack 1453, win
31944, length 0
16:47:48.083070 IP client.49550 > varnish.80: Flags [.], ack 2, win 4380, length 0
16:47:48.350763 IP client.49550 > varnish.80: Flags [P.], seq 1453:2905, ack 2,
win 4380, length 1452
16:47:48.350792 IP varnish.80 > client.49550: Flags [R], seq 3652568859, win 0,
length 0

The packet at 16:47:46.080501 contains the first part of a request up to the
start of a very long cookie line.

At 16:47:48 varnish closes after reaching timeout_req of 2s. Then, the client
immediately acks.

My understanding is that the varnish->client ack 1453 got lost and the client
did not get around to retransmit seq 1:1453 before we timed out.


The most helpful online reference regarding recommended initial tcp
retransmittion timeouts I have found so far is
http://tools.ietf.org/html/rfc6298#ref-PA00

In summary, an initial timeout (RTO) of 1s is now recommended, but the former 3s
RTO remains valid. So, for any client following the former 3s recommendation,
current we don't even tolerate a single packet retransmission after 3way is
complete. For those clients following the new 1s recommended RTO, timing is also
really tight it seems unlikely that we tolerate retransmission of two packets.

Based on this, I'd suggest to raise the default timeout_req to 7 seconds to
allow for two retransmissions at RTO=3.

This seems to be particularly relevant with the growing popularity of mobile
clients.

The risk is increased resource usage for malicious requests. To address it, I'd
suggest to document that lowering timeout_req can be an option to mitigate
certain DoS (slowloris) attacks.


Nils




More information about the varnish-dev mailing list