Client reflected response back to varnish, and got into a nasty loop

Rob S rtshilston at gmail.com
Wed Nov 30 20:17:26 CET 2011


Hi,

We had a strange incident happy on a varnish node today, which
resulted in a sustained extra 60 reqs/sec for around 4.5 hrs of
identical sequential requests with the *request* being "HTTP/1.1 302
Found".  The particular IP address had been browsing normally, until
they got the following response:


<snip>
  491 TxProtocol   c HTTP/1.1
  491 TxStatus     c 200
  491 TxResponse   c OK
  491 TxHeader     c Content-Type: text/javascript;charset=UTF-8
  491 TxHeader     c Cache-Control: no-cache, no-store, max-age=0,
must-revalidate, private, no-transform
  491 TxHeader     c Expires: Fri, 29 Apr 2011 05:00:00 GMT
  491 TxHeader     c Content-Length: 2715
  491 TxHeader     c Date: Wed, 30 Nov 2011 12:04:53 GMT
  491 TxHeader     c X-Varnish: 747013197
  491 TxHeader     c Age: 0
  491 TxHeader     c Via: 1.1 varnish
  491 TxHeader     c Connection: keep-alive
  491 Length       c 2715
  491 ReqEnd       c 747013197 1322654693.630128860
1322654693.636650801 0.030966997 0.006455421 0.000066519


This looks absolutely fine - we sent them a short javascript snippet
that shouldn't be cached.  This is a request which happens thousands
of times an hour with no issue.  What happened next is strange:  The
response we'd just sent them (above) came back to varnish as a fresh
request:

  491 ReqStart     c 86.166.aaa.bbb 56853 747013203
  491 RxRequest    c HTTP/1.1
  491 RxURL        c 200
  491 RxProtocol   c OK
  491 RxHeader     c Content-Type: text/javascript;charset=UTF-8
  491 RxHeader     c Cache-Control: no-cache, no-store, max-age=0,
must-revalidate, private, no-transform
  491 RxHeader     c Expires: Fri, 29 Apr 2011 05:00:00 GMT
  491 RxHeader     c Content-Length: 2715
  491 RxHeader     c Date: Wed, 30 Nov 2011 12:04:53 GMT
  491 RxHeader     c X-Varnish: 747013197
  491 RxHeader     c Age: 0
  491 RxHeader     c Via: 1.1 varnish
  491 RxHeader     c Connection: keep-alive
  491 RxHeader     c X-Antivirus: avast! 4
  491 RxHeader     c X-Antivirus-Status: Clean


Look how we *received* an X-Varnish header, with the same ID as the
previous response.  But, it must have gone via the client, as an
additional header was added - the X-Antivirus.  This then was a miss
in Varnish, and fell through to our backends, which responds with a
redirect if requested with an invalid hostname:

  491 VCL_call     c recv pass
  491 VCL_call     c hash hash
  491 VCL_call     c pass pass
  491 Backend      c 823 primarypool web01
  491 TTL          c 747013203 RFC 120 1322654693 0 0 0 0
  491 VCL_call     c fetch
  491 TTL          c 747013203 VCL 0 1322654694
  491 VCL_return   c deliver
  491 ObjProtocol  c HTTP/1.1
  491 ObjStatus    c 302
  491 ObjResponse  c Found
  491 ObjHeader    c Date: Wed, 30 Nov 2011 11:04:54 GMT
  491 ObjHeader    c Location: http://www.mainsite.com
  491 ObjHeader    c Content-Length: 201
  491 ObjHeader    c Content-Type: text/html; charset=iso-8859-1
  491 ObjHeader    c Cache-Control: max-age=0, public
  491 ObjHeader    c X-Varnish-Status: u
  491 VCL_call     c deliver deliver
  491 TxProtocol   c HTTP/1.1
  491 TxStatus     c 302
  491 TxResponse   c Found
  491 TxHeader     c Location: http://www.mainsite.com
  491 TxHeader     c Content-Type: text/html; charset=iso-8859-1
  491 TxHeader     c Cache-Control: max-age=0, public
  491 TxHeader     c X-Varnish-Status: u
  491 TxHeader     c Content-Length: 201
  491 TxHeader     c Date: Wed, 30 Nov 2011 12:04:53 GMT
  491 TxHeader     c X-Varnish: 747013203
  491 TxHeader     c Age: 0
  491 TxHeader     c Via: 1.1 varnish
  491 TxHeader     c Connection: keep-alive
  491 Length       c 201
  491 ReqEnd       c 747013203 1322654693.676038265
1322654693.701058388 0.039387465 0.024901628 0.000118494

And then this pattern continued for over 4 hrs.


  491 ReqStart     c 86.166.aaa.bbb 56853 747013208
  491 RxRequest    c HTTP/1.1
  491 RxURL        c 302
  491 RxProtocol   c Found
  491 RxHeader     c Location: http://www.mainsite.com
  491 RxHeader     c Content-Type: text/html; charset=iso-8859-1
  491 RxHeader     c Cache-Control: max-age=0, public
  491 RxHeader     c X-Varnish-Status: u
  491 RxHeader     c Content-Length: 201
  491 RxHeader     c Date: Wed, 30 Nov 2011 12:04:53 GMT
  491 RxHeader     c X-Varnish: 747013203
  491 RxHeader     c Age: 0
  491 RxHeader     c Via: 1.1 varnish
  491 RxHeader     c Connection: keep-alive
  491 VCL_call     c recv pass
  491 VCL_call     c hash hash
  491 VCL_call     c pass pass
  491 Backend      c 115 primarypool web02
  491 TTL          c 747013208 RFC 120 1322654693 0 0 0 0
  491 VCL_call     c fetch
  491 TTL          c 747013208 VCL 0 1322654694
  491 VCL_return   c deliver
  491 ObjProtocol  c HTTP/1.1
  491 ObjStatus    c 302
  491 ObjResponse  c Found
  491 ObjHeader    c Date: Wed, 30 Nov 2011 11:04:54 GMT
  491 ObjHeader    c Location: http://www.mainsite.com
  491 ObjHeader    c Content-Length: 201
  491 ObjHeader    c Content-Type: text/html; charset=iso-8859-1
  491 ObjHeader    c Cache-Control: max-age=0, public
  491 ObjHeader    c X-Varnish-Status: u
  491 VCL_call     c deliver deliver
  491 TxProtocol   c HTTP/1.1
  491 TxStatus     c 302
  491 TxResponse   c Found
  491 TxHeader     c Location: http://www.mainsite.com
  491 TxHeader     c Content-Type: text/html; charset=iso-8859-1
  491 TxHeader     c Cache-Control: max-age=0, public
  491 TxHeader     c X-Varnish-Status: u
  491 TxHeader     c Content-Length: 201
  491 TxHeader     c Date: Wed, 30 Nov 2011 12:04:53 GMT
  491 TxHeader     c X-Varnish: 747013208
  491 TxHeader     c Age: 0
  491 TxHeader     c Via: 1.1 varnish
  491 TxHeader     c Connection: keep-alive
  491 Length       c 201
  491 ReqEnd       c 747013208 1322654693.774824381
1322654693.775967836 0.073765993 0.001085758 0.000057697

  491 ReqStart     c 86.166.aaa.bbb 56853 747013209
  491 RxRequest    c HTTP/1.1
  491 RxURL        c 302
  491 RxProtocol   c Found
  491 RxHeader     c Location: http://www.mainsite.com
  491 RxHeader     c Content-Type: text/html; charset=iso-8859-1
  491 RxHeader     c Cache-Control: max-age=0, public
  491 RxHeader     c X-Varnish-Status: u
  491 RxHeader     c Content-Length: 201
  491 RxHeader     c Date: Wed, 30 Nov 2011 12:04:53 GMT
  491 RxHeader     c X-Varnish: 747013208
  491 RxHeader     c Age: 0
  491 RxHeader     c Via: 1.1 varnish
  491 RxHeader     c Connection: keep-alive
  491 VCL_call     c recv pass
  491 VCL_call     c hash hash
  491 VCL_call     c pass pass
  491 Backend      c 115 primarypool web03
  491 TTL          c 747013209 RFC 120 1322654693 0 0 0 0
  491 VCL_call     c fetch
  491 TTL          c 747013209 VCL 0 1322654694
  491 VCL_return   c deliver
  491 ObjProtocol  c HTTP/1.1
  491 ObjStatus    c 302
  491 ObjResponse  c Found
  491 ObjHeader    c Date: Wed, 30 Nov 2011 11:04:55 GMT
  491 ObjHeader    c Location: http://www.mainsite.com
  491 ObjHeader    c Content-Length: 201
  491 ObjHeader    c Content-Type: text/html; charset=iso-8859-1
  491 ObjHeader    c Cache-Control: max-age=0, public
  491 ObjHeader    c X-Varnish-Status: u
  491 VCL_call     c deliver deliver
  491 TxProtocol   c HTTP/1.1
  491 TxStatus     c 302
  491 TxResponse   c Found
  491 TxHeader     c Location: http://www.mainsite.com
  491 TxHeader     c Content-Type: text/html; charset=iso-8859-1
  491 TxHeader     c Cache-Control: max-age=0, public
  491 TxHeader     c X-Varnish-Status: u
  491 TxHeader     c Content-Length: 201
  491 TxHeader     c Date: Wed, 30 Nov 2011 12:04:53 GMT
  491 TxHeader     c X-Varnish: 747013209
  491 TxHeader     c Age: 0
  491 TxHeader     c Via: 1.1 varnish
  491 TxHeader     c Connection: keep-alive
  491 Length       c 201
  491 ReqEnd       c 747013209 1322654693.791796923
1322654693.793078899 0.015829086 0.001143694 0.000138283




Has anyone seen anything like this before?  How would you reduce the
likelihood of this circular (almost) DoS attach in Varnish?  Should we
drop inbound requests if they contain an RxHeader of X-Varnish?  Our
vcl_recv is fairly complex, but ultimately boils down to
return(lookup) if it doesn't match any earlier criteria, so that's the
path being hit.

We didn't see a significant change in network bandwidth usage, and at
60 requests per second, I wonder if this tight loop happened
internally within Varnish, rather than going all the way to the end IP
address.  60 sequential requests per second is one every 16ms.  The IP
address is an ADSL connection in the UK, so it *could* be 16ms away,
but that's at the very low end of what's the likely ping time.  But,
on the other hand, 16ms is a long time for Varnish to be in a loop in
the datacentre, given that the backend web servers are on the same
gigabit LAN as Varnish and our backends are fast for this request.

We're running varnish-2.1.5-1.el6.x86_64 on CentOS 6.0 (2.6.32-71.el6.x86_64)


Thanks,


Rob




More information about the varnish-misc mailing list