Varnish returning 503s for Googlebot requests (Bug #813?)

Ronan Mullally ronan at iol.ie
Mon Mar 7 20:45:43 CET 2011


Hi Mattias,

On Sun, 6 Mar 2011, Mattias Geniar wrote:

> Not sure if you've managed to test this yet, but Google seem to run with
> "Accept-Encoding: gzip". Perhaps there's a problem serving the
> compressed version, whereas your manual wget's don't use this
> accept-encoding?

You're spot on.  Adding an Accept-Encoding header to my wget requests
resulted in failures.  The content length reported being longer than that
actually retrieved.  I tracked the fault down to PHP doing compression
via zlib.compression.

Thanks for your help.


-Ronan

> -----Original Message-----
> >From: varnish-misc-bounces at varnish-cache.org
> [mailto:varnish-misc-bounces at varnish-cache.org] On Behalf Of Ronan
> Mullally
> Sent: zaterdag 5 maart 2011 10:48
> To: varnish-misc at varnish-cache.org
> Subject: Varnish returning 503s for Googlebot requests (Bug #813?)
>
> Hi,
>
> I'm a varnish noob.  I've only just started rolling out a cache in front
> of a VBulletin site running Apache that is currently using pound for
> load
> balancing.
>
> I'm running 2.1.5 on a debian lenny box.  Testing is going well, apart
> from one problem.  The site runs VBSEO to generate sitemap files.
> Without excpetion, every time Googlebot tries to request these files
> Varnish returns a 503:
>
>  66.249.66.246 - - [05/Mar/2011:09:33:53 +0000] "GET
> http://www.sitename.net/sitemap_151.xml.gz HTTP/1.1" 503 419 "-"
>    "Mozilla/5.0 (compatible; Googlebot/2.1;
> +http://www.google.com/bot.html)"
>
> I can request these files via wget direct from the backend as well as
> direct from varnish without a problem:
>
>  --2011-03-05 09:23:39--  http://www.sitename.net/sitemap_362.xml.gz
>
>  HTTP request sent, awaiting response...
>    HTTP/1.1 200 OK
>    Server: Apache
>    Content-Type: application/x-gzip
>    Content-Length: 130283
>    Date: Sat, 05 Mar 2011 09:23:38 GMT
>    X-Varnish: 1282440127
>    Age: 0
>    Via: 1.1 varnish
>    Connection: keep-alive
>  Length: 130283 (127K) [application/x-gzip]
>  Saving to: `/dev/null'
>
>  2011-03-05 09:23:39 (417 KB/s) - `/dev/null' saved [130283/130283]
>
> I've reverted back to default.vcl, the only changes being to define my
> own
> backends.  Varnishlog output is below.  Having googled a bit the only
> thing I've found is bug #813, but that was apparently fixed prior to
> 2.1.5.  Am I missing something obvious?
>
>
> -Ronan
>
>
> Varnishlog output
>
>    18 ReqStart     c 66.249.66.246 63009 1282436348
>    18 RxRequest    c GET
>    18 RxURL        c /sitemap_362.xml.gz
>    18 RxProtocol   c HTTP/1.1
>    18 RxHeader     c Host: www.sitename.net
>    18 RxHeader     c Connection: Keep-alive
>    18 RxHeader     c Accept: */*
>    18 RxHeader     c From: googlebot(at)googlebot.com
>    18 RxHeader     c User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1;
> +http://www.google.com/bot.html)
>    18 RxHeader     c Accept-Encoding: gzip,deflate
>    18 RxHeader     c If-Modified-Since: Sat, 05 Mar 2011 08:40:46 GMT
>    18 VCL_call     c recv
>    18 VCL_return   c lookup
>    18 VCL_call     c hash
>    18 VCL_return   c hash
>    18 VCL_call     c miss
>    18 VCL_return   c fetch
>    18 Backend      c 40 sitename sitename1
>    40 TxRequest    b GET
>    40 TxURL        b /sitemap_362.xml.gz
>    40 TxProtocol   b HTTP/1.1
>    40 TxHeader     b Host: www.sitename.net
>    40 TxHeader     b Accept: */*
>    40 TxHeader     b From: googlebot(at)googlebot.com
>    40 TxHeader     b User-Agent: Mozilla/5.0 (compatible; Googlebot/2.1;
> +http://www.google.com/bot.html)
>    40 TxHeader     b Accept-Encoding: gzip,deflate
>    40 TxHeader     b X-Forwarded-For: 66.249.66.246
>    40 TxHeader     b X-Varnish: 1282436348
>    40 RxProtocol   b HTTP/1.1
>    40 RxStatus     b 200
>    40 RxResponse   b OK
>    40 RxHeader     b Date: Sat, 05 Mar 2011 09:17:37 GMT
>    40 RxHeader     b Server: Apache
>    40 RxHeader     b Content-Length: 130327
>    40 RxHeader     b Content-Encoding: gzip
>    40 RxHeader     b Vary: Accept-Encoding
>    40 RxHeader     b Content-Type: application/x-gzip
>    18 TTL          c 1282436348 RFC 10 1299316657 0 0 0 0
>    18 VCL_call     c fetch
>    18 VCL_return   c deliver
>    18 ObjProtocol  c HTTP/1.1
>    18 ObjStatus    c 200
>    18 ObjResponse  c OK
>    18 ObjHeader    c Date: Sat, 05 Mar 2011 09:17:37 GMT
>    18 ObjHeader    c Server: Apache
>    18 ObjHeader    c Content-Encoding: gzip
>    18 ObjHeader    c Vary: Accept-Encoding
>    18 ObjHeader    c Content-Type: application/x-gzip
>    18 FetchError   c straight read_error: 0
>    40 Fetch_Body   b 4 4294967295 1
>    40 BackendClose b sitename1
>    18 VCL_call     c error
>    18 VCL_return   c deliver
>    18 VCL_call     c deliver
>    18 VCL_return   c deliver
>    18 TxProtocol   c HTTP/1.1
>    18 TxStatus     c 503
>    18 TxResponse   c Service Unavailable
>    18 TxHeader     c Server: Varnish
>    18 TxHeader     c Retry-After: 0
>    18 TxHeader     c Content-Type: text/html; charset=utf-8
>    18 TxHeader     c Content-Length: 419
>    18 TxHeader     c Date: Sat, 05 Mar 2011 09:17:38 GMT
>    18 TxHeader     c X-Varnish: 1282436348
>    18 TxHeader     c Age: 1
>    18 TxHeader     c Via: 1.1 varnish
>    18 TxHeader     c Connection: close
>    18 Length       c 419
>    18 ReqEnd       c 1282436348 1299316657.660784483
> 1299316658.684726000 0.478523970 1.023897409 0.000044107
>    18 SessionClose c error
>    18 StatSess     c 66.249.66.246 63009 6 1 5 0 0 4 2984 32012
>
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at varnish-cache.org
> http://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
>
>




More information about the varnish-misc mailing list