ESI and search engine spiders
Rob S
rtshilston at gmail.com
Tue Aug 10 22:05:48 CEST 2010
Hi,
On one site we run behind varnish, we've got a "most popular" widget
displayed on every page (much like http://www.bbc.co.uk/news/).
However, we have difficulties where this pollutes search engines, as
searches for a specific popular headline tend not to link directly to
the article itself, but to one of the index pages with high Google
pagerank or similar.
What I'd like to know is how other Varnish users might have served
different ESI content based on whether it's a bot or not.
My initial idea was to set an "X-Not-For-Bots: 1" header on the URL that
generates the most-popular fragment, then do something like (though
untested):
sub vcl_deliver {
if (req.http.header.user-agent ~ "bot" && resp.http.X-Not-For-Bots
== "1") {
error 752 "Not for bots";
} else {
deliver;
}
}
...
sub vcl_error {
if (obj.status==750) {
set obj.status = 200;
synthetic {"<!-- not for bots -->"};
deliver;
}
}
However, such an approach doesn't work as the req object isn't available
in vcl_deliver. We'd prefer to use a header such as X-Not-For-Bots or
similar, rather than hard-coding a list of ESI fragments to be
suppressed from bots into Varnish.
Has anyone any good suggestions?
Rob
More information about the varnish-misc
mailing list