Let GoogleBot Crawl full content, reverse DNS lookup
rlane at ahbelo.com
Mon Mar 7 17:51:49 CET 2011
I am aware of Google's policy about serving different content to search
users, which is why I am have to implement their "First Click Free" program.
I will use the User-Agent but need to go a step further and verify the
crawler is who they say they are by DNS.
On 3/7/11 9:05 AM, "Mattias Geniar" <mattias at nucleus.be> wrote:
> I would look at the user agent to verify if it's a GoogleBot or not, as
> that's more easily checked via VCL. All GoogleBots also adhere to the
> correct User-Agent.
> There really aren't that many users that spoof their User-Agent to gain
> extra access.
> Also keep in mind that serving GoogleBot different content than actual
> users will get you penalties in SEO, eventually dropping your Google
> ranking. Just, FYI.
> From: varnish-misc-bounces at varnish-cache.org
> [mailto:varnish-misc-bounces at varnish-cache.org] On Behalf Of Lane,
> Sent: maandag 7 maart 2011 15:58
> To: varnish-misc at varnish-cache.org
> Subject: Let GoogleBot Crawl full content, reverse DNS lookup
> I am looking into supporting Google's "First Click Free for Web Search".
> I need to allow the GoogleBots to index the full content of my sites but
> still maintain the Registration wall for everyone else. Google suggests
> that you detect there GoogleBots by reverse DNS lookup of the requesters
> Google Desc:
> Has anyone done DNS lookups via VCL to verify access to content or to
> cache content?
> System Desc:
> Varnish 2.1.4
> RHEL 5-4
> Apache 2.2x
> - Richard
More information about the varnish-misc