<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#ffffff" text="#000000">

    On 07/03/2011 14:58, Lane, Richard wrote:

    <blockquote cite="mid:C99A4EA0.3C67B%25rlane@ahbelo.com" type="cite">

      <meta http-equiv="Content-Type" content="text/html;

        charset=windows-1252">

      <title>Let GoogleBot Crawl full content, reverse DNS lookup</title>

      <font face="Arial"><span style="font-size: 11pt;"><br>

          I am looking into supporting Google’s “First Click Free for

          Web Search”. I need to allow the GoogleBots to index the full

          content of my sites but still maintain the Registration wall

          for everyone else. Google suggests that you detect there

          GoogleBots by reverse DNS lookup of the requesters IP. <br>

          <br>

          Google Desc: <a moz-do-not-send="true"

href="http://www.google.com/support/webmasters/bin/answer.py?answer=80553">http://www.google.com/support/webmasters/bin/answer.py?answer=80553</a><br>

          <br>

          Has anyone done DNS lookups via VCL to verify access to

          content or to cache content?<br>

        </span></font></blockquote>

    <br>

    I believe this /could/ be done using a C function, but it's not

    something I've had experience of before.<br>

    <br>

    What you could do is detect the Google user-agent in varnish, and

    then pass that and the IP to a backend script with the original

    request: such as<br>

    /* Varnish 2.0.6 psuedo code - may need updating */<br>

    if (req.http.user-agent == "Googlebot") {<br>

        set.http.x-varnish-originalurl = req.url;<br>

        set req.url = "/googlecheck?ip= " client.ip "&originalurl="

    req.url;<br>

        lookup;<br>

    }<br>

    and the Googlecheck script actually does the rDNS look up and if it

    matches, it returns the contents of the requested url.<br>

    <br>

    Richard Chiswell<br>

    <a class="moz-txt-link-freetext" href="http://www.mangahigh.com">http://www.mangahigh.com</a><br>

    (Speaking personally yadda yadda)<br>

  </body>

</html>