[Varnish] #541: Suggested VCL for cross domain XMLHttpRequest using Varnish

Varnish varnish-bugs at projects.linpro.no
Sat Aug 15 13:13:01 CEST 2009


#541: Suggested VCL for cross domain XMLHttpRequest using Varnish
---------------------------------+------------------------------------------
 Reporter:  ned14                |        Type:  enhancement  
   Status:  new                  |    Priority:  low          
Milestone:  Varnish 2.1 release  |   Component:  documentation
  Version:  trunk                |    Severity:  minor        
 Keywords:                       |  
---------------------------------+------------------------------------------
 Following on from [http://varnish.projects.linpro.no/ticket/536], here is
 a suggestion for how to have varnish cache the proxying of another website
 such that AJAX code can perform cross domain XMLHttpRequests without
 running into browser security issues. In other words, this is how to make
 a third party website appear like it is part of your own website using URL
 rewriting.

 Normally speaking one configures Apache or whatever your front end web
 server is to do the URL rewriting and proxying. However having varnish do
 it instead has one massive benefit: you can have varnish cache the results
 such that load on the third party server is greatly reduced.

 Firstly, add a backend:
 {{{
 backend repec {
         .host = "ideas.repec.org";
         .port = "80";
 }
 }}}
 This is ideas.repec.org which is an index of Economics publicatons, so one
 can pull the list of all Economics academic publications for a given
 author by pulling a magic URL like [http://ideas.repec.org/cgi-
 bin/authorref.cgi?handle=pdo206&output=0].

 In sub vcl_recv you want something like this at the start:
 {{{
 sub vcl_recv {
         /*set req.grace = 20s;*/ /* Only enable if you don't mind slightly
 stale content */

         /* Rewrite all requests to /repec/cgi-bin/authorref.cgi to
 http://ideas.repec.org/cgi-bin/authorref.cgi */
         if (req.url ~ "^/repec/cgi-bin/authorref.cgi") {
                 set req.http.host = "ideas.repec.org";
                 set req.url = regsub(req.url, "^/repec", "");
                 set req.backend = repec;
                 remove req.http.Cookie;
                 lookup;
         } else {
                 set req.backend = default;
                 ... do normal processing ...
 }}}

 And finally in sub vcl_fetch:

 {{{
 sub vcl_fetch {
         /*set req.grace = 20s;*/ /* Only enable if you don't mind slightly
 stale content */
         if (req.http.host == "ideas.repec.org") {
                 set obj.http.Content-Type = "text/html; charset=utf-8"; /*
 Correct the wrong response */
                 set obj.ttl = 86400s;
                 set obj.http.Cache-Control = "max-age=3600";
                 deliver;
         }
 }}}

 What this does is to firstly correct the wrong MIME type returned by the
 RePEc server - it says text/plain and iso-8859-1. It then keeps it in the
 varnish cache for 1 day such that the RePEc server will only ever be asked
 once per day per author. It then tells the web browser and any
 intermediate caches to not bother varnish for one hour after a fetch.

 Ideally I'd like to have set an Expires: header but I am not entirely sure
 how to compute one of these in VCL. I suppose one could overwrite the max-
 age in vcl_hit by subtracting the Age header returned by varnish when it
 fetches from cache from 86400. Anyway a one hour browser cache expiry is
 good enough for most cases when someone is casually browsing a website.

 I hope that someone finds this useful - I certainly have.

 Cheers,[[BR]]
 Niall

-- 
Ticket URL: <http://varnish.projects.linpro.no/ticket/541>
Varnish <http://varnish.projects.linpro.no/>
The Varnish HTTP Accelerator


More information about the varnish-bugs mailing list