[Varnish] #682: hash maps

Varnish varnish-bugs at varnish-cache.org
Mon Apr 19 22:30:59 CEST 2010


#682: hash maps
-------------------------+--------------------------------------------------
 Reporter:  rosenfield   |       Owner:  phk                  
     Type:  enhancement  |      Status:  new                  
 Priority:  normal       |   Milestone:                       
Component:  varnishd     |     Version:  trunk                
 Severity:  normal       |    Keywords:  hash, hashtable, maps
-------------------------+--------------------------------------------------
 Hi

 I have a web site where every URL either:
  1) returns an immutable (cacheable) content page or[[BR]]
  2) results in an (uncacheable) HTTP redirect to a more-specific URL.

 I'd like to cache everything under bullet 1).

 {{{
   +-------http-------+                          +------302------+
   | lots of contents |                          | go here       |
   | ...              |                          | +------302------+
   | +-------https------+                        +-| update acc'd  |
   | | blah blah        |                          | go there      |
   | | blah             |                          +---------------+
   | | ....             |
   +-|                  |
     |                  |
     +------------------+
 }}}

 Many pages are for authorized users only.  Caching them requires a bit of
 work.  Specifics follow.

 I want an elegant design, meaning:
  1) No hitting the backend servers to do auth,[[BR]]
  2) No superfluous per-user backend requests,[[BR]]
  3) No superfluous per-user copies of the same page.

 {{{
   +-----------+           +-----------+         +-----------+
   |           | ---> ---> |           |         |           |
   |  CLIENT   | blah blah |   CACHE   |  SSSH!  |  BACKEND  |
   |           | <--- <--- |           |         |           |
   +-----------+           +-----------+         +-----------+
 }}}


 So far so good.

 At a minimum, I need to keep track of a few details via VCL:
  1) which security clearance a cached page requires, and[[BR]]
  2) given the user's authentication cookie, which clearances it grants.

 Bullet 1) is easy to implement.  There are advanced and very flexible
 solutions where the backend server emits a page's security context via a
 HTTP header..  But for starters let's just go with something simple and
 put all pages which require elevated privileges into /admin/, let's call
 that security context 1.  Even more dangerous stuff goes into /superuser/,
 aka context 2.  Everything else is context 0.  A snippet of VCL then
 simply deduces the security context (aka required clearance) by grokking
 the URL with a regex.

 {{{
   ^            ^       ^           ^       ^              ^
   |            |       |           |       |              |
   | /*         |       | /admin/*  |       | /superuser/* |
   | /public/*  |       |           |       |              |
   | /images/*  |       |           |       |              |
   | /foo/*     |       |           |       |              |
    \----------/         \---------/         \------------/
     Context 0            Context 1            Context 2
 }}}

 Easy peasy.


 Bullet 2) comes in two parts.  First, we need to shovel the user's
 security clearances (or "allowed contexts") to the VCL.  Second, we need
 to check that against the page's required clearance (or "security
 context") on subsequent requests.

 For the first part, we just shovel the user's security clearances (or
 "allowed contexts") to the VCL when the user logs in.  No problem, just
 add a header, call it "X-Granted-Clearances: 1,2" to the login response.
 The VCL can now pick out both the user's cookie value from the "Set-
 Cookie: SESSAUTH=blah" header, and which security contexts this user has
 access to from the X-Granted-Clearances header.

 For the second part, we hit a brick wall.  When the next request from the
 user comes in, VCL is in a different state of mind and has forgotten all
 about that stuff.

 So therein lies my problem.  I need a hash map which is accessible from
 VCL, in which I can store authentication cookie values plus which security
 contexts they map to.

 Additionally, the hash map must clean itself up.  The simplest mechanism
 will do, such as a 5-minute timeout on all entries.

 The required API functions as seen from VCL code are thus:
 {{{
    map-define(map-id, timeout)   [ creates the map if it doesn't exist. ]
    map-set(map-id, key, value)   [ adds or updates entry. NULL value
 destroys
                                    entry. internally timestamps new
 entries. ]
    map-touch(map-id, key)        [ updates internal timestamp. ]
    map-get(map-id, key)          [ retrieves an entry. ]
 }}}

 And the cleanup thread must at least do:

 {{{
    while (true) {
       if (no-entries) {
          wait-for-first-entry;
       } else {
          find-entry-with-lowest-timeout;
          wait-remaining;
          purge-if-still-exists-and-not-touched;
       }
    }
 }}}

 I think hash maps with timeouts have other use cases as well!

 (For example, the security context could be stored in a {url,context} map
 rather than being deduced directly from the URL, resulting in a more
 flexible solution.)

-- 
Ticket URL: <http://varnish-cache.org/ticket/682>
Varnish <http://varnish-cache.org/>
The Varnish HTTP Accelerator




More information about the varnish-bugs mailing list