is 2.0.2 not as efficient as 1.1.2 was?

Tue Nov 25 23:37:14 CET 2008

Hello,

We run Gravatar.com and use varnish to cache avatar responses.  There 
are a ton of very small objects and lots of requests per second. Last 
week we were using 1.1.2 compiled against tcmalloc (-t 600 -w 1,4000,5 
-h classic,500009 -p thread_pools 10 -p listen_depth 4096 -s 
malloc,16G). This used an nginx load balancer on a separate host as its 
back end which distributed varnish's requests to our pool of webs.  All 
was well.

This week we upgraded to 2.0.2 and are using varnish's back end & 
director configuration for the same work.  What we are seeing is that 
2.0.2 holds about 60% of the objects in the same amount of cache space 
as 1.1.2 did (we tried tcmalloc, jemalloc, and mmap.)  This caused us 
quite a few problems after the upgrade as varnish would start spiking 
the load on the boxes into the hundreds.  We attempted tuning the 
lru_interval (up) and obj_workspace (down) but we couldn't get varnish 
to hold the same data that it used to on the same machines.  

Right now we've reduced the time that we keep cached objects 
drastically, bringing our cache hit rate down to 92% from 96% which 
roughly doubled the requests (and load) on the web servers.  It is, 
however, stable at this point.  Obviously the idea of not keeping up 
with the latest versions of varnish is not what we want to do, however 
effectively doubling requirements for scaling the service is just as 
unappealing.

So, what we're asking is... how do we get varnish 2 to be as efficient 
as varnish 1 was?  We're glad to try things...  It takes a while to fill 
up the cache to the point that it can cause problems so testing and 
reporting back will take some time, but we'd like this fixed and will 
put in some work. We're currently running the following cli options:

-a 0.0.0.0:80 -f ... -P ... -T 10.1.94.43:6969 -t 600 -w 1,4000,5 -h 
classic,500009 -p thread_pools 10 -p listen_depth 4096 -s malloc,16G

And our VCL looks like this (with most of the webs taken out for brevity 
since they're repeated verbatim with only numbers changed)

backend web11 { .host = "xxx"; .port = "8088"; .probe =
                { .url = "xxx"; .timeout = 50 ms; .interval = 5s; 
.window = 2; .threshold = 1; }
}
backend web12 { .host = "xxx"; .port = "8088"; .probe =
                { .url = "xxx"; .timeout = 50 ms; .interval = 5s; 
.window = 2; .threshold = 1; }
}

director default random {
        .retries = 3;
        { .backend = web11; .weight = 1; }
        { .backend = web12; .weight = 1; }
}

sub vcl_recv {
  set req.backend = default;
  set req.grace = 30s;
  if ( req.url ~ "^/(avatar|userimage)" && req.http.cookie )  {
    lookup;
  }
}

sub vcl_fetch {
  if (obj.ttl < 600s) {
    set obj.ttl = 600s;
  }
  if (obj.status == 404) {
    set obj.ttl = 30s;
  }
  if (obj.status == 500 || obj.status == 503 ) {
    pass;
  }
  set obj.grace = 30s;
  deliver;
}

sub vcl_deliver {
  remove resp.http.Expires;
  remove resp.http.Cache-Control;
  set resp.http.Cache-Control = "public, max-age=600, proxy-revalidate";
  deliver;
}