Handling of cache-control

Rob S rtshilston at gmail.com
Tue Jan 19 21:31:14 CET 2010


Michael Fischer wrote:
> On Mon, Jan 18, 2010 at 4:37 PM, Poul-Henning Kamp <phk at phk.freebsd.dk 
> <mailto:phk at phk.freebsd.dk>> wrote:
>
>     In message <DE028C9E-4618-4EBC-8477-6E308753CBCE at dynamine.net
>     <mailto:DE028C9E-4618-4EBC-8477-6E308753CBCE at dynamine.net>>,
>     "Michael S. Fis
>     cher" writes:
>     >On Jan 18, 2010, at 5:20 AM, Tollef Fog Heen wrote:
>
>     >> My suggestion is to also look at Cache-control: no-cache,
>     possibly also
>     >> private and no-store and obey those.
>     >
>     >Why wasn't it doing it all along?
>
>     Because we wanted to give the backend a chance to tell Varnish one
>     thing with respect to caching, and the client another.
>
>     I'm not saying we hit the right decision, and welcome any consistent,
>     easily explainable policy you guys can agree on.
>
>
> Well, the problem is that application engineers who understand what 
> that header does have a reasonable expectation that the caches will 
> obey them, and so I think Vanish should honor them as Squid does. 
>  Otherwise surprising results will occur when the caching platform is 
> changed.
>
> Cache-Control: private certainly meets the goal you stated, at least 
> insofar as making Varnish behave differently than the client -- it 
> states that the client can cache, but Varnish (as an intermediate 
> cache) cannot.  
>
> I assume, however, that some engineers want a way to do the opposite - 
> to inform Varnish that it can cache, but inform the client that it 
> cannot.  Ordinarily I'd think this is not a very good idea, since you 
> almost always want to keep the cached copy as close to the user as 
> possible.  But I guess there are some circumstances where an engineer 
> would want to preload a cache with prerendered data that is expensive 
> to generate, and, also asynchronously force updates by flushing stale 
> objects with a PURGE or equivalent.  In that case the cache TTL would 
> be very high, but not necessarily meaningful. 
>
> I'm not sure it makes sense to extend the Cache-Control: header here, 
> because there could be secondary intermediate caches downstream that 
> are not under the engineer's control; so we need a way to inform only 
> authorized intermediate caches that they should cache the response 
> with the specified TTL.  
>
> One way I've seen to accomplish this goal is to inject a custom header 
> in the response, but we need to ensure it is either encrypted (so that 
> non-authorized caches can't see it -- but this could be costly in 
> terms of CPU) or removed by the last authorized intermediate cache as 
> the response is passed back downstream.
>
> --Michael

Michael,

You've obviously got some strong views about varnish, as we've all seen 
from the mailing list over the past few days!

When we deployed varnish, we did so in front of applications that 
weren't prepared to have a cache in front of them.  Accordingly, we 
disabled all caching on HTML and RSS type content in Varnish, and 
instead just cached CSS / JS / images.  This was a good outcome because 
we could stop using round robin DNS (which is a bit questionable, imho, 
if it includes more than two or three hosts) to the web servers, and 
instead just point 2 A records at Varnish.  We elected to use 
X-External-Cache-Control AND X-Internal-TTL as a headers that we'd set 
in Varnish-aware applications.  So, old apps that emit cache-control 
headers are completely uncached by Varnish), and new-apps can benefit to 
a certain degree of caching by Varnish.

PHK's plans for 2010 will enable us to fully exploit our X-Internal-TTL 
headers because it'll be able to parse TTL values out of headers.  In 
the meantime, these are hard-set in Varnish to a value that's 
appropriate for our apps.

The X-External-Cache-Control is then presented as Cache-Control to 
public HTTP requests.

This describes how we've chosen to deploy varnish, without causing our 
application developers huge headaches.  In parallel, we've changed many 
of our sites to use local cookies+javascript to add personalisation to 
the most popular pages.  Overall, deploying Varnish has seen a big 
reduction in back end requests, PLUS the ability to load balance over a 
large pool whilst still implementing sticky-sessions where our apps 
still need them.  Varnish is, as the name suggests, a lovely layer in 
front of our platform which makes it perform better.

Now, to answer your points: 

1) Application developers to be aware of caching headers:  I'd disagree 
here.  Our approach is to use code libraries to deliver functionality to 
the developers which the sysadmins can maintain.  There's always some 
overlap here, but we're comfortable with our position.  We're a PHP 
company, and so we've a class that's used statically, with methods such 
as Cacheability::noCache(), Cacheability::setExternalExpiryTime($secs), 
and Cacheability::setInternalExpiryTime($secs), as well as 
Cacheability::purgeCache($path).  Just as, I'm sure, your developers are 
using abstraction layers for database access, then they could use a 
similar approach for cacheability.

2) Preloading the cache:  This is something we do.  We set 
InternalExpiryTime to be high, and ExternalExpiryTime to be very low.  
Then, when there's a change, the app calls a purge.

3) Downstream caches:  You either have to decide if the caches are under 
your control, or are public.  You should make the edge of your estate 
behave as you want, and let third parties worry about themselves.  Get 
your outer most caches to strip all headers other than those you want 
retained. 

In summary, I think you need to partition what's done by your sysadmins 
and what's the job of your developers.  I also think it'd help me (and 
probably the mailing list) if you could give a little more detail about 
the site(s) you're running behind Varnish, and your main troubles are 
with your architecture / why you thought Varnish would help.  (For the 
information of others, we're predominantly using Varnish to balance 
traffic between a pool of servers that deliver a news website.  Combined 
with memcache and gluster, Varnish works well as a frontend to the estate.



Rob



More information about the varnish-misc mailing list