Do I need to unset cookie in vcl_fetch if there is remove req.http.Cookie in vcl_recv?
Michael Alger
varnish at mm.quex.org
Fri Feb 11 05:01:25 CET 2011
On Fri, Feb 11, 2011 at 01:48:47AM +0100, David Murphy wrote:
> Hello
>
> I've been testing removing cookies from images/css/js and am a
> little unclear on the difference between:
>
>
> //start ==========
> sub vcl_recv {
> if (req.url ~ "\.(png|gif|jpg|swf|css|js)$") {
> remove req.http.Cookie;
> }
> //end==========
>
>
> and...
>
>
> //start =========
> sub vcl_recv {
> if (req.url ~ "\.(png|gif|jpg|swf|css|js)$") {
> unset req.http.cookie;
> }
> }
>
> sub vcl_fetch {
> if (req.url ~ "\.(png|gif|jpg|swf|css|js)$") {
> unset beresp.http.set-cookie;
> }
> }
> //end =========
>
>
> The first example removes the cookie from the incoming requests
> for these files so that these objects are cacheable.
Kind of. I think it's more accurate to say that removing the cookie
from the request allows Varnish to look up the requested URI in its
cache. The default VCL will pass any request that includes a cookie
to the backend, on the assumption that the cookie may matter to the
website (for example, the cookie may indicate the user is logged in
which allows the website to decline to serve resources to users who
aren't logged in).
The above example assumes that any request matching that pattern is
not dependent on cookies, and therefore the cookie can be discarded
from the request altogether.
If you didn't remove the cookie from the client's request, but also
didn't consider the cookie as part of the hash value when trying to
fetch the object from cache, you'd achieve much the same thing. The
difference is that, if the object isn't in cache, when Varnish goes
to fetch it from the backend, that request will include any cookies
that that particular client had sent.
Probably that won't matter, but if the server is actually sending a
different response to different cookies, the cached object will end
up being served to other clients who should have received something
different because they had a different cookie.
If the server's response doesn't vary based on the cookies, then it
doesn't matter if you send an object that was fetched with cookie A
to a client who has cookie B; but it doesn't make any sense to send
the server cookie values it's going to ignore. Stripping the cookie
early in the request gives you a simpler configuration and achieves
the same result.
> What I'm a bit unclear on is why the 2nd example uses 'unset' in
> both vcl_recv and also when the object has been retrieved from the
> backend (vcl_fetch).
The line in vcl_recv removes any cookie that the client may have in
its /request/. In vcl_fetch you're removing the "Set-Cookie" header
from the /response/ from the server (which assumes a cache miss and
therefore a backend request to fetch the object).
Responses that set a cookie are usually not cacheable, since it can
often indicate some client-specific activity is taking place, which
is why the default VCL will refuse to cache such responses.
Some session-tracking mechanisms may try to set a cookie with every
response, without regard for whether doing so actually makes sense.
Removing the Set-Cookie header in responses to certain object types
provides a workaround for that situation.
Without the line in vcl_fetch, if your server is actually sending a
Set-Cookie header with responses for those resources, you would end
up either never caching them in the first place, or you'd cache the
response along with the Set-Cookie. Subsequent cache hits would see
every client receiving the same cookie. If it actually is a session
tracking cookie, that could be pretty disastrous.
> I need to be 100% sure that for all image/css objects, there are
> no cookies stored anywhere. Is this what the first example does,
> whereas the second example 'ignores' the cookie?
Varnish doesn't really store cookies, though it can be set up so as
to add them to the hash used to lookup cached entities. Both of the
examples are ignoring the cookies, but that first example will only
ignore the cookies sent by the client when requesting resources. In
the second example, you also ignore any cookies that are set by the
server when responding to the requests.
You could consider this to be preventing the browser from receiving
cookies, however that's not quite correct since other components of
the site may set cookies, and it's unlikely their path restrictions
will prevent them from being "valid" for images/css/etc. That's why
you still want to strip them from the incoming request, even though
you're also stripping them from certain responses.
More information about the varnish-misc
mailing list