Avoiding big objects

Tue May 17 02:04:34 CEST 2011

On Tue, Apr 26, 2011 at 5:25 PM, Mark Moseley <moseleymark at gmail.com> wrote:
> I was working on something in my quest to keep big (eventually
> uncacheable) objects from wreaking havoc on my cache. Even if I employ
> a scheme to call "restart" from vcl_fetch, after adding a header that
> tells vcl_recv to call 'pipe', the object still gets fetched from the
> origin server. And if it's 1.5 gig, it can be pretty painful.
>
> So I was hoping to throw this by you guys, esp the Varnish devs.
> Mainly I wanted to hear if anyone thought this was a tremendously bad
> idea. I wrote this about 45 minutes ago, so it's not particularly
> well-tested out, but if you guys said this was the worst idea ever,
> then I might reconsider putting a lot more time into perfecting it.
> Thus there are likely to be big corner cases here. There was another
> recent thread about this subject, so I know there are some other
> people looking for a similar solution, so I thought I'd throw this out
> there too. This doesn't protect me from 1.5 gig JPEG files but it does
> most of the job. and a further comment is that, yes, I'm ok with all
> the extra backend reqs, providing their HEADs.
>
> Mainly what it's doing is this:
>
> 1. Huge files won't ever be HITs in my environment, since I'll have piped them.
> 2. If a MISS (as it should be), rewrite backend method from GET (I
> don't do POSTs on varnish) to HEAD in vcl_miss if it's a file
> extension likely to be a biggish file and matches other conditions.
> 3. In vcl_fetch, if it's a rewritten HEAD, do size check. If it's too
> big, add the header that indicates to vcl_fetch to drop immediately to
> 'pipe'
> 4. In either case, in vcl_fetch, rewrite the method back to GET and
> call 'restart'.
>
>
> Here's the essence of the VCL (imagine regularly-working VCL alongside
> it). I typed this out so ignore dumb typos:
>
> sub vcl_fetch {
>   ....
>   # If we've got the header that says to pipe this request, pipe it
> (thanks Tollef)
>   if ( req.http.X-PIPEME && req.restarts > 0 ) {
>                return( pipe );
>   }
>   ....
> }
>
>
> # The URLs in this regex are some sample ones that are often huge in
> size; the eventual list would be bigger and have others like 'mpg'
> etc. Note that I don't send POSTs over varnish, so ignore lack of POST
> sub vcl_miss {
>        # If no headcheck header and GET and type is on big list,
> rewrite to HEAD
>        if ( ! req.http.X-HEADCHECK && bereq.request == "GET" &&
> req.url ~ "\.(gz|wmv|zip|flv|avi)$" && req.restarts == 0 ) {
>                set req.http.X-HEADCHECK = "1";
>                set bereq.request = "HEAD";
>                set bereq.http.User-Agent = "HEAD Check";
>                log "DEBUG: Rewriting to HEAD";
>        }
> }
>
>
>
> sub vcl_fetch {
>        # If this used to be a GET request that we changed to HEAD, do
> length check. But try to avoid restart loops.
>        if ( req.http.X-HEADCHECK && req.request == "GET" &&
> bereq.request == "HEAD" && req.url ~ "\.(gz|wmv|zip|flv|avi)$" &&
> req.restarts < 1) {
>                unset req.http.X-HEADCHECK;
>                set bereq.request = "GET";
>                log "DEBUG: [fetch] Rewriting to HEAD";
>
>                # If content is over 10 meg, pipe it
>                if ( beresp.http.Content-Length ~ "[0-9]{8,}" ) {
>                        set req.http.X-PIPEME = "1";
>                }
>
>                restart;
>        }
>       ....
> }
>
>
>
> Mainly I'm just looking for whether the Varnish devs think that this
> would cause something to completely explode and/or melt down or this
> is the worst security hole ever. It seems to work ok so far. For reqs
> that match 'beresp.http.Content-Length ~ "[0-9]{8,}"', the "SMA bytes
> allocated" counter never budges, where it normally does for anything
> fetched (memory backend).
>
> Thanks! Hope someone else can benefit from this too. If someone else
> uses this (after thorough testing), be sure to remove the 'log' calls
> in production.
>

Just to update: Works great so far. Prior to this, I was hitting that
stevedore.c error on lots of my boxes after a few days of uptime
(thanks to customers with gigantic files). Since I rolled this out,
most of my boxes' varnishd's now have uptimes from when I deployed
this solution across the board about 2 weeks ago. If you try it
yourself, watch for loops.