Mass redirects/backend selection with Varnish?
Michael Alger
varnish at mm.quex.org
Mon Oct 24 16:05:43 CEST 2011
On Sun, Oct 23, 2011 at 01:40:57PM -0400, Jason W. wrote:
> We're looking to move from squid as a reverse proxy to using varnish.
> However, I'm not able to come up with a drop-in replacement for
> backend selection and 301 redirects.
>
> Currently, we have squid using squirm[1] as a redirector. For every
> request coming to squid, a list of squirm patterns (regexes) is
> consulted and a rewritten URL is constructed. This URL can be either a
> 301 redirect (URL prefaced with "301:") or a backend URL. If it's a
> 301, squid removes the 301: and serves up the redirect. If it's a
> backend URL, squid rewrites the URL internally (the new URL is what
> gets stored in the cache but the client never sees it) and fetches it
> from the backend. We don't configure squid itself to use a single
> origin. If the URL isn't matched by a squirm pattern, it's not
> successfully served by squid.
>
> Given a single squid instance (we have 7 currently), there can be
> anywhere from 50 to 1000 patterns. They are stored in a
> space-delimited text file with a regex that is matched on the URL and
> a rewritten URL, some of which use backreferences from the matched
> regex.
>
> Obviously, we could do this with a giant list of if statements using
> req.url and/or req.host. This doesn't strike me as ideal.
I had the same thought when we migrated from squid + jesred to Varnish;
we had several thousand patterns across a few sites. I did simplify
things a little by implementing virtual-hosting type behaviour within
Varnish, so it only had to process redirects for the particular site the
request was actually for. If you can make a similar optimisation you
might find the amount of processing per request drops considerably.
While the if/elsif ladder looks a bit ugly and like a lot of work, it's
actually pretty much exactly what squirm is already doing. So I think
you'll find the performance to be about the same; possibly a bit faster
since if you implement it within VCL you won't have the overhead of
communicating over a pipe.
The only issue you'd have with doing it in Varnish is if your backend
hosts are pretty much arbitrary; Varnish needs each origin to be
explicitly defined. This requires a slight change to the logic, in that
you need to set req.backend appropriately, in addition to req.host
and/or req.url. But, it's not really complex.
> Our thinking is to abstract the selection of backend URL and/or
> whether to 301 redirect out of VCL. We've considered writing a custom
> VMOD to handle this (either implement the squirm functionality or use
> squirm in the same way that squid does), but I wanted to get the
> community's take before we reinvent the wheel or do something crazy.
>
> Is this something that is feasible with varnish or should I move this
> functionality elsewhere in the stack? Any ideas are welcome.
I do think that abstracting it out of the VCL so you don't have to
actually manage the if/elsif ladder directly is probably a good idea.
It'd certainly be workable but you've probably got better things to do
with your time. I guess it depends how frequently you make changes or
additions to your redirections.
When I moved to Varnish, I took the opportunity to place all the
redirects and rewrites into our "DNS management system", which is just
an in-house hodge-podge of Python and perl. The redirects are specified
in a similar format to squirm/jesred, and I have a script that parses
them and spits out appropriate VCL. That file is then rsynced to each of
the proxies, and included from the appropriate site's configuration.
If you are happy editing it directly, then there's no real issue. One
nice thing about Varnish is you can tell it to load a new config while
it's running, and if it can't compile it, it'll just tell you to rack
off and keep running the existing one. So even if you break the config
the site keeps running without a hiccup. jesred liked to just stop doing
any redirects if I broke its pattern file, and the comment about "Dodo
mode" makes me think squirm may well do the same thing.
So in summary: try not to fret about the ugliness of a massive if/elsif
ladder. That's what squirm is doing, anyway. It might be a good time to
decide if directly editing the pattern file is how you want to be
managing all those redirects, and if that's what's really bugging you,
implement a better solution for that. The VCL itself isn't really a
problem, Varnish seems quite happy to load massive configurations.
I don't think I directly answered your question, so in case you didn't
infer an answer: I personally don't think you'd benefit from doing your
own custom squirm-like (or other) handler if performance is your concern.
I think you'd be better off doing a quick hackish mass-conversion of as
many patterns as you can and seeing how Varnish performs. My hunch is
that'll alleviate any concerns you've got.
More information about the varnish-misc
mailing list