[PATCH] Random director tries all backends before giving up
Jack Lindamood
jack at facebook.com
Tue Apr 13 00:11:54 CEST 2010
Thanks for the feedback. While not an issue in my case, a configuration parameter that limits the number of backends to try could be useful for others. I don't know how most people use varnish, but potentially triggering vcl_error when a single backend shuts down is probably undesirable behavior for most users.
From: varnish-dev-bounces at varnish-cache.org [mailto:varnish-dev-bounces at varnish-cache.org] On Behalf Of Adrian Otto
Sent: Sunday, April 11, 2010 7:51 PM
To: varnish-dev at varnish-cache.org
Subject: Re: [PATCH] Random director tries all backends before giving up
Jack,
This approach is probably not a good idea if (a) you have a large cluster, (b) a heavily loaded cluster, and/or (c) if your backends are sensitive to overload. You are likely to trigger a cascading failure. It might be smarter to have a configurable number of backends to try... perhaps 2 or 3. Imagine if you have 50 backends. There is no point in trying 50 times to find a healthy backend. Changes are that if 25% of your backends are down, trying more is just going to exacerbate the problem.
Adrian
On Apr 11, 2010, at 4:35 PM, Jack Lindamood wrote:
The following is a patch I've made to varnish that I hope improves the random director: which anyone's welcome to use (even varnish trunk?). My motivation was to reduce the number of vcl_error calls when a director is mostly good. You can get the entire patch at this link.
http://github.com/cep21/Varnish/commit/6f5e98143ac2636504d9febf574b14c3c1a072fc
Here's the commit message:
Random director tries all backends before giving up
Summary:
The current random director gives up when it can't get a FD to the backend it wants retries times in a row. Rather than give up and return NULL, which is guaranteed to cause a vcl_error, as a last ditch effort we try all other healthy backends until we get one that works. This is mostly useful in the between time after a backend server dies and before the health check fails enough to mark a backend unhealthy.
Backwards Compatibility = Not strictly backwards compatible. In cases when the old code would of fallen through to vcl_error this will give a shot at getting a good result.
Performance = In the worse case, this will add extra calls for getting a FD, but only for situations that vcl_error
Test Plan: New varnish unittest. It fails in the old code and works in this new code.
_______________________________________________
varnish-dev mailing list
varnish-dev at varnish-cache.org<mailto:varnish-dev at varnish-cache.org>
http://lists.varnish-cache.org/mailman/listinfo/varnish-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-dev/attachments/20100412/44d1a37c/attachment-0003.html>
More information about the varnish-dev
mailing list