Strategies for splitting load across varnish instances? And avoiding single-point-of-failure?

Sat Jan 16 01:35:15 CET 2010

On Jan 15, 2010, at 3:39 PM, pub crawler wrote:

> Have we considered adding pooling functionality to Varnish much like
> what they have in memcached?
> 
> Run multiple Varnish(es) and load distributed amongst the identified
> Varnish server pool....  So an element in Varnish gets hashed and the
> hash identifies the server in the pool it's on.  If the server isn't
> there or the element isn't there cold lookup to the backend server
> then we store it where it should be.
> 
> Seems like an obvious feature - unsure of the performance implications though.

At first glance, this is doing something that you can more cheaply and efficiently do at a higher level, with software dedicated to that purpose.  It's interesting, but I'm not sure it's more than just a restatement of the same solution with it's own problems.

> The recommendation of load balancers in front on Varnish to facilitate
> this feature seems costly when talking about F5 gear.   The open
> source solutions require at least two severs dedicated to this load
> balancing function for sanity sake (which is costly).

F5/NetScaler is quite expensive, but they have significant functionality, too.

The hardware required to run LVS/haproxy (for example) can be very cheap -- Small RAM, 1-2 CPU cores per ethernet interface.  When you're already talking about scaling out to lots of big-RAM/disk Varnish boxes, the cost of a second load balancer is tiny, and the benefit of redundancy is huge.

VMs don't solve the redundancy problem, and add significant overhead to network-intensive loads like high-traffic LVS or iptables configs.

> Finally, Vanish
> already offers load balancing (although limited) to the back end
> servers - so lets do the same within Varnish to make sure Varnish
> scales horizontally and doesn't require these other aids to be deemed
> totally reliable.

Squid has a peering feature; I think if you had ever tried it you would know why it's not a fabulous idea. :)  It scales terribly.  Also, Memcache pooling that I've seen scale involves logic in the app (a higher level).

Varnish as a pool/cluster also doesn't provide redundancy to the client interface.

A distributed Varnish cache (or perhaps a memcache storage option in Varnish?) is really interesting; it might be scalable, but not obviously.  It also doesn't eliminate the need for a higher-level balancer.

All just IMHO!
-- 
Ken