About saintmode_threshold behavior

Kristian Lyngstol kristian at varnish-software.com
Mon Jul 12 10:44:43 CEST 2010


On Fri, Jul 09, 2010 at 01:39:10PM -0300, Rodrigo K. Ferreira wrote:
> About the error counters what is compared with saintmode_threshold, when it
> counter is back to zero ? Just when that backend server are penalized ? Or
> always after one backend probe ?
> This questions is why is a bit normal dinamic backends servers returns few
> 5XX errors, for client reqs bad formed or other reasons. And if isnt back to
> zero, backend servers will be labeled sick in some time.

Ok, I'm not entirely sure I understand what you're asking, but I'll explain
saintmode_threshold anyway.

Every time you use the "saintmode" command/directive in VCL, you add an
entry to a list of bad objects, hooked up to the backend. So one list for
each backend.

When Varnish is trying to find a healthy backend, it will check if the
objecthead it's looking for is represented on the list. While checking, it
will count how many valid entries are present on the list. The only
condition required for an entry to be valid is that it has not timed out.
If it either finds the objecthead on the list OR finds saintmode_threshold
items on the list, the backend is considered sick. This is not affected by
health check polling at all. The only way to re-enable a backend that is
considered sick because of too many saintmode-items, is time.

Do keep in mind, though, that new entries are not added to the list after
saintmode_threshold is reached. You might get a couple extra on the account
of parallel requests going to the backend, but once the list is large
enough, the backend wont be used, and thus cant get new items added to the
blacklist. So if you use a 20s timer on saintmode, the maximum time until
varnish retries the backend is 20 seconds.

Consider saintmode a combination of a buffer until the real health checks
detect the problem, and a way to blacklist just one item on one backend.

You will need _different_ items on the saintmode blacklist to mark the
backend as completely down. Even if a single page returns 500 constantly,
that will not bring down the entire backend - it will just make varnish not
ask that backend for that specific page.

Hope this cleared up some questions, though it might add a few new ones I
suppose.

- Kristian




More information about the varnish-misc mailing list