Varnish Lurker is getting slower / Ban lists keeps increasing

Nils Goroll slink at
Wed Aug 30 14:37:09 CEST 2017

Hi Olivier,

I'm responding to the last two emails from you in one go

On 30/08/17 08:47, Olivier Hanesse wrote:
> What will happen when the ban list hits the size of bans defined in ban_cutoff
> value ?

The ban lurker still works the list of bans as before, but when having reached
the <ban_cutoff>th ban, we kill all objects hanging off these bans without
testing the ban condition.

This way, actively used objects (which get tested against the ban list at
request time and will end up hanging off some ban near the top of the ban list)
will not get killed, but rather only those which were least frequently accessed
(iow the long tail).

On 30/08/17 11:44, Olivier Hanesse wrote:
> Last night after your reply, I put a ban_cutoff value of 18500 according to
> the definition (50ms of latency, 370K/s ban.lurker.tested) (I've restarted
> varnish, "varnishadm ban_cutoff" shows the right value)
> This morning, nothing has changed :  ban lists is increasing (well over
> 18500).

One obvious explanation would be that the lurker had not got to the cutoff value.

But I wonder what exactly you are measuring here. In your first email you wrote

On 29/08/17 18:19, Olivier Hanesse wrote:
> our ban list keeps increasing to reach 100K objects (and sometimes more).

This makes me guess that maybe you'd be graphing the number of objects hanging
off the bans. Quick reminder:

* the second column in the varnishadm ban.list output is the number of
  objects associated with this ban (objects, for which this ban has
  last been tested)

* what the ban_cutoff parameter is limiting is the number of bans
  (that would be varnishadm ban.list | wc -l minus 2)

So can you please double check that you are graphing the latter and not the
former for ban.list?

If you'd actually be graphing the former, then we don't have a problem as this
will just be the total number of objects in your cache.

Thanks, Nils

