mass purge causes high load?

Sascha Ottolski ottolski at web.de
Mon Apr 14 15:26:08 CEST 2008


Am Montag 14 April 2008 14:19:11 schrieb Dag-Erling Smørgrav:
> Sascha Ottolski <ottolski at web.de> writes:
> > Dag-Erling Smørgrav <des at linpro.no> writes:
> > > No, the semantics are completely different.  With HTTP PURGE, you
> > > do a direct cache lookup, and set the object's TTL to 0 if it
> > > exists. With url.purge, you add an entry to a ban list, and every
> > > time an object is looked up in the cache, it is checked against
> > > all ban list entries that have arrived since the last time.  This
> > > is the only way to implement regexp purging efficiently.
> >
> > thanks very much for clarification. I guess the ban list gets
> > smaller everytime an object has been purged?
>
> Each ban list entry has a sequence number, and each object has a
> generation number.  When a new object is inserted into the cache, its
> generation number is set to the sequence number of the newest ban
> list entry.
>
> For every cache hit, the object's generation number is compared to
> the sequence number of the last ban list entry.  If they don't match,
> the object is checked against every ban list entry that has a
> sequence number higher than the object's generation number.
>
> If the object matches one of these entries, it is discarded, and
> processing continues as if the object had never been in cache.
>
> If it doesn't, its generation number is set to the sequence number of
> the last entry it was matched against.
>
> The only alternative to this algorithm would be to lock the cache and
> inspect every item, which would stop all request processing for
> several seconds or minutes, depending on the size of your cache and
> how much of it is resident; and even then, it would only work for
> hash.purge, not url.purge, as only the hash string is actually stored
> in the cache.
>
> DES

Dag,

thanks again. If I get it right, the ban list never shrinks, so I 
probably have 17,000 ban list entries hanging around. can I purge this 
list somehow, other than restarting the proxy? I suppose even if the 
list is not used any more, even the comparing the generation and 
sequence no. for each request adds a bit of overhead, doesn't it?


Cheers,

Sascha



More information about the varnish-misc mailing list