mass purge causes high load?

Mon Apr 14 14:19:11 CEST 2008

Sascha Ottolski <ottolski at web.de> writes:
> Dag-Erling Smørgrav <des at linpro.no> writes:
> > No, the semantics are completely different.  With HTTP PURGE, you do
> > a direct cache lookup, and set the object's TTL to 0 if it exists. 
> > With url.purge, you add an entry to a ban list, and every time an
> > object is looked up in the cache, it is checked against all ban list
> > entries that have arrived since the last time.  This is the only way
> > to implement regexp purging efficiently.
> thanks very much for clarification. I guess the ban list gets smaller 
> everytime an object has been purged?

Each ban list entry has a sequence number, and each object has a
generation number.  When a new object is inserted into the cache, its
generation number is set to the sequence number of the newest ban list
entry.

For every cache hit, the object's generation number is compared to the
sequence number of the last ban list entry.  If they don't match, the
object is checked against every ban list entry that has a sequence
number higher than the object's generation number.

If the object matches one of these entries, it is discarded, and
processing continues as if the object had never been in cache.

If it doesn't, its generation number is set to the sequence number of
the last entry it was matched against.

The only alternative to this algorithm would be to lock the cache and
inspect every item, which would stop all request processing for several
seconds or minutes, depending on the size of your cache and how much of
it is resident; and even then, it would only work for hash.purge, not
url.purge, as only the hash string is actually stored in the cache.

DES
-- 
Dag-Erling Smørgrav
Senior Software Developer
Linpro AS - www.linpro.no