Proposal/specs for backend conditional requests / aka "GET If-Modified-Since" (GET IMS))
Artur Bergman
sky at crucially.net
Mon Sep 27 21:27:14 CEST 2010
For persistant storage, just ignore the TTL and throw away the segment
with the oldest object, refreshed or not.
I am of the opinion that if a method exists to verify the object, LM
or Etag, we shouldn't ever expire it. The ttl is just a setting for
when we should refresh it. Of course, standard LRU should still apply.
I am also less worried about the reader/writer scenario for the
headers, since by spec you shouldnt' update any headers that aren't
Expires/Cache-Control (and weirdly enough, Vary)
Artur
On Sep 27, 2010, at 6:50 AM, Nils Goroll wrote:
> Hi,
>
> I'd like to add a brief update to the following section summarizing my
> understanding after talking to phk today, who seems to be really
> busy and
> probably will not find time to respond before the weekend:
>
>> To allow multiple cache objects to share body data, we want to add
>> reference counters to struct storage following the example of the
>> existing implementation for objects (HSH_Ref(), HSH_Unref() etc).
>
> Though I still believe this should be pretty straight forward for
> all other
> storages, it won't be for -spersistent. After studying the code for
> an hour or
> so, my understanding is the following:
>
> Persistent storage segments the cache (see
> http://www.varnish-cache.org/trac/wiki/
> ArchitecturePersistentStorage) and won't
> re-use segments for new objects unless they are completely empty (no
> live
> objects). Right now, this relies on the LRU and TTL based expiry to
> eventually
> clean out segments before running out of space. Having multiple refs
> to the same
> obj in persistent storage (and updating it again and again) would
> effectively
> lead to more and more segments being kept from becoming empty.
>
> I believe what is really needed is additional space management for the
> persistent storage. In a first step, when running short of storage,
> objects
> could get nuked from the smallest segment. In a second step, the
> mechanics to
> copy live objects from one segment to another could be implemented.
> Ideally,
> this could be vcl controlled ("should we rather nuke the object or
> bother
> copying it?"). But I see some complications for both, mainly that
> storage would
> need to know which objects are referencing it in order to update
> those (sounds
> wrong).
>
> As long as we don't have any of this, I suggest two alternative
> temporary solutions:
>
> a) If an object getting refreshed lives in persistent storage, we'll
> simply copy
> it. Actually, the existing Rackspace implementation does this. This
> is far from
> optimal, but won't make much of a difference for small objects and
> is still much
> more efficient than re-fetching the object from backend like today,
> so we
> shouldn't see any performance regression.
>
> For other stevedores, we'll use the reference counter.
>
> b) Add reference counters to persistent storage, too, and simply
> live with the
> cache fragmentation issue. Those using persistent storage would be
> advised not
> to use cache refresh.
>
> At this point, I'd favor a).
>
>
> Please note that all of this is my personal understanding. I am
> posting these
> thoughts in the hope that my understanding is correct and I'd really
> appreciate
> corrections if it's not.
>
> Thank you, Nils
>
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at varnish-cache.org
> http://lists.varnish-cache.org/mailman/listinfo/varnish-dev
More information about the varnish-dev
mailing list