multi-terabyte caching

David Birdsong david.birdsong at gmail.com
Sat Nov 21 10:43:39 CET 2009


On Sat, Nov 21, 2009 at 1:31 AM, Eric Bowman <ebowman at boboco.ie> wrote:
> Thanks -- very useful and helpful.
>
> cheers,
> Eric
>
of course my equation was wrong though, should be:
working_set / optimal_size = N

> David Birdsong wrote:
>> On Fri, Nov 20, 2009 at 2:19 PM, Eric Bowman <ebowman at boboco.ie> wrote:
>>
>>> Hi,
>>>
>>> Apologies if this has been hashed out before.  I did some googling, and
>>> read the faq, but I could have been more thorough... ;)
>>>
>>> I'm considering using Varnish to handle caching for a mapping
>>> application.  After reading
>>> http://varnish.projects.linpro.no/wiki/ArchitectNotes, it seems like
>>> Varnish is maybe not a good choice for this.  In short I need to cache
>>> something like 500,000,000 files that take up about 2TB of storage.
>>>
>>> Using more 1975 technologies, one of the challenges has been how to
>>> distribute these across the file system without putting too many files
>>> per directory.  We have a solution we kind of like, and there are others
>>> out there.
>>>
>>> My impression is that we would start to put a big strain on Varnish and
>>> the OS using it in the standard way.  But maybe I'm wrong.  Or, is there
>>> a way to plugin a backend to manage this storage, without getting into
>>> the vm-thrash from which Squid suffers?
>>>
>>> Thanks for any advice -- Varnish gets such good press I'd really love if
>>> it were straightforward to use it in this case.
>>>
>>> -Eric
>>>
>> a straight forward way to store an unlimited amount of data is to find
>> the optimal cache storage capacity per varnish instance then:
>>
>> optimal_size  / working_set = N
>>
>> where N is the number of varnish instances you need to run.
>>
>> then put a layer 7 switch in front of the pool of varnish instances,
>> hashing on the requests.
>>
>> works like a charm.
>>
>> finding optimal storage amount per varnish requires turning the knobs:
>>  - tuning VM
>>  - tuning kernel for high network traffic
>>  - balancing between big and fast storage medium
>>     random reads will skyrocket, minimize writing to storage while
>> serving if possible (pregenerate your working set, dont let anything
>> expire between generating )
>>  ..and test
>>
>>
>>> Eric Bowman
>>>
>>>
>>> _______________________________________________
>>> varnish-misc mailing list
>>> varnish-misc at projects.linpro.no
>>> http://projects.linpro.no/mailman/listinfo/varnish-misc
>>>
>>>
>
>
> --
> Eric Bowman
> Boboco Ltd
> ebowman at boboco.ie
> http://www.boboco.ie/ebowman/pubkey.pgp
> +35318394189/+353872801532
>
>



More information about the varnish-misc mailing list