how to...accelarate randon access to millions of images?

Michael S. Fischer michael at dynamine.net
Sun Mar 16 18:00:42 CET 2008


On Fri, Mar 14, 2008 at 1:37 PM, Sascha Ottolski <ottolski at web.de> wrote:
>  The challenge is to server 20+ million image files, I guess with up to
>  1500 req/sec at peak.

A modern disk drive can service 100 random IOPS (@ 10ms/seek, that's
reasonable).  Without any caching, you'd need 15 disks to service your
peak load, with a bit over 10ms I/O latency (seek + read).

> The files tend to be small, most of them in a
>  range of 5-50 k. Currently the image store is about 400 GB in size (and
>  growing every day). The access pattern is very random, so it will be
>  very unlikely that any size of RAM will be big enough...

Are you saying that the hit ratio is likely to be zero?  If so,
consider whether you want to have caching turned on the first place.
There's little sense buying extra RAM if it's useless to you.

>  Now my question is: what kind of hardware would I need? Lots of RAM
>  seems to be obvious, what ever "a lot" may be...What about the disk
>  subsystem? Should I look into something like RAID-0 with many disk to
>  push the IO-performance?

You didn't say what your failure tolerance requirements were.  Do you
care if you lose data?   Do you care if you're unable to serve some
requests while a machine is down?

Consider dividing up your image store onto multiple machines.  Not
only would you get better performance, but you would be able to
survive hardware failures with fewer catastropic effects (i.e., you'd
lose only 1/n of service).

If I were designing such a service, my choices would be:

(1) 4 machines, each with 4-disk RAID 1 (fast, but dangerous)
(2) 4 machines, each with 5-disk RAID 5 (safe, fast reads, but slow
writes for your file size - also, RAID 5 should be battery backed,
which adds cost)
(3) 4 machines, each with 4-disk RAID 10 (will meet workload
requirement, but won't handle peak load in degraded mode)
(4) 5 machines, each with 4-disk RAID 10
(5) 9 machines, each with 2-disk RAID 0

Multiply each of these machine counts by 2 if you want to be resilient
to failures other than disk failures.

You can then put a Varnish proxy layer in front of your image storage
servers, and direct incoming requests to the appropriate backend
server.

--Michael



More information about the varnish-misc mailing list