Performance options for trunk

Fri May 30 15:54:36 CEST 2008

Am Freitag 30 Mai 2008 14:01:35 schrieb Audun Ytterdal:
> I run trunk in front of a site. I have 3 varnishservers, all with
> 32GB
>
> of ram serving only small pictures, thumbnails and profile pictures.
> The cacheset is pretty large (1.5 TB) and changing much over time.
> And before you all ask why I don't just server partitioned data from
> several apache/nginx/lighttpd servers It's because we're not there
> yet. The varnishes all fetch their content from one lighttpd server .
>
> I run into the Thread pileup-problem
>
> I've set threadlimit to 1500 and it usually lies between 80 and 700.
> While restarting it hits the 1500 limit and stays there for a few
> minutes. Then it gradualy manages to controll traffic and ends up
> around 80 threads. It usually grows a bit. But not over 700-1000 ish.
> But suddenly, under high traffic it goes up to the limit beeing 1500
> or 4000 or whatever i set it to. Then it stays there and usualy never
> recovers without a restart.
> I guess it's because the backend at some point answers slowly. But is
> there a way to easier get out of this situation.
>
> Running varnish like this:
>
> (redhat 4.6 32 GB RAM)
>
> /usr/sbin/varnishd -a :80 -f /etc/varnish/nettby.vcl -T 127.0.0.1:82
> -t 120 -w 2,2000,30 -u varnish -g varnish -p client_http11 on -p
> thread_pools 4 -p thread_pool_max 4000 -p listen_depth 4096 -p
> lru_interval 3600 -h classic,500009 -s
> file,/var/varnish/varnish_storage.bin,30G -P /var/run/varnish.pid
>
> and for testing purposes
>
> (redhat 5.1 32 GB RAM)
>
> varnish  22959 17.1 45.2 20187068 14938024 ?   Sl   May29 160:40
> /usr/sbin/varnishd -a :80 -f /etc/varnish/nettby.vcl -T 127.0.0.1:82
> -t 120 -u varnish -g varnish -p thread_pools 4 -p thread_pool_max
> 2000 -p client_http11 on -p listen_depth 4096 -p lru_interval 3600 -h
> classic,500009 -s malloc,60G -P /var/run/varnish.pid
>
> Each varnish handles about 3000 req/s before it caves in.
>
> Any suggestions?

not really sure if I can help, but at least I can tell you that we run a 
similar setup. however, our images are never changed, only get deleted, 
and if it happens, a PURGE request clears them off the proxies.

Therefore, we go a little different route: huge file based cache (over 
500 GB), and huge default_ttl (one year, only 404 errors have a smaller 
ttl of 3 hours). we also have 32 GB installed, and set thread_pool_max 
to 8000, and -h classic,2500009 according to a hint in the wiki.

I did not look at the thread count for a longer time, and don't really 
know if 8000 is any good or bad. since everything works nice, I just 
keep it as it is.

having said all this, the request pattern is not really like yours. the 
cache file is only used by ~60 % after 50 days, and peak traffic for 
each out of three proxies is only about 300 req/s. one would probably 
cope well with our traffic, we run three as a simple HA measure.

Cheers,

Sascha

>
> Are the parameters sane?
>
> --
> Audun
>
>
>
> *****************************************************************
> Denne fotnoten bekrefter at denne e-postmeldingen ble
> skannet av MailSweeper og funnet fri for virus.
> *****************************************************************
> This footnote confirms that this email message has been
> swept by MailSweeper for the presence of computer viruses.
> *****************************************************************
>
> _______________________________________________
> varnish-misc mailing list
> varnish-misc at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-misc