Is anyone using ESI with a lot of traffic?

Mon Mar 2 22:10:16 CET 2009

HAProxy doesn't do keep-alive, so it makes everything slower.

Artur

On Feb 27, 2009, at 9:02 PM, Cloude Porteus wrote:

> John,
> Thanks so much for the info, that's a huge help for us!!!
>
> I love HAProxy and Willy has been awesome to us. We run everything
> through it, since it's really easy to monitor and also easy to debug
> where the lag is when something in the chain is not responding fast
> enough. It's been rock solid for us.
>
> The nice part for us is that we can use it as a content switcher to
> send all /xxx traffic or certain user-agent traffic to different
> backends.
>
> best,
> cloude
>
> On Fri, Feb 27, 2009 at 2:24 PM, John Adams <jna at twitter.com> wrote:
>> cc'ing the varnish dev list for comments...
>> On Feb 27, 2009, at 1:33 PM, Cloude Porteus wrote:
>>
>> John,
>> Goodto hear from you. You must be slammed at Twitter. I'm happy to
>> hear that ESI is holding up for you. It's been in my backlog since  
>> you
>> mentioned it to me pre-Twitter.
>>
>> Any performance info would be great.
>>
>>
>> Any comments on our setup are welcome. You may also choose to call us
>> crazypants. Many, many thanks to Artur Bergman of Wikia for helping  
>> us get
>> this configuration straightened out.
>> Right now, we're running varnish (on search) in a bit of a non- 
>> standard way.
>> We plan to use it in the normal fashion (varnish to Internet, nothing
>> inbetween) on our API at some point. We're running version 2.0.2, no
>> patches. Cache hit rates range from 10% to 30%, or higher when a  
>> real-time
>> event is flooding search.
>> 2.0.2 is quite stable for us, with the occasional child death here  
>> and there
>> when we get massive headers coming in that flood sess_workspace. I  
>> hear this
>> is fixed in 2.0.3, but haven't had time to try it yet.
>> We have a number of search boxes, and each search box has an apache  
>> instance
>> on it, and varnish instance. We plan to merge the varnish instances  
>> at some
>> point, but we use very low TTLs (Twitter is the real time web!) and  
>> don't
>> see much of a savings by running less of them.
>> We do:
>> Apache --> Varnish --> Apache -> Mongrels
>> Apaches are using mod_proxy_balancer. The front end apache is there  
>> because
>> we've long had a fear that Varnish would crash on us, which it did  
>> many
>> times prior to our figuring out the proper parameters for startup.  
>> We have
>> two entries in that balancer. Either the request goes to varnish,  
>> or, if
>> varnish bombs out, it goes directly to the mongrel.
>> We do this, because we need a load balancing algorithm that varnish  
>> doesn't
>> support, called bybusiness. Without bybusiness, varnish tries to  
>> direct
>> requests to Mongrels that are busy, and requests end up in the  
>> listen queue.
>> that adds ~100-150mS to load times, and that's no good for our  
>> desired
>> service times of 200-250mS (or less.)
>> We'd be so happy if someone put bybusiness into Varnish's backend  
>> load
>> balancing, but it's not there yet.
>> We also know that taking the extra hop through localhost costs us  
>> next to
>> nothing in service time, so it's good to have Apache there incase  
>> we need to
>> yank out Varnish. In the future, we might get rid of Apache and use  
>> HAProxy
>> (it's load balancing and backend monitoring is much richer than  
>> Apache, and,
>> it has a beautiful HTTP interface to look at.)
>> Some variables and our decisions:
>>              -p obj_workspace=4096 \
>>      -p sess_workspace=262144 \
>> Absolutely vital!  Varnish does not allocate enough space by  
>> default for
>> headers, regexps on cookies, and otherwise. It was increased in  
>> 2.0.3, but
>> really, not increased enough. Without this we were panicing every  
>> 20-30
>> requests and overflowing the sess hash.
>>              -p listen_depth=8192 \
>> 8192 is probably excessive for now. If we're queuing 8k conns,  
>> something is
>> really broke!
>>              -p log_hashstring=off \
>> Who cares about this - we don't need it.
>>      -p lru_interval=60 \
>> We have many small objects in the search cache. Run LRU more often.
>>              -p sess_timeout=10 \
>> If you keep session data around for too long, you waste memory.
>>      -p shm_workspace=32768 \
>> Give us a bit more room in shm
>>              -p ping_interval=1 \
>> Frequent pings in case the child dies on us.
>>              -p thread_pools=4 \
>>              -p thread_pool_min=100 \
>> This must match up with VARNISH_MIN_THREADS. We use four pools,  
>> (pools *
>> thread_pool_min == VARNISH_MIN_THREADS)
>>      -p srcaddr_ttl=0 \
>> Disable the (effectively unused) per source-IP statistics
>>      -p esi_syntax=1
>> Disable ESI syntax verification so we can use it to process JSON  
>> requests.
>> If you have more than 2.1M objects, you should also add:
>> # -h classic,250007 = recommeded value for 2.1M objects
>> #     number should be 1/10 expected working set.
>>
>> In our VCL, we have a few fancy tricks that we use. We label the  
>> cache
>> server and cache hit/miss rate in vcl_deliver with this code:
>> Top of VCL:
>> C{
>> #include <stdio.h>
>> #include <unistd.h>
>> char myhostname[255] = "";
>>
>> }C
>> vcl_deliver:
>> C{
>>     VRT_SetHdr(sp, HDR_RESP, "\014X-Cache-Svr:", myhostname,
>> vrt_magic_string_end);
>> }C
>>     /* mark hit/miss on the request */
>>     if (obj.hits > 0) {
>>       set resp.http.X-Cache = "HIT";
>>       set resp.http.X-Cache-Hits = obj.hits;
>>     } else {
>>       set resp.http.X-Cache = "MISS";
>>     }
>>
>> vcl_recv:
>> C{
>>    if (myhostname[0] == '\0') {
>>      /* only get hostname once - restart required if hostname  
>> changes */
>>      gethostname(myhostname, 255);
>>    }
>> }C
>>
>> Portions of /etc/sysconfig/varnish follow...
>> # The minimum number of worker threads to start
>> VARNISH_MIN_THREADS=400
>> # The Maximum number of worker threads to start
>> VARNISH_MAX_THREADS=1000
>> # Idle timeout for worker threads
>> VARNISH_THREAD_TIMEOUT=60
>> # Cache file location
>> VARNISH_STORAGE_FILE=/var/lib/varnish/varnish_storage.bin
>> # Cache file size: in bytes, optionally using k / M / G / T suffix,
>> # or in percentage of available disk space using the % suffix.
>> VARNISH_STORAGE_SIZE="8G"
>> #
>> # Backend storage specification
>> VARNISH_STORAGE="malloc,${VARNISH_STORAGE_SIZE}"
>> # Default TTL used when the backend does not specify one
>> VARNISH_TTL=5
>> # the working directory
>> DAEMON_OPTS="-a ${VARNISH_LISTEN_ADDRESS}:${VARNISH_LISTEN_PORT} \
>>              -f ${VARNISH_VCL_CONF} \
>>              -T
>> ${VARNISH_ADMIN_LISTEN_ADDRESS}:${VARNISH_ADMIN_LISTEN_PORT} \
>>              -t ${VARNISH_TTL} \
>>     -n ${VARNISH_WORKDIR} \
>>              -w
>> ${VARNISH_MIN_THREADS},${VARNISH_MAX_THREADS},$ 
>> {VARNISH_THREAD_TIMEOUT} \
>>              -u varnish -g varnish \
>>              -p obj_workspace=4096 \
>>     -p sess_workspace=262144 \
>>              -p listen_depth=8192 \
>>              -p log_hashstring=off \
>>     -p lru_interval=60 \
>>              -p sess_timeout=10 \
>>     -p shm_workspace=32768 \
>>              -p ping_interval=1 \
>>              -p thread_pools=4 \
>>              -p thread_pool_min=100 \
>>     -p srcaddr_ttl=0 \
>>     -p esi_syntax=1 \
>>              -s ${VARNISH_STORAGE}"
>>
>> ---
>> John Adams
>> Twitter Operations
>> jna at twitter.com
>> http://twitter.com/netik
>>
>>
>>
>>
>
>
>
> -- 
> VP of Product Development
> Instructables.com
>
> http://www.instructables.com/member/lebowski
> _______________________________________________
> varnish-dev mailing list
> varnish-dev at projects.linpro.no
> http://projects.linpro.no/mailman/listinfo/varnish-dev