Is anyone using ESI with a lot of traffic?

Sat Feb 28 06:02:55 CET 2009

John,
Thanks so much for the info, that's a huge help for us!!!

I love HAProxy and Willy has been awesome to us. We run everything
through it, since it's really easy to monitor and also easy to debug
where the lag is when something in the chain is not responding fast
enough. It's been rock solid for us.

The nice part for us is that we can use it as a content switcher to
send all /xxx traffic or certain user-agent traffic to different
backends.

best,
cloude

On Fri, Feb 27, 2009 at 2:24 PM, John Adams <jna at twitter.com> wrote:
> cc'ing the varnish dev list for comments...
> On Feb 27, 2009, at 1:33 PM, Cloude Porteus wrote:
>
> John,
> Goodto hear from you. You must be slammed at Twitter. I'm happy to
> hear that ESI is holding up for you. It's been in my backlog since you
> mentioned it to me pre-Twitter.
>
> Any performance info would be great.
>
>
> Any comments on our setup are welcome. You may also choose to call us
> crazypants. Many, many thanks to Artur Bergman of Wikia for helping us get
> this configuration straightened out.
> Right now, we're running varnish (on search) in a bit of a non-standard way.
> We plan to use it in the normal fashion (varnish to Internet, nothing
> inbetween) on our API at some point. We're running version 2.0.2, no
> patches. Cache hit rates range from 10% to 30%, or higher when a real-time
> event is flooding search.
> 2.0.2 is quite stable for us, with the occasional child death here and there
> when we get massive headers coming in that flood sess_workspace. I hear this
> is fixed in 2.0.3, but haven't had time to try it yet.
> We have a number of search boxes, and each search box has an apache instance
> on it, and varnish instance. We plan to merge the varnish instances at some
> point, but we use very low TTLs (Twitter is the real time web!) and don't
> see much of a savings by running less of them.
> We do:
> Apache --> Varnish --> Apache -> Mongrels
> Apaches are using mod_proxy_balancer. The front end apache is there because
> we've long had a fear that Varnish would crash on us, which it did many
> times prior to our figuring out the proper parameters for startup. We have
> two entries in that balancer. Either the request goes to varnish, or, if
> varnish bombs out, it goes directly to the mongrel.
> We do this, because we need a load balancing algorithm that varnish doesn't
> support, called bybusiness. Without bybusiness, varnish tries to direct
> requests to Mongrels that are busy, and requests end up in the listen queue.
> that adds ~100-150mS to load times, and that's no good for our desired
> service times of 200-250mS (or less.)
> We'd be so happy if someone put bybusiness into Varnish's backend load
> balancing, but it's not there yet.
> We also know that taking the extra hop through localhost costs us next to
> nothing in service time, so it's good to have Apache there incase we need to
> yank out Varnish. In the future, we might get rid of Apache and use HAProxy
> (it's load balancing and backend monitoring is much richer than Apache, and,
> it has a beautiful HTTP interface to look at.)
> Some variables and our decisions:
>               -p obj_workspace=4096 \
>       -p sess_workspace=262144 \
> Absolutely vital!  Varnish does not allocate enough space by default for
> headers, regexps on cookies, and otherwise. It was increased in 2.0.3, but
> really, not increased enough. Without this we were panicing every 20-30
> requests and overflowing the sess hash.
>               -p listen_depth=8192 \
> 8192 is probably excessive for now. If we're queuing 8k conns, something is
> really broke!
>               -p log_hashstring=off \
> Who cares about this - we don't need it.
>       -p lru_interval=60 \
> We have many small objects in the search cache. Run LRU more often.
>               -p sess_timeout=10 \
> If you keep session data around for too long, you waste memory.
>       -p shm_workspace=32768 \
> Give us a bit more room in shm
>               -p ping_interval=1 \
> Frequent pings in case the child dies on us.
>               -p thread_pools=4 \
>               -p thread_pool_min=100 \
> This must match up with VARNISH_MIN_THREADS. We use four pools, (pools *
> thread_pool_min == VARNISH_MIN_THREADS)
>       -p srcaddr_ttl=0 \
> Disable the (effectively unused) per source-IP statistics
>       -p esi_syntax=1
> Disable ESI syntax verification so we can use it to process JSON requests.
> If you have more than 2.1M objects, you should also add:
> # -h classic,250007 = recommeded value for 2.1M objects
> #     number should be 1/10 expected working set.
>
> In our VCL, we have a few fancy tricks that we use. We label the cache
> server and cache hit/miss rate in vcl_deliver with this code:
> Top of VCL:
> C{
> #include <stdio.h>
> #include <unistd.h>
> char myhostname[255] = "";
>
> }C
> vcl_deliver:
> C{
>      VRT_SetHdr(sp, HDR_RESP, "\014X-Cache-Svr:", myhostname,
> vrt_magic_string_end);
> }C
>      /* mark hit/miss on the request */
>      if (obj.hits > 0) {
>        set resp.http.X-Cache = "HIT";
>        set resp.http.X-Cache-Hits = obj.hits;
>      } else {
>        set resp.http.X-Cache = "MISS";
>      }
>
> vcl_recv:
> C{
>     if (myhostname[0] == '\0') {
>       /* only get hostname once - restart required if hostname changes */
>       gethostname(myhostname, 255);
>     }
> }C
>
> Portions of /etc/sysconfig/varnish follow...
> # The minimum number of worker threads to start
> VARNISH_MIN_THREADS=400
> # The Maximum number of worker threads to start
> VARNISH_MAX_THREADS=1000
> # Idle timeout for worker threads
> VARNISH_THREAD_TIMEOUT=60
> # Cache file location
> VARNISH_STORAGE_FILE=/var/lib/varnish/varnish_storage.bin
> # Cache file size: in bytes, optionally using k / M / G / T suffix,
> # or in percentage of available disk space using the % suffix.
> VARNISH_STORAGE_SIZE="8G"
> #
> # Backend storage specification
> VARNISH_STORAGE="malloc,${VARNISH_STORAGE_SIZE}"
> # Default TTL used when the backend does not specify one
> VARNISH_TTL=5
> # the working directory
> DAEMON_OPTS="-a ${VARNISH_LISTEN_ADDRESS}:${VARNISH_LISTEN_PORT} \
>               -f ${VARNISH_VCL_CONF} \
>               -T
> ${VARNISH_ADMIN_LISTEN_ADDRESS}:${VARNISH_ADMIN_LISTEN_PORT} \
>               -t ${VARNISH_TTL} \
>      -n ${VARNISH_WORKDIR} \
>               -w
> ${VARNISH_MIN_THREADS},${VARNISH_MAX_THREADS},${VARNISH_THREAD_TIMEOUT} \
>               -u varnish -g varnish \
>               -p obj_workspace=4096 \
>      -p sess_workspace=262144 \
>               -p listen_depth=8192 \
>               -p log_hashstring=off \
>      -p lru_interval=60 \
>               -p sess_timeout=10 \
>      -p shm_workspace=32768 \
>               -p ping_interval=1 \
>               -p thread_pools=4 \
>               -p thread_pool_min=100 \
>      -p srcaddr_ttl=0 \
>      -p esi_syntax=1 \
>               -s ${VARNISH_STORAGE}"
>
> ---
> John Adams
> Twitter Operations
> jna at twitter.com
> http://twitter.com/netik
>
>
>
>

-- 
VP of Product Development
Instructables.com

http://www.instructables.com/member/lebowski