Bug? Barage of hits leads to failure creating worker threads / stats tracking
John Adams
jna at twitter.com
Sat Apr 11 01:30:27 CEST 2009
Something's very wrong here - we've never experienced this before.
Are you stating the server as root or as another user? Any ulimit or
restrictions on # of file descriptors?
-j
On Apr 10, 2009, at 3:58 PM, Ray Barnes wrote:
> John,
>
> Thanks for the reply; as you can see my config is largely based on
> the one you posted to this list in February (thanks!).
>
> I went back as you suggested and waited 90 seconds, while starting
> it the same way. Before running any tests, I went into the CLI and
> viewed stats on the threads:
>
> 364 N worker threads
> 364 N worker threads created
> 782 N worker threads not created
>
> When this happens (started threads do not match the number
> specified), varnish does really unpredictable things, i.e. it won't
> take 300 connections from 'ab' and times out with the following
> message:
>
> Benchmarking 98.124.141.3 (be patient)
> apr_poll: The timeout specified has expired (70007)
> Total of 52 requests completed
>
> I think the crux of my problem is figuring out why it won't start
> more threads. Being not-so-familiar with the internals of varnish,
> I can't tell whether that's an OS problem or a varnish problem.
> Hope that helps.
>
> -Ray
>
>
>
> On Fri, Apr 10, 2009 at 6:35 PM, John Adams <jna at twitter.com> wrote:
> It takes time to spawn threads. If you start the server with
> hundreds of threads, they won't be ready for ~30-90 seconds.
>
> Maybe that's causing this issue?
>
> -j
>
> On Apr 10, 2009, at 3:12 PM, Ray Barnes wrote:
>
>> Hi all. Note that everything herein is based only on a very lay
>> knowledge of varnish, without being familiar with the internals of
>> the code.
>>
>> In my quest to eek more performance out of Varnish, I've been
>> testing under 2.0.4. I have not seen much improvement over 2.0.3
>> in the way it acts after receiving a bunch of hits all at one
>> time. I am invoking varnish like this:
>>
>> ulimit -n 131072
>> ulimit -l 82000
>> /usr/local/sbin/varnishd -a 98.124.141.3:80 -b 67.212.179.98:80 -T
>> 98.124.141.3:6083 \
>> -t 60 -w1440,3000,60 -u apache -g apache -p
>> obj_workspace=16000 -p sess_workspace=262144 -p listen_depth=4096 \
>> -p shm_workspace=64000 -p thread_pools=8 -p
>> thread_pool_min=180 -p ping_interval=1 -p srcaddr_ttl=0 -s malloc,80M
>> As best I can tell, the problem I'm seeing is that it will not
>> create the number of worker threads that I'm telling it to, as
>> evidenced by the 'status' output within the CLI immediately after
>> launch:
>>
>> 270 N worker threads
>> 285 N worker threads created
>> So if I launch 'ab' with 700 connections against varnish, it will
>> not work right from the beginning, like so:
>>
>> [root at mia ~]# ab -n 20000 -c 700 http://98.124.141.3/
>> This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $>
>> apache-2.0
>> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
>> Copyright 2006 The Apache Software Foundation, http://www.apache.org/
>> Benchmarking 98.124.141.3 (be patient)
>> apr_socket_recv: Connection refused (111)
>> [root at mia ~]# ab -n 20000 -c 700 http://98.124.141.3/
>> This is ApacheBench, Version 2.0.40-dev <$Revision: 1.146 $>
>> apache-2.0
>> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
>> Copyright 2006 The Apache Software Foundation, http://www.apache.org/
>> Benchmarking 98.124.141.3 (be patient)
>> apr_poll: The timeout specified has expired (70007)
>> Total of 147 requests completed
>> [root at mia ~]# telnet 98.124.141.3 80
>> Trying 98.124.141.3...
>> Connected to 98.124.141.3 (98.124.141.3).
>> Escape character is '^]'.
>> GET / HTTP/1.0
>> ^]
>> telnet> quit
>> Connection closed.
>> The above telnet command simply hung, presumably because there are
>> still 700 sessions in CLOSE_WAIT state within the kernel, although
>> that should not matter if varnish opened the number of worker
>> threads it was supposed to. Based on what I've seen, it would seem
>> that varnish has some problem when you launch it with "too many"
>> initial worker threads (although I'm having a hard time
>> understanding why 1400ish is too many). It seems to go crazy if
>> you specify too many threads initially. Again, that number should
>> not be a problem for the machine in theory, as it's a multicore
>> Xeon. Platform is Linux 2.6 RHEL. Any idea what's happening here?
>>
>> -Ray
>>
>> _______________________________________________
>> varnish-dev mailing list
>> varnish-dev at projects.linpro.no
>> http://projects.linpro.no/mailman/listinfo/varnish-dev
>
> ---
> John Adams
> Twitter Operations
> jna at twitter.com
> http://twitter.com/netik
>
>
>
>
>
---
John Adams
Twitter Operations
jna at twitter.com
http://twitter.com/netik
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-dev/attachments/20090410/77cb72b2/attachment-0002.html>
More information about the varnish-dev
mailing list