keeping varnishstat open will bring down server
Poul-Henning Kamp
phk at phk.freebsd.dk
Tue Apr 13 15:13:52 CEST 2010
Please open a ticket.
In message <2903443B3710364B814B820238DDEF2CA761B759 at TIL-EXCH-01.netmatch.local
>, =?iso-8859-1?Q?Angelo_H=F6ngens?= writes:
>Hey guys,
>
>I've seen something I'd like to share with you, perhaps it could be seen as=
> a bug in varnishstat.
>
>Yesterday I opened ssh sessions to my 4 balancers, to run some scripts, and=
> then I opened varnishstat to monitor them. A while later I had to leave in=
> a rush and closed my laptop's lid, and in that process killed my vpn tunne=
>l and ssh sessions. However, the varnishstat process (apparently) keeps run=
>ning. (FreeBSD 7.2 x64)
>
>Just a few hours ago (so around 16 hours later), I had one balancer die on =
>my (become completely unresponsive, refuse connections to port 80). I immed=
>iately restarted varnishd, and I also saw a varnishstat instance eat 100% c=
>pu, which I killed.
>
>Now when I just looked on the other balancers, I see the varnishstat instan=
>ce using up a lot of CPU (only one out of 4 cores though):
>
>
>last pid: 77863; load averages: 1.40, 1.48, 1.47 up 105+00:24:26 14=
>:56:40
>166 processes: 2 running, 164 sleeping
>CPU: 27.1% user, 0.0% nice, 4.2% system, 1.9% interrupt, 66.8% idle
>Mem: 6430M Active, 550M Inact, 709M Wired, 189M Cache, 399M Buf, 32M Free
>Swap: 4096M Total, 228M Used, 3868M Free, 5% Inuse
>
> PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
>69587 root 1 112 0 95640K 1044K CPU3 3 19.1H 77.20% varnishst=
>at
>76211 haproxy 1 4 0 48928K 18944K kqread 1 16:34 3.17% haproxy
>68762 www 116 44 0 8756M 6412M select 0 0:01 0.39% varnishd
>31203 root 1 44 0 176M 5476K select 2 439:16 0.00% snmpd
>69527 root 1 8 0 94312K 83384K nanslp 0 11:59 0.00% varnishnc=
>sa
>37934 root 1 4 0 66244K 3164K kqread 0 8:46 0.00% squid
> 1912 root 1 44 0 10484K 724K select 0 7:50 0.00% ntpd
> 2036 root 1 44 0 85732K 3528K select 1 4:12 0.00% httpd
>56664 root 1 44 0 5692K 616K select 2 0:51 0.00% syslogd
> 2056 root 1 8 0 6748K 392K nanslp 2 0:33 0.00% cron
> 2023 root 1 4 0 5808K 428K kqread 0 0:23 0.00% master
> 2031 postfix 1 4 0 5808K 408K kqread 0 0:22 0.00% qmgr
>76181 www 1 4 0 85732K 3732K kqread 3 0:01 0.00% httpd
>76182 www 1 20 0 85732K 3716K lockf 3 0:01 0.00% httpd
>76185 www 1 20 0 85732K 3696K lockf 2 0:01 0.00% httpd
>76298 www 1 20 0 85732K 3868K lockf 3 0:01 0.00% httpd
>
>
>So it seems running varnishstat for a long time, it will use more and more =
>resources, and in my case, even cause varnishd to fail somehow (it could be=
> a coincidence, but I don't think so).
>
>After killing varnishstat, load went back from 1.5 to 0.2, around the usual.
>
>-- =
>
>
> =
>
>With kind regards,
> =
>
> =
>
>Angelo H=F6ngens
> =
>
>Systems Administrator
> =
>
>------------------------------------------
>NetMatch
>tourism internet software solutions
> =
>
>Ringbaan Oost 2b
>5013 CA Tilburg
>T: +31 (0)13 5811088
>F: +31 (0)13 5821239
> =
>
>mailto:A.Hongens at netmatch.nl
>http://www.netmatch.nl
>------------------------------------------
>
>
>
>_______________________________________________
>varnish-misc mailing list
>varnish-misc at varnish-cache.org
>http://lists.varnish-cache.org/mailman/listinfo/varnish-misc
>
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
More information about the varnish-misc
mailing list