Child process recurrently being restarted
stefanobaldo at gmail.com
Mon Jun 26 16:51:40 CEST 2017
Thanks for answering.
I'm using a SSD disk. I've changed from ext4 to ext2 to increase
performance but it stills restarting.
Also, I checked the I/O performance for the disk and there is no signal of
I've changed the /var/lib/varnish to a tmpfs and increased its 80m default
size passing "-l 200m,20m" to varnishd and using
"nodev,nosuid,noatime,size=256M 0 0" for the tmpfs mount. There was a
problem here. After a couple of hours varnish died and I received a "no
space left on device" message - deleting the /var/lib/varnish solved the
problem and varnish was up again, but it's weird because there was free
memory on the host to be used with the tmpfs directory, so I don't know
what could have happened. I will try to stop increasing the
Anyway, I am worried about the bans. You asked me if the bans are lurker
friedly. Well, I don't think so. My bans are created this way:
ban("req.http.host == " + req.http.host + " && req.url ~ " + req.url + " &&
req.http.User-Agent !~ Googlebot");
Are they lurker friendly? I was taking a quick look and the documentation
and it looks like they're not.
On Fri, Jun 23, 2017 at 11:30 AM, Guillaume Quintard <
guillaume at varnish-software.com> wrote:
> Hi Stefano,
> Let's cover the usual suspects: I/Os. I think here Varnish gets stuck
> trying to push/pull data and can't make time to reply to the CLI. I'd
> recommend monitoring the disk activity (bandwidth and iops) to confirm.
> After some time, the file storage is terrible on a hard drive (SSDs take a
> bit more time to degrade) because of fragmentation. One solution to help
> the disks cope is to overprovision themif they're SSDs, and you can try
> different advices in the file storage definition in the command line (last
> parameter, after granularity).
> Is your /var/lib/varnish mount on tmpfs? That could help too.
> 40K bans is a lot, are they ban-lurker friendly?
> Guillaume Quintard
> On Fri, Jun 23, 2017 at 4:01 PM, Stefano Baldo <stefanobaldo at gmail.com>
>> I am having a critical problem with Varnish Cache in production for over
>> a month and any help will be appreciated.
>> The problem is that Varnish child process is recurrently being restarted
>> after 10~20h of use, with the following message:
>> Jun 23 09:15:13 b858e4a8bd72 varnishd: Child (11824) not
>> responding to CLI, killed it.
>> Jun 23 09:15:13 b858e4a8bd72 varnishd: Unexpected reply from ping:
>> 400 CLI communication error
>> Jun 23 09:15:13 b858e4a8bd72 varnishd: Child (11824) died signal=9
>> Jun 23 09:15:14 b858e4a8bd72 varnishd: Child cleanup complete
>> Jun 23 09:15:14 b858e4a8bd72 varnishd: Child (24038) Started
>> Jun 23 09:15:14 b858e4a8bd72 varnishd: Child (24038) said Child
>> Jun 23 09:15:14 b858e4a8bd72 varnishd: Child (24038) said SMF.s0
>> mmap'ed 483183820800 bytes of 483183820800
>> The following link is the varnishstat output just 1 minute before a
>> varnish-5.1.2 revision 6ece695
>> Debian 8.7 - Debian GNU/Linux 8 (3.16.0)
>> Installed using pre-built package from official repo at packagecloud.io
>> CPU 2x2.9 GHz
>> Mem 3.69 GiB
>> Running inside a Docker container
>> Additional info:
>> - I need to cache a large number of objets and the cache should last for
>> almost a week, so I have set up a 450G storage space, I don't know if this
>> is a problem;
>> - I use ban a lot. There was about 40k bans in the system just before the
>> last crash. I really don't know if this is too much or may have anything to
>> do with it;
>> - No registered CPU spikes (almost always by 30%);
>> - No panic is reported, the only info I can retrieve is from syslog;
>> - During all the time, event moments before the crashes, everything is
>> okay and requests are being responded very fast.
>> Stefano Baldo
>> varnish-misc mailing list
>> varnish-misc at varnish-cache.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the varnish-misc