[Varnish] #1235: Frequent Varnish crashes - Size of Varnish cache never grows
Varnish
varnish-bugs at varnish-cache.org
Mon Dec 10 12:17:06 CET 2012
#1235: Frequent Varnish crashes - Size of Varnish cache never grows
------------------------+--------------------
Reporter: msallen333 | Owner:
Type: defect | Status: new
Priority: normal | Milestone:
Component: varnishd | Version: 3.0.2
Severity: normal | Resolution:
Keywords: crash |
------------------------+--------------------
Description changed by kristian:
Old description:
> *** PROBLEM DESCRIPTION ***
>
> Varnish 3.0.2-1daemon crashes and restarts at least once per week with
> below output in /var/log/messages. In addition, the
> usage of /varnish filesystem never climbs above 404G.
>
> I have already disabled "Transparent Hugepages".
>
> Has anyone else experienced this same problem, and possibly have a
> solution?
>
>
> =============================================
>
>
> # /var/log/messages
> Dec 2 12:45:37 lx11 varnishd[7568]: Child (7569) not responding to CLI,
> killing it.
> Dec 2 12:45:47 lx11 varnishd[7568]: Child (7569) not responding to CLI,
> killing it.
> Dec 2 12:45:57 lx11 varnishd[7568]: Child (7569) not responding to CLI,
> killing it.
> Dec 2 12:46:06 lx11 varnishd[7568]: Child (7569) not responding to CLI,
> killing it.
> Dec 2 12:46:06 lx11 varnishd[7568]: Child (7569) not responding to CLI,
> killing it.
> Dec 2 12:46:06 lx11 varnishd[7568]: Child (7569) died signal=3 (core
> dumped)
> Dec 2 12:46:06 lx11 varnishd[7568]: child (9172) Started
> Dec 2 12:46:06 lx11 varnishd[7568]: Child (9172) said Child starts
> Dec 2 12:46:06 lx11 varnishd[7568]: Child (9172) said SMF.s0 mmap'ed
> 1589334294528 bytes of 1589334294528
>
> # df -h | grep varnish
> /dev/mapper/emcvg1-varnish 2.0T 404G 1.5T 22% /varnish
>
> # ps -ef | grep arnish
> root 4130 1 0 Nov30 ? 00:00:10 /usr/bin/varnishlog -a -w
> /var/log/varnish/varnish.log -D -P /var/run/varnishlog.pid
> root 4137 1 0 Nov30 ? 00:07:08 /usr/bin/varnishncsa -a
> -w /var/log/varnish/varnishncsa.log -D -P /var/run/varnishncsa.pid
> root 7568 1 0 Nov30 ? 00:00:02 /usr/sbin/varnishd -P
> /var/run/varnish.pid -a :80 -f /etc/varnish/default.vcl -T 127.0.0.1:6082
> -t 120 -w 1,1000,120 -u varnish -g varnish -S /etc/varnish/secret -s
> file,/varnish/varnish_storage.bin,98%
> varnish 9172 7568 13 Dec02 ? 02:48:45 /usr/sbin/varnishd -P
> /var/run/varnish.pid -a :80 -f /etc/varnish/default.vcl -T 127.0.0.1:6082
> -t 120 -w 1,1000,120 -u varnish -g varnish -S /etc/varnish/secret -s
> file,/varnish/varnish_storage.bin,98%
>
> # cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
> always [never]
>
>
> Which varnish version ?
>
> # rpm -qa | grep varnish
> varnish-libs-3.0.2-1.el5.x86_64
> varnish-3.0.2-1.el5.x86_64
> varnish-release-3.0-1.noarch
>
>
> Which type of CPU ?
>
> # more /proc/cpuinfo
> processor : 0
> vendor_id : GenuineIntel
> cpu family : 6
> model : 29
> model name : Intel(R) Xeon(R) CPU E7440 @ 2.40GHz
> stepping : 1
> cpu MHz : 2400.080
> cache size : 16384 KB
> physical id : 0
> siblings : 4
> core id : 0
> cpu cores : 4
> apicid : 0
> initial apicid : 0
> fpu : yes
> fpu_exception : yes
> cpuid level : 11
> wp : yes
> flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca
> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall lm
> constant_tsc arch_perfmon pebs bts rep_good xtopology aperfmperf pni
> dtes64 monitor ds_cpl vm
> x est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm dts tpr_shadow vnmi
> flexpriority
> bogomips : 4800.16
> clflush size : 64
> cache_alignment : 64
> address sizes : 40 bits physical, 48 bits virtual
> power management:
>
>
> 32 or 64 bit mode ?
>
> 64bit
>
>
> how much RAM ?
>
> 128GB
>
> # more meminfo
> MemTotal: 132153720 kB
> MemFree: 662828 kB
> Buffers: 334308 kB
> Cached: 124553164 kB
> SwapCached: 28 kB
> Active: 15416940 kB
> Inactive: 114045600 kB
> Active(anon): 4406140 kB
> Inactive(anon): 174144 kB
> Active(file): 11010800 kB
> Inactive(file): 113871456 kB
> Unevictable: 5244 kB
> Mlocked: 5244 kB
> SwapTotal: 16777208 kB
> SwapFree: 16777080 kB
> Dirty: 33320 kB
> Writeback: 0 kB
> AnonPages: 4580508 kB
> Mapped: 68785120 kB
> Shmem: 1784 kB
> Slab: 548132 kB
> SReclaimable: 429880 kB
> SUnreclaim: 118252 kB
> KernelStack: 5504 kB
> PageTables: 158976 kB
> NFS_Unstable: 0 kB
> Bounce: 0 kB
> WritebackTmp: 0 kB
> CommitLimit: 82854068 kB
> Committed_AS: 45740984 kB
> VmallocTotal: 34359738367 kB
> VmallocUsed: 493684 kB
> VmallocChunk: 34359215588 kB
> HardwareCorrupted: 0 kB
> AnonHugePages: 1718272 kB
> HugePages_Total: 0
> HugePages_Free: 0
> HugePages_Rsvd: 0
> HugePages_Surp: 0
> Hugepagesize: 2048 kB
> DirectMap4k: 9456 kB
> DirectMap2M: 134205440 kB
>
>
> Which OS/kernel version ?
>
> # cat /etc/redhat-release
> Red Hat Enterprise Linux Server release 6.2 (Santiago)
> # cat /proc/version
> Linux version 2.6.32-220.4.2.el6.x86_64
> (mockbuild at x86-003.build.bos.redhat.com) (gcc version 4.4.6 20110731 (Red
> Hat 4.4.6-3) (GCC) ) #1 SMP Mon Feb 6 16:39:28 EST 2012
>
>
> default VCL or do you have your own ?
>
> # cat /etc/varnish/default.vcl
> # This is a basic VCL configuration file for varnish. See the vcl(7)
> # man page for details on VCL syntax and semantics.
> #
> # Default backend definition. Set this to point to your content
> # server.
> #
> backend default {
> .host = "x.y.z";
> .port = "8080";
> .connect_timeout = 15s;
> .first_byte_timeout = 120s;
> .between_bytes_timeout = 120s;
> }
> #
> # Below is a commented-out copy of the default VCL logic. If you
> # redefine any of these subroutines, the built-in logic will be
> # appended to your code.
> # sub vcl_recv {
> # if (req.restarts == 0) {
> # if (req.http.x-forwarded-for) {
> # set req.http.X-Forwarded-For =
> # req.http.X-Forwarded-For + ", " + client.ip;
> # } else {
> # set req.http.X-Forwarded-For = client.ip;
> # }
> # }
> # if (req.request != "GET" &&
> # req.request != "HEAD" &&
> # req.request != "PUT" &&
> # req.request != "POST" &&
> # req.request != "TRACE" &&
> # req.request != "OPTIONS" &&
> # req.request != "DELETE") {
> # /* Non-RFC2616 or CONNECT which is weird. */
> # return (pipe);
> # }
> # if (req.request != "GET" && req.request != "HEAD") {
> # /* We only deal with GET and HEAD by default */
> # return (pass);
> # }
> # if (req.http.Authorization || req.http.Cookie) {
> # /* Not cacheable by default */
> # return (pass);
> # }
> # return (lookup);
> # }
> #
>
> sub vcl_fetch {
> set beresp.grace = 1h;
>
> if (beresp.http.content-type ~ "(text|application)") {
> set beresp.do_gzip = true;
> }
>
> }
>
> sub vcl_recv {
> # unset cookies since we don't want to bypass caching normally
> if (req.http.cookie) {
> unset req.http.cookie;
> }
>
> set req.grace = 1h;
> }
>
> sub vcl_deliver {
> if (!resp.http.Vary) {
> set resp.http.Vary = "Accept-Encoding";
> } else if (resp.http.Vary !~ "(?i)Accept-Encoding") {
> set resp.http.Vary = resp.http.Vary + ",Accept-Encoding";
> }
> }
>
> # sub vcl_pipe {
> # # Note that only the first request to the backend will have
> # # X-Forwarded-For set. If you use X-Forwarded-For and want to
> # # have it set for all requests, make sure to have:
> # # set bereq.http.connection = "close";
> # # here. It is not set by default as it might break some broken web
> # # applications, like IIS with NTLM authentication.
> # return (pipe);
> # }
> #
> # sub vcl_pass {
> # return (pass);
> # }
> #
> # sub vcl_hash {
> # hash_data(req.url);
> # if (req.http.host) {
> # hash_data(req.http.host);
> # } else {
> # hash_data(server.ip);
> # }
> # return (hash);
> # }
> #
> # sub vcl_hit {
> # return (deliver);
> # }
> #
> # sub vcl_miss {
> # return (fetch);
> # }
> #
> # sub vcl_fetch {
> # if (beresp.ttl <= 0s ||
> # beresp.http.Set-Cookie ||
> # beresp.http.Vary == "*") {
> # /*
> # * Mark as "Hit-For-Pass" for the next 2 minutes
> # */
> # set beresp.ttl = 120 s;
> # return (hit_for_pass);
> # }
> # return (deliver);
> # }
> #
> # sub vcl_deliver {
> # return (deliver);
> # }
> #
> sub vcl_error {
> set obj.http.Content-Type = "text/html; charset=utf-8";
> set obj.http.Retry-After = "5";
> synthetic {"
> <?xml version="1.0" encoding="utf-8"?>
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> <html>
> <head>
> <title>The page is temporarily unavailable</title>
> </head>
> <body>
> <h1>Chronicling America is currently unavailable</h1>
> <p>The Chronicling America website is currently offline, undergoing
> maintenance. We regret the inconvenience, and invite you to visit other
> collections available on the Library of Congress website at <a
> href="http://www.loc.gov">www.loc.gov</a> while we are working to restore
> service.</p>
> </body>
> </html>
> "};
> return (deliver);
> }
> #
> # sub vcl_init {
> # return (ok);
> # }
> #
> # sub vcl_fini {
> # return (ok);
> # }
New description:
*** PROBLEM DESCRIPTION ***
Varnish 3.0.2-1daemon crashes and restarts at least once per week with
below output in /var/log/messages. In addition, the
usage of /varnish filesystem never climbs above 404G.
I have already disabled "Transparent Hugepages".
Has anyone else experienced this same problem, and possibly have a
solution?
=============================================
{{{
# /var/log/messages
Dec 2 12:45:37 lx11 varnishd[7568]: Child (7569) not responding to CLI,
killing it.
Dec 2 12:45:47 lx11 varnishd[7568]: Child (7569) not responding to CLI,
killing it.
Dec 2 12:45:57 lx11 varnishd[7568]: Child (7569) not responding to CLI,
killing it.
Dec 2 12:46:06 lx11 varnishd[7568]: Child (7569) not responding to CLI,
killing it.
Dec 2 12:46:06 lx11 varnishd[7568]: Child (7569) not responding to CLI,
killing it.
Dec 2 12:46:06 lx11 varnishd[7568]: Child (7569) died signal=3 (core
dumped)
Dec 2 12:46:06 lx11 varnishd[7568]: child (9172) Started
Dec 2 12:46:06 lx11 varnishd[7568]: Child (9172) said Child starts
Dec 2 12:46:06 lx11 varnishd[7568]: Child (9172) said SMF.s0 mmap'ed
1589334294528 bytes of 1589334294528
# df -h | grep varnish
/dev/mapper/emcvg1-varnish 2.0T 404G 1.5T 22% /varnish
# ps -ef | grep arnish
root 4130 1 0 Nov30 ? 00:00:10 /usr/bin/varnishlog -a -w
/var/log/varnish/varnish.log -D -P /var/run/varnishlog.pid
root 4137 1 0 Nov30 ? 00:07:08 /usr/bin/varnishncsa -a -w
/var/log/varnish/varnishncsa.log -D -P /var/run/varnishncsa.pid
root 7568 1 0 Nov30 ? 00:00:02 /usr/sbin/varnishd -P
/var/run/varnish.pid -a :80 -f /etc/varnish/default.vcl -T 127.0.0.1:6082
-t 120 -w 1,1000,120 -u varnish -g varnish -S /etc/varnish/secret -s
file,/varnish/varnish_storage.bin,98%
varnish 9172 7568 13 Dec02 ? 02:48:45 /usr/sbin/varnishd -P
/var/run/varnish.pid -a :80 -f /etc/varnish/default.vcl -T 127.0.0.1:6082
-t 120 -w 1,1000,120 -u varnish -g varnish -S /etc/varnish/secret -s
file,/varnish/varnish_storage.bin,98%
# cat /sys/kernel/mm/redhat_transparent_hugepage/enabled
always [never]
}}}
Which varnish version ?
{{{
# rpm -qa | grep varnish
varnish-libs-3.0.2-1.el5.x86_64
varnish-3.0.2-1.el5.x86_64
varnish-release-3.0-1.noarch
}}}
Which type of CPU ?
{{{
# more /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 29
model name : Intel(R) Xeon(R) CPU E7440 @ 2.40GHz
stepping : 1
cpu MHz : 2400.080
cache size : 16384 KB
physical id : 0
siblings : 4
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 11
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic mtrr pge mca
cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall lm
constant_tsc arch_perfmon pebs bts rep_good xtopology aperfmperf pni
dtes64 monitor ds_cpl vm
x est tm2 ssse3 cx16 xtpr pdcm dca sse4_1 lahf_lm dts tpr_shadow vnmi
flexpriority
bogomips : 4800.16
clflush size : 64
cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management:
32 or 64 bit mode ?
64bit
}}}
how much RAM ?
128GB
{{{
# more meminfo
MemTotal: 132153720 kB
MemFree: 662828 kB
Buffers: 334308 kB
Cached: 124553164 kB
SwapCached: 28 kB
Active: 15416940 kB
Inactive: 114045600 kB
Active(anon): 4406140 kB
Inactive(anon): 174144 kB
Active(file): 11010800 kB
Inactive(file): 113871456 kB
Unevictable: 5244 kB
Mlocked: 5244 kB
SwapTotal: 16777208 kB
SwapFree: 16777080 kB
Dirty: 33320 kB
Writeback: 0 kB
AnonPages: 4580508 kB
Mapped: 68785120 kB
Shmem: 1784 kB
Slab: 548132 kB
SReclaimable: 429880 kB
SUnreclaim: 118252 kB
KernelStack: 5504 kB
PageTables: 158976 kB
NFS_Unstable: 0 kB
Bounce: 0 kB
WritebackTmp: 0 kB
CommitLimit: 82854068 kB
Committed_AS: 45740984 kB
VmallocTotal: 34359738367 kB
VmallocUsed: 493684 kB
VmallocChunk: 34359215588 kB
HardwareCorrupted: 0 kB
AnonHugePages: 1718272 kB
HugePages_Total: 0
HugePages_Free: 0
HugePages_Rsvd: 0
HugePages_Surp: 0
Hugepagesize: 2048 kB
DirectMap4k: 9456 kB
DirectMap2M: 134205440 kB
}}}
Which OS/kernel version ?
{{{
# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 6.2 (Santiago)
# cat /proc/version
Linux version 2.6.32-220.4.2.el6.x86_64
(mockbuild at x86-003.build.bos.redhat.com) (gcc version 4.4.6 20110731 (Red
Hat 4.4.6-3) (GCC) ) #1 SMP Mon Feb 6 16:39:28 EST 2012
}}}
default VCL or do you have your own ?
{{{
# cat /etc/varnish/default.vcl
# This is a basic VCL configuration file for varnish. See the vcl(7)
# man page for details on VCL syntax and semantics.
#
# Default backend definition. Set this to point to your content
# server.
#
backend default {
.host = "x.y.z";
.port = "8080";
.connect_timeout = 15s;
.first_byte_timeout = 120s;
.between_bytes_timeout = 120s;
}
#
# Below is a commented-out copy of the default VCL logic. If you
# redefine any of these subroutines, the built-in logic will be
# appended to your code.
# sub vcl_recv {
# if (req.restarts == 0) {
# if (req.http.x-forwarded-for) {
# set req.http.X-Forwarded-For =
# req.http.X-Forwarded-For + ", " + client.ip;
# } else {
# set req.http.X-Forwarded-For = client.ip;
# }
# }
# if (req.request != "GET" &&
# req.request != "HEAD" &&
# req.request != "PUT" &&
# req.request != "POST" &&
# req.request != "TRACE" &&
# req.request != "OPTIONS" &&
# req.request != "DELETE") {
# /* Non-RFC2616 or CONNECT which is weird. */
# return (pipe);
# }
# if (req.request != "GET" && req.request != "HEAD") {
# /* We only deal with GET and HEAD by default */
# return (pass);
# }
# if (req.http.Authorization || req.http.Cookie) {
# /* Not cacheable by default */
# return (pass);
# }
# return (lookup);
# }
#
sub vcl_fetch {
set beresp.grace = 1h;
if (beresp.http.content-type ~ "(text|application)") {
set beresp.do_gzip = true;
}
}
sub vcl_recv {
# unset cookies since we don't want to bypass caching normally
if (req.http.cookie) {
unset req.http.cookie;
}
set req.grace = 1h;
}
sub vcl_deliver {
if (!resp.http.Vary) {
set resp.http.Vary = "Accept-Encoding";
} else if (resp.http.Vary !~ "(?i)Accept-Encoding") {
set resp.http.Vary = resp.http.Vary + ",Accept-Encoding";
}
}
# sub vcl_pipe {
# # Note that only the first request to the backend will have
# # X-Forwarded-For set. If you use X-Forwarded-For and want to
# # have it set for all requests, make sure to have:
# # set bereq.http.connection = "close";
# # here. It is not set by default as it might break some broken web
# # applications, like IIS with NTLM authentication.
# return (pipe);
# }
#
# sub vcl_pass {
# return (pass);
# }
#
# sub vcl_hash {
# hash_data(req.url);
# if (req.http.host) {
# hash_data(req.http.host);
# } else {
# hash_data(server.ip);
# }
# return (hash);
# }
#
# sub vcl_hit {
# return (deliver);
# }
#
# sub vcl_miss {
# return (fetch);
# }
#
# sub vcl_fetch {
# if (beresp.ttl <= 0s ||
# beresp.http.Set-Cookie ||
# beresp.http.Vary == "*") {
# /*
# * Mark as "Hit-For-Pass" for the next 2 minutes
# */
# set beresp.ttl = 120 s;
# return (hit_for_pass);
# }
# return (deliver);
# }
#
# sub vcl_deliver {
# return (deliver);
# }
#
sub vcl_error {
set obj.http.Content-Type = "text/html; charset=utf-8";
set obj.http.Retry-After = "5";
synthetic {"
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<title>The page is temporarily unavailable</title>
</head>
<body>
<h1>Chronicling America is currently unavailable</h1>
<p>The Chronicling America website is currently offline, undergoing
maintenance. We regret the inconvenience, and invite you to visit other
collections available on the Library of Congress website at <a
href="http://www.loc.gov">www.loc.gov</a> while we are working to restore
service.</p>
</body>
</html>
"};
return (deliver);
}
#
# sub vcl_init {
# return (ok);
# }
#
# sub vcl_fini {
# return (ok);
# }
}}}
--
--
Ticket URL: <https://www.varnish-cache.org/trac/ticket/1235#comment:1>
Varnish <https://varnish-cache.org/>
The Varnish HTTP Accelerator
More information about the varnish-bugs
mailing list