Strange Broken Pipe error from Varnish health checks

Wed Apr 19 09:26:36 UTC 2023

Couldn't a HEAD request solve this? Then nginx wouldn't bother with the body at all, right?

This is what we do with our health checks. For example:

backend someBackend {
    .host = "[redacted]";
    .port = "80";
    .probe = {
        .interval = 9s;
        .request =
            "HEAD /healthcheck HTTP/1.1"
            "Host: [redacted]"
            "User-Agent: varnish-health-probe"
            "Connection: Close"
            "Accept: */*";
    }
}
________________________________
From: varnish-misc <varnish-misc-bounces+batanun=hotmail.com at varnish-cache.org> on behalf of George <izghitu at gmail.com>
Sent: Monday, April 17, 2023 10:21 AM
To: varnish-misc at varnish-cache.org <varnish-misc at varnish-cache.org>
Subject: Strange Broken Pipe error from Varnish health checks

Hi,

I have a Varnish/nginx cluster running with varnish-7.1.2-1.el7.x86_64 on CentOS 7.

The issue I am having comes from the varnish health checks. I am getting a "broken pipe" error in the nginx error log at random times like below:
Apr 10 17:32:46 VARNISH-MASTER nginx_varnish_error: 2023/04/10 17:32:46 [info] 17808#17808: *67626636 writev() failed (32: Broken pipe), client: unix:, server: _, request: "GET /varnish_check HTTP/1.1", host: "0.0.0.0"

The strange thing is that this error appears only when Varnish performs the health checks. I have other scripts doing it(nagios, curl, wget, AWS ELB) but those do not show any errors. In addition to this Varnish and nginx where the health checks occur are on the same server and it makes no difference if I use a TCP connection or socket based one.

Below are the varnish vcl and nginx locations for the health checks:
backend nginx_varnish {
                   .path = "/run/nginx/nginx.sock";
                   .first_byte_timeout = 600s;
                   .probe = health;
        }

location = /varnish_check {
                keepalive_timeout 305;
                return 200 'Varnish Check';
                access_log /var/log/nginx/varnish_check.log main;
                error_log /var/log/nginx/varnish_check_errors.log debug;
                error_log syslog:server=unix:/run/nginx_log.in.sock,facility=local1,tag=nginx_varnish_error,nohostname info;
        }

Are there any docs I can read about how exactly varnish performs the health checks and what internal processes are involved?
Did anyone happen to have similar issues? This is not causing any operational problems for the cluster but it is just something that I want to determine why it is happening because it just should not be happening.

Please help
THanks in advance.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20230419/f5c37e28/attachment.html>