[master] 537b74f55 cache: It's time for the big quit

Dridi Boukelmoune dridi.boukelmoune at gmail.com
Tue Apr 18 09:11:08 UTC 2023


commit 537b74f559cb7cc5672305383d739e7310511277
Author: Dridi Boukelmoune <dridi.boukelmoune at gmail.com>
Date:   Wed Apr 12 20:55:34 2023 +0200

    cache: It's time for the big quit
    
    When mgt sends a command to the cache process, whether it is a period
    ping or an actual operation, it must complete within cli_timeout. When
    the cache fails to meet this requirement, mgt sends a SIGQUIT signal
    to the cache process. As a result the cache process MAY dump a core
    file for post-mortem analysis.
    
    When the core file is missing we are left to our own devices.
    
    To mitigate this, a new signal handler is added for SIGQUIT, but since
    we can't (or don't even try to) guarantee delivery on the CLI thread,
    we make a last-ditch effort to forward SIGQUIT signals to properly
    panic from the CLI thread. With a regular panic we may get both a panic
    report and a core dump.
    
    I didn't add test coverage for this, since we try to avoid intentional
    core dumps in test cases with the `no_coredump` feature flag that turns
    SIGQUIT into a SIGKILL signal.

diff --git a/bin/varnishd/cache/cache.h b/bin/varnishd/cache/cache.h
index 21b44fbf2..6e73158f8 100644
--- a/bin/varnishd/cache/cache.h
+++ b/bin/varnishd/cache/cache.h
@@ -605,7 +605,8 @@ void BAN_Abandon(struct ban_proto *b);
 
 /* cache_cli.c [CLI] */
 extern pthread_t cli_thread;
-#define ASSERT_CLI() do {assert(pthread_equal(pthread_self(), cli_thread));} while (0)
+#define IS_CLI() (pthread_equal(pthread_self(), cli_thread))
+#define ASSERT_CLI() do {assert(IS_CLI());} while (0)
 
 /* cache_http.c */
 unsigned HTTP_estimate(unsigned nhttp);
diff --git a/bin/varnishd/cache/cache_main.c b/bin/varnishd/cache/cache_main.c
index 2d8bd510d..2cf82fd95 100644
--- a/bin/varnishd/cache/cache_main.c
+++ b/bin/varnishd/cache/cache_main.c
@@ -330,7 +330,7 @@ child_signal_handler(int s, siginfo_t *si, void *c)
 }
 
 /*=====================================================================
- * Magic for panicing properly on signals
+ * Magic for panicking properly on signals
  */
 
 static void
@@ -363,6 +363,17 @@ child_sigmagic(size_t altstksz)
 	(void)sigaction(SIGSEGV, &sa, NULL);
 }
 
+static void
+cli_quit(int sig)
+{
+
+	if (!IS_CLI()) {
+		AZ(pthread_kill(cli_thread, sig));
+		return;
+	}
+
+	WRONG("It's time for the big quit");
+}
 
 /*=====================================================================
  * Run the child process
@@ -376,6 +387,7 @@ child_main(int sigmagic, size_t altstksz)
 		child_sigmagic(altstksz);
 	(void)signal(SIGINT, SIG_DFL);
 	(void)signal(SIGTERM, SIG_DFL);
+	(void)signal(SIGQUIT, cli_quit);
 
 #if defined(__FreeBSD__) && __FreeBSD_version >= 1000000
 	malloc_message = child_malloc_fail;


More information about the varnish-commit mailing list