varnish lock-up

Thu Jun 6 03:16:39 CEST 2013

Hi Varnish Folk,

We had an OOM condition today on a dev box that I thought was worth passing along, as it could've happened in production or may happen to others.

This is on Varnish 3.0.3 / Ubuntu 10.04.

It looks like the parent varnish process is not handling unexpected child exits correctly while inside a signal, leading to a situation where the parent process locks up because of a race condition where the signal handlers stomp on the global values of child_state and child_pid.

Here's a snippet of log -- the OOM killer kills the child processes each time that varnish attempts to run them (28004, 15403, 15640/-1).

[7270495.691960] Out of memory: kill process 27086 (varnishd) score 9908734 or a child
[7270495.691971] Killed process 28004 (varnishd)

1370466543.3  Child (28004) died signal=9
1370466543.3  Child cleanup complete
1370466543.4  child (15403) Started
1370466545.4  Child (15403) said Not running as root, no priv-sep
1370466545.4  Child (15403) said Child starts
1370466545.4  Child (15403) said SMF.s0 mmap'ed 268435456 bytes of 268435456

[7270511.916123] Out of memory: kill process 27086 (varnishd) score 9724109 or a child
[7270511.916134] Killed process 15403 (varnishd)

1370466558.8  Child (15403) not responding to CLI, killing it.
1370466559.6  Child (15403) died signal=9
1370466559.6  Child cleanup complete
1370466559.6  child (15640) Started

[7270520.798539] Out of memory: kill process 27086 (varnishd) score 6705406 or a child
[7270520.798551] Killed process 15640 (varnishd)

1370466568.1  Pushing vcls failed:
1370466568.1  CLI communication error (hdr)
1370466568.1  Stopping Child
1370466568.1  Child (15640) died signal=9

# (The -1 value is "interesting" given the check for >0; child_pid is a global and it's being set to -1 elsewhere -- so presumably start_child is being called from somewhere else; i.e. start_child is running, mgt_sigchild gets triggered, then flow returns to start_child right after and assigns -1 to child_pid)

1370466568.1  Child (-1) said Not running as root, no priv-sep
1370466568.1  Child (-1) said Child starts
1370466568.1  Child (-1) said SMF.s0 mmap'ed 268435456 bytes of 268435456
1370466568.1  Child cleanup complete

Reading through mgt_child.c, here's what I think is going on:

1) The signal handler is getting invoked when the child process is killed by the OOM killer
2) Then the parent process, inside the sig handler (mgt_sigchld), calls fork (inside mgt_sigchld's call to start_child), which causes a new child process to start
3) The OOM killer kills that second child process right away, before the sig handler has a chance to return

Does it maybe make sense to set the signal to be blocked for sigchild so that the events can be processed synchronously without losing them?

-Jeff