Varnish constantly running into an OOM condition.

Chris Lee chris.lee at cern.ch
Tue Jul 8 11:18:00 UTC 2025


Hi all,

I am trying to install a varnish service for some of our developers, and I’ll admit I know next to nothing about varnish itself.

While things are up and running, Varnish was being killed every 2-3 hours by OOM-Killer as the cache-main process is using up all the system memory [1].
We are using varnish-6.6.2-6.el9_6.1.x86_64 on a VM with 4 cores, 14Gi of RAM running on AlmaLinux release 9.6
We could try to install V7, but would prefer to stay with the releases available from the default repositories which are mirrored locally.

I have tried to change the malloc memory settings and gone down in 2G increments from 10G to 2G where it is now. 
This has increased the number of evictions, but extended the uptime a bit.
Adjusting the workspace_client and workspace_backend settings has increased the OOM interval to about 6-8 hours.

The service is currently run via systemd as per the command line in [2].

The hit rats as shown in [3] are fairly high from what I can tell, and the default.vlc is shown in [4]

In the mailing list archives I found a link pointing to https://info.varnish-software.com/blog/understanding-varnish-cache-memory-usage and I haven’t tried to tune the malloc settings mentioned in there yet.

But I’m running out of idea’s and though I would ask the experts here first for some guidance and assistance.

Thanks in Advance
Chris

[1]:
```
[root at frontier-varnish02 ~]# top -p $(pgrep -d, -f varnishd) |egrep "PID|$(pgrep -d"|" -f varnishd)"
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
740517 varnish   20   0   12.6g  11.1g  86528 S  20.0  78.8  27:42.25 cache-main 
740496 varnish   20   0   23748   6404   5632 S   0.0   0.0   0:00.52 varnishd  
```
[2]:
```
/usr/sbin/varnishd -f /etc/varnish/default.vcl -a http=:6082,HTTP -a proxy=:8443,PROXY -p feature=+http2 -p max_restarts=8 -p workspace_client=512k -p workspace_backend=512k -s malloc,2G -s Transient=malloc,1G
```
[3]:
```
Uptime mgt:  0+03:23:57                                             Hitrate n:  10  100   171
Uptime child:    0+03:23:58                                          avg(n):   0.9977   0.9940   0.9931
Press <h> to toggle help screen
    NAME                       CURRENT        CHANGE       AVERAGE        AVG_10       AVG_100      AVG_1000 
MGT.uptime                   0+03:23:57
MAIN.uptime                 0+03:23:58
MAIN.sess_conn               2308418        123.91        188.63        139.32        133.20        132.53
MAIN.client_req               21919893       2019.56    1791.13       1356.64       1257.93       1228.81
MAIN.cache_hit               21866126       2016.56    1786.74       1354.38       1256.70       1227.64
MAIN.cache_miss                 47442          3.00          3.88             2.26          1.22 1.17
```

[4]:
```
vcl 4.1;
import std;
import directors;

backend frontier_1 {
  .host = "atlasfrontier1-ai.cern.ch";
  .port = "8000";
}
backend frontier_2 {
  .host = "atlasfrontier2-ai.cern.ch";
  .port = "8000";
}
backend frontier_3 {
  .host = "atlasfrontier3-ai.cern.ch";
  .port = "8000";
}
backend frontier_4 {
  .host = "atlasfrontier4-ai.cern.ch";
  .port = "8000";
}

sub vcl_init {
  new vdir = directors.round_robin();
  vdir.add_backend(frontier_1);
  vdir.add_backend(frontier_2);
  vdir.add_backend(frontier_3);
  vdir.add_backend(frontier_4);
}

sub vcl_recv {
  set req.backend_hint = vdir.backend();           set req.http.X-frontier-id = "varnish";
  if (req.method != "GET" && req.method != "HEAD") {
    return (pipe);
  }
}
```



More information about the varnish-misc mailing list