Varnish constantly running into an OOM condition.
Chris Lee
chris.lee at cern.ch
Tue Jul 8 11:18:00 UTC 2025
Hi all,
I am trying to install a varnish service for some of our developers, and I’ll admit I know next to nothing about varnish itself.
While things are up and running, Varnish was being killed every 2-3 hours by OOM-Killer as the cache-main process is using up all the system memory [1].
We are using varnish-6.6.2-6.el9_6.1.x86_64 on a VM with 4 cores, 14Gi of RAM running on AlmaLinux release 9.6
We could try to install V7, but would prefer to stay with the releases available from the default repositories which are mirrored locally.
I have tried to change the malloc memory settings and gone down in 2G increments from 10G to 2G where it is now.
This has increased the number of evictions, but extended the uptime a bit.
Adjusting the workspace_client and workspace_backend settings has increased the OOM interval to about 6-8 hours.
The service is currently run via systemd as per the command line in [2].
The hit rats as shown in [3] are fairly high from what I can tell, and the default.vlc is shown in [4]
In the mailing list archives I found a link pointing to https://info.varnish-software.com/blog/understanding-varnish-cache-memory-usage and I haven’t tried to tune the malloc settings mentioned in there yet.
But I’m running out of idea’s and though I would ask the experts here first for some guidance and assistance.
Thanks in Advance
Chris
[1]:
```
[root at frontier-varnish02 ~]# top -p $(pgrep -d, -f varnishd) |egrep "PID|$(pgrep -d"|" -f varnishd)"
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
740517 varnish 20 0 12.6g 11.1g 86528 S 20.0 78.8 27:42.25 cache-main
740496 varnish 20 0 23748 6404 5632 S 0.0 0.0 0:00.52 varnishd
```
[2]:
```
/usr/sbin/varnishd -f /etc/varnish/default.vcl -a http=:6082,HTTP -a proxy=:8443,PROXY -p feature=+http2 -p max_restarts=8 -p workspace_client=512k -p workspace_backend=512k -s malloc,2G -s Transient=malloc,1G
```
[3]:
```
Uptime mgt: 0+03:23:57 Hitrate n: 10 100 171
Uptime child: 0+03:23:58 avg(n): 0.9977 0.9940 0.9931
Press <h> to toggle help screen
NAME CURRENT CHANGE AVERAGE AVG_10 AVG_100 AVG_1000
MGT.uptime 0+03:23:57
MAIN.uptime 0+03:23:58
MAIN.sess_conn 2308418 123.91 188.63 139.32 133.20 132.53
MAIN.client_req 21919893 2019.56 1791.13 1356.64 1257.93 1228.81
MAIN.cache_hit 21866126 2016.56 1786.74 1354.38 1256.70 1227.64
MAIN.cache_miss 47442 3.00 3.88 2.26 1.22 1.17
```
[4]:
```
vcl 4.1;
import std;
import directors;
backend frontier_1 {
.host = "atlasfrontier1-ai.cern.ch";
.port = "8000";
}
backend frontier_2 {
.host = "atlasfrontier2-ai.cern.ch";
.port = "8000";
}
backend frontier_3 {
.host = "atlasfrontier3-ai.cern.ch";
.port = "8000";
}
backend frontier_4 {
.host = "atlasfrontier4-ai.cern.ch";
.port = "8000";
}
sub vcl_init {
new vdir = directors.round_robin();
vdir.add_backend(frontier_1);
vdir.add_backend(frontier_2);
vdir.add_backend(frontier_3);
vdir.add_backend(frontier_4);
}
sub vcl_recv {
set req.backend_hint = vdir.backend(); set req.http.X-frontier-id = "varnish";
if (req.method != "GET" && req.method != "HEAD") {
return (pipe);
}
}
```
More information about the varnish-misc
mailing list