Varnish constantly running into an OOM condition.
Chris Lee
chris.lee at cern.ch
Thu Jul 10 08:25:01 UTC 2025
Hi Guillaume
Thanks for the response. Our Developers were worried that nobody would reply since we are using V6 :-)
The transient should be at 1G (-s malloc,2G -s Transient=malloc,1G)
Below are all the SMA values [1], but at the moment the real value is much higher [2]
From https://varnish-cache.org/docs/trunk/users-guide/storage-backends.html my understanding is that transient is used when the TTL is below the short-lived setting.
The default_ttl and shortlived are the default settings, so 120s and 10s.
Checking the headers in the varnish log, all of these are coming in as “Cache-Control: max-age=3000”,
So I was actually thinking of setting the transient much lower, since as you can see in [3] on a new server I started 3 days ago, the transient isn’t even being used at all.
As I mentioned, I could try V7, But with my sysadmin hat on, I really wanted to try find out what is going in the stable releases, before jumping to the latest release.
Thanks in Advance
Chris
[1]:
[atlasfrontiergpn02 ~]# varnishstat -1 -f 'SMA*'
SMA.s0.c_req 4797047 174.74 Allocator requests
SMA.s0.c_fail 24725 0.90 Allocator failures
SMA.s0.c_bytes 73606101515 2681265.54 Bytes allocated
SMA.s0.c_freed 72001644367 2622819.63 Bytes freed
SMA.s0.g_alloc 113811 . Allocations outstanding
SMA.s0.g_bytes 1604457148 . Bytes outstanding
SMA.s0.g_space 543026500 . Bytes available
SMA.Transient.c_req 0 0.00 Allocator requests
SMA.Transient.c_fail 0 0.00 Allocator failures
SMA.Transient.c_bytes 0 0.00 Bytes allocated
SMA.Transient.c_freed 0 0.00 Bytes freed
SMA.Transient.g_alloc 0 . Allocations outstanding
SMA.Transient.g_bytes 0 . Bytes outstanding
SMA.Transient.g_space 1073741824 . Bytes available
[2]:
[atlasfrontiergpn02 ~]# top -b -n 1 -p $(pgrep -d, -f varnishd) |egrep "PID|$(pgrep -d"|" -f varnishd)"
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2685175 varnish 20 0 14.1g 12.5g 86272 S 6.7 88.8 30:40.92 cache-main
2685154 varnish 20 0 23748 6148 5376 S 0.0 0.0 0:01.20 varnishd
[3]:
[PastedGraphic-1.png]
On 10 Jul 2025, at 07:17, Guillaume Quintard <guillaume.quintard at gmail.com> wrote:
Hi Chris,
What are the g_bytes counters in varnishstat saying? If you haven't bounded your Transient storage. It could be a reason.
The other suspect is the newer jemalloc version in the repository. If the problem isn't the transient storage, I would encourage you to try the packagecloud repository to get a newer version and see if this help.
--
Guillaume Quintard
On Tue, Jul 8, 2025, 04:19 Chris Lee <chris.lee at cern.ch<mailto:chris.lee at cern.ch>> wrote:
Hi all,
I am trying to install a varnish service for some of our developers, and I’ll admit I know next to nothing about varnish itself.
While things are up and running, Varnish was being killed every 2-3 hours by OOM-Killer as the cache-main process is using up all the system memory [1].
We are using varnish-6.6.2-6.el9_6.1.x86_64 on a VM with 4 cores, 14Gi of RAM running on AlmaLinux release 9.6
We could try to install V7, but would prefer to stay with the releases available from the default repositories which are mirrored locally.
I have tried to change the malloc memory settings and gone down in 2G increments from 10G to 2G where it is now.
This has increased the number of evictions, but extended the uptime a bit.
Adjusting the workspace_client and workspace_backend settings has increased the OOM interval to about 6-8 hours.
The service is currently run via systemd as per the command line in [2].
The hit rats as shown in [3] are fairly high from what I can tell, and the default.vlc is shown in [4]
In the mailing list archives I found a link pointing to https://info.varnish-software.com/blog/understanding-varnish-cache-memory-usage and I haven’t tried to tune the malloc settings mentioned in there yet.
But I’m running out of idea’s and though I would ask the experts here first for some guidance and assistance.
Thanks in Advance
Chris
[1]:
```
[root at frontier-varnish02 ~]# top -p $(pgrep -d, -f varnishd) |egrep "PID|$(pgrep -d"|" -f varnishd)"
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
740517 varnish 20 0 12.6g 11.1g 86528 S 20.0 78.8 27:42.25 cache-main
740496 varnish 20 0 23748 6404 5632 S 0.0 0.0 0:00.52 varnishd
```
[2]:
```
/usr/sbin/varnishd -f /etc/varnish/default.vcl -a http=:6082,HTTP -a proxy=:8443,PROXY -p feature=+http2 -p max_restarts=8 -p workspace_client=512k -p workspace_backend=512k -s malloc,2G -s Transient=malloc,1G
```
[3]:
```
Uptime mgt: 0+03:23:57 Hitrate n: 10 100 171
Uptime child: 0+03:23:58 avg(n): 0.9977 0.9940 0.9931
Press <h> to toggle help screen
NAME CURRENT CHANGE AVERAGE AVG_10 AVG_100 AVG_1000
MGT.uptime 0+03:23:57
MAIN.uptime 0+03:23:58
MAIN.sess_conn 2308418 123.91 188.63 139.32 133.20 132.53
MAIN.client_req 21919893 2019.56 1791.13 1356.64 1257.93 1228.81
MAIN.cache_hit 21866126 2016.56 1786.74 1354.38 1256.70 1227.64
MAIN.cache_miss 47442 3.00 3.88 2.26 1.22 1.17
```
[4]:
```
vcl 4.1;
import std;
import directors;
backend frontier_1 {
.host = "atlasfrontier1-ai.cern.ch<http://atlasfrontier1-ai.cern.ch/>";
.port = "8000";
}
backend frontier_2 {
.host = "atlasfrontier2-ai.cern.ch<http://atlasfrontier2-ai.cern.ch/>";
.port = "8000";
}
backend frontier_3 {
.host = "atlasfrontier3-ai.cern.ch<http://atlasfrontier3-ai.cern.ch/>";
.port = "8000";
}
backend frontier_4 {
.host = "atlasfrontier4-ai.cern.ch<http://atlasfrontier4-ai.cern.ch/>";
.port = "8000";
}
sub vcl_init {
new vdir = directors.round_robin();
vdir.add_backend(frontier_1);
vdir.add_backend(frontier_2);
vdir.add_backend(frontier_3);
vdir.add_backend(frontier_4);
}
sub vcl_recv {
set req.backend_hint = vdir.backend(); set req.http.X-frontier-id = "varnish";
if (req.method != "GET" && req.method != "HEAD") {
return (pipe);
}
}
```
_______________________________________________
varnish-misc mailing list
varnish-misc at varnish-cache.org<mailto:varnish-misc at varnish-cache.org>
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20250710/76195661/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PastedGraphic-1.png
Type: image/png
Size: 117470 bytes
Desc: PastedGraphic-1.png
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-misc/attachments/20250710/76195661/attachment-0001.png>
More information about the varnish-misc
mailing list