[Varnish] #897: sess_mem "leak" on hyper-threaded cpu
Varnish
varnish-bugs at varnish-cache.org
Sat Apr 9 02:36:29 CEST 2011
#897: sess_mem "leak" on hyper-threaded cpu
-------------------------------------------------+--------------------------
Reporter: askalski | Type: defect
Status: new | Priority: normal
Milestone: | Component: build
Version: trunk | Severity: major
Keywords: sess_mem leak n_sess race condition |
-------------------------------------------------+--------------------------
There is a race condition on the n_sess statistic, which causes the
counter to drift upward to ridiculously high levels:
{{{
100000 . . N struct sess_mem
867438 . . N struct sess
}}}
Because SES_Delete() uses the n_sess counter to decide whether to pre-
allocate additional workspaces (sess_mem), this leads varnish eventually
to allocate session_max of them (100000 by default), which consumes an
excessive amount of memory.
{{{
97a1d998 (Poul-Henning Kamp 2010-06-17 08:47:19 +0000 220)
VSC_main->n_sess++; /* XXX: locking ? */
...
97a1d998 (Poul-Henning Kamp 2010-06-17 08:47:19 +0000 261)
VSC_main->n_sess--; /* XXX: locking ? */
...
28e7319e (Poul-Henning Kamp 2010-01-26 21:58:30 +0000 285) /* Try to
precreate some ses-mem so the acceptor will not have to */
97a1d998 (Poul-Henning Kamp 2010-06-17 08:47:19 +0000 286) if
(VSC_main->n_sess_mem < VSC_main->n_sess + 10) {
28e7319e (Poul-Henning Kamp 2010-01-26 21:58:30 +0000 287) sm
= ses_sm_alloc();
28e7319e (Poul-Henning Kamp 2010-01-26 21:58:30 +0000 288) if
(sm != NULL) {
28e7319e (Poul-Henning Kamp 2010-01-26 21:58:30 +0000 289)
ses_setup(sm);
28e7319e (Poul-Henning Kamp 2010-01-26 21:58:30 +0000 290)
Lck_Lock(&ses_mem_mtx);
28e7319e (Poul-Henning Kamp 2010-01-26 21:58:30 +0000 291)
VTAILQ_INSERT_HEAD(&ses_free_mem[1 - ses_qp], sm, li
28e7319e (Poul-Henning Kamp 2010-01-26 21:58:30 +0000 292)
Lck_Unlock(&ses_mem_mtx);
28e7319e (Poul-Henning Kamp 2010-01-26 21:58:30 +0000 293) }
28e7319e (Poul-Henning Kamp 2010-01-26 21:58:30 +0000 294) }
}}}
The bug only seems to manifest itself on machines with hyper-threaded
CPU's. I was able to reproduce the issue on my laptop (Core i7, 2-core +
HT = 4 virtual cores) by hitting varnish with heavy concurrency (ab
-c128).
{{{
# Test 1: All virtual cores active - Bug exists
$ egrep 'core id' /proc/cpuinfo
core id : 0
core id : 2
core id : 0
core id : 2
# Test 2: Two virtual cores disabled, HT disabled - No bug
$ egrep 'core id' /proc/cpuinfo
core id : 0
core id : 2
# Test 3: Two virtual cores disabled, HT enabled - Bug exists
$ egrep 'core id' /proc/cpuinfo
core id : 0
core id : 0
}}}
Locking stat_mtx solves the problem.
--
Ticket URL: <http://www.varnish-cache.org/trac/ticket/897>
Varnish <http://varnish-cache.org/>
The Varnish HTTP Accelerator
More information about the varnish-bugs
mailing list