ban lurker questions

Poul-Henning Kamp phk at phk.freebsd.dk
Mon Apr 18 10:20:32 CEST 2016


--------
In message <57110C4E.8010209 at schokola.de>, Nils Goroll writes:
>Hi,
>
>I am working on a ban (lurker) performance issue, which, at this point, is
>simply caused by too frequent and too inefficient bans - but anyway, I'd like to
>understand the code to the best of my abilities.
>
>1) ban_cleantail
>
>why do we acquire the mtx for every ban we look at rather than collecting all
>the bans to be freed in a loop while holding the mtx?

Probably just an accident of how the code has developed.

>2) would this assertion be correct?
>
>diff --git a/bin/varnishd/cache/cache_ban_lurker.c
>b/bin/varnishd/cache/cache_ban_lurker.c
>index 65c552e..fb69f78 100644
>--- a/bin/varnishd/cache/cache_ban_lurker.c
>+++ b/bin/varnishd/cache/cache_ban_lurker.c
>@@ -190,6 +190,7 @@ ban_lurker_test_ban(struct worker *wrk, struct vsl_log *vsl,
>struct ban *bt,
>                        VSC_C_main->bans_lurker_obj_killed++;
>                } else {
>                        if (oc->ban != bd) {
>+                               assert(oc->ban == bt);
>                                Lck_Lock(&ban_mtx);
>                                oc->ban->refcount--;
>                                VTAILQ_REMOVE(&oc->ban->objcore, oc, ban_list);

I am not sure.  Isn't there a window where a HSH_Lookup could race us ?

>3) ban_lurker_getfirst questions:
>
>- for the contention case, shouldn't we continue walking the bt->objcore list
>  and sleep only if we hit the marker?

We'd need to come back to the missed oc's later, the lurker cannot skip
some of the oc's.

>- do IUC correctly that getfirst moves the oc to the tail of the bt->objcore
>  list, behind the marker, to ensure we don't re-visit ocs which have not got
>  killed yet, after being handed off to exp?

Not sure I understand the question...

>4) ban_lurker_work
>
>why do we mark_completed and then clean out the completed bans in the next step
>rather than removing the completed bans straight away? Why do we need to do the
>spec fiddling in ban_mark_completed (including a membar (!)) if we're about to
>ditch the ban anyway?

again, probably just an accident of how the code developed.

Improvements are most welcome

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.



More information about the varnish-dev mailing list