Varnish performance musings

Nils Goroll slink at schokola.de
Mon Apr 4 23:37:30 CEST 2016


Hi Devon,

thank you very much for the interesting writeup - despite the fact that I have
so much unfinished Varnish work on my list already, I'd like to dump some
thoughts in the hope that others may pick up on them:

> All VRT functions operate on the same timestamps (for each VCL callback)

This sounds perfectly reasonable, I think we should just do this.

>  1. TIM_real in our tree showed up in top functions of a synthetic,
> all-hit workload.

I once optimized a CLOCK_REALTIME bound app by caching real time and offsetting
it with the TSC as long as the thread didn't migrate. This turned a syscall into
a handful instructions for the fast path, but there's a portability question,
and (unless one has constant_tsc) the inaccuracy due to speed stepping.

> 64-bit counters that represent the number of nanoseconds since the
> epoch. We will run out of those in something like 540 years, and I'm
> happy to make that someone else's problem :).

Besides the fact that this should be "(far) enough (in the future) for
everyone", I'm not even convinced that we need nanos in varnish. Couldn't we
shave off some 3-8 bits or even use only micros?

> ## PCRE
> 
> There are other things we've done (like optimizing regexes that are
> obviously prefix and suffix matches -- turns out lots of people write
> things like `if (req.http.x-foo ~ "^bar.*$")` that are effectively `if
> (strncmp(req.http.x-foo, "bar" 3))` because it's easy), but I don't
> see those as being as high priority for upstream; they're largely
> issues for our multi-tenant use case.

Years ago I learned about the fastly "Host header switch" problem and actually
it has an interesting generalization: As we compile VCC we could as well compile
pattern matchers also. I do this in
https://code.uplex.de/uplex-varnish/dcs_classifier - and the results are pretty
impressive.

I have no experience with it, but there's also re2c, which takes a generic
approach to the problem.

VCC could generate optimal matcher code for common yet simple expressions.
Adding a glob pattern type could be an option.

Ideally, I'd like the "Host header switch problem" solved by having a VCC
statement which would compile something like this...

select req.http.Host {
	"www.foo.com": {
		call vcl_recv_foo;
		break;
	}
	"www.bar.*": {
		call vcl_recv_bar;
		break;
	}
	"*": {
		call ...
	}
}

...into classifier tree code.

Nils



More information about the varnish-dev mailing list