Varnish performance musings
Geoffrey Simmons
geoff at uplex.de
Mon Apr 4 22:12:46 CEST 2016
> On Apr 4, 2016, at 9:21 PM, Devon H. O'Dell <dho at fastly.com> wrote:
>
> ## PCRE
>
> There are other things we've done (like optimizing regexes that are
> obviously prefix and suffix matches -- turns out lots of people write
> things like `if (req.http.x-foo ~ "^bar.*$")` that are effectively `if
> (strncmp(req.http.x-foo, "bar" 3))` because it's easy), but I don't
> see those as being as high priority for upstream; they're largely
> issues for our multi-tenant use case. We have done this already;
> another thing we would like to do is to check regexes for things like
> backtracking and use DFA-based matching where possible. In the flame
> graph screenshot, the obvious VRT functions are PCRE.
You might be interested in this, although it's new as can be (just today tagged as v0.1) -- a VMOD to access Google's RE2 regular expression lib:
https://code.uplex.de/uplex-varnish/libvmod-re2
For those not familiar with RE2: it limits the syntax so that patterns are regular languages in the strictly formal sense. Most notably, backrefs within a pattern are not allowed. That means that the matcher can run as DFAs/NFAs, there is never any backtracking, and the time requirement for matches is always linear in the length of the string to be matched.
So far this is just a proof of concept, and I haven't done any performance testing. From the documentation, I suspect that there are certain kinds of use cases for Varnish where RE2 would perform better than PCRE, and many cases where it doesn't make much difference (such as the prefix or suffix matches you mentioned). But that's all speculation until it's been tested.
Best,
Geoff
More information about the varnish-dev
mailing list