VIP23 (VSL refactoring) design sketch
Poul-Henning Kamp
phk at phk.freebsd.dk
Wed Apr 10 07:13:58 UTC 2019
--------
In message <82919b48-25e0-d22f-1029-6d17db040519 at uplex.de>, Geoff Simmons writes:
>Whether we get the support is yet to be finalized, but if so we'll give
>it high priority. phk asked me to post a design sketch in advance before
>we put in the major time and effort, so that others can comment, and
>warn us off we're going in the wrong direction.
Yes, I really hate it when people go off and do something in quiet
and come back with something which isn't quite right for some reason
they overlooked.
>- That means that varnishd writes binary fields into the log.
For all its problems, a single call to sprintf is actually pretty
efficient in terms of CPU cache etc, because we almost always have
it stuck there.
I'm not very keen to see a lot of macro-work here, for instance
wrapping three memcpy's for Timestamp etc.
In theory one could extend the printf syntax with %V and similar,
but that looses the compilers type-checking of arguments.
However, just because the "const char *fmt" and the varargs
are the *same* as printf, doesn't mean we have to *do* the same
thing as printf with them.
So as a total early morning strawman, before morning tea has
fully kicked in, but *after* I read your email:
VSLb(vsl, SLT_Timestamp, "%s: %.6f %.6f %.6f",
event, now, now - first, now - *pprev);
Becomes:
VSLb(vsl, SLT_Timestamp, "S%sT%fD%fD%f",
event, now, now - first, now - *pprev);
And serialize into VSL as:
<HEADER><ASCIIZ><8 byte double><8 byte double><8 byte double>
Meanwhile, the "S%sT%fD%fD%f" string (possibly transformed) tells
the client side that we have a string, a timestamp and two durations.
We could make the "veprintf" format string come out of the tbl/mumble.h
paired with the SLT_ define:
#include <tbl/mumble.h> magic produces:
#define SLT_Bin_Timestamp SLT_Timestamp | SLT_BINARY, "S%sT%fD%fD%f"
Then in source call becomes:
VSLb(vsl, SLT_Bin_Timestamp, event, now, now - first, now - *pprev);
That way nobody confuses it with a "regular" printf, and the
SLT_Timestamp bit alerts the VSL code to interpret the fmt string
as "veprintf" syntax.
This would be back-wards source compatible, making the the transition
easier, and could allow us to make A/B testing by substituting "real"
printf formats with a compiletime option.
The "veprintf" syntax needs to be carefully thought out, for instance
"S%f" should be a syntax error, and that needs to be caught somewhere,
preferably at compile time, but worst case we could walk all the
strings from <tbl/mumble.h> with a varnishd -x argument as part of
the build.
>- Outputs for structured, typed data, in particular JSON from varnishlog
>-j, fall out easily, since the data is already structured and typed
>(don't need to parse strings in the log).
So where does <tbl/mumble.h> come from ?
It probably makes a lot of sense to wrap all the information one single
place, possibly as .rst-oid format like ".vsc" files and have python
chew on that and output all of docs, JSON metadata and the macro
definitions for varnishd (which would not tbl/mumble.h then.)
Far out:
Given this level of abstraction we could reorder the arguments
between VSL and presentation, so for instance timestamp could be
serialized as:
<8 byte double><8 byte double><8 byte double><ASCIIZ>
So the VSLQ would always know where to find the doubles, but the
presentation code in libvarnishapi would (still) render the string
in front.
We chould also benchmark emitting a length before ASCIIZ to make
it faster to skip strings elements.
Either 'trick' might make VSLQ faster.
--
Poul-Henning Kamp | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG | TCP/IP since RFC 956
FreeBSD committer | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
More information about the varnish-dev
mailing list