Cache request body and user-accesible functions.

Arianna Aondio arianna.aondio at varnish-software.com
Thu Feb 26 10:32:23 CET 2015


VDD Hamburg talking point:

Context:
Starting from Varnish 4 we can buffer the request body (usually POST
and PUT requests) before sending it to the backend.
Now we have just one function accessible to users:
std.cache_req_body(BYTES size) which initializes the buffering.
Once the request body has been cached, it can be consumed as many
times as needed, making it available to other user-accesible
functions, such as:
* request body length access function
* regular expression match on request body
* regular expression substitution on request body
* request body as input in vcl_hash

Problems:
1. Bug #1664, std.cache_req_body(BYTES size) lacks of errors handling,
if it is called with a request body bigger than size, Varnish crashes
and if we have a chunked request the function will cache every request
bodies ignoring the provided size limitation.
2. Regular expression match on body: how do we want the user interface
to be, do we want the function to return a boolean indicating if the
request body contains the string the user is looking for?  In VCL this
can look like :
sub vcl_recv {
     set req.http.x-boolean1 = std.regex_req_body("varnish rocks");
}

Or do we want to be more aligned with the regex syntax and make the
request body completely available to the user? In VCL this can look
like :
sub vcl_recv {
     if (std.reqbody_re_match() ~ "varnish rocks") {
     ....
     }
}

3. Regular expression substitution on body, this function needs to be
discussed. Do we really need to be able to substitute on the request
body? Is it safe? How do we handle the possible increase of request
body?

Proposed solutions:
1. As decided a couple of weeks ago during a bugwash, we either buffer
the whole request body or fail the request.
I have a patch for this: if the request body is bigger than the given
size, we close the connection and move forward to the next request.
2. && 3. to be discussed.

Request body length access function: once the request body has been
cached, we can then iterate over it and return the number of bytes.

Request body as input in vcl_hash: once the request body has been
cached, we can hash on it. This function should be available just in
vcl_hash.
Until now we have always just hashed on strings, but if we want to
hash on bodies we need to be aware that they can be binary, so we need
to handle this properly.

I think functions regarding request body manipulation should be part
of the std.vmod.


General considerations:
Request bodies may contains binary data that headers should not contain.
Functions have to be able to handle any kind of request body.

-- 
Arianna Aondio
Software Developer | Varnish Software AS
Mobile: +47 980 62 619

We Make Websites Fly
www.varnish-software.com



More information about the varnish-dev mailing list