V4 VCL ideas

Wed Jan 9 17:38:57 CET 2013

I like the idea of a vcl_http1_req method that runs before vcl_req but I don't like the idea of making it protocol specific. The reason I like it is that it gives me a place to do upfront processing for a request that I only want to do once. Like setting the X-Forwarded-For, stripping out cookies, geopip lookups, digest computations, etc. Then on a restart I know that those items have already been completed. The reason I don't like the idea of making it protocol specific is that then I have to add all that logic to each vcl_XXX_req method or move it down to vcl_req and check to see if it's the first time through vcl_req. I would propose using req.proto or something similar.

sub vcl_pre_req {
	/*
	 * V4: These functions are only invoked once, restart goes to vcl_req{}
	 */

	/ * XXX V4: The user could add all the protocol specific code if they need it. */
	if (req.proto == "HTTP1.0" || req.proto == "HTTP1.1") {
		... 
      } elseif (req.proto == "HTTP2.0") {
		...
      }

      / * XXX V4: Set the X-Forwarded-For for all protocols. */
	if (req.http.x-forwarded-for) {
	    set req.http.X-Forwarded-For =
		req.http.X-Forwarded-For + ", " + client.ip;
	} else {
	    set req.http.X-Forwarded-For = client.ip;
	}
	if (xxx_something_bad) {
		return (error);		/* XXX V4: synthetic */
	}

	/ * XXX V4: Do some heavy lifting here. */
	req.http.cookie = ...;
	req.http.X-Country = ...;
	req.http.X-Bucket = ...;

	return (req);
}

I would adopt the req.proto approach in the rest of the functions also.

How does vcl_lookup work. Are we always guaranteed to have an obj, or do we have to check for existence? If it always exists how can we tell if the obj represents a cache miss?

Keep up the great work!

Raul

-----Original Message-----
From: varnish-dev-bounces at varnish-cache.org [mailto:varnish-dev-bounces at varnish-cache.org] On Behalf Of Poul-Henning Kamp
Sent: Wednesday, January 09, 2013 3:17 AM
To: varnish-dev at varnish-cache.org
Subject: V4 VCL ideas

I'm still struggling with designing the V4 VCL language, and thought it was time to run my current sketch past the rest of you.

One thing I'm particularly struggling with now, is how much we want to involve VCL in the mechanics of the protocol.

Right now we are somewhat schizofrenic about this, for instance we handle incoming "Connection: xxx" in C-code, but allow VCL to set it on the way out.

One possible consistent view is that _everything_ which might smell the least of policy, should be in VCL.  The obvious downside is that VCL becomes quite complicated.

Another possible consistent view is that we handle all protocol mechanics in the C-code, and only put "clean content policy" in VCL.  That sounds like it could be a cleaner VCL language to me.

As far as I can tell, the major argument for the first approach is trust management.

For instance, can you trust the X-Forwarded-For header, can you not trust it, do you not care about it at all ?

If it is handled in C-code, invisible to the VCL code, users may not realize that this is something they should think about in the first place.

Right now I'm exploring a model where the VCL has a "core" part, which takes a HTTP request and responds to it, and a number of protocol specific parts, to enable people to get their hands on the low-level stuff.

My current mock-up default.vcl presently looks like this, input, ideads and comments are most welcome.

Don't hang yourself too much in specific names, those are just sort of place holders, it's more the overall structure I am interested in.

... and no, I still don't know what to do about vcl_error{}...

Poul-Henning

/*-
 * Copyright (c) 2006 Verdens Gang AS
 * Copyright (c) 2006-2013 Varnish Software AS
 * All rights reserved.
 *
 * Author: Poul-Henning Kamp <phk at phk.freebsd.dk>
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 *
 * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL AUTHOR OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 *
 * The default VCL code.
 *
 * NB! You do NOT need to copy & paste all of these functions into your
 * own vcl code, if you do not provide a definition of one of these
 * functions, the compiler will automatically fall back to the default
 * code from this file.
 *
 * This code will be prefixed with a backend declaration built from the
 * -b argument.
 */

sub vcl_http1_req {
	/*
	 * V4: This is the new protocol specific "get-here-first" function
	 * V4: in the future we may also have vcl_http2_req{}, vcl_spdy_req{}
	 * V4: and so on.
	 * V4: These functions are only invoked once, restart goes to vcl_req{}
	 */

	if (req.http.x-forwarded-for) {
	    set req.http.X-Forwarded-For =
		req.http.X-Forwarded-For + ", " + client.ip;
	} else {
	    set req.http.X-Forwarded-For = client.ip;
	}
	if (xxx_something_bad) {
		return (error);		/* XXX V4: synthetic */
	}
	return (req);
}

sub vcl_req {
	/*
	* V4 Formerly known as vcl_recv{}
	*/

	if (req.method != "GET" &&
	    req.method != "HEAD" &&
	    req.method != "PUT" &&
	    req.method != "POST" &&
	    req.method != "TRACE" &&
	    req.method != "OPTIONS" &&
	    req.method != "DELETE") {
		/* Non-RFC2616 or CONNECT which is weird. */
		return (pipe);
	}
	if (req.method != "GET" && req.method != "HEAD") {
		/* We only deal with GET and HEAD by default */
		return (pass);		/* XXX V4:  return(bereq) ? */
	}
	if (req.http.Authorization || req.http.Cookie) {
		/* Not cacheable by default */
		return (pass);		/* XXX V4:  return(bereq) ? */
	}
	return (hash);
}

sub vcl_http1_pipe {
	/*
	 * V4: Pipe is protocol specific as far as I can tell
	 */

	# Note that only the first request to the backend will have
	# X-Forwarded-For set.  If you use X-Forwarded-For and want to
	# have it set for all requests, make sure to have:
	# set bereq.http.connection = "close";
	# here.  It is not set by default as it might break some broken web
	# applications, like IIS with NTLM authentication.
	return (pipe);
}

sub vcl_hash {
	hash_data(req.url);
	if (req.http.host) {
		hash_data(req.http.host);
	} else {
		hash_data(server.ip);
		hash_data(server.port);
	}
	return (lookup);
}

sub vcl_lookup {
	/*
	 * V4: Formerly part of vcl_hit{}, vcl_miss{} and vcl_pass{}
	 */
	if (is_pass(obj)) {
		return (pass);	/* XXX: V4: return(bereq) ? */
	}
	if (obj.ttl > 0 s) {
		return (deliver);	/* XXX: V4: return (resp) */
	}
	background_fetch(obj);
	if (obj.grace > 0 s) {
		return (deliver);	/* XXX: V4: return (resp) */
	}
	return (fetch);
}

sub vcl_resp {
	/*
	 * V4: Not obvious if this is/might be protocol specific
	 */
	return (deliver);
}

sub vcl_bereq {
	/*
 	 * V4: Formerly part of vcl_pass{} and vcl_miss{}
	 * V4: Not obvious if this is/might be protocol specific
	 */
}

sub vcl_beresp {
	/*
	 * V4: Formerly vcl_fetch{}
	 * V4: Not obvious if this is/might be protocol specific
	 */
	if (beresp.ttl <= 0s ||
            beresp.http.Set-Cookie ||
	    beresp.http.Vary == "*") {
		/*
		 * Mark as "Hit-For-Pass" for the next 2 minutes
		 */
		set beresp.ttl = 120 s;
		set beresp.pass = true;
	}
	return (insert);
}

/*
 * We can come here "invisibly" with the following errors:  413, 417 & 503  */ sub vcl_error {
    set obj.http.Content-Type = "text/html; charset=utf-8";
    set obj.http.Retry-After = "5";
    synthetic {"
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
  <head>
    <title>"} + obj.status + " " + obj.response + {"</title>
  </head>
  <body>
    <h1>Error "} + obj.status + " " + obj.response + {"</h1>
    <p>"} + obj.response + {"</p>
    <h3>Guru Meditation:</h3>
    <p>XID: "} + req.xid + {"</p>
    <hr>
    <p>Varnish cache server</p>
  </body>
</html>
"};
    return (deliver);
}

sub vcl_init {
	return (ok);
}

sub vcl_fini {
	return (ok);
}

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

_______________________________________________
varnish-dev mailing list
varnish-dev at varnish-cache.org
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-dev