New (be)req fields: path scheme and authority

Dridi Boukelmoune dridi at varni.sh
Thu Jan 28 12:20:30 CET 2016


Hi all,

I'm starting a discussion that will hopefully lead to a VIP. I have
pondered this one for quite a while now and trac #1847 convinced
me that I should share this with the -dev list. I predict it will be
one of those long emails I'm very good at not synthesizing.

The proposal is to have new request fields:
- req.path (the absolute path)
- req.host (the virtual host)
- req.authority

This is based on my understanding of HTTP/1.1 and it seems like it'd
play nicely with HTTP/2 but I'm not done studying the latter.

In HTTP/1.1 a request starts[1] with a request-line:

  method SP request-target SP HTTP/1.1 CRLF

The request-target is a URI[2] that can take several forms:

  request-target = origin-form => the path, /something
               / absolute-form => [scheme]://[authority][path]
               / authority-form => host name for CONNECT
               / asterisk-form => an actual * for OPTIONS

The problem with the absolute-form is that Varnish is not supposed to
receive this kind of request because they are meant for forward
proxies, but it MUST [5] be handled regardless.

The good thing with the absolute-form is that it integrates nicely with
HTTP/2 [3] since all its components map to pseudo headers.

The other thing that Varnish should do with absolute-form URIs is to
ignore[4] host headers and use the absolute-form authority as such.

One may ask where we should store the asterisk-form, and HTTP/2 says
it belongs in the path[3].

How would it work?

For HTTP/1.1 Varnish would dissect the request and populate the
following fields:
- req.url => request-target
- req.path => origin form OR asterisk-form OR path from absolute-form
- req.authority => authority-form OR authority from absolute-form OR host header
- req.scheme => scheme from absolute-form OR "http"

For HTTP/2 I suppose we could reconstruct an absolute-form with the
pseudo headers from the new fields, since we'd get them as pseudo
headers[3].

Changes in the built-in VCL:
sed -e s/req.url/req.path/ -e s/req.http.host/req.authority/

Security concerns:

Mainly, how to deal with an "https" scheme? And for that I'd shift the
responsibility to the user/documentation. If you have a trusted TLS or
HTTPS proxy you can always route the decrypted traffic to a different
port and check it when req.scheme == "https".

Breaking changes:

On top of the breaking changes (only the built-in VCL, i thnik) I
wouldn't mind renaming req.url to req.target but not because I like to
break things for the sake of breaking them (but I do like breaking
stuff).

The rationale is that since the request-target is not necessarily a
URL, that would be the occasion of getting better semantics wrt to
the RFCs (like req.request that became req.method) and also make
sure that VCL wouldn't compile and give you subtle bugs because of
changes in the built-in VCL.

Thoughts?

Best,
Dridi

[1] https://tools.ietf.org/html/rfc7230#section-3.1.1
[2] https://tools.ietf.org/html/rfc7230#section-5.3
[3] https://tools.ietf.org/html/rfc7540#section-8.1.2.3
[4] https://tools.ietf.org/html/rfc7230#section-5.4
[5] https://tools.ietf.org/html/rfc7230#section-5.3.2



More information about the varnish-dev mailing list