More on the HAProxy proxy protocol

Mon Nov 11 16:34:14 CET 2013

[ watch out, this is became rather long ]

Hi all.

Recently I've been looking into merging Mark Bergsma's proxy protocol[1] work into
Varnish. The goal of this email is to write down our thoughts up until now,
and get a small discussion going on how it should look when it is done.

This is a protocol where a small header is written first in a TCP connection,
telling the next hop what the client ip/port was as seen by the proxy in front.
Typical use is SSL termination where the SSL terminator does not know, or want
to know for performance reasons, anything about the inner protocol. (no x-f-f)

There are two header formats, proxy (text based) and proxy2 (binary). The binary
one is preferred in the specification.

I'm only considering frontside proxy support here. We have the standard xff
(or now just f-f) to indicate this information to the backend. The proxy protocol
also supports domain sockets, UDP and other oddities. I've only considered
TCP over IPv4 and IPv6 here.

I've discussed this with Tollef for a bit, and we've come up with the following.

1. Extend the -a startup argument with a protocol definition:

Current behaviour:
    varnishd -a 192.0.2.10:80 -f /etc/varnish/foo.vcl
    varnishd -a plain at 192.0.2.10:80 -f /etc/varnish/foo.vcl # New equivalent.

Force proxy Proxy or Proxy2 on incoming connections:

    varnishd -a proxy at 192.0.2.10:80 -f /etc/varnish/foo.vcl
    varnishd -a proxy2 at 192.0.2.10:80 -f /etc/varnish/foo.vcl

Per the specification any connection not sending a proxy header to such a
socket should be a hard error.

It might be necessary to filter what clients are allowed to connect to this
socket, to avoid security implications of outside clients writing the proxy
header themselves. This can perhaps allow only localhost in the default case,
and a predefined ACL must be set/extended in VCL to allow external clients.
(needs discussion/thought)

2. VCL interface

In VCL we now have client.ip, server.ip and server.port available. These
are now (as I understand it) picked directly from the socket endpoints.

New in the proxy protocol is that we have:

    a) proxy front connection source IPv4/IPv6.
    b) proxy front connection source port.
    c) proxy front connection destination IPv4/IPv6. (your SSL IP)
    d) proxy front connection destination port. (often 443)

This information, or parts of it, needs to be put into VCL somehow, so we can
use it for policy decisions like ACL matching.

Mark took the stoic approach:

    a) VCL_IP  req.proxy.client.ip
    b) VCL_INT req.proxy.client.port
    c) VCL_IP  req.proxy.server.ip
    d) VCL_INT req.proxy.server.port
    e) bool    req.proxy (proxy in use or not)

This feels confusing; req.proxy.server.ip, is that the connection with
the envelope on or what? Why is it in req, it is a session concept not a
request concept; we don't have req.client.ip, we have client.ip.

To keep VCL consise, and I think this is the least surprising behaviour for
a user, we suggest that VCL client.ip is set to (a) if the socket is a proxy
socket.

This makes all the ACLs and logging behave like you expect them to. No cluttered
VCL in the default case.

If we do this, we're not sure what server.ip should be any more. Should it be (c),
or should it be the local endpoint IP? Input appreciated.
We also need to make something new that is the local socket endpoint information,
since client.ip/server.ip is modified. Maybe a tcp.(client|server).(ip|port), if
that prefix makes sense.

I think the two ways forward are:
1) treat proxy as an exception to the rule, and make the 5 new proxy variables in VCL. (but removing req.)
2) change client.ip behind the scenes, and create 5 (?) new tcp.XXX variables in VCL to store what would be
in client.ip/server.ip.

Any comments or inputs on this are appreciated.

1: http://haproxy.1wt.eu/download/1.5/doc/proxy-protocol.txt
-- 
With regards,
Lasse Karstensen
Varnish Software AS