RFC for VIP17: unix domain sockets for listen and backend addresses

Tue Apr 25 16:42:08 CEST 2017

> # Synopsis
> Allow unix domain sockets as a listen address for Varnish (``-a``
> option) and as addresses for backends, and for use in ACLs. Obtain
> credentials of the peer process connected on a UDS, such as uid and
> gid, for use in VCL.

Except for ACLs, I find the idea compelling so far. I would even like
to see UDS support for admin sockets (-T option).

> # Why?
> * Eliminate the overhead of TCP/loopback for connections with peers
> that are colocated on a host with Varnish.

Yes.

> * Restrict who can send requests to Varnish by setting permissions on
> the UDS path of the listen address.

The whole point of UDSs IMO.

>   * (But see the discussion below about getting this right portably.)
> * Make it possible for a backend peer to require restricted
> credentials for the Varnish process by setting permissions on the UDS
> path on which it listens.

It is technically possible to implement a UDS backend if one is brave
enough to re-implement all the VBE logic. So I'm strongly in favor of
having this capability in varnishd.

> * Peer credentials make it possible to:
>   * Make information about the peer available in VCL and the log.

Why not, no opinion.

>   * Extend ACLs to make it possible to place further restrictions on
> peers connecting to the listen address.

I would use a regex instead of messing with ACLs.

<...snip...>

> I would like to make this contribution for the September 2017 release.
> With the VIP I'd like to clarify:
>
> * Are there any changes planned for VTCP and VSA in the September
> release that would make adding UDS to those interfaces less trivial
> than it is now?

I wouldn't mix UDS with VSA, there's probably room for a different solution.

> * Every platform has a way to get peer credentials from a UDS, but
> there's no standard and it's highly platform-dependent. So how do we
> want to handle that?

Maybe we could start by not having them, being able to use UDSs is
already a huge win IMO.

> * Additions/changes to VCL and other changes in naming, such as the
> ``-a`` option and backend definitions.

I don't think we need to change -a or -T, as long as we force absolute
names we should be able to get away with the current syntax. An address
starting with a slash (/) would denote a UDS. [1]

> * If someone knows a reason why we shouldn't do this at all, this is
> the place to say so.
>
> # How?
> ## Address notation
> I suggest that we require a prefix such as ``unix:`` to identify UDS
> addresses (nginx uses ``unix:``, haproxy uses ``unix@``):
> ```
> varnishd -a unix:/path/to/uds

This should be enough:

    varnishd -a /path/to/uds

> backend uds { .host = "unix:/path/to/uds"; }

I would instead go for a .path field mutually exclusive with .host and
.port, removing ambiguity at vcl.load-time (error messages etc).

> ```
> That makes the interpretation unambiguous. We could simply interpret
> paths as UDS addresses when they appear in those places, but then we
> would need logic like: if the argument cannot be resolved as a host or
> parsed as an IP address, then assume it's a path for UDS, but if the
> path does not exist or cannot be accessed, then fail. So better to
> just make it unambiguous.

As I said earlier, I think a slash [1] is enough to remove ambiguity.

> Parsing UDS addresses would be an extension of ``VSS_Resolver``.

Not if we don't mix paths with IPs/domains

> The name ``.host`` in a backend definition becomes a bit peculiar if
> its value can also be a UDS (we will see a number or examples like
> this). We could:
>
> * stay with the name ``.host``, and document the fact that it might
> not identify a host in some cases
> * replace ``.host`` with a name like ``.peer``, sacrificing backward
> compatibility
> * introduce ``.peer``, retain ``.host`` as a deprecated alias, and
> remove ``.host`` in a future release
>
> I suggest the last option, comments welcome.

Once again, I suggest we don't mix them up so that we don't need to
break anything. I also find .peer ambiguous.

> ``.port`` in a backend definition is already optional, and is
> unnecessary for a UDS. Should it be an error to specify a port when a
> UDS is specified, or should it be ignored? Comments welcome.

As stated above, mutually exclusive with the .path field.

> ## Access permissions on the listen address
> For the ``-a`` address, I suggest an optional means of specifying who
> can access the UDS:
> ```
> varnishd -a unix:/path/to/uds:uid=foo,gid=bar
> ```
> There's an issue here in that the separator (``:`` in the example)
> could not appear in any UDS path. We might just have to forbid a
> certain character in UDS paths. Fortunately we don't have a such a
> problem with backend addresses (which are generated by another server,
> so we have less freedom to impose restrictions on the path names).

I would use the comma separator like -j and -s options for jails and
storage backends. Possibly named parameters like in -j so that order
doesn't matter. But that means breaking the syntax so that the protocol
(HTTP/1 or PROXY) requires a name too.

Example:

    varnishd -a /path/to/socket,uid=...,gid=...,proto=PROXY

<...snip...>

> If no access restrictions were requested, then don't manipulate
> ownership, let bind create the UDS, and set its permissions to 0666.

Wouldn't it be based on umask instead?

<...snip...>

> A minor issue is that the name ``VTCP`` (all of the ``VTCP_*``
> functions, the source name ``vtcp.c``, etc.) becomes a misnomer if it
> also covers UDS. We could just live with that. OTOH a single git
> commit could change it all at once, although we might have to bikeshed
> over a new name (``VSOCK``?).

<...snip...>

> So it appears that the least common denominator is EUID and EGID
> (assuming that's what you get in Linux). I suggest that we just go
> with that, to be used as described below.

That, or nothing for starters. UDSs, huge win already.

<...snip...>

> ## VCL/VRT
> Additions and changes to VCL and VRT involve:
> * VCL variables ``*.ip``: ``client.ip``, ``local.ip``, ``server.ip``,
> ``remote.ip`` and ``beresp.backend.ip``

Or we could leave them alone and introduce a new {server,local}.path
field and depending on the type of connection one of .ip and .path is
null. It is up to the VCL code to check that, and existing operations
such as matching ACLs would fail with a null IP.

> * VCL data type IP

Or a separate type altogether (eg. UNIX).

> * introducing VMOD std functions to return the uid and gid for the
> ``*.ip`` objects, as numbers or names

Yes, similar to std.port

> * extending ACLs to specify UDSen and optionally peer credentials

Not compelling.

> * VRT: types ``VCL_IP`` and ``struct vrt_backend``, and the VRT
> functions related to ``VCL_IP`` and ``suckaddr``
>
> The ``*.ip`` variables essentially encapsulate suckaddrs, which we
> don't have to change. For the string conversion, if the suckaddr wraps
> a sockaddr_un, then return the UDS path.

Again, not sure we should mix them up.

> Here again we have the problem that the names ``*.ip`` are
> inappropriate, since the value could be a UDS. Again I suggest the
> strategy of introducing a new name, in this case ``*.addr``, and
> deprecating the old names, but leaving the old names around until a
> future release.

Again, not mixing them up won't leave us with deprecated syntax.

> ``VCL_IP`` is just a suckaddr, so we don't have to change anything,
> but we have another inappropriate name for UDSen. The same goes for
> data type ``IP``. Again I suggest the strategy of introducing new
> names, ``ADDR`` and ``VCL_ADDR`` (``VCL_ADDR`` defined as exactly the
> same typedef as ``VCL_IP``), and deprecating the old names.

bis repetita

> I suggest adding functions like these to VMOD std, with the obvious
> implementations:
> * ``INT uid_number(ADDR addr, INT fallback)``
> * ``STRING uid_name(ADDR addr, STRING fallback)``
> * ``INT gid_number(ADDR addr, INT fallback)``
> * ``STRING gid_name(ADDR addr, STRING fallback)``

So maybe we should not support uid/gid for starters :)

> Of course these would always return the fallbacks for non-UDS addresses.
>
> ACLs can be extended to include paths for a UDS and restrictions on the uid/gid:
> ```
> acl foo {
>     "/path/to/uds";
>     "/path/with/a/*/wildcard";
>     "/path/with/a/uid/restriction",uid=4711;
>     "/path/with/more/r?strictions",uid=foo,gid=bar;
> }
> ```
>
> So we can: name UDS paths in an ACL, allow filename globbing, include
> restrictions on the uid and gid, and allow both numbers and names for
> uid/gid.

Not compelling.

> I'm not sure what to do about ``struct vrt_backend``, which currently
> has fields for IPv4 and IPv6 addresses, both as strings and suckaddrs.
> I doubt that it makes sense just to add the same fields for UDS
> addresses, since the point is that a backend may have both kinds of IP
> addresses, but it won't also have a UDS address at the same time.
>
> We might have to introduce something like this:
> ```
> union addr {
>     struct {
>         char *ipv4_addr;
>         char *ipv6_addr;
>         struct suckaddr *ipv4_suckaddr;
>         struct suckaddr *ipv6_suckaddr;
>     } ip;
>     struct {
>         char *path;
>         struct suckaddr *uds_suckaddr;
>     } uds;
> };
> ```
> ... and then use the union type for the "address" field of the backend
> definition -- it's either an IP address, which can be one or both of
> IPv4 and IPv6, or a UDS. Comments welcome.

Or keep those fields mutually exclusive and handle the differences in the
VBE subsystem for the backend side. On the client side I haven't given
much thought to the mechanics.

> I think that the VRT functions that currently use ``VCL_IP`` and
> suckaddrs can be adapted either without changes or very
> straightforwardly, but again we'll want to introduce "addr" where "ip"
> currently appears in the names, and deprecate the old names:
> * ``VRT_acl_match``: use the ``VCL_ADDR`` type in the signature
> * ``VRT_ipcmp``: no change
> * ``VRT_IP_string``: introduce ``char *VRT_ADDR_string(VRT_CTX,
> VCL_ADDR)`` with the same function, and deprecate the old one

I do not need to say again that I'm not thrilled by the idea of mixing
them up :)

As promised I shared my feedback. This looks like more work than I
anticipated, but I'd really love to see UDS support landing in Varnish.

Cheers,
Dridi

[1] we can't have a slash in a domain name, right?