Varnish jails, priv-sep, packaging etc.

Mon Apr 13 21:57:12 CEST 2015

I have spent time over Easter thinking about the jail/priv-sep thing,
and in particular about the child process.

(All of this applies only when varnishd is started as root.)

Assume the worker process has been possesed.

How can we prevent it from making the possession persistent ?

For starters it should not be able to manipulate the compiled VCL
shlib files.

Today we use varnish:varnish ($params) for all the privsep subprocesses
(ie: VCC/CC/DLOPEN/WORKER)

This means that a possesed worker process in principle can replace
the compiled VCL shlib files (because permissions allow CC to write them.)

That could be remedied by changing the ownership of the -n directory
and having the VCC/CC/DLOPEN operate out of a varnish:varnish owned
subdirectory.

But the possesed worker could just continously scan the -n directory
and jump in when a VCL compilation was happening and corrupt the result.
This limits the opportunities somewhat, but it doesn't close the hole.

There is also the general question of file access, I don't think it is
safe to assume that the worker should have access to anything VCC or CC
have access to.

This argues strongly for a separate uid for the worker process.

The next question is the ownership of the default secret file.  If people
specify one with -S it is not our problem, but the default config it is.

The worker should certainly not have access to this file and
should absolutely not be able to write or replace it, which it can today
because the -n directory is varnish:varnish.

So the result I come up with is the following:

Assume that varnishd is started as root:wheel

Assume that we have a "varnish" group and "vadmin" and "vrun" users.

	-n directory	root:varnish 755
			We cannot make it 750, because then admins
			with wheel group can not get to the secret/vsm files.

	_.secret	root:wheel 440

	_.vsm		vadmin:varnish 644

	vcl$shlib	vadmin:varnish 750

Subprocesses VCC and CC runs as vadmin:varnish.  The master process creates
a temporary compilation subdirectory under ${-n} with vadmin:varnish 750
and once the shlib is done, moves it up to ${-n} and changes m:o:g

Subprocesses DLOPEN and WORKER runs as vrun:varnish and therefore cannot
read the _.secret file, and cannot write or replace vcl shlib files.

In addition:

	The -j arguments "ccgroup" is bestowed on the CC process

	The -j argument "secretgroup" overrides the primary group
	from the master process for the _secret file (Allowing it
	to be set for instance to "operator").

I have given up on using "nobody" instead of "vrun" for two reasons:

First "nobody" is used for all sorts of stuff, and it's only a matter
of time before there is an unintended consequence of that.

But more importantly, by having a specific uid for the worker process,
we give admins a way to control access to files VMODs needs, GEO-IP
databases etc. etc, and it makes it possible to isolate directories
which VMODs can create and write files to -- "nobody" would be very
inappropriate for that.

So is this acceptable from a packaging perspective ?

What should we call the users ?

The vadmin one could simply be "varnish", but what do we call the
vrun user ?  I think we have to respect the historical 7-char limit
so "varnish-run" is out of reach, and "vrun" is only logical to VTLA
afficionadios like us.

Comments, inputs, ideas ?

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.