[CHERI] 54273f22f Second part of the CHERI saga

Poul-Henning Kamp phk at FreeBSD.org
Tue Nov 29 12:48:09 UTC 2022


commit 54273f22fd18404839426b496b2bacc09c37ca9f
Author: Poul-Henning Kamp <phk at FreeBSD.org>
Date:   Sat Nov 26 10:48:34 2022 +0000

    Second part of the CHERI saga

diff --git a/doc/sphinx/phk/cheri1.rst b/doc/sphinx/phk/cheri1.rst
index ce27d6495..03545e568 100644
--- a/doc/sphinx/phk/cheri1.rst
+++ b/doc/sphinx/phk/cheri1.rst
@@ -173,10 +173,10 @@ First test-run
 Just to see how bad it is, we run the main test-scripts::
 
     % cd bin/varnishtest
-    % ./varnistest -i -k -q tests/*.vtc
+    % ./varnishtest -i -k -q tests/*.vtc
     […]
     38 tests failed, 33 tests skipped, 754 tests passed
 
 That's not half bad…
 
-/phk
+*/phk*
diff --git a/doc/sphinx/phk/cheri2.rst b/doc/sphinx/phk/cheri2.rst
new file mode 100644
index 000000000..9c675af29
--- /dev/null
+++ b/doc/sphinx/phk/cheri2.rst
@@ -0,0 +1,121 @@
+.. _phk_cheri_2:
+
+How Varnish met CHERI 2/N
+=========================
+
+CHERI capabilities are twice the size of pointers, and Varnish not
+only uses a lot of pointers per request, it is also stingy with
+RAM, because it is not uncommon to use 100K worker threads.
+
+A number of test-cases fail because they are too stingy with memory
+allocations, I will deal with them as I get to them, and merely
+note them here as part of the accounting::
+
+    Increase workspace
+    ==================
+    TEST tests/c00108.vtc
+    TEST tests/r01038.vtc
+    TEST tests/r01120.vtc
+    TEST tests/r02219.vtc
+    TEST tests/o00005.vtc
+
+Things you cannot do under CHERI: Pointers in Pipes
+---------------------------------------------------
+
+Varnish has a central "waiter" service, whose job it is to monitor
+file descriptors to idle network connections, and do the right thing
+if data arrives on them, or if they are, or should be closed after
+a timeout.
+
+For reasons of performance, we have multiple implementations:
+``kqueue(2)`` (BSD), ``epoll(2)`` (Linux), ``ports(2)`` (Solaris)
+and ``poll(2)`` which should work everywhere POSIX has been read.
+
+We only have the ``poll(2)`` based waiter for portability, one
+less issue to deal with during bring-up on new platforms, its
+performance degrades to uselessness with contemporary loads
+of open network connections.
+
+The way they all work is that have a single thread sitting
+in the relevant system-call, monitoring tens of thousands
+of file descriptors.
+
+Some of those system calls allows other threads to add fds to the
+list, but ``poll(2)`` does not, so when we start the poll-waiter
+we create a ``pipe(2)``, and have the waiter-thread listen to that
+too.
+
+When another thread wants to add a file descriptor to the inventory,
+it uses ``write(2)`` to send a pointer into that pipe.  The kernel
+provide all the locking and buffering for us, wakes up the waiter-thread
+which reads the pointer, adds the new fd to its inventory and dives
+back into ``poll(2)``.
+
+This is 100% safe, because nobody else can get to a pipe created
+with ``pipe(2)``, but there is no way CHERI could spot that to
+make an execption, so reading pointers out of a filedescriptor,
+cause fully justified core-dumps.
+
+If the poll-waiter was actaully relevant, the proper fix would be
+to let the sending thread stick things on a locked list and just
+write a nonce-byte into the pipe to the waiter-thread, but that
+goes at the bottom of the TODO list, and for now I just remove the
+-Wpoll argument from five tests, which then pass::
+
+    TEST tests/b00009.vtc
+    TEST tests/b00048.vtc
+    TEST tests/b00054.vtc
+    TEST tests/b00059.vtc
+    TEST tests/c00080.vtc
+
+But why five tests ?
+
+It looks like one to test the poll-waiter and four cases of copy&paste.
+
+Never write your own Red-Black Trees
+------------------------------------
+
+In general there are few pieces of code I dare not wade into,
+but there are a LOT of code I dont want to touch, if there
+is any way to avoid it.
+
+Red-Black trees are one of them.
+
+In Varnish we stol^H^H^H^H imported both ``<queue.h>`` and ``<tree.h>``
+from FreeBSD, but as a safety measure we stuck a ``V`` prefix on
+everything in them.
+
+Every so often I will run a small shell-script which does the
+v-thing and compare the result to ``vtree.h`` and ``vqueue.h``,
+to keep up with FreeBSD.
+
+Today that paid off handsomely:  Some poor person on the CHERI
+team had to wade into ``tree.h`` and stick ``__no_subobject_bounds``
+directives to pointers to make that monster work under CHERI.
+
+I just ran my script and 20 more tests pass::
+
+    TEST tests/b00068.vtc
+    TEST tests/c00005.vtc
+    TEST tests/e00003.vtc
+    TEST tests/e00008.vtc
+    TEST tests/e00019.vtc
+    TEST tests/l00002.vtc
+    TEST tests/l00003.vtc
+    TEST tests/l00005.vtc
+    TEST tests/m00053.vtc
+    TEST tests/r01312.vtc
+    TEST tests/r01441.vtc
+    TEST tests/r02451.vtc
+    TEST tests/s00012.vtc
+    TEST tests/u00004.vtc
+    TEST tests/u00010.vtc
+    TEST tests/v00009.vtc
+    TEST tests/v00011.vtc
+    TEST tests/v00017.vtc
+    TEST tests/v00041.vtc
+    TEST tests/v00043.vtc
+
+Only nine failing tests left now.
+
+*/phk*
diff --git a/doc/sphinx/phk/index.rst b/doc/sphinx/phk/index.rst
index 755e75d5c..f5741e337 100644
--- a/doc/sphinx/phk/index.rst
+++ b/doc/sphinx/phk/index.rst
@@ -13,6 +13,7 @@ You may or may not want to know what Poul-Henning thinks.
 .. toctree::
 	:maxdepth: 1
 
+	cheri2.rst
 	cheri1.rst
 	routine.rst
 	503aroundtheworld.rst


More information about the varnish-commit mailing list