[master] 5d4e916 More 10 year aniversary late-night ruminations.

Poul-Henning Kamp phk at FreeBSD.org
Tue Mar 1 02:08:32 CET 2016

commit 5d4e9160fafbc571b2d5e056f8ee02f65242b681
Author: Poul-Henning Kamp <phk at FreeBSD.org>
Date:   Tue Mar 1 01:07:19 2016 +0000

    More 10 year aniversary late-night ruminations.

diff --git a/doc/sphinx/phk/index.rst b/doc/sphinx/phk/index.rst
index 65108ed..0556769 100644
--- a/doc/sphinx/phk/index.rst
+++ b/doc/sphinx/phk/index.rst
@@ -8,6 +8,7 @@ You may or may not want to know what Poul-Henning thinks.
 .. toctree::
 	:maxdepth: 1
+	thatslow.rst
diff --git a/doc/sphinx/phk/thatslow.rst b/doc/sphinx/phk/thatslow.rst
new file mode 100644
index 0000000..f35b65b
--- /dev/null
+++ b/doc/sphinx/phk/thatslow.rst
@@ -0,0 +1,194 @@
+.. _phk_that_slow:
+Going fast slowly
+If I count in my source tree, right here and now, Varnish has 100K lines
+of sourcecode::
+	75619 lines in .c files
+	18489 lines in .h files
+	 2625 lines in .py files
+	  670 lines in .vcc files
+	  501 lines in .vcl files
+A little over 20K lines of testcases::
+	21777 lines in .vtc files
+A little over 20K lines of documentation::
+	22169 lines in .rst files
+And probably about 5K lines of "misc"::
+	 1393 lines in .am files
+	  712 lines in .ac files
+	  613 lines in .lnt files
+For the sake of simplicity, lets us call it a round 150K total lines [#f1]_.
+Varnish has been in existence for 10 years, so that's 15K lines per year.
+200 workdays a year makes that 75 lines a day.
+7.5 hours of work per day gives 10 lines per hour.
+Even though I have written the vast majority of the source code,
+Varnish is far from a one-person project.
+I have no way to estimate the average number of full time persons
+over the last ten years, so lets pick the worst case and say that
+only two persons were full time.
+It follows that there is *no way* average output of those two persons
+exceeded 5 linies per hour, measured over the ten year history of
+the project.
+Does that number seem low or high to you ?
+Anyway, What do programmers do all day?
+(`Yeah, yeah, yeah, I know... <http://xkcd.com/303/>`_)
+Back before the dot-com disaster, people had actually spent considerable
+time and effort to find out what kind of productivity to expect from
+a programmer, after all, how could you ever estimate a project without
+knowing that crucial number?
+The results were all over the place, to put it mildly, but they were
+universally much lower than everybody expected.
+With his seminal The Mythical Man-Month, Frederick P. Brooks brought
+the ballpark estimate "10 lines per programmer per day" into common
+use, despite everything he wrote in the text surrounding that number
+arguing for the exact opposite.
+With the ultimate focus on quality and correctness, for instance
+the Apollo and Space Shuttle software, productivity drops to less
+than one line of code per day per employee.
+The estimated upper bound on Varnish productivity is almost an order
+of magnitude above Brooks ball-park estimate, and another easily
+ignorable magnitude away from the unrealistic goal of being the
+same quality as the software for the Space Shuttle.
+So we are inside Brooks ball-park, even if a bit on the high side [#f2]_,
+What took us so long ?
+The surprise over the 5LOC/h number is undoubtedly inversely
+proportional to the age of the reader.
+Back when I was a kid I could write 1000 lines in a single
+sleep-deprived session across midnight [#f3]_, but it didn't take
+that long before I discovered that I had to throw out most if it
+once I woke up again.
+I was 40 years old when I started Varnish and I had 22 years
+of professional experience, a *lot* of them staring at, and
+often fixing/improving/refactoring, other peoples source code.
+Over the years I came to appreciate Antonie de Saint Exupérys
+	Perfection is attained, not when there is nothing more
+	to add, but when there is nothing more to remove.
+And eventually I no longer think about code lines as an asset
+to be accumulated, but rather as an expenditure to be avoided.
+When I started Varnish, one of my main personal goals was to 
+make it my highest quality program - ever [#f4]_.
+This is why Varnish is written in "pidgin C" style and lousy with
+asserts which don't do anything [#f5]_, except clarify programmer
+intent [#f6]_, and in case of mistakes, stop bad things before they
+get out of hand.
+And this is why there are other "pointless overheads" in the Varnish
+source code, from the panic/backtrace code over the "miniobj"
+type-safety to obscure hints to Gimpel Softwares FlexeLint product.
+Needless to say, it is also not by accident that the 20K lines of
+testcases exercise over 90% of the varnishd source code lines.
+And insisting on doing things right, rather than *"we can fix it
+properly later"* which is so widespread in FOSS source code [#f7]_,
+is not going to boost your line count either.
+But did it work ?
+A 10 year project aniversary is a good reason to stop and see
+if the expected roses are there to be smelled.
+We have lots of numbers, commits (10538), bugreports (1864), CVEs
+(2) [#f8]_ or Coverity detections (a dozen?) but It is pretty nigh
+impossible to measure program quality, even though we tend to know
+it when we see it.
+There are also uncountable events which should be in the ledger,
+503s [#f9]_, crashes, hair-tearing, head-scrathing, coffee-drinking,
+manual- and source-code thumbing and frustrated cries of help on IRC.
+In the other cup there are equally intangible positives, pats on
+the shoulder, free beers, X-mas and birthday presents from my Amazon
+wish-list (Thanks!), and more snarky tweets about how great Varnish
+is than I can remember.
+All in all, the best I have been able to do, to convince myself that
+I have not *totally* missed my goal, is a kind of "The curious case
+of the dog in the night-time" observation:
+I have never yet had a person tell me Varnish made their life more
+I'll take that.
+.. rubric:: Footnotes
+.. [#f1] We can do a better and more precise estimate if we want.
+   For instance we have not typed in the 30 line BSD-2 Blurp *all*
+   314 times, and upwards of 30% of the rest are blank lines.
+   However, there is no way we can reduce the number by an order
+   of magnitude, in particular not because code that was written
+   and subsequently removed is not part of the base data.
+.. [#f2] Which is to be expected really: We don't program on punched cards.
+.. [#f3] And I did.  Migrating an oilcompany from IBM mainframes
+   to 16-bit UNIX computers in 198x was an interesting challenge.
+.. [#f4] Having half the world adopt your hastily hacked up md5crypt
+   with a glaringly obvious, but fortunately harmless, bug will do
+   that to you.
+.. [#f5] Roughly 10% of the source code lines were asserts last I looked.
+.. [#f6] I prefer asserts over comments for this, since the compiler
+   can also see them.  The good news is, the compiler can also see
+   that they don't do anything so a lot fewer are present in the
+   binary program.  Interestingly, a couple of them allows the compiler
+   to optimize much harder.  No, I won't tell you which those are.
+.. [#f7] Only code where that is a bigger problem is phd-ware:  Software
+   written as proof-of-concept and abandonned in haste when the diploma
+   was in hand.
+.. [#f8] Obviously, a high count of CVE's should be a real reason for
+   concern, but there is no meaningful difference between having
+   one, two or three CVE's over the course of ten years.
+   The two CVEs against Varnish were both utterly bogus "trophy-hunter"
+   CVEs in my opinion.  (But don't take my word for it, judge for
+   yourself.)
+.. [#f9] There used to be a link back to the Varnish project on the
+   default.vcl's 503 page, but we removed it after a large national
+   institution in a non-english country showed it to a *lot* of people
+   who clicked on the only link they could see on the page.

More information about the varnish-commit mailing list