[master] d3a647c A handful of design documents from the very start of Varnish

Poul-Henning Kamp phk at FreeBSD.org
Sat Feb 20 22:08:20 CET 2016


commit d3a647c5280bfe4fda30a7b94f7057fc4d3754e8
Author: Poul-Henning Kamp <phk at FreeBSD.org>
Date:   Sat Feb 20 21:07:48 2016 +0000

    A handful of design documents from the very start of Varnish

diff --git a/doc/sphinx/phk/firstdesign.rst b/doc/sphinx/phk/firstdesign.rst
new file mode 100644
index 0000000..216101a
--- /dev/null
+++ b/doc/sphinx/phk/firstdesign.rst
@@ -0,0 +1,1635 @@
+.. _phk_firstdesign:
+
+===========================
+The first design of Varnish
+===========================
+
+I have been working on a "bit-storage" facility for datamuseum.dk,
+and as part of my "eat your own dog-food" policy, I converting my
+own personal archive (41 DVD's worth) as a test.
+
+Along the way I passed through 2006 and found some files from
+the birth of Varnish 10 years ago.
+
+The first Varnish Design notes
+------------------------------
+
+This file are notes taken during a meeting in Oslo on 2nd feb 2006,
+which in essence consisted of Anders Berg cursing Squid for a couple
+of hours.
+
+(Originally the meeting was scheduled for jan 24th but a SAS pilot
+strike put an end to that.)
+
+To be honest I knew very little about web-traffic, my own homepage
+was written in HTML in vi(1), so I had a bit of catching up to do
+on that front, but the overall job was pretty simple:  A program
+to move bytes ... fast.
+
+It is quite interesting to see how many things we got right and
+where we kept thinking in the old frame of reference (ie: Squid)::
+
+	Notes on Varnish
+	----------------
+
+	Collected 2006-02-02 to 2006-02-..
+
+	Poul-Henning Kamp
+
+
+	Philosophy
+	----------
+
+	It is not enough to deliver a technically superior piece of software,
+	if it is not possible for people to deploy it usefully in a sensible
+	way and timely fashion.
+
+
+	Deployment scenarios
+	--------------------
+
+	There are two fundamental usage scenarios for Varnish: when the
+	first machine is brought up to offload a struggling backend and
+	when a subsequent machine is brought online to help handle the load.
+
+
+	The first (layer of) Varnish
+	----------------------------
+
+	Somebodys webserver is struggling and they decide to try Varnish.
+
+	Often this will be a skunkworks operation with some random PC
+	purloined from wherever it wasn't being used and the Varnish "HOWTO"
+	in one hand.
+
+	If they do it in an orderly fashion before things reach panic proportions,
+	a sensible model is to setup the Varnish box, test it out from your
+	own browser, see that it answers correctly.  Test it some more and
+	then add the IP# to the DNS records so that it takes 50% of the load
+	off the backend.
+
+	If it happens as firefighting at 3AM the backend will be moved to another
+	IP, the Varnish box given the main IP and things had better work real
+	well, really fast.
+
+	In both cases, it would be ideal if all that is necessary to tell
+	Varnish are two pieces of information:
+
+		Storage location
+			Alternatively we can offer an "auto" setting that makes
+			Varnish discover what is available and use what it find.
+
+		DNS or IP# of backend.
+
+			IP# is useful when the DNS settings are not quite certain
+			or when split DNS horizon setups are used.
+
+	Ideally this can be done on the commandline so that there is no
+	configuration file to edit to get going, just
+
+		varnish -d /home/varnish -s backend.example.dom
+
+	and you're off running.
+
+	A text, curses or HTML based based facility to give some instant
+	feedback and stats is necessary.
+
+	If circumstances are not conductive to strucured approach, it should
+	be possible to repeat this process and set up N independent Varnish
+	boxes and get some sort of relief without having to read any further
+	documentation.
+
+
+	The subsequent (layers of) Varnish
+	----------------------------------
+
+	This is what happens once everybody has caught their breath,
+	and where we start to talk about Varnish clusters.
+
+	We can assume that at this point, the already installed Varnish
+	machines have been configured more precisely and that people
+	have studied Varnish configuration to some level of detail.
+
+	When Varnish machines are put in a cluster, the administrator should
+	be able to consider the cluster as a unit and not have to think and
+	interact with the individual nodes.
+
+	Some sort of central management node or facility must exist and
+	it would be preferable if this was not a physical but a logical
+	entity so that it can follow the admin to the beach.  Ideally it
+	would give basic functionality in any browser, even mobile phones.
+
+	The focus here is scaleability, we want to avoid per-machine
+	configuration if at all possible.  Ideally, preconfigured hardware
+	can be plugged into power and net, find an address with DHCP, contact
+	preconfigured management node, get a configuration and start working.
+
+	But we also need to think about how we avoid a site of Varnish
+	machines from acting like a stampeeding horde when the power or
+	connectivity is brought back after a disruption.  Some sort of
+	slow starting ("warm-up" ?) must be implemented to prevent them
+	from hitting all the backend with the full force.
+
+	An important aspect of cluster operations is giving a statistically
+	meaninful judgement of the cluster size, in particular answering
+	the question "would adding another machine help ?" precisely.
+
+	We should have a facility that allows the administrator to type
+	in a REGEXP/URL and have all the nodes answer with a checksum, age
+	and expiry timer for any documents they have which match.  The
+	results should be grouped by URL and checksum.
+
+
+	Technical concepts
+	------------------
+
+	We want the central Varnish process to be that, just one process, and
+	we want to keep it small and efficient at all cost.
+
+	Code that will not be used for the central functionality should not
+	be part of the central process.  For instance code to parse, validate
+	and interpret the (possibly) complex configuration file should be a
+	separate program.
+
+	Depending on the situation, the Varnish process can either invoke
+	this program via a pipe or receive the ready to use data structures
+	via a network connection.
+
+	Exported data from the Varnish process should be made as cheap as
+	possible, likely shared memory.  That will allow us to deploy separate
+	processes for log-grabbing, statistics monitoring and similar
+	"off-duty" tasks and let the central process get on with the
+	important job.
+
+
+	Backend interaction
+	-------------------
+
+	We need a way to tune the backend interaction further than what the
+	HTTP protocol offers out of the box.
+
+	We can assume that all documents we get from the backend has an
+	expiry timer, if not we will set a default timer (configurable of
+	course).
+
+	But we need further policy than that.  Amongst the questions we have
+	to ask are:
+
+		How long time after the expiry can we serve a cached copy
+		of this document while we have reason to belive the backend
+		can supply us with an update ?
+		
+		How long time after the expiry can we serve a cached copy
+		of this document if the backend does not reply or is
+		unreachable.
+
+		If we cannot serve this document out of cache and the backend
+		cannot inform us, what do we serve instead (404 ?  A default
+		document of some sort ?)
+
+		Should we just not serve this page at all if we are in a
+		bandwidth crush (DoS/stampede) situation ?
+
+	It may also make sense to have a "emergency detector" which triggers
+	when the backend is overloaded and offer a scaling factor for all
+	timeouts for when in such an emergency state.  Something like "If
+	the average response time of the backend rises above 10 seconds,
+	multiply all expiry timers by two".
+
+	It probably also makes sense to have a bandwidth/request traffic
+	shaper for backend traffic to prevent any one Varnish machine from
+	pummeling the backend in case of attacks or misconfigured
+	expiry headers.
+
+
+	Startup/consistency
+	-------------------
+
+	We need to decide what to do about the cache when the Varnish
+	process starts.  There may be a difference between it starting
+	first time after the machine booted and when it is subsequently
+	(re)started.
+
+	By far the easiest thing to do is to disregard the cache, that saves
+	a lot of code for locating and validating the contents, but this
+	carries a penalty in backend or cluster fetches whenever a node
+	comes up.  Lets call this the "transient cache model"
+
+	The alternative is to allow persistently cached contents to be used
+	according to configured criteria:
+
+		Can expired contents be served if we can't contact the
+		backend ?  (dangerous...)
+
+		Can unexpired contents be served if we can't contact the
+		backend ?  If so, how much past the expiry ?
+
+	It is a very good question how big a fraction of the persistent
+	cache would be usable after typical downtimes:
+
+		After a Varnish process restart:  Nearly all.
+
+		After a power-failure ?  Probably at least half, but probably
+		not the half that contains the most busy pages.
+
+	And we need to take into consideration if validating the format and
+	contents of the cache might take more resources and time than getting
+	the content from the backend.
+
+	Off the top of my head, I would prefer the transient model any day
+	because of the simplicity and lack of potential consistency problems,
+	but if the load on the back end is intolerable this may not be
+	practically feasible.
+
+	The best way to decide is to carefully analyze a number of cold
+	starts and cache content replacement traces.
+
+	The choice we make does affect the storage management part of Varnish,
+	but I see that is being modular in any instance, so it may merely be
+	that some storage modules come up clean on any start while other
+	will come up with existing objects cached.
+
+
+	Clustering
+	----------
+
+	I'm somewhat torn on clustering for traffic purposes.  For admin
+	and management: Yes, certainly, but starting to pass objects from
+	one machine in a cluster to another is likely to be just be a waste
+	of time and code.
+
+	Today one can trivially fit 1TB into a 1U machine so the partitioning
+	argument for cache clusters doesn't sound particularly urgent to me.
+
+	If all machines in the cluster have sufficient cache capacity, the
+	other remaining argument is backend offloading, that would likely
+	be better mitigated by implementing a 1:10 style two-layer cluster
+	with the second level node possibly having twice the storage of
+	the front row nodes.
+
+	The coordination necessary for keeping track of, or discovering in
+	real-time, who has a given object can easily turn into a traffic
+	and cpu load nightmare.
+
+	And from a performance point of view, it only reduces quality:
+	First we send out a discovery multicast, then we wait some amount
+	of time to see if a response arrives only then should we start
+	to ask the backend for the object.  With a two-level cluster
+	we can ask the layer-two node right away and if it doesn't have
+	the object it can ask the back-end right away, no timeout is
+	involved in that.
+
+	Finally Consider the impact on a cluster of a "must get" object
+	like an IMG tag with a misspelled URL.  Every hit on the front page
+	results in one get of the wrong URL.  One machine in the cluster
+	ask everybody else in the cluster "do you have this URL" every
+	time somebody gets the frontpage.
+
+	If we implement a negative feedback protocol ("No I don't"), then
+	each hit on the wrong URL will result in N+1 packets (assuming multicast).
+
+	If we use a silent negative protocol the result is less severe for
+	the machine that got the request, but still everybody wakes up to
+	to find out that no, we didn't have that URL.
+
+	Negative caching can mitigate this to some extent.
+
+
+	Privacy
+	-------
+
+	Configuration data and instructions passed forth and back should
+	be encrypted and signed if so configured.  Using PGP keys is
+	a very tempting and simple solution which would pave the way for
+	administrators typing a short ascii encoded pgp signed message
+	into a SMS from their Bahamas beach vacation...
+
+
+	Implementation ideas
+	--------------------
+
+	The simplest storage method mmap(2)'s a disk or file and puts
+	objects into the virtual memory on page aligned boundaries,
+	using a small struct for metadata.  Data is not persistant
+	across reboots.  Object free is incredibly cheap.  Object
+	allocation should reuse recently freed space if at all possible.
+	"First free hole" is probably a good allocation strategy.
+	Sendfile can be used if filebacked.  If nothing else disks
+	can be used by making a 1-file filesystem on them.
+
+	More complex storage methods are object per file and object
+	in database models.  They are relatively trival and well
+	understood.  May offer persistence.
+
+	Read-Only storage methods may make sense for getting hold
+	of static emergency contents from CD-ROM etc.
+
+	Treat each disk arm as a separate storage unit and keep track of
+	service time (if possible) to decide storage scheduling.
+
+	Avoid regular expressions at runtime.  If config file contains
+	regexps, compile them into executable code and dlopen() it
+	into the Varnish process.  Use versioning and refcounts to
+	do memory management on such segments.
+
+	Avoid committing transmit buffer space until we have bandwidth
+	estimate for client.  One possible way:  Send HTTP header
+	and time ACKs getting back, then calculate transmit buffer size
+	and send object.  This makes DoS attacks more harmless and
+	mitigates traffic stampedes.
+
+	Kill all TCP connections after N seconds, nobody waits an hour
+	for a web-page to load.
+
+	Abuse mitigation interface to firewall/traffic shaping:  Allow
+	the central node to put an IP/Net into traffic shaping or take
+	it out of traffic shaping firewall rules.  Monitor/interface
+	process (not main Varnish process) calls script to config
+	firewalling.
+
+	"Warm-up" instructions can take a number of forms and we don't know
+	what is the most efficient or most usable.  Here are some ideas:
+
+	    Start at these URL's then...
+
+		... follow all links down to N levels.
+
+		... follow all links that match REGEXP no deeper than N levels down.
+
+		... follow N random links no deeper than M levels down.
+
+		... load N objects by following random links no deeper than
+		    M levels down.
+
+	    But...
+
+		... never follow any links that match REGEXP
+
+		... never pick up objects larger than N bytes
+
+		... never pick up objects older than T seconds
+
+
+	It makes a lot of sense to not actually implement this in the main
+	Varnish process, but rather supply a template perl or python script
+	that primes the cache by requesting the objects through Varnish.
+	(That would require us to listen separately on 127.0.0.1
+	so the perlscript can get in touch with Varnish while in warm-up.)
+
+	One interesting but quite likely overengineered option in the
+	cluster case is if the central monitor tracks a fraction of the
+	requests through the logs of the running machines in the cluster,
+	spots the hot objects and tell the warming up varnish what objects
+	to get and from where.
+
+
+	In the cluster configuration, it is probably best to run the cluster
+	interaction in a separate process rather than the main Varnish
+	process.  From Varnish to cluster info would go through the shared
+	memory, but we don't want to implement locking in the shmem so
+	some sort of back-channel (UNIX domain or UDP socket ?) is necessary.
+
+	If we have such an "supervisor" process, it could also be tasked
+	with restarting the varnish process if vitals signs fail:  A time
+	stamp in the shmem or kill -0 $pid.
+
+	It may even make sense to run the "supervisor" process in stand
+	alone mode as well, there it can offer a HTML based interface
+	to the Varnish process (via shmem).
+
+	For cluster use the user would probably just pass an extra argument
+	when he starts up Varnish:
+
+		varnish -c $cluster_args $other_args
+	vs
+
+		varnish $other_args
+
+	and a "varnish" shell script will Do The Right Thing.
+
+
+	Shared memory
+	-------------
+
+	The shared memory layout needs to be thought about somewhat.  On one
+	hand we want it to be stable enough to allow people to write programs
+	or scripts that inspect it, on the other hand doing it entirely in
+	ascii is both slow and prone to race conditions.
+
+	The various different data types in the shared memory can either be
+	put into one single segment(= 1 file) or into individual segments
+	(= multiple files).  I don't think the number of small data types to
+	be big enough to make the latter impractical.
+
+	Storing the "big overview" data in shmem in ASCII or HTML would
+	allow one to point cat(1) or a browser directly at the mmaped file
+	with no interpretation necessary, a big plus in my book.
+
+	Similarly, if we don't update them too often, statistics could be stored
+	in shared memory in perl/awk friendly ascii format.
+
+	But the logfile will have to be (one or more) FIFO logs, probably at least
+	three in fact:  Good requests, Bad requests, and exception messages.
+
+	If we decide to make logentries fixed length, we could make them ascii
+	so that a simple "sort -n /tmp/shmem.log" would put them in order after
+	a leading numeric timestamp, but it is probably better to provide a
+	utility to cat/tail-f the log and keep the log in a bytestring FIFO
+	format.  Overruns should be marked in the output.
+
+
+	*END*
+
+The second Varnish Design notes
+-------------------------------
+
+You will notice above that there is no mention of VCL, it took a
+couple of weeks for that particular lightning to strike.
+
+Interestingly I know exactly where the lightning came from, and
+what it hit.
+
+The timeframe was around GCC 4.0.0 which was not their best release,
+and I had for some time been pondering a pre-processor for the C
+language to make up for the ISO-C stagnation and braindamage.
+
+I've read most of the "classic" compiler books, and probably read
+more compilers many people (Still to go: `GIER Algol 4 <http://datamuseum.dk/wiki/GIER/GA4GuideToDocumentationAndCode>`_) but to be honest I found
+them far too theoretical and not very helpful from a *practical* compiler
+construction point of view.
+
+But there is one compiler-book which takes an entirely different
+take:  `Hanson and Fraser's LCC book. <http://www.amazon.com/gp/search/?field-isbn=0805316701>`_ which throws LEX and YACC under the truck and
+concentrates on compiling.
+
+Taking their low-down approach to parsing, and emitting C code,
+there really isn't much compiler left to write, and I had done
+several interesting hacks towards my 'K' language.
+
+The lightning rod was all the ideas Anders had for how Varnish
+should be able to manipulate the traffic passing through, how
+to decide what to cache, how long time to cache it, where to
+cache it and ... it sounded like a lot of very detailed code
+which had to be incredibly configurable.
+
+Soon those two inspiratons collided::
+
+
+	Notes on Varnish
+	----------------
+
+	Collected 2006-02-24 to 2006-02-..
+
+	Poul-Henning Kamp
+
+	-----------------------------------------------------------------------
+	Policy Configuration
+
+	Policy is configured in a simple unidirectional (no loops, no goto)
+	programming language which is compiled into 'C' and from there binary
+	modules which are dlopen'ed by the main Varnish process.
+
+	The dl object contains one exported symbol, a pointer to a structure
+	which contains a reference count, a number of function pointers,
+	a couple of string variables with identifying information.
+
+	All access into the config is protected by the reference counts.
+
+	Multiple policy configurations can be loaded at the same time
+	but only one is the "active configuration".  Loading, switching and
+	unloading of policy configurations happen via the managment
+	process.
+
+	A global config sequence number is incremented on each switch and
+	policy modified object attributes (ttl, cache/nocache) are all
+	qualified by the config-sequence under which they were calculated
+	and invalid if a different policy is now in effect.
+
+	-----------------------------------------------------------------------
+	Configuration Language
+
+	XXX: include lines.
+
+	BNF:
+		program:	function
+				| program function
+
+		function:	"sub" function_name compound_statement
+
+		compound_statement:	"{" statements "}"
+
+		statements:	/* empty */
+				| statement
+				| statements statement
+				
+
+		statement:	if_statement
+				| call_statement
+				| "finish"
+				| assignment_statement
+				| action_statement
+
+		if_statement:	"if" condition compound_statement elif_parts else_part
+
+		elif_parts:	/* empty */
+				| elif_part
+				| elif_parts elif_part
+
+		elif_part:	"elseif" condition compound_statement
+				| "elsif" condition compound_statement
+				| "else if" condition compound_statement
+
+		else_part:	/* empty */
+				| "else" compound_statement
+
+		call_statement:	"call" function_name
+
+		assign_statement:	field "=" value
+
+		field:		object
+				field "." variable
+
+		action_statement:	action arguments
+
+		arguments:	/* empty */
+				arguments | argument
+
+	-----------------------------------------------------------------------
+	Sample request policy program
+
+		sub request_policy {
+
+			if (client.ip in 10.0.0.0/8) {
+				no-cache
+				finish
+			}
+
+			if (req.url.host ~ "cnn.no$") {
+				rewrite	s/cnn.no$/vg.no/
+			}
+
+			if (req.url.path ~ "cgi-bin") {
+				no-cache
+			}
+
+			if (req.useragent ~ "spider") {
+				no-new-cache
+			}
+
+			if (backend.response_time > 0.8s) {
+				set req.ttlfactor = 1.5
+			} elseif (backend.response_time > 1.5s) {
+				set req.ttlfactor = 2.0
+			} elseif (backend.response_time > 2.5s) {
+				set req.ttlfactor = 5.0
+			}
+
+			/*
+			 * the program contains no references to
+			 * maxage, s-maxage and expires, so the
+			 * default handling (RFC2616) applies
+			 */
+		}
+
+	-----------------------------------------------------------------------
+	Sample fetch policy program
+
+		sub backends {
+			set backend.vg.ip = {...}
+			set backend.ads.ip = {...}
+			set backend.chat.ip = {...}
+			set backend.chat.timeout = 10s
+			set backend.chat.bandwidth = 2000 MB/s
+			set backend.other.ip = {...}
+		}
+
+		sub vg_backend {
+			set backend.ip = {10.0.0.1-5}
+			set backend.timeout = 4s
+			set backend.bandwidth = 2000Mb/s
+		}
+
+		sub fetch_policy {
+
+			if (req.url.host ~ "/vg.no$/") {
+				set req.backend = vg
+				call vg_backend
+			} else {
+				/* XXX: specify 404 page url ? */
+				error 404
+			}
+
+			if (backend.response_time > 2.0s) {
+				if (req.url.path ~ "/landbrugspriser/") {
+					error 504
+				}
+			}
+			fetch
+			if (backend.down) {
+				if (obj.exist) {
+					set obj.ttl += 10m
+					finish
+				}
+				switch_config ohhshit
+			}
+			if (obj.result == 404) {
+				error 300 "http://www.vg.no"
+			}
+			if (obj.result != 200) {
+				finish
+			}
+			if (obj.size > 256k) {
+				no-cache
+			} else if (obj.size > 32k && obj.ttl < 2m) {
+				obj.tll = 5m				
+			}
+			if (backend.response_time > 2.0s) {
+				set ttl *= 2.0
+			}
+		}
+
+		sub prefetch_policy {
+
+			if (obj.usage < 10 && obj.ttl < 5m) {
+				fetch
+			}
+		}
+
+	-----------------------------------------------------------------------
+	Purging
+
+	When a purge request comes in, the regexp is tagged with the next
+	generation number and added to the tail of the list of purge regexps.
+
+	Before a sender transmits an object, it is checked against any
+	purge-regexps which have higher generation number than the object
+	and if it matches the request is sent to a fetcher and the object
+	purged.
+
+	If there were purge regexps with higher generation to match, but
+	they didn't match, the object is tagged with the current generation
+	number and moved to the tail of the list.
+
+	Otherwise, the object does not change generation number and is
+	not moved on the generation list.
+
+	New Objects are tagged with the current generation number and put
+	at the tail of the list.
+
+	Objects are removed from the generation list when deleted.
+
+	When a purge object has a lower generation number than the first
+	object on the generation list, the purge object has been completed
+	and will be removed.  A log entry is written with number of compares
+	and number of hits.
+		
+	-----------------------------------------------------------------------
+	Random notes
+
+		swap backed storage
+
+		slowstart by config-flipping
+			start-config has peer servers as backend
+			once hitrate goes above limit, management process
+			flips config to 'real' config.
+
+		stat-object
+			always URL, not regexp
+
+		management + varnish process in one binary, comms via pipe
+
+		Change from config with long expiry to short expiry, how
+		does the ttl drop ?  (config sequence number invalidates
+		all calculated/modified attributes.)
+
+		Mgt process holds copy of acceptor socket ->  Restart without
+		lost client requests.
+
+		BW limit per client IP: create shortlived object (<4sec)
+		to hold status.  Enforce limits by delaying responses.
+
+
+	-----------------------------------------------------------------------
+	Source structure
+
+
+		libvarnish
+			library with interface facilities, for instance
+			functions to open&read shmem log
+
+		varnish
+			varnish sources in three classes
+
+	-----------------------------------------------------------------------
+	protocol cluster/mgt/varnish
+
+	object_query url -> TTL, size, checksum
+	{purge,invalidate} regexp
+	object_status url -> object metadata
+
+	load_config filename
+	switch_config configname
+	list_configs
+	unload_config
+
+	freeze 	# stop the clock, freezes the object store
+	thaw
+
+	suspend	# stop acceptor accepting new requests
+	resume
+
+	stop	# forced stop (exits) varnish process
+	start
+	restart = "stop;start"
+
+	ping $utc_time -> pong $utc_time
+
+	# cluster only
+	config_contents filename $inline -> compilation messages
+
+	stats [-mr] -> $data
+
+	zero stats
+
+	help
+
+	-----------------------------------------------------------------------
+	CLI (local)
+		import protocol from above
+
+		telnet localhost someport
+		authentication:
+			password $secret
+		secret stored in {/usr/local}/etc/varnish.secret (400 root:wheel)
+
+
+	-----------------------------------------------------------------------
+	HTML (local)
+
+		php/cgi-bin thttpd ?
+		(alternatively direct from C-code.)
+		Everything the CLI can do +
+		stats
+			popen("rrdtool");
+		log view
+
+	-----------------------------------------------------------------------
+	CLI (cluster)
+		import protocol from above, prefix machine/all
+		compound stats
+		accept / deny machine (?)
+		curses if you set termtype
+
+	-----------------------------------------------------------------------
+	HTML (cluster)
+		ditto
+		ditto
+
+		http://clustercontrol/purge?regexp=fslkdjfslkfdj
+			POST with list of regexp
+			authentication ? (IP access list)
+
+	-----------------------------------------------------------------------
+	Mail (cluster)
+
+		pgp signed emails with CLI commands
+
+	-----------------------------------------------------------------------
+	connection varnish -> cluster controller
+
+		Encryption
+			SSL
+		Authentication (?)
+			IP number checks.
+
+		varnish -c clusterid -C mycluster_ctrl.vg.no
+
+	-----------------------------------------------------------------------
+	Filer
+		/usr/local/sbin/varnish
+			contains mgt + varnish process.
+			if -C argument, open SSL to cluster controller.
+			Arguments:
+				-p portnumber
+				-c clusterid at cluster_controller
+				-f config_file
+				-m memory_limit
+				-s kind[,storage-options]
+				-l logfile,logsize
+				-b backend ip...
+				-d debug
+				-u uid
+				-a CLI_port
+
+			KILL SIGTERM	-> suspend, stop
+
+		/usr/local/sbin/varnish_cluster
+			Cluster controller.
+			Use syslog
+
+			Arguments:
+				-f config file
+				-d debug
+				-u uid (?)
+
+		/usr/local/sbin/varnish_logger
+			Logfile processor
+			-i shmemfile
+			-e regexp
+			-o "/var/log/varnish.%Y%m%d.traffic" 
+			-e regexp2
+			-n "/var/log/varnish.%Y%m%d.exception"  (NCSA format)
+			-e regexp3
+			-s syslog_level,syslogfacility
+			-r host:port	send via TCP, prefix hostname
+
+			SIGHUP: reopen all files.
+
+		/usr/local/bin/varnish_cli
+			Command line tool.
+
+		/usr/local/share/varnish/etc/varnish.conf
+			default request + fetch + backend scripts
+
+		/usr/local/share/varnish/etc/rfc2616.conf
+			RFC2616 compliant handling function
+
+		/usr/local/etc/varnish.conf (optional)
+			request + fetch + backend scripts
+
+		/usr/local/share/varnish/etc/varnish.startup
+			default startup sequence
+
+		/usr/local/etc/varnish.startup (optional)
+			startup sequence
+
+		/usr/local/etc/varnish_cluster.conf
+			XXX
+
+		{/usr/local}/etc/varnish.secret
+			CLI password file.
+
+	-----------------------------------------------------------------------
+	varnish.startup
+
+		load config /foo/bar startup_conf
+		switch config startup_conf
+		!mypreloadscript
+		load config /foo/real real_conf
+		switch config real_conf
+		resume
+
+
+The third Varnish Design notes
+-------------------------------
+
+A couple of days later the ideas had gel'ed::
+
+
+	Notes on Varnish
+	----------------
+
+	Collected 2006-02-26 to 2006-03-..
+
+	Poul-Henning Kamp
+
+	-----------------------------------------------------------------------
+
+	Objects available to functions in VCL
+
+		client	# The client
+
+		req	# The request
+
+		obj	# The object from which we satisfy it
+
+		backend	# The chosen supplier
+
+	-----------------------------------------------------------------------
+	Configuration Language
+
+	XXX: declare IP lists ?
+
+	BNF:
+		program:	part
+				| program part
+
+		part:		"sub" function_name compound
+				| "backend" backend_name compound
+
+		compound:	"{" statements "}"
+
+		statements:	/* empty */
+				| statement
+				| statements statement
+
+		statement:	conditional
+				| functioncall
+				| "set" field value
+				| field "=" value
+				| "no_cache"
+				| "finish"
+				| "no_new_cache"
+				| call function_name
+				| fetch
+				| error status_code
+				| error status_code string(message)
+				| switch_config config_id
+				| rewrite field string(match) string(replace)
+
+		conditional:	"if" condition compound elif_parts else_part
+
+		elif_parts:	/* empty */
+				| elif_part
+				| elif_parts elif_part
+
+		elif_part:	"elseif" condition compound
+				| "elsif" condition compound
+				| "else if" condition compound
+
+		else_part:	/* empty */
+				| "else" compound
+
+		functioncal:	"call" function_name
+
+		field:		object
+				field "." variable
+
+		condition:	'(' cond_or ')'
+
+		cond_or:	cond_and
+				| cond_or '||' cond_and
+
+		cond_and:	cond_part
+				| cond_and '&&' cond_part
+
+		cond_part:	'!' cond_part2
+				| cond_part2
+
+		cond_part2:	condition
+				| field(int) '<' number
+				| field(int) '<=' number
+				| field(int) '>' number
+				| field(int) '>=' number
+				| field(int) '=' number
+				| field(int) '!=' number
+				| field(IP)  ~ ip_list
+				| field(string) ~ string(regexp)
+
+	-----------------------------------------------------------------------
+	Sample request policy program
+
+		sub request_policy {
+
+			if (client.ip in 10.0.0.0/8) {
+				no-cache
+				finish
+			}
+
+			if (req.url.host ~ "cnn.no$") {
+				rewrite	s/cnn.no$/vg.no/
+			}
+
+			if (req.url.path ~ "cgi-bin") {
+				no-cache
+			}
+
+			if (req.useragent ~ "spider") {
+				no-new-cache
+			}
+
+			if (backend.response_time > 0.8s) {
+				set req.ttlfactor = 1.5
+			} elseif (backend.response_time > 1.5s) {
+				set req.ttlfactor = 2.0
+			} elseif (backend.response_time > 2.5s) {
+				set req.ttlfactor = 5.0
+			}
+
+			/*
+			 * the program contains no references to
+			 * maxage, s-maxage and expires, so the
+			 * default handling (RFC2616) applies
+			 */
+		}
+
+	-----------------------------------------------------------------------
+	Sample fetch policy program
+
+		sub backends {
+			set backend.vg.ip = {...}
+			set backend.ads.ip = {...}
+			set backend.chat.ip = {...}
+			set backend.chat.timeout = 10s
+			set backend.chat.bandwidth = 2000 MB/s
+			set backend.other.ip = {...}
+		}
+
+		sub vg_backend {
+			set backend.ip = {10.0.0.1-5}
+			set backend.timeout = 4s
+			set backend.bandwidth = 2000Mb/s
+		}
+
+		sub fetch_policy {
+
+			if (req.url.host ~ "/vg.no$/") {
+				set req.backend = vg
+				call vg_backend
+			} else {
+				/* XXX: specify 404 page url ? */
+				error 404
+			}
+
+			if (backend.response_time > 2.0s) {
+				if (req.url.path ~ "/landbrugspriser/") {
+					error 504
+				}
+			}
+			fetch
+			if (backend.down) {
+				if (obj.exist) {
+					set obj.ttl += 10m
+					finish
+				}
+				switch_config ohhshit
+			}
+			if (obj.result == 404) {
+				error 300 "http://www.vg.no"
+			}
+			if (obj.result != 200) {
+				finish
+			}
+			if (obj.size > 256k) {
+				no-cache
+			} else if (obj.size > 32k && obj.ttl < 2m) {
+				obj.tll = 5m				
+			}
+			if (backend.response_time > 2.0s) {
+				set ttl *= 2.0
+			}
+		}
+
+		sub prefetch_policy {
+
+			if (obj.usage < 10 && obj.ttl < 5m) {
+				fetch
+			}
+		}
+
+	-----------------------------------------------------------------------
+	Purging
+
+	When a purge request comes in, the regexp is tagged with the next
+	generation number and added to the tail of the list of purge regexps.
+
+	Before a sender transmits an object, it is checked against any
+	purge-regexps which have higher generation number than the object
+	and if it matches the request is sent to a fetcher and the object
+	purged.
+
+	If there were purge regexps with higher generation to match, but
+	they didn't match, the object is tagged with the current generation
+	number and moved to the tail of the list.
+
+	Otherwise, the object does not change generation number and is
+	not moved on the generation list.
+
+	New Objects are tagged with the current generation number and put
+	at the tail of the list.
+
+	Objects are removed from the generation list when deleted.
+
+	When a purge object has a lower generation number than the first
+	object on the generation list, the purge object has been completed
+	and will be removed.  A log entry is written with number of compares
+	and number of hits.
+		
+	-----------------------------------------------------------------------
+	Random notes
+
+		swap backed storage
+
+		slowstart by config-flipping
+			start-config has peer servers as backend
+			once hitrate goes above limit, management process
+			flips config to 'real' config.
+
+		stat-object
+			always URL, not regexp
+
+		management + varnish process in one binary, comms via pipe
+
+		Change from config with long expiry to short expiry, how
+		does the ttl drop ?  (config sequence number invalidates
+		all calculated/modified attributes.)
+
+		Mgt process holds copy of acceptor socket ->  Restart without
+		lost client requests.
+
+		BW limit per client IP: create shortlived object (<4sec)
+		to hold status.  Enforce limits by delaying responses.
+
+
+	-----------------------------------------------------------------------
+	Source structure
+
+
+		libvarnish
+			library with interface facilities, for instance
+			functions to open&read shmem log
+
+		varnish
+			varnish sources in three classes
+
+	-----------------------------------------------------------------------
+	protocol cluster/mgt/varnish
+
+	object_query url -> TTL, size, checksum
+	{purge,invalidate} regexp
+	object_status url -> object metadata
+
+	load_config filename
+	switch_config configname
+	list_configs
+	unload_config
+
+	freeze 	# stop the clock, freezes the object store
+	thaw
+
+	suspend	# stop acceptor accepting new requests
+	resume
+
+	stop	# forced stop (exits) varnish process
+	start
+	restart = "stop;start"
+
+	ping $utc_time -> pong $utc_time
+
+	# cluster only
+	config_contents filename $inline -> compilation messages
+
+	stats [-mr] -> $data
+
+	zero stats
+
+	help
+
+	-----------------------------------------------------------------------
+	CLI (local)
+		import protocol from above
+
+		telnet localhost someport
+		authentication:
+			password $secret
+		secret stored in {/usr/local}/etc/varnish.secret (400 root:wheel)
+
+
+	-----------------------------------------------------------------------
+	HTML (local)
+
+		php/cgi-bin thttpd ?
+		(alternatively direct from C-code.)
+		Everything the CLI can do +
+		stats
+			popen("rrdtool");
+		log view
+
+	-----------------------------------------------------------------------
+	CLI (cluster)
+		import protocol from above, prefix machine/all
+		compound stats
+		accept / deny machine (?)
+		curses if you set termtype
+
+	-----------------------------------------------------------------------
+	HTML (cluster)
+		ditto
+		ditto
+
+		http://clustercontrol/purge?regexp=fslkdjfslkfdj
+			POST with list of regexp
+			authentication ? (IP access list)
+
+	-----------------------------------------------------------------------
+	Mail (cluster)
+
+		pgp signed emails with CLI commands
+
+	-----------------------------------------------------------------------
+	connection varnish -> cluster controller
+
+		Encryption
+			SSL
+		Authentication (?)
+			IP number checks.
+
+		varnish -c clusterid -C mycluster_ctrl.vg.no
+
+	-----------------------------------------------------------------------
+	Filer
+		/usr/local/sbin/varnish
+			contains mgt + varnish process.
+			if -C argument, open SSL to cluster controller.
+			Arguments:
+				-p portnumber
+				-c clusterid at cluster_controller
+				-f config_file
+				-m memory_limit
+				-s kind[,storage-options]
+				-l logfile,logsize
+				-b backend ip...
+				-d debug
+				-u uid
+				-a CLI_port
+
+			KILL SIGTERM	-> suspend, stop
+
+		/usr/local/sbin/varnish_cluster
+			Cluster controller.
+			Use syslog
+
+			Arguments:
+				-f config file
+				-d debug
+				-u uid (?)
+
+		/usr/local/sbin/varnish_logger
+			Logfile processor
+			-i shmemfile
+			-e regexp
+			-o "/var/log/varnish.%Y%m%d.traffic" 
+			-e regexp2
+			-n "/var/log/varnish.%Y%m%d.exception"  (NCSA format)
+			-e regexp3
+			-s syslog_level,syslogfacility
+			-r host:port	send via TCP, prefix hostname
+
+			SIGHUP: reopen all files.
+
+		/usr/local/bin/varnish_cli
+			Command line tool.
+
+		/usr/local/share/varnish/etc/varnish.conf
+			default request + fetch + backend scripts
+
+		/usr/local/share/varnish/etc/rfc2616.conf
+			RFC2616 compliant handling function
+
+		/usr/local/etc/varnish.conf (optional)
+			request + fetch + backend scripts
+
+		/usr/local/share/varnish/etc/varnish.startup
+			default startup sequence
+
+		/usr/local/etc/varnish.startup (optional)
+			startup sequence
+
+		/usr/local/etc/varnish_cluster.conf
+			XXX
+
+		{/usr/local}/etc/varnish.secret
+			CLI password file.
+
+	-----------------------------------------------------------------------
+	varnish.startup
+
+		load config /foo/bar startup_conf
+		switch config startup_conf
+		!mypreloadscript
+		load config /foo/real real_conf
+		switch config real_conf
+		resume
+
+Fourth Varnish Design Note
+--------------------------
+
+You'd think we'd be cookin' with gas now, and indeed we were, but now
+all the difficult details started to raise ugly questions, and it
+has never stopped since::
+
+	Questions:
+
+	*  Which "Host:" do we put in the request to the backend ?
+
+	      The one we got from the client ?
+
+	      The ip/dns-name of the backend ?
+
+	      Configurable in VCL backend declaration ?
+
+	      (test with www.ing.dk)
+
+	*  Construction of headers for queries to backend ?
+
+	      How much do we take from client headers, how much do we make up ?
+
+	      Some sites discriminate contents based on User-Agent header.
+		 (test with www.krak.dk/www.rs-components.dk)
+
+	      Cookies
+
+	*  Mapping of headers from backend reply to the reply to client
+
+	      Which fields come from the backend ?
+
+	      Which fields are made up on the spot ? (expiry time ?)
+
+	      (Static header fields can be prepended to contents in storage)
+
+
+	*  3xx replies from the backend
+
+	      Does varnish follow a redirection or do we pass it to the client ?
+
+	      Do we cache 3xx replies ?
+
+
+The first live traffic
+----------------------
+
+The final bit of history I want to share is the IRC log from the
+first time tried to put real live traffic through Varnish.
+
+The language is interscandinavian, but I think non-vikings can get
+still get the drift::
+
+	**** BEGIN LOGGING AT Thu Jul  6 12:36:48 2006
+
+	Jul 06 12:36:48 *	Now talking on #varnish
+	Jul 06 12:36:48 *	EvilDES gives channel operator status to andersb
+	Jul 06 12:36:53 *	EvilDES gives channel operator status to phk
+	Jul 06 12:36:53 <andersb>	hehe
+	Jul 06 12:36:56 <EvilDES>	sånn
+	Jul 06 12:37:00 <andersb>	Jepps, er dere klare?
+	Jul 06 12:37:08 <phk>	Jeg har varnish oppe og køre med leonora som backend.
+	Jul 06 12:37:12 *	EvilDES has changed the topic to: Live testing in progress!
+	Jul 06 12:37:16 *	EvilDES sets mode +t #varnish
+	Jul 06 12:37:19 <andersb>	Da setter jeg på trafikk
+	Jul 06 12:37:36 <phk>	andersb: kan du starte med bare at give us trafiik i 10 sekunder eller så ?
+	Jul 06 12:37:49 *	edward (edward at f95.linpro.no) has joined #varnish
+	Jul 06 12:38:32 <andersb>	hmm, først må jeg få trafikk dit.
+	Jul 06 12:38:55 <andersb>	Har noe kommet? Eller har det blitt suprt etter /systemmeldinger/h.html som er helsefilen?
+	Jul 06 12:39:10 <andersb>	s/suprt/spurt/
+	Jul 06 12:39:41 <EvilDES>	ser ingenting
+	Jul 06 12:39:45 <phk>	jeg har ikke set noget endnu...
+	Jul 06 12:40:35 <phk>	den prøver på port 80
+	Jul 06 12:41:24 <andersb>	okay..
+	Jul 06 12:41:31 <EvilDES>	kan vi ikke bare kjøre varnishd på port 80?
+	Jul 06 12:41:46 <phk>	ok, jeg ville bare helst ikke køre som root.
+	Jul 06 12:41:47 <andersb>	Prøver den noe annet nå?
+	Jul 06 12:41:59 <phk>	nej stadig 80.
+	Jul 06 12:42:03 <phk>	Jeg starter varnishd som root
+	Jul 06 12:42:08 <EvilDES>	nei, vent
+	Jul 06 12:42:08 <andersb>	Topp
+	Jul 06 12:42:11 <andersb>	okay
+	Jul 06 12:42:15 <andersb>	kom det 8080 nå?
+	Jul 06 12:42:18 <EvilDES>	sysctl reserved_port
+	Jul 06 12:43:04 <andersb>	okay? Får dere 8080 trafikk nå?
+	Jul 06 12:43:08 <EvilDES>	sysctl net.inet.ip.portrange.reservedhigh=79
+	Jul 06 12:44:41 <andersb>	Okay, avventer om vi skal kjøre 8080 eller 80.
+	Jul 06 12:45:56 <EvilDES>	starter den på port 80 som root
+	Jul 06 12:46:01 <phk>	den kører nu
+	Jul 06 12:46:01 <andersb>	Okay, vi har funnet ut at måten jeg satte 8080 på i lastbalanserern var feil.
+	Jul 06 12:46:07 <andersb>	okay på 80?
+	Jul 06 12:46:12 <phk>	vi kører
+	Jul 06 12:46:14 <EvilDES>	ja, masse trafikk
+	Jul 06 12:46:29 <phk>	omtrent 100 req/sec
+	Jul 06 12:46:37 <phk>	and we're dead...
+	Jul 06 12:46:40 <EvilDES>	stopp!
+	Jul 06 12:46:58 <andersb>	den stopper automatisk.
+	Jul 06 12:47:04 <andersb>	Vi kan bare kjøre det slik.
+	Jul 06 12:47:06 <EvilDES>	tok noen sekunder
+	Jul 06 12:47:20 <andersb>	Npr den begynner svar på 80 så vil lastbalanserern finne den fort og sende trafikk.
+	Jul 06 12:47:41 <EvilDES>	ca 1500 connection requests kom inn før den sluttet å sende oss trafikk
+	Jul 06 12:47:49 <EvilDES>	altså, 1500 etter at varnishd døde
+	Jul 06 12:48:02 <andersb>	tror det er en god nok måte å gjøre det på. Så slipper vi å configge hele tiden.
+	Jul 06 12:48:07 <EvilDES>	greit
+	Jul 06 12:48:11 <EvilDES>	det er dine lesere :)
+	Jul 06 12:48:19 <andersb>	ja :)
+	Jul 06 12:48:35 <andersb>	kan sette ned retry raten litt.
+	Jul 06 12:49:15 <andersb>	>> AS3408-2 VG Nett - Real server 21 # retry
+	Jul 06 12:49:16 <andersb>	Current number of failure retries: 4
+	Jul 06 12:49:16 <andersb>	Enter new number of failure retries [1-63]: 1
+	Jul 06 12:49:33 <andersb>	^^ before de decalres dead
+	Jul 06 12:49:41 <andersb>	he declairs :)
+	Jul 06 12:51:45 <phk>	I've saved the core, lets try again for another shot.
+	Jul 06 12:52:09 <andersb>	sure :)
+	Jul 06 12:52:34 <andersb>	When you start port 80 loadbalancer will send 8 req's for h.html then start gicing traficc
+	Jul 06 12:53:00 <andersb>	^^ Microsoft keyboard
+	Jul 06 12:53:09 <phk>	ok, jeg starter
+	Jul 06 12:53:10 <EvilDES>	you need to get a Linux keyboard
+	Jul 06 12:53:16 <andersb>	Yeah :)
+	Jul 06 12:53:18 <EvilDES>	woo!
+	Jul 06 12:53:21 <phk>	boom.
+	Jul 06 12:53:25 <EvilDES>	oops
+	Jul 06 12:53:35 <EvilDES>	18 connections, 77 requests
+	Jul 06 12:53:40 <EvilDES>	that didn't last long...
+	Jul 06 12:54:41 <andersb>	longer than me :) *rude joke
+	Jul 06 12:55:04 <phk>	bewm
+	Jul 06 12:55:22 <andersb>	can I follow a log?
+	Jul 06 12:55:39 <andersb>	with: lt-varnishlog ?
+	Jul 06 12:56:27 <phk>	samme fejl
+	Jul 06 12:56:38 <phk>	andersb: jeg gemmer logfilerne
+	Jul 06 12:57:00 <phk>	bewm
+	Jul 06 12:57:13 <andersb>	phk: Jepp, men for min egen del for å se når dere skrur på etc. Da lærer jeg loadbalancer ting.
+	Jul 06 12:57:51 <phk>	ok, samme fejl igen.
+	Jul 06 12:58:02 <phk>	jeg foreslår vi holder en lille pause mens jeg debugger.
+	Jul 06 12:58:09 <andersb>	sure.
+	Jul 06 12:58:16 <EvilDES>	andersb: cd ~varnish/varnish/trunk/varnish-cache/bin/varnishlog
+	Jul 06 12:58:21 <EvilDES>	andersb: ./varnishlog -o
+	Jul 06 12:58:37 <EvilDES>	andersb: cd ~varnish/varnish/trunk/varnish-cache/bin/varnishstat
+	Jul 06 12:58:43 <EvilDES>	andersb: ./varnishstat -c
+	Jul 06 12:58:44 <phk>	eller ./varnislog -r _vlog3 -o | less
+	Jul 06 13:00:02 <andersb>	Jeg går meg en kort tur. Straks tilbake.
+	Jul 06 13:01:27 <phk>	vi kører igen
+	Jul 06 13:02:31 <phk>	2k requests
+	Jul 06 13:02:57 <phk>	3k
+	Jul 06 13:03:39 <phk>	5k
+	Jul 06 13:03:55 <EvilDES>	ser veldig bra ut
+	Jul 06 13:04:06 <EvilDES>	hit rate > 93%
+	Jul 06 13:04:13 <EvilDES>	95%
+	Jul 06 13:05:14 <phk>	800 objects
+	Jul 06 13:05:32 <EvilDES>	load 0.28
+	Jul 06 13:05:37 <EvilDES>	0.22
+	Jul 06 13:05:52 <EvilDES>	CPU 98.9% idle :)
+	Jul 06 13:06:12 <phk>	4-5 Mbit/sec
+	Jul 06 13:06:42 <andersb>	nice :)
+	Jul 06 13:06:49 <andersb>	vi kjører til det krasjer?
+	Jul 06 13:06:58 <phk>	jep
+	Jul 06 13:07:05 <phk>	du må gerne åbne lidt mere
+	Jul 06 13:07:20 <andersb>	okay
+	Jul 06 13:07:41 <andersb>	3 ganger mer...
+	Jul 06 13:08:04 <andersb>	si fra når dere vil ha mer.
+	Jul 06 13:08:24 <phk>	vi gir den lige et par minutter på det her niveau
+	Jul 06 13:09:17 <phk>	bewm
+	Jul 06 13:09:31 <EvilDES>	        3351        0.00 Client connections accepted
+	Jul 06 13:09:31 <EvilDES>	       23159        0.00 Client requests received
+	Jul 06 13:09:31 <EvilDES>	       21505        0.00 Cache hits
+	Jul 06 13:09:31 <EvilDES>	        1652        0.00 Cache misses
+	Jul 06 13:10:17 <phk>	kører igen
+	Jul 06 13:10:19 <EvilDES>	here we go again
+	Jul 06 13:11:06 <phk>	20mbit/sec
+	Jul 06 13:11:09 <phk>	100 req/sec
+	Jul 06 13:12:30 <andersb>	nice :)
+	Jul 06 13:12:46 <andersb>	det er gode tall, og jeg skal fortelle dere hvorfor senere
+	Jul 06 13:12:49 <phk>	steady 6-8 mbit/sec
+	Jul 06 13:12:52 <andersb>	okay.
+	Jul 06 13:13:00 <phk>	ca 50 req/sec
+	Jul 06 13:13:04 <EvilDES>	skal vi øke?
+	Jul 06 13:13:14 <phk>	ja, giv den det dobbelte hvis du kan
+	Jul 06 13:13:19 <andersb>	vi startet med 1 -> 3 -> ?
+	Jul 06 13:13:22 <phk>	6
+	Jul 06 13:13:23 <andersb>	6
+	Jul 06 13:13:34 <andersb>	done
+	Jul 06 13:13:42 <andersb>	den hopper opp graceful.
+	Jul 06 13:13:54 <EvilDES>	boom
+	Jul 06 13:14:06 <andersb>	:)
+	Jul 06 13:14:11 <EvilDES>	men ingen ytelsesproblemer
+	Jul 06 13:14:19 <EvilDES>	bare bugs i requestparsering
+	Jul 06 13:14:20 <phk>	kører igen
+	Jul 06 13:14:26 <phk>	bewm
+	Jul 06 13:14:31 <phk>	ok, vi pauser lige...
+	Jul 06 13:17:40 <phk>	jeg har et problem med "pass" requests, det skal jeg lige have fundet inden vi går videre.
+	Jul 06 13:18:51 <andersb>	Sure.
+	Jul 06 13:28:50 <phk>	ok, vi prøver igen
+	Jul 06 13:29:09 <phk>	bewm
+	Jul 06 13:29:35 <phk>	more debugging
+	Jul 06 13:33:56 <phk>	OK, found the/one pass-mode bug
+	Jul 06 13:33:58 <phk>	trying again
+	Jul 06 13:35:23 <phk>	150 req/s 24mbit/s, still alive
+	Jul 06 13:37:02 <EvilDES>	andersb: tror du du klarer å komme deg hit til foredraget, eller er du helt ødelagt?
+	Jul 06 13:37:06 <phk>	andersb: giv den 50% mere trafik
+	Jul 06 13:39:46 <andersb>	mer trafikk
+	Jul 06 13:39:56 <andersb>	EvilDES: Nei :(( Men Stein fra VG Nett kommer.
+	Jul 06 13:41:25 <EvilDES>	btw, har du noen data om hva load balanceren synes om varnish?
+	Jul 06 13:41:50 <EvilDES>	jeg regner med at den følger med litt på hvor god jobb vi gjør
+	Jul 06 13:43:10 <phk>	Jeg genstarter lige med flere workerthreads...
+	Jul 06 13:43:43 <phk>	jeg tror 20 workerthreads var for lidt nu...
+	Jul 06 13:43:47 <phk>	nu har den 220
+	Jul 06 13:44:40 <EvilDES>	        2976      107.89 Client connections accepted
+	Jul 06 13:44:41 <EvilDES>	       10748      409.57 Client requests received
+	Jul 06 13:44:41 <EvilDES>	        9915      389.59 Cache hits
+	Jul 06 13:45:13 <EvilDES>	det var altså 400 i sekundet :)
+	Jul 06 13:45:45 <phk>	og ingen indlysende fejl på www.vg.no siden :-)
+	Jul 06 13:45:54 <phk>	bewm
+	Jul 06 13:47:16 <EvilDES>	andersb: hvor stor andel av trafikken hadde vi nå?
+	Jul 06 13:48:06 <EvilDES>	altså, vekt i load balanceren i forhold til totalen
+	Jul 06 13:49:20 <phk>	ok, kun 120 threads så...
+	Jul 06 13:50:48 <andersb>	9
+	Jul 06 13:52:45 <phk>	andersb: 9 -> 12 ?
+	Jul 06 13:52:48 <EvilDES>	andersb: 9 til varnish, men hvor mye er den totale vekten?
+	Jul 06 13:52:58 <EvilDES>	har vi 1%? 5%? 10%?
+	Jul 06 13:54:37 <EvilDES>	nå passerte vi nettopp 50000 requests uten kræsj
+	Jul 06 13:55:36 <phk>	maskinen laver ingenting...  98.5% idle
+	Jul 06 13:56:21 <andersb>	12 maskiner med weight 20
+	Jul 06 13:56:26 <andersb>	1 med weight 40
+	Jul 06 13:56:29 <andersb>	varnish med 9
+	Jul 06 13:57:01 <andersb>	si fra når dere vil ha mer trafikk.
+	Jul 06 13:57:02 <phk>	9/289 = 3.1%
+	Jul 06 13:57:12 <phk>	andersb: giv den 15
+	Jul 06 13:57:44 <andersb>	gjort
+	Jul 06 13:59:43 <andersb>	dette er morro. Jeg må si det.
+	Jul 06 14:00:27 <phk>	20-23 Mbit/sec steady, 200 req/sec, 92.9% idle
+	Jul 06 14:00:30 <phk>	bewm
+	Jul 06 14:00:46 <EvilDES>	OK
+	Jul 06 14:00:57 <EvilDES>	jeg tror vi kan slå fast at ytelsen er som den skal være
+	Jul 06 14:01:33 <EvilDES>	det er en del bugs, men de bør det gå an å fikse.
+	Jul 06 14:01:34 <andersb>	Jepp :) Det så pent ut...
+	Jul 06 14:01:53 <phk>	jeg tror ikke vi har set skyggen af hvad Varnish kan yde endnu...
+	Jul 06 14:01:53 <EvilDES>	andersb: hvordan ligger vi an i forhold til Squid?
+	Jul 06 14:01:58 <andersb>	pent :)
+	Jul 06 14:02:13 <andersb>	Jeg har ikke fått SNMP opp på dene boksen, jeg burde grafe det...
+	Jul 06 14:02:23 <EvilDES>	snmp kjører på c21
+	Jul 06 14:02:33 <EvilDES>	tror agero satte det opp
+	Jul 06 14:02:36 <EvilDES>	aagero
+	Jul 06 14:02:38 <andersb>	Ja, men jeg har ikke mal i cacti for bsnmpd
+	Jul 06 14:02:43 <EvilDES>	ah, ok
+	Jul 06 14:03:03 <EvilDES>	men den burde støtte standard v2 mib?
+	Jul 06 14:03:26 <andersb>	det er ikke protocoll feil :)
+	Jul 06 14:03:42 <andersb>	Hva er byte hitratio forresetn?
+	Jul 06 14:03:52 <EvilDES>	det tror jeg ikke vi måler
+	Jul 06 14:03:55 <EvilDES>	enda
+	Jul 06 14:03:59 <phk>	andersb: den har jeg ikke stats på endnu.
+	Jul 06 14:04:22 <phk>	ok, forrige crash ligner en 4k+ HTTP header...
+	Jul 06 14:04:27 <phk>	(eller en kodefejl)
+	Jul 06 14:06:03 <phk>	andersb: prøv at øge vores andel til 20
+	Jul 06 14:06:26 <EvilDES>	hvilken vekt har hver av de andre cachene?
+	Jul 06 14:06:49 <phk>	20 og en med 40
+	Jul 06 14:07:50 <andersb>	gjort
+	Jul 06 14:08:59 <phk>	440 req/s 43mbit/s
+	Jul 06 14:09:17 <phk>	bewm
+	Jul 06 14:09:18 <EvilDES>	bewm
+	Jul 06 14:10:30 <EvilDES>	oj
+	Jul 06 14:10:39 <EvilDES>	vi var oppe over 800 req/s et øyeblikk
+	Jul 06 14:10:46 <phk>	60mbit/sec
+	Jul 06 14:10:52 <phk>	og 90% idle :-)
+	Jul 06 14:10:59 <EvilDES>	ingen swapping
+	Jul 06 14:11:58 <EvilDES>	og vi bruker nesten ikke noe minne - 3 GB ledig fysisk RAM
+	Jul 06 14:13:02 <phk>	ca 60 syscall / req
+	Jul 06 14:14:31 <andersb>	nice :)
+	Jul 06 14:14:58 <phk>	andersb: prøv at give os 40
+	Jul 06 14:17:26 <andersb>	gjort
+	Jul 06 14:18:17 <phk>	det ligner at trafikken falder her sidst på eftermiddagen...
+	Jul 06 14:19:07 <andersb>	ja :)
+	Jul 06 14:19:43 <phk>	andersb: så skal vi nok ikke øge mere, nu nærmer vi os hvad 100Mbit ethernet kan klare.
+	Jul 06 14:19:58 <andersb>	bra :)
+	Jul 06 14:20:36 <phk>	42mbit/s steady
+	Jul 06 14:20:59 <EvilDES>	40 av 320?
+	Jul 06 14:21:06 <EvilDES>	12,5%
+	Jul 06 14:21:43 *	nicholas (nicholas at nfsd.linpro.no) has joined #varnish
+	Jul 06 14:22:00 <phk>	det der cluster-noget bliver der da ikke brug for når vi har 87% idle
+	Jul 06 14:23:05 <andersb>	hehe :)
+	Jul 06 14:24:38 <andersb>	skal stille de andre ned litt for 48 er max
+	Jul 06 14:24:57 <phk>	jeg tror ikke vi skal gå højere før vi har gigE
+	Jul 06 14:25:14 <andersb>	4-5MB/s
+	Jul 06 14:25:32 <andersb>	lastbalanserer backer off på 100 Mbit
+	Jul 06 14:25:35 <andersb>	:)
+	Jul 06 14:25:42 <andersb>	Så vi kan kjøre nesten til taket.
+	Jul 06 14:26:01 <andersb>	hvis det har noe poeng.
+	Jul 06 14:26:09 <andersb>	crash :)
+	Jul 06 14:27:33 <phk>	bewm
+	Jul 06 14:29:08 <andersb>	Stilt inn alle på weight 5
+	Jul 06 14:29:17 <andersb>	bortsett fra 1 som er 10
+	Jul 06 14:29:20 <andersb>	varnish er 5
+	Jul 06 14:29:24 <phk>	så giv os 20
+	Jul 06 14:29:51 <andersb>	gjort
+	Jul 06 14:30:58 <phk>	vi får kun 300 req/s
+	Jul 06 14:31:04 <phk>	Ahh der skete noget.
+	Jul 06 14:32:41 <phk>	ok, ved denne last bliver backend connections et problem, jeg har set dns fejl og connection refused
+	Jul 06 14:33:10 <phk>	dns fejl
+	Jul 06 14:33:21 <andersb>	okay, pek den mot 10.0.2.5
+	Jul 06 14:33:28 <andersb>	det er layer 2 squid cache
+	Jul 06 14:33:35 <andersb>	morro å teste det og.
+	Jul 06 14:33:54 <phk>	det gør jeg næste gang den falder
+	Jul 06 14:34:48 <phk>	jeg kunne jo også bare give leonors IP# istedet... men nu kører vi imod squid
+	Jul 06 14:36:05 <andersb>	ja, gi leonora IP det er sikkert bedre. Eller det kan jo være fint å teste mot squid og :)
+	Jul 06 14:39:04 <phk>	nu kører vi med leonora's IP#
+	Jul 06 14:39:33 <phk>	nu kører vi med leonora's *rigtige* IP#
+	Jul 06 14:41:20 <phk>	Nu er vi færdige med det her 100Mbit/s ethernet, kan vi få et til ?  :-)
+	Jul 06 14:41:42 <andersb>	lol :)
+	Jul 06 14:42:00 <andersb>	For å si det slik. Det tar ikke mange dagene før Gig switch er bestilt :)
+	Jul 06 14:43:05 <phk>	bewm
+	Jul 06 14:43:13 <phk>	ok, jeg synes vi skal stoppe her.
+	Jul 06 14:43:41 <EvilDES>	jepp, foredrag om 15 min
+	Jul 06 14:43:57 <andersb>	jepp
+	Jul 06 14:44:23 <andersb>	disabled server
+	Jul 06 14:45:29 <EvilDES>	dette har vært en veldig bra dag.
+	Jul 06 14:45:49 <EvilDES>	hva skal vi finne på i morgen? skifte ut hele Squid-riggen med en enkelt Varnish-boks? ;)
+	Jul 06 14:45:53 <andersb>	lol
+	Jul 06 14:46:15 *	EvilDES må begynne å sette i stand til foredraget
+	Jul 06 14:46:17 <andersb>	da må jeg har Gig switch. Eller så kan være bære med en HP maskin å koble rett på lastbal :)
+	Jul 06 14:46:22 <phk>	kan vi ikke nøjes med en halv varnish box ?
+	Jul 06 14:46:41 <EvilDES>	vi må ha begge halvdeler for failover
+	Jul 06 14:47:01 <andersb>	:)
+	Jul 06 14:47:14 <andersb>	kan faile tilbake til de andre.
+	Jul 06 14:47:25 <andersb>	Jeg klarer ikke holde meg hjemme.
+	Jul 06 14:47:33 <andersb>	 Jeg kommer oppover om litt :)
+	Jul 06 14:47:39 <andersb>	Ringer på.
+	Jul 06 14:48:19 <andersb>	må gå en tur nå :)
+	Jul 06 14:48:29 *	andersb has quit (BitchX: no additives or preservatives)
+	Jul 06 14:49:44 <EvilDES>	http://www.des.no/varnish/
+	**** ENDING LOGGING AT Thu Jul  6 14:52:04 2006
+
+*phk*
+
diff --git a/doc/sphinx/phk/index.rst b/doc/sphinx/phk/index.rst
index 60a3375..65108ed 100644
--- a/doc/sphinx/phk/index.rst
+++ b/doc/sphinx/phk/index.rst
@@ -8,6 +8,7 @@ You may or may not want to know what Poul-Henning thinks.
 .. toctree::
 	:maxdepth: 1
 
+	firstdesign.rst
 	10goingon50.rst
 	brinch-hansens-arrows.rst
 	ssl_again.rst



More information about the varnish-commit mailing list