[Fwd: Re: My random thoughts]

Poul-Henning Kamp phk at phk.freebsd.dk
Thu Feb 16 11:09:17 CET 2006


In message <65058.193.213.34.102.1140050754.squirrel at denise.vg.no>, "Anders Ber
g" writes:

Let me just try to see if I can express the overall threading
strategy I have formed without using a whiteboard:

The [...] is which thread we're in.


[acceptor] Incoming connections are handled by acceptfilters in a
single thread or if acceptfilters are not available with a single
threaded poll loop.

[acceptor] Once a full HTTP request has been gathered, the URL is
hashed and looked up to see if we have a hit or not.

[acceptor] If we have a hit, and the object is in a "ready" state,
a thread is pulled off the "sender" queue and given the request to
complete.

[sender] The object will be shipped out according to its state (it
may still be arriving from the backend) and the HTTP headers.
sendfile will be used if at all possible.  Once done, the the fd
will be sent back the the acceptor if not closed {can we engage
acceptfilters again ?}  {We may ($config) engage in compression
here and in such case we would embellish the object with the
compressed version (up front) so it can be reused by other senders.}

[acceptor] If we have a hit, but the object is not in a "ready"
state, (for instance we are trying to get the object from the
backend, but havn't received any of it yet) the request is parked
on the object.

[acceptor] If we have no hit, the header needs to be analyzed (URL
cleanup, rewriting, negative lookup etc etc).  We could use a
"sender" thread to do this, but I would rather in order to limit
the amount of potentially expensive work we do here.  My initial
thought therefore is to put the request into a queue to be dealt
with by the "backend" threads.

[backend] These threads will look for two kinds of work in order
of priority: requests that needs analysing and objects nearing
expiration.

[backend] Requests needing analysis are chewed upon according to
the configured rules and one of four outcomes are possible:

[backend] Invalid request.  Grap a "sender" and ship out a static
error-object.

[backend] Rematched request, (after analysis it matches an existing
object) treat like the acceptor would for a hash hit.  If configuration
allows: add new hash entry to put this URL on fast track in the
future.

[backend] Unmatched request, cacheable (glob/regexp matching).
Create object, queue request on it.  Add hash entry.  Initiate fetch
from backend.  When HTTP header arrives, set expiry on object
accordingly.  Once some data has arrived, grab sender and pass it
the object (NB: not the request).  Receive full object.

[backend] Unmatched request, uncacheable (glob/regexp matching).
Create (transient) object.  Initiate fetch from backend.  Once some
data has arrived, grab sender thread and pass it object.  Receive
full object.

[backend] Near-expiry objects: Once an object nears expiry (defined
by config) it is eligble for refresh.  A backend thread will determine
if the object is important enough (defined by config) compared to
current backend responsiveness to be refreshed.  If it is, a GET
request is sent to the backend.  (I'm not sure optimizing with a HEAD
is worth much here, maybe a hybrid strategy:  If the object has been
refreshed before and a GET was necessary more often than not, then
do GET otherwise try HEAD first).

[sender] When passed object:  If only one request queued on object,
behave as if passed that request.  If more than one request is
queued, grab a sender for each and pass that request.

[sender] On transient object:  Destroy object after transmission.

[any] If on attempting to pull a sender off the queue, none is
available, the request or object is queued instead.

[overseer] Monitor number of sender threads and create/destroy them
as appropriate.  Sender threads go back to the front of the queue
(to cache efficiency reasons) and if they linger in the tail of the
queue doing nothing for more than $config seconds, they get killed off.

[overseer] Monitor backend responsiveness based on backend thread
statistics.  Switch between various policy states accordingly.

[master] handle requests coming in via $channel from janitor process.

... or something like that.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk at FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.



More information about the varnish-dev mailing list