Streaming details

Martin Blix Grydeland martin at varnish-software.com
Fri Jun 24 15:22:01 CEST 2011


Hi Poul-Henning,

I've detailed some aspects of how I imagine the call paths and actions will
be when we get the next part of streaming in place, using pseudo-something
:) I've tried to detail out some of the locking strategies to get a picture
about what will be locked when and see if we are creating too heavy
contention on anything.

You were working on the moving of the waiting lists to the busyobj right?
How is that progressing?

I'm going to start hacking along these lines beginning next week.

Should I create the branch in git for this?

-Martin

Call flow for streaming fetch&delivery (pass is not detailed in this
flow):

Note: The busyobj from HSH_Prealloc has refcnt==1
1. cnt_lookup()
  * Creates the object marked as busy. The object will have a busyobj
    already attached to it (attached when it is created in
    HSH_Prealloc)
  * Next step is STP_MISS
2. cnt_miss()
  * Nothing special
  * Next step is STP_FETCH
3. cnt_fetch()
  * Nothing special
  * Next step is STP_FETCHBODY
4. cnt_fetchbody()
  * If sp->wrk->do_stream, skips calling FetchBody()
  * Next step is STP_PREPRESP
5. cnt_prepresp()
  * If sp->wrk->do_stream, sets next step to STP_STREAMBODY
6. cnt_streambody()
  * This function has the stream_ctx on the stack
  * Does not need to grab a busyobj->refcnt as it is initialized to 1,
    consequently does not need to lock objhdr here
  * Calls RES_StreamStart
  * Then calls FetchBody()
    * FetchBody() (in both the gzip and nop vfp's) will do callbacks
      to RES_StreamPoll if sp->wrk->do_stream is true
      * RES_StreamPoll will on the first call to it (looking at it's
        stream_ctx) know that on the first call to it set the
        busyobj->can_stream to true. This will allow subsequent
        requests coming in to start streaming the content directly
        (not go on waiting list / not receive grace candidate). It
        should also wake (all or just some) sessions from the waiting
        list. These will then compete for tokens. (Waking all here
        might create some thundering herd? Wake as many as we have
        tokens for? And let any thread releasing a token try to wake
        an amount equal to the number of tokens currently available?)
      * RES_StreamPoll should (looking at it's stream_ctx) know that
        it is the backend receiving thread as well and should:
* Lock the busyobj->lock
* Update busyobj counters with new end of data
* Signal the cond to wake one sleeping thread. This thread
          will in turn trigger another until all the tokens are in
          use.
* Unlock busobj->lock
      * Note: RES_StreamPoll will free data as it progresses when in
        pass mode.
  * Calls HSH_Unbusy()
    * While holding objhdr->lock:
      * Grab busyobj->lock
      * Sets busyobj->complete
      * Clears the busy flag so later calls to this object will not go
        by the streaming bits, and there will eventually be no more
        refs to the busyobj.
      * Decrements busyobj->refcnt, last one to leave turns off the
        light and frees it.
      * Signals the cond.
      * Release busyobj->lock
  * Calls RES_StreamEnd()
  * Next step is STP_DONE
7. Nothing special from this point on


Call flow for streaming delivery only (hit on streaming object):

1. cnt_lookup()
  * Does a HSH_Lookup and gets an oc that is busy
  * Next step is STP_HIT
2. cnt_hit()
  * nothing special
  * Next step is STP_PREPRESP
3. cnt_prepresp()
  * (Here some work on different length algorithms is done)
  * Also calls RES_BuildHttp
  * Next step is STP_DELIVER
3. cnt_deliver()
  * Will be split into seperate paths for streaming and non-streaming
    execution
  * Has the stream_ctx on the stack for this delivery
  * Test if object is busy (OC_F_BUSY). If not normal implementation
    is executed
  * Lock objhdr
  * Test again on (OC_F_BUSY && objhdr->objcore->busyobj->can_stream),
    if not unlock objhdr and run normal implementation. Normal here
    means go on the waiting list as before, waiting for the first
    parts of the body to appear.
  * Increase busyobj->refcnt
  * Unlock objhdr
  * Will call RES_StreamStart to set up the client receive state
  * Grab busyobj->lock
  * Loop until all has been delivered (delivered up to
    busyobj->total_len bytes or busyobj->complete is true)
    * Try to get a token (busyobj->token--). If unsuccessful, wait on
      busyobj->cond (releasing lock) and continue loop
    * If other tokens available, signal busyobj->cond to give the next
      in line a chance to grab a token
    * Loop around while your stream_ctx->end_of_data <
      busyobj->end_of_data
      * Update your stream_ctx end of data from the busyobj
      * Release busyobj->lock
      * Deliver what you can (call RES_StreamPoll)
      * Grab busyobj->lock
    * Release token (busyobj->token++)
    * Wait on busyobj->cond (releasing lock)
  * Call RES_StreamEnd
  * Deref busyobj
    * Grab objhead->lock
    * busyobj->refcnt--
    * If zero
      * Free the busyobj
    * Release objhead->lock
  * Next step is STP_DONE

Lifetime of busyobj:
* busyobj is created in HSH_Prealloc and attached to objcore
* busyobj is free'd by the current calling thread (and the pointer
  in objcore set to NULL) when it's refcnt reaches zero
* busyobj is free'd whenever the objcore is free'd is the pointer is not
NULL

struct busyobj {
       unsigned magic;
       #define BUSYOBJ_MAGIC something
       int complete;
       int can_stream;
       int refcnt;
       int total_len; /* If known */
       int current_len;
       int tokens;
       pthread_cond_t cond;
       pthread_lock_t lock;
};

--
Martin Blix Grydeland
Varnish Software AS
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://www.varnish-cache.org/lists/pipermail/varnish-dev/attachments/20110624/5dec7ac7/attachment-0003.html>


More information about the varnish-dev mailing list