Proposed restructuring of http_conn and where the data is stored
Rogier R. Mulhuijzen
drwilco at drwilco.net
Tue Nov 22 00:04:24 CET 2011
What about decoupling the workspace instead (or maybe as well)? That way
the workspace can be released (back into a pool) at the end of a request,
and not eat up memory during idle-time for a session. And then the
workspace can go with the http_conn to the next worker in your scenario.
This is from one of our servers:
1 VCL_Recv
3 Pipe
3 Reading_Backend
3 Waiting_List
10 Connect_Backend
28 Background
223 Writing_Client
225 Linger
226 Reading_Backend_Hdr
17274 Waiting_Client_Poll
182004 Idle
With a 128K sess_workspace, that's 22 gigs in Idle sessions.
Now I know this is sess_workspace and the worker workspace is what you're
after here, but if we decouple workspaces as a whole we can move them
around for this and not waste extra memory on this problem.
My 2 cents, at least.
Cheers,
DocWilco
On Mon, 21 Nov 2011, Martin Blix Grydeland wrote:
> For the streaming development, some changes will be needed to the http_conn
> and where it stores it's data (buffers while reading headers and such, as
> well as read-ahead and pipeline for the http protocol). Today the data is
> stored on the session workspace (for the client communication) and on the
> worker workspace for the backend communication.
>
> For the streaming development this causes problems when we want to hand
> over the body fetching to another worker, as there is read-ahead data in
> the http_conn buffer that it needs access to, but this will then be
> pointing into the workspace of the previous worker. I'd rather decouple
> this, as it creates a strong relationship between the two threads and
> troubles will come if they are not synchronized with regard to this address
> space (e.g. if the client hangs up, the client thread needs to make sure
> the body fetcher thread have finished with the data before it can reuse
> it's workspace).
>
> To come around this, I'm proposing to make the http_conn's a pooled
> resource of their own, with their own internal buffer space. Something
> along these lines:
>
> - Each thread pool have a list of unused http_conn's
> - Each worker thread have a pool of unused http_conn's. When the worker
> is idle (goes into pool or starts processing a new request), this pool is
> increased/reduced from the thread pool's list (or creating new ones) to a
> size of 2. Number of 2 as it will need 1 for the client request, and maybe
> one for the backend fetch.
> - Worker thread takes http_conns from it's pool when it needs a HTC (or
> creating a new one if it goes empty). It returns them to it's pool when
> they are not used anymore
> - http_conn's can then be transferred from one thread to another and
> take their data with them. (The receiving worker will then end up with one
> more, but this is returned to the thread pool's list when it's finished
> with the request. The thread giving one away gets one from the thread pool)
> - Whenever the house keeping is done on the worker's list, it will check
> the buffer sizes against the current parameter sizes, and free the
> http_conn's and creating new ones if they have changed.
>
> I believe this creates a mostly lock free system, but still make these data
> structures decoupled from the session/thread and can be transferred between
> them when that is needed.
>
> Any comments?
>
> --
> Martin Blix Grydeland
> Varnish Software AS
>
More information about the varnish-dev
mailing list