diff options
author | Roy T. Fielding <fielding@apache.org> | 2000-07-13 08:00:11 +0000 |
---|---|---|
committer | Roy T. Fielding <fielding@apache.org> | 2000-07-13 08:00:11 +0000 |
commit | e465ba32742cbdb931dc0da6b38bed1ca98f7c38 (patch) | |
tree | 72f48eeb9244ee893b70d8431bb2a58e6f2c6836 /buckets | |
parent | a5c5727561ccdb46821c643dde63d03284b82694 (diff) | |
download | apr-e465ba32742cbdb931dc0da6b38bed1ca98f7c38.tar.gz |
Save another old thread on stacked-io
Submitted by: Ed Korthof, Ben Hyde
git-svn-id: https://svn.apache.org/repos/asf/apr/apr/trunk@60355 13f79535-47bb-0310-9956-ffa450edef68
Diffstat (limited to 'buckets')
-rw-r--r-- | buckets/doc_stacked_io.txt | 257 |
1 files changed, 257 insertions, 0 deletions
diff --git a/buckets/doc_stacked_io.txt b/buckets/doc_stacked_io.txt index 8828abb2d..8adfa8c86 100644 --- a/buckets/doc_stacked_io.txt +++ b/buckets/doc_stacked_io.txt @@ -552,3 +552,260 @@ a few bits of metadata to do HTTP: file size and last modified. We need an etag generation function, it is specific to the filters in use. You see, I'm envisioning a bottom layer which pulls data out of a database rather than reading from a file.] + + +************************************************************************* +Date: Sun, 27 Dec 1998 13:08:22 -0800 (PST) +From: Ed Korthof <ed@bitmechanic.com> +To: new-httpd@apache.org +Subject: I/O filters & reference counts +Message-ID: <Pine.LNX.3.96.981224163237.10687E-100000@crankshaft> + +Hi -- + +A while back, I indicated I'd propose a way to do reference counts w/ the +layered I/O I want to implement for 2.0 (assuming we don't use nspr)... +for single-threaded Apache, this seems unnecessary (assuming you don't use +shared memory in your filters to share data amoung the processes), but in +other situations it does have advantages. + +Anyway, what I'd propose involves using a special syntax when you want to +use reference counts. This allows Apache to continue using the +'pool'-based memory system (it may not be perfect, but imo it's reasonably +good), without creating difficult when you wish to free memory. + +If you're creating memory which you'll want to share amoung multiple +threads, you'll create it using a function more or less like: + + ap_palloc_share(pool *p, size_t size); + +you get back a void * pointer for use as normal. When you want to give +someone else a reference to it, you do the following: + + ap_pshare_data(pool *p1, pool *p2, void * data); + +where data is the return from above (and it must be the same). Then both +pools have a reference to the data & to a counter; when each pool is +cleaned up, it will automatically decrement the counter, and free the data +if the counter is down to zero. + +In addition, a pool can decrement the counter with the following: + + ap_pshare_free(pool * p1, void * data); + +after which the data may be freed. There would also be a function, + + ap_pshare_countrefs(pool * p1, void * data); + +which would return the number of pools holding a ref to 'data', or 1 if +it's not a shared block. + +Internally, the pool might either keep a list of the shared blocks, or a +balanced b-tree; if those are too slow, I'd look into passing back and +forth a (pointer to an) int, and simply use an array. The filter +declaring the shared memory would need to keep track of such an int, but +no one else would. + +In the context of I/O filters, this would mean that each read function +returns a const char *, which should not be cast to a non-const char * (at +least, not without calling ap_pshare_countrefs()). If a filter screwed +this up, you'd have a problem -- but that's more or less unavoidable with +sharing data amoung threads using reference counts. + +It might make sense to build a more general reference counting system; if +that's what people want, I'm also up for working on that. But one of the +advantages the pool system has is its simplicity, some of which would be +lost. + +Anyway, how does this sound? Reasonable or absurd? + +Thanks -- + +Ed + ---------------------------------------- +History repeats itself, first as tragedy, second as farce. - Karl Marx + +************************************************************************* +From: Ben Hyde <bhyde@pobox.com> +Date: Tue, 29 Dec 1998 11:50:01 -0500 (EST) +To: new-httpd@apache.org +Subject: Re: I/O filters & reference counts +In-Reply-To: <Pine.LNX.3.96.981227192210.10687H-100000@crankshaft> +References: <Pine.GSO.3.96.981227185303.8793B-100000@elaine21.Stanford.EDU> + <Pine.LNX.3.96.981227192210.10687H-100000@crankshaft> +Message-ID: <13960.60942.186393.799490@zap.ml.org> + + +There are two problems that reference counts address that we have, +but I still don't like them. + +These two are: pipeline memory management, and response paste up. A +good pipeline ought not _require_ memory proportional to the size of +the response but only proportional to the diameter of the pipe. +Response paste up is interesting because the library of clip art is +longer lived than the response or connection pool. There is a lot to +be said for leveraging the configuration pool life cycle for this kind +of thing. + +The pipeline design, and the handling of the memory it uses become +very entangled after a while - I can't think about one without the +other. This is the right place to look at this problem. I.e. this +is a problem to be lead by buff.c rework, not alloc.c rework. + +Many pipeline operations require tight coupling to primitive +operations that happen to be efficient. Neat instructions, memory +mapping, etc. Extreme efficiency in this pipeline makes it desirable +that the chunks in the pipeline be large. I like the phrase "chunks +and pumps" to summarize that there are two elements to design to get +modularity right here. + +The pasteup problem - one yearns for a library of fragments (call it a +cache, clip art, or templates if you like) which then readers in that +library can assemble these into responses. Some librarians like to +discard stale bits and they need a scheme to know that the readers +have all finished. The library resides in a pool that lives longer +than a single response connection. If the librarian can be convinced +that the server restart cycles are useful we get to a fall back to +there. + +I can't smell yet where the paste up problem belong in the 2.0 design +problem. (a) in the core, (b) in a module, (c) as a subpart of the +pipeline design, or (d) ostracized outside 2.0 to await a gift (XML?) +we then fold into Apache. I could probably argue any one of these. A +good coupling between this mechanism and the pipeline is good, limits +on the pipeline design space are very good. + + - ben + + +************************************************************************* +Date: Mon, 4 Jan 1999 18:26:36 -0800 (PST) +From: Ed Korthof <ed@bitmechanic.com> +To: new-httpd@apache.org +Subject: Re: I/O filters & reference counts +In-Reply-To: <13960.60942.186393.799490@zap.ml.org> +Message-ID: <Pine.LNX.3.96.981231094653.486R-100000@crankshaft> + +On Tue, 29 Dec 1998, Ben Hyde wrote: + +> There are two problems that reference counts address that we have, +> but I still don't like them. + +They certainly add some clutter. But they offer a solution to the +problems listed below... and specifically to an issue which you brought up +a while back: avoiding a memcpy in each read layer which has a read +function other than the default one. Sometimes a memcpy is required, +sometimes not; with "reference counts", you can go either way. + +> These two are: pipeline memory management, and response paste up. A +> good pipeline ought not _require_ memory proportional to the size of +> the response but only proportional to the diameter of the pipe. +> Response paste up is interesting because the library of clip art is +> longer lived than the response or connection pool. There is a lot to +> be said for leveraging the configuration pool life cycle for this kind +> of thing. + +I was indeed assuming that we would use pools which would last from one +restart (and a run through of the configuration functions) to the next. + +So far as limiting the memory requirements of the pipeline -- this is +primarily a function of the module programming. Because the pipeline will +generally live in a single thread (with the possible exception of the data +source, which could be another processes), the thread will only be +operating on a single filter at a time (unless you added custom code to +create a new thread to handle one part of the pipeline -- ugg). + +For writing, the idea would be to print one or more blocks of text with +each call; wait for the write function to return; and then recycle the +buffers used. + +Reading has no writev equivalent, so you only be able to do it one block +at a time, but this seems alright to me (reading data is actually a much +less complicated procedure in practice -- at least, with the applications +which I've seen). + +Recycling read buffers (so as to limit the size of the memory pipeline) +is the hardest part, when we add in this 'reference count' scheme -- but +it can be done, if the modules recieving the data are polite and indicate +when they're done with the buffer. Ie.: + + module 1 module 2 +1.) reads from module 2: + char * ap_bread(BUFF *, pool *, int); + +2.) returns a block of text w/ ref counts: + str= char* ap_pshare_alloc(size_t); + ... + return str; + keeps a ref to str. + +3.) handles the block of data + returned, and indicates it's + finished with: + void ap_pshare_free(char * block); + reads more data via + char * ap_bread(BUFF *, pool *, int); + +4.) tries to recycle the buffer used: + if (ap_pshare_count_refs(str)==1) + reuse str + else + str = ap_pshare_alloc(...) + ... + return str; + +5.) handles the block of data + returned... +... + +One disadvantage is that if module 1 doesn't release its hold on a memory +block it got from step 2 until step 5, then the memory block wouldn't be +reused -- you'd pay w/ a free & a malloc (or with a significant increase +in complexity -- I'd probably choose the free & malloc). And if the module +failed to release the memory (via ap_pshare_free), then the memory +requirements would be as large as the response (or request). + +I believe this is only relevant for clients PUTting large files onto their +servers; but w/ files which are potentially many gigabytes, it is +important that filters handling reading do this correctly. Of course, +that's currently the situation anyhow. + +> The pipeline design, and the handling of the memory it uses become +> very entangled after a while - I can't think about one without the +> other. This is the right place to look at this problem. I.e. this +> is a problem to be lead by buff.c rework, not alloc.c rework. + +Yeah, after thinking about it a little bit I realized that no (or very +little) alloc.c work would be needed to implement the system which I +described. Basically, you'd have an Apache API function which does malloc +on its own, and other functions (also in the API) which register a cleanup +function (for the malloc'ed memory) in appropriate pools. + +IMO, the 'pipeline' is likely to be the easiest place to work with this, +at least in terms of getting the most efficient & clean design which we +can. + +[snip good comments] +> I can't smell yet where the paste up problem belong in the 2.0 design +> problem. (a) in the core, (b) in a module, (c) as a subpart of the +> pipeline design, or (d) ostracized outside 2.0 to await a gift (XML?) +> we then fold into Apache. I could probably argue any one of these. A +> good coupling between this mechanism and the pipeline is good, limits +> on the pipeline design space are very good. + +An overdesigned pipeline system (or an overly large one) would definitely +not be helpful. If it would be useful, I'm happy to work on this (even if +y'all aren't sure if you'd want to use it); if not, I'm sure I can find +things to do with my time. <g> + +Anyway, I went to CPAN and got a copy of sfio... the latest version I +found is from Oct, 1997. I'd guess that using it (assuming this is +possible) might give us slightly less efficency (simply because sfio +wasn't built specifically for Apache, and customizing it is a much more +involved processes), but possibly fewer bugs to work out & lots of +interesting features. + +thanks -- + +Ed, slowly reading through the sfio source code + |