diff options
Diffstat (limited to 'buckets/doc_SFmtg.txt')
-rw-r--r-- | buckets/doc_SFmtg.txt | 172 |
1 files changed, 0 insertions, 172 deletions
diff --git a/buckets/doc_SFmtg.txt b/buckets/doc_SFmtg.txt deleted file mode 100644 index bf2fed23c..000000000 --- a/buckets/doc_SFmtg.txt +++ /dev/null @@ -1,172 +0,0 @@ - -From akosut@leland.Stanford.EDU Thu Jul 23 09:38:40 1998 -Date: Sun, 19 Jul 1998 00:12:37 -0700 (PDT) -From: Alexei Kosut <akosut@leland.Stanford.EDU> -To: new-httpd@apache.org -Subject: Apache 2.0 - an overview - -For those not at the Apache meeting in SF, and even for those who were, -here's a quick overview of (my understanding of) the Apache 2.0 -architecture that we came up with. I present this to make sure that I have -it right, and to get opinions from the rest of the group. Enjoy. - - -1. "Well, if we haven't released 2.0 by Christmas of 1999, it won't - matter anyway." - -A couple of notes about this plan: I'm looking at this right now from a -design standpoint, not an implementation one. If the plan herein were -actually coded as-is, you'd get a very inefficient web server. But as -Donald Knuth (Professor emeritus at Stanford, btw... :) points out, -"premature optimization is the root of all evil." Rest assured there are -plenty of ways to make sure Apache 2.0 is much faster than Apache 1.3. -Taking out all the "slowness" code, for example... :) - -Also, the main ideas in this document mainly come from Dean Gaudet, Simon -Spero, Cliff Skolnick and a bunch of other people, from the Apache Group's -meeting in San Francisco, July 2 and 3, 1998. The other ideas come from -other people. I'm being vague because I can't quite remember. We should -have videotaped it. I've titled the sections of this document with quotes -from our meeting, but they are paraphrased from memory, so don't take them -too seriously. - -2. "But Simon, how can you have a *middle* end?" - -One of the main goals of Apache 2.0 is protocol independence (i.e., -serving HTTP/1.1, HTTP-NG, and maybe FTP or gopher or something). Another -is to rid the server of the belief that everything is a file. Towards this -end, we divide the server up into three parts, the front end, the middle -end, and the back end. - -The front end is essentially a combination of http_main and http_protocol -today. It takes care of all network and protocol matters, interpreting the -request, putting it into a protocol-neutral form, and (possibly) passing -it off to the rest of the server. This is approximately equivalent to the -part of Apache contained in Dean's flow stuff, and it also works very well -in certain non-Unix-like architectures such as clustered mainframes. In -addition, part of this front-end might be optionally run in kernel space, -giving a very fast server indeed... - -The back end is what generates the content. At the back of the back end we -have backing stores (Cliff's term), which contain actual data. These might -represent files on a disk, entries in a database, CGI scripts, etc... The -back end also consists of other modules, which can alter the request in -various fashions. The objects the server acts on can be thought of (Cliff -again) as a filehandle and a set of key/value pairs (metainformation). -The modules are set up as filters that can alter either one of those, -stacking I/O routines onto the stream of data, or altering the -metainformation. - -The middle end is what comes between the front and back ends. Think of -http_request. This section takes care of arranging the modules, backing -stores, etc... into a manner so that the path of the request will result -in the correct entity being delivered to the front end and sent to the -client. - -3. "I won't embarrass you guys with the numbers for how well Apache - performs compared to IIS." (on NT) - -For a server that was designed to handle flat files, Apache does it -surprisingly poorly, compared with other servers that have been optimized -for it. And the performance for non-static files is, of course, worse. -While Apache is still more than fast enough for 95% of Web servers, we'd -be remiss to dismiss those other 5% (they're the fun ones anyway). Another -problem Apache has is its lack of a good, caching, proxy module. - -Put these together, along with the work Dean has done with the flow and -mod_mmap_static stuff, and we realize the most important part of Apache -2.0: a built-in, all-pervasive, cache. Every part of the request process -will involve caching. In the path outlined above, between each layer of -the request, between each module, sits the cache, which can (when it is -useful), cache the response and its metainformation - including its -variance, so it knows when it is safe to give out the cached copy. This -gives every opportunity to increase the speed of the server by making sure -it never has to dynamically create content more than it needs to, and -renders accelerators such as Squid unnecessary. - -This also allows what I alluded to earlier: a kernel (or near-to-kernel) -based web server component, which could read the request, consult the -cache to find the requested object, and spit it back out, without so much -as an interrupt in the way. Of course, the rest of Apache (with all its -modules - it's generally a bad idea to let unknown, untrusted code, insert -itself into the kernel) sits up in user-space, ready to handle any request -the micro-Apache can't. - -A built-in cache also makes a real working HTTP/1.1 proxy server trivially -easy to write. - -4. "Stop asking about backwards compatibility with the API. We'll write a - compatibility module... later." - -If modules are as described above, then obviously they are very much -distinct from how Apache's current modules function. The only module -function that is similar to the current model is the handler, or backing -store, that actually provides the basic stream of data that the server -alters to product a response entity. - -The basic module's approach to its job is to stack a filter onto the -output. But it's better to think of the modules not as a stack that the -request flows through (a layer cake with cache icing between the layers), -but more of a mosaic (pretend I didn't use that word. I wrote collage. You -can't prove anything), with modules stuck onto various sides of the -request at different points, altering the request/response. - -Today's Apache modules take an all-or-nothing approach to request -handlers. They tell Apache what they can do, overestimating, and then are -supposed to DECLINE if they don't pass a number of checks they are -supposed to make. Most modules don't do this correctly. The better -approach is to allow the modules to inform Apache exactly of what they can -do, and have Apache (the middle-end) take care of invoking them when -appropriate. - -The final goal of all of this, of course, is simply to allow CGI output to -be parsed for server-side includes. But don't tell Dean that. - -5. "Will Apache run without any of the normal Unix binaries installed, - only the BSD/POSIX libraries?" - -Another major issue is, of course, configuration of the server. There are -a number of distinct opinions on this, both as to what should be -configured and how it should be done. We talked mainly about the latter, -but the did touch on the former. Obviously, with a radically distinct -module API, the configuration is radically different. We need a good way -to specify how the modules are supposed to interact, and of controlling -what they can do, when and how, balancing what the user asks the server to -do, and what the module (author) wants the server to do. We didn't really -come up with a good answer to this. - -However, we did make some progress on the other side of the issue: We -agreed that the current configuration system is definitely taking the -right approach. Having a well-defined repository of the configuration -scheme, containing the possible directives, when they are applicable, what -their parameters are, etc... is the right way to go. We agreed that more -information and stronger-typing (no RAW_ARGS!) would be good, and may -enable on-the-fly generated configuration managers. - -We agreed that such a program, probably external to Apache, would generate -a configuration and pass it to Apache, either via a standard config file, -or by calling Apache API functions. It is desirable to be able to go the -other way, pulling current configuration from Apache to look at, and -perhaps change it on the fly, but unfortunately is unlikely this -information would always be available; modules may perform optimizations -on their configuration that makes the original configuration unavailable. - -For the language and specification of the configuration, we thought -perhaps XML might be a good approach, and agreed it should be looked -into. Other issues, such as SNMP, were brought up and laughed at. - -6. "So you're saying that the OS that controls half the banks, and 90% of - the airlines, doesn't even have memory protection for seperate - processes?" - -Obviously, there are a lot more items that have to be part of Apache 2.0, -and we talked about a number of them. However, the four points above, I -think, represent the core of the architecture we agreed on as a starting -point. - --- Alexei Kosut <akosut@stanford.edu> <http://www.stanford.edu/~akosut/> - Stanford University, Class of 2001 * Apache <http://www.apache.org> * - - - - |