1 files changed, 538 insertions, 0 deletions
diff --git a/docs/do-it-yourself-framework.txt b/docs/do-it-yourself-framework.txt
new file mode 100644
index 0000000..ae77ec0
--- /dev/null
+++ b/docs/do-it-yourself-framework.txt
@@ -0,0 +1,538 @@
+A Do-It-Yourself Framework
+++++++++++++++++++++++++++
+
+:author: Ian Bicking <ianb@colorstudy.com>
+:revision: $Rev$
+:date: $LastChangedDate$
+
+This tutorial has been translated `into Portuguese
+<http://montegasppa.blogspot.com/2007/06/um-framework-faa-voc-mesmo.html>`_.
+
+A newer version of this article is available `using WebOb
+<http://pythonpaste.org/webob/do-it-yourself.html>`_.
+
+.. contents::
+
+.. comments:
+
+   Explain SCRIPT_NAME/PATH_INFO better
+
+Introduction and Audience
+=========================
+
+This short tutorial is meant to teach you a little about WSGI, and as
+an example a bit about the architecture that Paste has enabled and
+encourages.
+
+This isn't an introduction to all the parts of Paste -- in fact, we'll
+only use a few, and explain each part.  This isn't to encourage
+everyone to go off and make their own framework (though honestly I
+wouldn't mind).  The goal is that when you have finished reading this
+you feel more comfortable with some of the frameworks built using this
+architecture, and a little more secure that you will understand the
+internals if you look under the hood.
+
+What is WSGI?
+=============
+
+At its simplest WSGI is an interface between web servers and web
+applications.  We'll explain the mechanics of WSGI below, but a higher
+level view is to say that WSGI lets code pass around web requests in a
+fairly formal way.  But there's more!  WSGI is more than just HTTP.
+It might seem like it is just *barely* more than HTTP, but that little
+bit is important:
+
+* You pass around a CGI-like environment, which means data like
+  ``REMOTE_USER`` (the logged-in username) can be securely passed
+  about.
+
+* A CGI-like environment can be passed around with more context --
+  specifically instead of just one path you two: ``SCRIPT_NAME`` (how
+  we got here) and ``PATH_INFO`` (what we have left).
+
+* You can -- and often should -- put your own extensions into the WSGI
+  environment.  This allows for callbacks, extra information,
+  arbitrary Python objects, or whatever you want.  These are things
+  you can't put in custom HTTP headers.
+
+This means that WSGI can be used not just between a web server an an
+application, but can be used at all levels for communication.  This
+allows web applications to become more like libraries -- well
+encapsulated and reusable, but still with rich reusable functionality.
+
+Writing a WSGI Application
+==========================
+
+The first part is about how to use `WSGI
+<http://www.python.org/peps/pep-0333.html>`_ at its most basic.  You
+can read the spec, but I'll do a very brief summary:
+
+* You will be writing a *WSGI application*.  That's an object that
+  responds to requests.  An application is just a callable object
+  (like a function) that takes two arguments: ``environ`` and
+  ``start_response``.
+
+* The environment looks a lot like a CGI environment, with keys like
+  ``REQUEST_METHOD``, ``HTTP_HOST``, etc.
+
+* The environment also has some special keys like ``wsgi.input`` (the
+  input stream, like the body of a POST request).
+
+* ``start_response`` is a function that starts the response -- you
+  give the status and headers here.
+
+* Lastly the application returns an iterator with the body response
+  (commonly this is just a list of strings, or just a list containing
+  one string that is the entire body.)
+
+So, here's a simple application::
+
+    def app(environ, start_response):
+        start_response('200 OK', [('content-type', 'text/html')])
+        return ['Hello world!']
+
+Well... that's unsatisfying.  Sure, you can imagine what it does, but
+you can't exactly point your web browser at it.
+
+There's other cleaner ways to do this, but this tutorial isn't about
+*clean* it's about *easy-to-understand*.  So just add this to the
+bottom of your file::
+
+    if __name__ == '__main__':
+        from paste import httpserver
+        httpserver.serve(app, host='127.0.0.1', port='8080')
+
+Now visit http://localhost:8080 and you should see your new app.
+If you want to understand how a WSGI server works, I'd recommend
+looking at the `CGI WSGI server
+<http://www.python.org/peps/pep-0333.html#the-server-gateway-side>`_
+in the WSGI spec.
+
+An Interactive App
+------------------
+
+That last app wasn't very interesting.  Let's at least make it
+interactive.  To do that we'll give a form, and then parse the form
+fields::
+
+    from paste.request import parse_formvars
+
+    def app(environ, start_response):
+        fields = parse_formvars(environ)
+        if environ['REQUEST_METHOD'] == 'POST':
+            start_response('200 OK', [('content-type', 'text/html')])
+            return ['Hello, ', fields['name'], '!']
+        else:
+            start_response('200 OK', [('content-type', 'text/html')])
+            return ['<form method="POST">Name: <input type="text" '
+                    'name="name"><input type="submit"></form>']
+
+The ``parse_formvars`` function just takes the WSGI environment and
+calls the `cgi <http://python.org/doc/current/lib/module-cgi.html>`_
+module (the ``FieldStorage`` class) and turns that into a MultiDict.
+
+Now For a Framework
+===================
+
+Now, this probably feels a bit crude.  After all, we're testing for
+things like REQUEST_METHOD to handle more than one thing, and it's
+unclear how you can have more than one page.
+
+We want to build a framework, which is just a kind of generic
+application.  In this tutorial we'll implement an *object publisher*,
+which is something you may have seen in Zope, Quixote, or CherryPy.
+
+Object Publishing
+-----------------
+
+In a typical Python object publisher you translate ``/`` to ``.``.  So
+``/articles/view?id=5`` turns into ``root.articles.view(id=5)``.  We
+have to start with some root object, of course, which we'll pass in...
+
+::
+
+    class ObjectPublisher(object):
+
+        def __init__(self, root):
+            self.root = root
+
+        def __call__(self, environ, start_response):
+            ...
+
+    app = ObjectPublisher(my_root_object)
+
+We override ``__call__`` to make instances of ``ObjectPublisher``
+callable objects, just like a function, and just like WSGI
+applications.  Now all we have to do is translate that ``environ``
+into the thing we are publishing, then call that thing, then turn the
+response into what WSGI wants.
+
+The Path
+--------
+
+WSGI puts the requested path into two variables: ``SCRIPT_NAME`` and
+``PATH_INFO``.  ``SCRIPT_NAME`` is everything that was used up
+*getting here*.  ``PATH_INFO`` is everything left over -- it's
+the part the framework should be using to find the object.  If you put
+the two back together, you get the full path used to get to where we
+are right now; this is very useful for generating correct URLs, and
+we'll make sure we preserve this.
+
+So here's how we might implement ``__call__``::
+
+    def __call__(self, environ, start_response):
+        fields = parse_formvars(environ)
+        obj = self.find_object(self.root, environ)
+        response_body = obj(**fields.mixed())
+        start_response('200 OK', [('content-type', 'text/html')])
+        return [response_body]
+
+    def find_object(self, obj, environ):
+        path_info = environ.get('PATH_INFO', '')
+        if not path_info or path_info == '/':
+            # We've arrived!
+            return obj
+        # PATH_INFO always starts with a /, so we'll get rid of it:
+        path_info = path_info.lstrip('/')
+        # Then split the path into the "next" chunk, and everything
+        # after it ("rest"):
+        parts = path_info.split('/', 1)
+        next = parts[0]
+        if len(parts) == 1:
+            rest = ''
+        else:
+            rest = '/' + parts[1]
+        # Hide private methods/attributes:
+        assert not next.startswith('_')
+        # Now we get the attribute; getattr(a, 'b') is equivalent
+        # to a.b...
+        next_obj = getattr(obj, next)
+        # Now fix up SCRIPT_NAME and PATH_INFO...
+        environ['SCRIPT_NAME'] += '/' + next
+        environ['PATH_INFO'] = rest
+        # and now parse the remaining part of the URL...
+        return self.find_object(next_obj, environ)
+
+And that's it, we've got a framework.
+
+Taking It For a Ride
+--------------------
+
+Now, let's write a little application.  Put that ``ObjectPublisher``
+class into a module ``objectpub``::
+
+    from objectpub import ObjectPublisher
+
+    class Root(object):
+
+        # The "index" method:
+        def __call__(self):
+            return '''
+            <form action="welcome">
+            Name: <input type="text" name="name">
+            <input type="submit">
+            </form>
+            '''
+
+        def welcome(self, name):
+            return 'Hello %s!' % name
+
+    app = ObjectPublisher(Root())
+
+    if __name__ == '__main__':
+        from paste import httpserver
+        httpserver.serve(app, host='127.0.0.1', port='8080')
+
+Alright, done!  Oh, wait.  There's still some big missing features,
+like how do you set headers?  And instead of giving ``404 Not Found``
+responses in some places, you'll just get an attribute error.  We'll
+fix those up in a later installment...
+
+Give Me More!
+-------------
+
+You'll notice some things are missing here.  Most specifically,
+there's no way to set the output headers, and the information on the
+request is a little slim.
+
+::
+
+    # This is just a dictionary-like object that has case-
+    # insensitive keys:
+    from paste.response import HeaderDict
+
+    class Request(object):
+        def __init__(self, environ):
+            self.environ = environ
+            self.fields = parse_formvars(environ)
+
+    class Response(object):
+        def __init__(self):
+            self.headers = HeaderDict(
+                {'content-type': 'text/html'})
+
+Now I'll teach you a little trick.  We don't want to change the
+signature of the methods.  But we can't put the request and response
+objects in normal global variables, because we want to be
+thread-friendly, and all threads see the same global variables (even
+if they are processing different requests).
+
+But Python 2.4 introduced a concept of "thread-local values".  That's
+a value that just this one thread can see.  This is in the
+`threading.local <http://docs.python.org/lib/module-threading.html>`_
+object.  When you create an instance of ``local`` any attributes you
+set on that object can only be seen by the thread you set them in.  So
+we'll attach the request and response objects here.
+
+So, let's remind ourselves of what the ``__call__`` function looked
+like::
+
+    class ObjectPublisher(object):
+        ...
+
+        def __call__(self, environ, start_response):
+            fields = parse_formvars(environ)
+            obj = self.find_object(self.root, environ)
+            response_body = obj(**fields.mixed())
+            start_response('200 OK', [('content-type', 'text/html')])
+            return [response_body]
+
+Lets's update that::
+
+    import threading
+    webinfo = threading.local()
+
+    class ObjectPublisher(object):
+        ...
+
+        def __call__(self, environ, start_response):
+            webinfo.request = Request(environ)
+            webinfo.response = Response()
+            obj = self.find_object(self.root, environ)
+            response_body = obj(**dict(webinfo.request.fields))
+            start_response('200 OK', webinfo.response.headers.items())
+            return [response_body]
+
+Now in our method we might do::
+
+    class Root:
+        def rss(self):
+            webinfo.response.headers['content-type'] = 'text/xml'
+            ...
+
+If we were being fancier we would do things like handle `cookies
+<http://python.org/doc/current/lib/module-Cookie.html>`_ in these
+objects.  But we aren't going to do that now.  You have a framework,
+be happy!
+
+WSGI Middleware
+===============
+
+`Middleware
+<http://www.python.org/peps/pep-0333.html#middleware-components-that-play-both-sides>`_
+is where people get a little intimidated by WSGI and Paste.
+
+What is middleware?  Middleware is software that serves as an
+intermediary.
+
+
+So lets
+write one.  We'll write an authentication middleware, so that you can
+keep your greeting from being seen by just anyone.
+
+Let's use HTTP authentication, which also can mystify people a bit.
+HTTP authentication is fairly simple:
+
+* When authentication is requires, we give a ``401 Authentication
+  Required`` status with a ``WWW-Authenticate: Basic realm="This
+  Realm"`` header
+
+* The client then sends back a header ``Authorization: Basic
+  encoded_info``
+
+* The "encoded_info" is a base-64 encoded version of
+  ``username:password``
+
+So how does this work?  Well, we're writing "middleware", which means
+we'll typically pass the request on to another application.  We could
+change the request, or change the response, but in this case sometimes
+we *won't* pass the request on (like, when we need to give that 401
+response).
+
+To give an example of a really really simple middleware, here's one
+that capitalizes the response::
+
+    class Capitalizer(object):
+
+        # We generally pass in the application to be wrapped to
+        # the middleware constructor:
+        def __init__(self, wrap_app):
+            self.wrap_app = wrap_app
+
+        def __call__(self, environ, start_response):
+            # We call the application we are wrapping with the
+            # same arguments we get...
+            response_iter = self.wrap_app(environ, start_response)
+            # then change the response...
+            response_string = ''.join(response_iter)
+            return [response_string.upper()]
+
+Techically this isn't quite right, because there there's two ways to
+return the response body, but we're skimming bits.
+`paste.wsgilib.intercept_output
+<http://pythonpaste.org/module-paste.wsgilib.html#intercept_output>`_
+is a somewhat more thorough implementation of this.
+
+.. note::
+
+   This, like a lot of parts of this (now fairly old) tutorial is
+   better, more thorough, and easier using `WebOb
+   <http://pythonpaste.org/webob/>`_.  This particular example looks
+   like::
+
+       from webob import Request
+
+       class Capitalizer(object):
+           def __init__(self, app):
+               self.app = app
+           def __call__(self, environ, start_response):
+               req = Request(environ)
+               resp = req.get_response(self.app)
+               resp.body = resp.body.upper()
+               return resp(environ, start_response)
+
+So here's some code that does something useful, authentication::
+
+    class AuthMiddleware(object):
+
+        def __init__(self, wrap_app):
+            self.wrap_app = wrap_app
+
+        def __call__(self, environ, start_response):
+            if not self.authorized(environ.get('HTTP_AUTHORIZATION')):
+                # Essentially self.auth_required is a WSGI application
+                # that only knows how to respond with 401...
+                return self.auth_required(environ, start_response)
+            # But if everything is okay, then pass everything through
+            # to the application we are wrapping...
+            return self.wrap_app(environ, start_response)
+
+        def authorized(self, auth_header):
+            if not auth_header:
+                # If they didn't give a header, they better login...
+                return False
+            # .split(None, 1) means split in two parts on whitespace:
+            auth_type, encoded_info = auth_header.split(None, 1)
+            assert auth_type.lower() == 'basic'
+            unencoded_info = encoded_info.decode('base64')
+            username, password = unencoded_info.split(':', 1)
+            return self.check_password(username, password)
+
+        def check_password(self, username, password):
+            # Not very high security authentication...
+            return username == password
+
+        def auth_required(self, environ, start_response):
+            start_response('401 Authentication Required',
+                [('Content-type', 'text/html'),
+                 ('WWW-Authenticate', 'Basic realm="this realm"')])
+            return ["""
+            <html>
+             <head><title>Authentication Required</title></head>
+             <body>
+              <h1>Authentication Required</h1>
+              If you can't get in, then stay out.
+             </body>
+            </html>"""]
+
+.. note::
+
+   Again, here's the same thing with WebOb::
+
+       from webob import Request, Response
+
+       class AuthMiddleware(object):
+           def __init__(self, app):
+               self.app = app
+           def __call__(self, environ, start_response):
+               req = Request(environ)
+               if not self.authorized(req.headers['authorization']):
+                   resp = self.auth_required(req)
+               else:
+                   resp = self.app
+               return resp(environ, start_response)
+           def authorized(self, header):
+               if not header:
+                   return False
+               auth_type, encoded = header.split(None, 1)
+               if not auth_type.lower() == 'basic':
+                   return False
+               username, password = encoded.decode('base64').split(':', 1)
+               return self.check_password(username, password)
+        def check_password(self, username, password):
+            return username == password
+        def auth_required(self, req):
+            return Response(status=401, headers={'WWW-Authenticate': 'Basic realm="this realm"'},
+                            body="""\
+            <html>
+             <head><title>Authentication Required</title></head>
+             <body>
+              <h1>Authentication Required</h1>
+              If you can't get in, then stay out.
+             </body>
+            </html>""")
+
+So, how do we use this?
+
+::
+
+    app = ObjectPublisher(Root())
+    wrapped_app = AuthMiddleware(app)
+
+    if __name__ == '__main__':
+        from paste import httpserver
+        httpserver.serve(wrapped_app, host='127.0.0.1', port='8080')
+
+Now you have middleware!  Hurrah!
+
+Give Me More Middleware!
+------------------------
+
+It's even easier to use other people's middleware than to make your
+own, because then you don't have to program.  If you've been following
+along, you've probably encountered a few exceptions, and have to look
+at the console to see the exception reports.  Let's make that a little
+easier, and show the exceptions in the browser...
+
+::
+
+    app = ObjectPublisher(Root())
+    wrapped_app = AuthMiddleware(app)
+    from paste.exceptions.errormiddleware import ErrorMiddleware
+    exc_wrapped_app = ErrorMiddleware(wrapped_app)
+
+Easy!  But let's make it *more* fancy...
+
+::
+
+    app = ObjectPublisher(Root())
+    wrapped_app = AuthMiddleware(app)
+    from paste.evalexception import EvalException
+    exc_wrapped_app = EvalException(wrapped_app)
+
+So go make an error now.  And hit the little +'s.  And type stuff in
+to the boxes.
+
+Conclusion
+==========
+
+Now that you've created your framework and application (I'm sure it's
+much nicer than the one I've given so far).  You might keep writing it
+(many people have so far), but even if you don't you should be able to
+recognize these components in other frameworks now, and you'll have a
+better understanding how they probably work under the covers.
+
+Also check out the version of this tutorial written `using WebOb
+<http://pythonpaste.org/webob/do-it-yourself.html>`_.  That tutorial
+includes things like **testing** and **pattern-matching dispatch**
+(instead of object publishing).