summaryrefslogtreecommitdiff
path: root/docs/user-guides/debugging-techniques.rst
diff options
context:
space:
mode:
Diffstat (limited to 'docs/user-guides/debugging-techniques.rst')
-rw-r--r--docs/user-guides/debugging-techniques.rst1148
1 files changed, 1148 insertions, 0 deletions
diff --git a/docs/user-guides/debugging-techniques.rst b/docs/user-guides/debugging-techniques.rst
new file mode 100644
index 0000000..368b795
--- /dev/null
+++ b/docs/user-guides/debugging-techniques.rst
@@ -0,0 +1,1148 @@
+====================
+Debugging Techniques
+====================
+
+Be it when initially setting up mod_wsgi for the first time, or later
+during development or use of your WSGI application, you are bound to get
+some sort of unexpected Python error. By default all you are usually going
+to see as evidence of this is a HTTP 500 "Internal Server Error" being
+displayed in the browser window and little else.
+
+The purpose of this document is to explain where to go look for more
+details on what caused the error, describe techniques one can use to have
+your WSGI application generate more useful debug information, as well as
+mechanisms that can be employed to interactively debug your application.
+
+Note that although this document is intended to deal with techniques which
+can be used when using mod_wsgi, many of the techniques are also directly
+transferable or adaptable to other web hosting mechanisms for WSGI
+applications.
+
+Apache Error Log Files
+----------------------
+
+When using mod_wsgi, unless you or the web framework you are using takes
+specific action to catch exceptions and present the details in an alternate
+manner, the only place that details of uncaught exceptions will be recorded
+is in the Apache error log files. The Apache error log files are therefore
+your prime source of information when things go wrong.
+
+Do note though that log messages generated by mod_wsgi are logged with
+various serverity levels and which ones will be output to the Apache error
+log files will depend on how Apache has been configured. The standard
+configuration for Apache has the !LogLevel directive being set to 'warn'.
+With this setting any important error messages will be output, but
+informational messages generated by mod_wsgi which can assist in working
+out what it is doing are not. Thus, if new to mod_wsgi or trying to debug a
+problem, it is worthwhile setting the Apache configuration to use 'info'
+log level instead::
+
+ LogLevel info
+
+If your Apache web server is only providing services for one host, it is
+likely that you will only have one error log file. If however the Apache
+web server is configured for multiple virtual hosts, then it is possible
+that there will be multiple error log files, one corresponding to the main
+server host and an additional error log file for each virtual host. Such a
+virtual host specific error log if one is being used, would have been
+configured through the placing of the Apache CustomLog directive within
+the context of the VirtualHost container.
+
+Although your WSGI application may be hosted within a particular virtual
+host and that virtual host has its own error log file, some error and
+informational messages will still go to the main server host error log
+file. Thus you may still need to consult both error log files when using
+virtual hosts.
+
+Messages of note that will end up in the main server host error log file
+include notifications in regard to initialisation of Python and the
+creation and destruction of Python sub interpreters, plus any errors which
+occur when doing this.
+
+Messages of note that would end up in the virtual host error log file, if
+it exists, include details of uncaught Python exceptions which occur when
+the WSGI application script is being loaded, or when the WSGI application
+callable object is being executed.
+
+Messages that are logged by a WSGI application via the 'wsgi.errors' object
+passed through to the application in the WSGI environment are also logged.
+These will got to the virtual host error log file if it exists, or the main
+error log file if the virtual host is not setup with its own error log file.
+Thus, if you want to add debugging messages to your WSGI application code,
+you can use 'wsgi.errors' in conjunction with the 'print' statement as shown
+below::
+
+ def application(environ, start_response):
+ status = '200 OK'
+ output = 'Hello World!'
+
+ print >> environ['wsgi.errors'], "application debug #1"
+
+ response_headers = [('Content-type', 'text/plain'),
+ ('Content-Length', str(len(output)))]
+ start_response(status, response_headers)
+
+ print >> environ['wsgi.errors'], "application debug #2"
+
+ return [output]
+
+If 'wsgi.errors' is not available to the code which needs to output log
+messages, then it should explicitly direct output from the 'print'
+statement to 'sys.stderr'::
+
+ import sys
+
+ def function():
+ print >> sys.stderr, "application debug #3"
+ ...
+
+If ``sys.stderr`` or ``sys.stdout`` is used directly then
+these messages will end up in the main server host error log file and not
+that for the virtual host unless the WSGI application is running in a
+daemon process specifically associated with a virtual host.
+
+Do be aware though that writing to ``sys.stdout`` is by default
+restricted in versions of mod_wsgi prior to 3.0 and will result in an
+exception occurring of the form::
+
+ IOError: sys.stdout access restricted by mod_wsgi
+
+This is because portable WSGI applications should not write to
+``sys.stdout`` or use the 'print' statement without specifying an
+alternate file object besides ``sys.stdout`` as the target. This
+restriction can be disabled for the whole server using the
+WSGIRestrictStdout directive, or by mapping ``sys.stdout`` to
+``sys.stderr`` at global scope within in the WSGI application script
+file::
+
+ import sys
+ sys.stdout = sys.stderr
+
+In general, a WSGI application should always endeavour to only log messages
+via the 'wsgi.errors' object that is passed through to a WSGI application
+in the WSGI environment. This is because this is the only way of logging
+messages for which there is some guarantee that they will end up in a log
+file that you might have access to if using a shared server.
+
+An application shouldn't however cache 'wsgi.errors' and try to use it
+outside of the context of a request. If this is done an exception will be
+raised indicating that the request has expired and the error log object
+is now invalid.
+
+That messages output via ``sys.stderr`` and ``sys.stdout`` end up in
+the Apache error logs at all is provided as a convenience but there is no
+requirement in the WSGI specification that they are valid means of a WSGI
+application logging messages.
+
+Displaying Request Environment
+------------------------------
+
+When a WSGI application is invoked, the request headers are passed as CGI
+variables in the WSGI request environment. The dictionary used for this
+also holds information about the WSGI execution environment and mod_wsgi.
+This includes mod_wsgi specific variables indicating the name of the
+process and application groups within which the WSGI application is
+executing.
+
+Knowing the values of the process and application group variables can be
+important when needing to validate that your Apache configuration is doing
+what you intended as far as ensuring your WSGI application is running in
+daemon mode or otherwise.
+
+A simple way of validating such details or getting access to any of the
+other WSGI request environment variables is to substitute your existing
+WSGI application with one which echos back the details to your browser.
+Such a task can be achieved with the following test application. The
+application could be extended as necessary to display other information as
+well, with process ID, user ID and group ID being shown as examples::
+
+ import cStringIO
+ import os
+
+ def application(environ, start_response):
+ headers = []
+ headers.append(('Content-Type', 'text/plain'))
+ write = start_response('200 OK', headers)
+
+ input = environ['wsgi.input']
+ output = cStringIO.StringIO()
+
+ print >> output, "PID: %s" % os.getpid()
+ print >> output, "UID: %s" % os.getuid()
+ print >> output, "GID: %s" % os.getgid()
+ print >> output
+
+ keys = environ.keys()
+ keys.sort()
+ for key in keys:
+ print >> output, '%s: %s' % (key, repr(environ[key]))
+ print >> output
+
+ output.write(input.read(int(environ.get('CONTENT_LENGTH', '0'))))
+
+ return [output.getvalue()]
+
+For the case of the process group as recorded by the
+'mod_wsgi.process_group' variable in the WSGI request environment, if the
+value is an empty string then the WSGI application is running in embedded
+mode. For any other value it will be running in daemon mode with the process
+group named by the variables value.
+
+Note that by default WSGI applications run in embedded mode, which means
+within the Apache server child processes which accept the original requests.
+Daemon mode processes would only be used through appropriate use of the
+WSGIDaemonProcess and WSGIProcessGroup directives to delegate the WSGI
+application to a named daemon process group.
+
+For the case of the application group as recorded by the
+'mod_wsgi.application_group' variable in the WSGI request environment, if the
+value is an empty string then the WSGI application is running in the main
+Python interpreter. That is, the very first interpreter created when Python
+was initialised. For any other value it indicates it is running in the named
+Python sub interpreter.
+
+Note that by default WSGI applications would always run in a sub
+interpreter rather than the main interpreter. The name of this sub
+interpreter would be automatically constructed from the name of the server
+or virtual host, the URL mount point of the WSGI application and the number
+of the listener port when it is other than ports 80 or 443.
+
+To delegate a WSGI application to run in main Python interpreter, the
+WSGIApplicationGroup directive would need to have been used with the value
+'%{GLOBAL}'. Although the value is '%{GLOBAL}', this translates to the
+empty string seen for the value of 'mod_wsgi.application_group' within the
+WSGI request environment.
+
+The WSGIApplicationGroup directive could also be used to designate a
+specific named sub interpreter rather than that selected automatically.
+
+For newcomers this can all be a bit confusing, which is where the test
+application comes in as you can use it to validate where your WSGI
+application is running is where you intended it to run.
+
+The set of WSGI request environment variables will also show the WSGI
+variables indicating whether process is multithreaded and whether the
+process group is multiprocess or not. For a more complete explanation
+of what that means see documentation of
+:doc:`../user-guides/processes-and-threading`.
+
+Tracking Request and Response
+-----------------------------
+
+Although one can use above test application to display the request
+environment, it is replacing your original WSGI application. Rather than
+replace your existing application you can use a WSGI middleware wrapper
+application which logs the details to the Apache error log instead::
+
+ # Original WSGI application.
+
+ def application(environ, start_response):
+ ...
+
+ # Logging WSGI middleware.
+
+ import pprint
+
+ class LoggingMiddleware:
+
+ def __init__(self, application):
+ self.__application = application
+
+ def __call__(self, environ, start_response):
+ errors = environ['wsgi.errors']
+ pprint.pprint(('REQUEST', environ), stream=errors)
+
+ def _start_response(status, headers, *args):
+ pprint.pprint(('RESPONSE', status, headers), stream=errors)
+ return start_response(status, headers, *args)
+
+ return self.__application(environ, _start_response)
+
+ application = LoggingMiddleware(application)
+
+The output from the middleware would end up in the Apache error log for the
+virtual host, or if no virtual host specific error log file, in the main
+Apache error log file.
+
+For more complicated problems it may also be necessary to track both the
+request and response content as well. A more complicated middleware which
+can log these as well as header information to the file system is as
+follows::
+
+ # Original WSGI application.
+
+ def application(environ, start_response):
+ ...
+
+ # Logging WSGI middleware.
+
+ import threading
+ import pprint
+ import time
+ import os
+
+ class LoggingInstance:
+ def __init__(self, start_response, oheaders, ocontent):
+ self.__start_response = start_response
+ self.__oheaders = oheaders
+ self.__ocontent = ocontent
+
+ def __call__(self, status, headers, *args):
+ pprint.pprint((status, headers)+args), stream=self.__oheaders)
+ self.__oheaders.close()
+
+ self.__write = self.__start_response(status, headers, *args)
+ return self.write
+
+ def __iter__(self):
+ return self
+
+ def write(self, data):
+ self.__ocontent.write(data)
+ self.__ocontent.flush()
+ return self.__write(data)
+
+ def next(self):
+ data = self.__iterable.next()
+ self.__ocontent.write(data)
+ self.__ocontent.flush()
+ return data
+
+ def close(self):
+ if hasattr(self.__iterable, 'close'):
+ self.__iterable.close()
+ self.__ocontent.close()
+
+ def link(self, iterable):
+ self.__iterable = iter(iterable)
+
+ class LoggingMiddleware:
+
+ def __init__(self, application, savedir):
+ self.__application = application
+ self.__savedir = savedir
+ self.__lock = threading.Lock()
+ self.__pid = os.getpid()
+ self.__count = 0
+
+ def __call__(self, environ, start_response):
+ self.__lock.acquire()
+ self.__count += 1
+ count = self.__count
+ self.__lock.release()
+
+ key = "%s-%s-%s" % (time.time(), self.__pid, count)
+
+ iheaders = os.path.join(self.__savedir, key + ".iheaders")
+ iheaders_fp = file(iheaders, 'w')
+
+ icontent = os.path.join(self.__savedir, key + ".icontent")
+ icontent_fp = file(icontent, 'w+b')
+
+ oheaders = os.path.join(self.__savedir, key + ".oheaders")
+ oheaders_fp = file(oheaders, 'w')
+
+ ocontent = os.path.join(self.__savedir, key + ".ocontent")
+ ocontent_fp = file(ocontent, 'w+b')
+
+ errors = environ['wsgi.errors']
+ pprint.pprint(environ, stream=iheaders_fp)
+ iheaders_fp.close()
+
+ length = int(environ.get('CONTENT_LENGTH', '0'))
+ input = environ['wsgi.input']
+ while length != 0:
+ data = input.read(min(4096, length))
+ if data:
+ icontent_fp.write(data)
+ length -= len(data)
+ else:
+ length = 0
+ icontent_fp.flush()
+ icontent_fp.seek(0, os.SEEK_SET)
+ environ['wsgi.input'] = icontent_fp
+
+ iterable = LoggingInstance(start_response, oheaders_fp, ocontent_fp)
+ iterable.link(self.__application(environ, iterable))
+ return iterable
+
+ application = LoggingMiddleware(application, '/tmp/wsgi')
+
+For this middleware, the second argument to the constructor should be a
+preexisting directory. For each request four files will be saved. These
+correspond to input headers, input content, response status and headers,
+and request content.
+
+Poorly Performing Code
+----------------------
+
+The WSGI specification allows any iterable object to be returned as the
+response, so long as the iterable yields string values. That this is the
+case means that one can too easily return an object which satisfies this
+requirement but has some sort of performance related issue.
+
+The worst case of this is where instead of returning a list containing
+strings, a single string is returned. The problem with a string is that
+when it is iterated over, a single character of the string is yielded each
+time. In other words, a single character is written back to the client on
+each loop, with a flush occuring in between to ensure that the character
+has actually been written and isn't just being buffered.
+
+Although for small strings a performance impact may not be noticed, if
+returning large strings the affect on request throughput could be quite
+significant.
+
+Another case which can cause problems is to return a file like object. For
+iteration over a file like object, typically what can occur is that a
+single line within the file is returned each time. If the file is a line
+oriented text file where each line is a of a reasonable length, this may be
+okay, but if the file is a binary file there may not actually be line
+breaks within the file.
+
+For the case where file contains many short lines, throughput would be
+affected much like in the case where a string is returned. For the case
+where the file is just binary data, the result can be that the complete
+file may be read in on the first loop. If the file is large, this could
+cause a large transient spike in memory usage. Once that memory is
+allocated, it will then be retained by the process, albeit that it may be
+reused by the process at a later point.
+
+Because of the performance impacts in terms of throughput and memory usage,
+both these cases should be avoided. For the case of returning a string, it
+should be returned with a single element list. For the case of a file like
+object, the 'wsgi.file_wrapper' extension should be used, or a wrapper
+which suitably breaks the response into chunks.
+
+In order to identify where code may be inadvertantly returning such iterable
+types, the following code can be used::
+
+ import types
+
+ import cStringIO
+ import socket
+ import StringIO
+
+ BAD_ITERABLES = [
+ cStringIO.InputType,
+ socket.SocketType,
+ StringIO.StringIO,
+ types.FileType,
+ types.StringType,
+ ]
+
+ class ValidatingMiddleware:
+
+ def __init__(self, application):
+ self.__application = application
+
+ def __call__(self, environ, start_response):
+ errors = environ['wsgi.errors']
+
+ result = self.__application(environ, start_response)
+
+ value = type(result)
+ if value == types.InstanceType:
+ value = result.__class__
+ if value in BAD_ITERABLES:
+ print >> errors, 'BAD ITERABLE RETURNED: ',
+ print >> errors, 'URL=%s ' % environ['REQUEST_URI'],
+ print >> errors, 'TYPE=%s' % value
+
+ return result
+
+ def application(environ, start_response):
+ ...
+
+ application = ValidatingMiddleware(application)
+
+Error Catching Middleware
+-------------------------
+
+Because mod_wsgi only logs details of uncaught exceptions to the Apache
+error log and returns a generic HTTP 500 "Internal Server Error" response,
+if you want the details of any exception to be displayed in the error
+page and be visible from the browser, you will need to use a WSGI error
+catching middleware component.
+
+One example of WSGI error catching middleware is the ErrorMiddleware class
+from Paste.
+
+ * http://www.pythonpaste.org
+
+This class can be configured not only to catch exceptions and present the
+details to the browser in an error page, it can also be configured to send
+the details of any errors in email to a designated recipient, or log the
+details to an alternate log file.
+
+Being able to have error details sent by email would be useful in a
+production environment or where your application is running on a web
+hosting environment and the Apache error logs would not necessarily be
+closely monitored on a day to day basis. Enabling of that particular
+feature though should possibly only be done when you have some confidence
+in the application else you might end up getting inundated with emails.
+
+To use the error catching middleware from Paste you simply need to wrap
+your existing application with it such that it then becomes the top level
+application entry point::
+
+ def application(environ, start_response):
+ status = '200 OK'
+ output = 'Hello World!\n\n'
+
+ response_headers = [('Content-type', 'text/plain'),
+ ('Content-Length', str(len(output)))]
+ start_response(status, response_headers)
+
+ return [output]
+
+ from paste.exceptions.errormiddleware import ErrorMiddleware
+ application = ErrorMiddleware(application, debug=True)
+
+In addition to displaying information about the Python exception that has
+occurred and the stack traceback, this middleware component will also
+output information about the WSGI environment such that you can see what
+was being passed to the WSGI application. This can be useful if the cause
+of any problem was unexpected values passed in the headers of the HTTP
+request.
+
+Note that error catching middleware is of absolutely no use for trying
+to capture and display in the browser any errors that occur at global scope
+within the WSGI application script when it is being imported. Details of
+any such errors occuring at this point will only be captured in the Apache
+error log files. As much as possible you should avoid performing
+complicated tasks when the WSGI application script file is being imported,
+instead you should only trigger such actions the first time a request is
+received. By doing this you will be able to capture errors in such
+initialisation code with the error catching middleware.
+
+Also note that the debug mode whereby details are displayed in the browser
+should only be used during development and not in a production system. This
+is because details which are displayed may be of use to anyone who may wish
+to compromise your site.
+
+Python Interactive Debugger
+---------------------------
+
+Python debuggers such as implemented by the 'pdb' module can sometimes be
+useful in debugging Python applications, especially where there is a need
+to single step through code and analyse application state at each point.
+Use of such debuggers in web applications can be a bit more tricky than
+normal applications though and especially so with mod_wsgi.
+
+The problem with mod_wsgi is that the Apache web server can create multiple
+child processes to respond to requests. Partly because of this, but also
+just to prevent problems in general, Apache closes off standard input at
+startup. Thus there is no actual way to interact with the Python debugger
+module if it were used.
+
+To get around this requires having complete control of the Apache web
+server that you are using to host your WSGI application. In particular, it
+will be necessary to shutdown the web server and then startup the 'httpd'
+process explicitly in single process debug mode, avoiding the 'apachectl'
+management application altogether::
+
+ $ apachectl stop
+ $ httpd -X
+
+If Apache is normally started as the 'root' user, this also will need to be
+run as the 'root' user otherwise the Apache web server will not have the
+required permissions to write to its log directories etc.
+
+The result of starting the 'httpd' process in this way will be that the
+Apache web server will run everything in one process rather than using
+multiple processes. Further, it will not close off standard input thus
+allowing the Python debugger to be used.
+
+Do note though that one cannot be using the ability of mod_wsgi to run
+your application in a daemon process when doing this. The WSGI application
+must be running within the main Apache process.
+
+To trigger the Python debugger for any call within your code, the following
+customised wrapper for the 'Pdb' class should be used::
+
+ class Debugger:
+
+ def __init__(self, object):
+ self.__object = object
+
+ def __call__(self, *args, **kwargs):
+ import pdb, sys
+ debugger = pdb.Pdb()
+ debugger.use_rawinput = 0
+ debugger.reset()
+ sys.settrace(debugger.trace_dispatch)
+
+ try:
+ return self.__object(*args, **kwargs)
+ finally:
+ debugger.quitting = 1
+ sys.settrace(None)
+
+This might for example be used to wrap the actual WSGI application callable
+object::
+
+ def application(environ, start_response):
+ status = '200 OK'
+ output = 'Hello World!\n\n'
+
+ response_headers = [('Content-type', 'text/plain'),
+ ('Content-Length', str(len(output)))]
+ start_response(status, response_headers)
+
+ return [output]
+
+ application = Debugger(application)
+
+When a request is now received, the Python debugger will be triggered and
+you can interactively debug your application from the window you ran the
+'httpd' process. For example::
+
+ > /usr/local/wsgi/scripts/hello.py(21)application()
+ -> status = '200 OK'
+
+ (Pdb) list
+ 16 finally:
+ 17 debugger.quitting = 1
+ 18 sys.settrace(None)
+ 19
+ 20 def application(environ, start_response):
+ 21 -> status = '200 OK'
+ 22 output = 'Hello World!\n\n'
+ 23
+ 24 response_headers = [('Content-type', 'text/plain'),
+ 25 ('Content-Length', str(len(output)))]
+ 26 start_response(status, response_headers)
+
+ (Pdb) print start_response
+ <built-in method start_response of mod_wsgi.Adapter object at 0x1160180>
+
+ cont
+
+When wishing to allow the request to complete, issue the 'cont' command. If
+wishing to cause the request to abort, issue the 'quit' command. This will
+result in a 'BdbQuit' exception being raised and would result in a HTTP
+500 "Internal Server Error" response being returned to the client. To kill
+off the whole 'httpd' process, after having issued 'cont' or 'quit' to exit
+the debugger, interrupt the process using 'CTRL-C'.
+
+To see what commands the Python debugger accepts, issue the 'help' command
+and also consult the documentation for the 'pdb' module on the Python web
+site.
+
+Note that the Python debugger expects to be able to write to
+``sys.stdout`` to display information to the terminal. Thus if using
+using a Python web framework which replaces ``sys.stdout`` such as
+web.py, you will not be able to use the Python debugger.
+
+Browser Based Debugger
+----------------------
+
+In order to use the Python debugger modules you need to have direct access
+to the host and the Apache web server that is running your WSGI application.
+If your only access to the system is via your web browser this makes the use
+of the full Python debugger impractical.
+
+An alternative to the Python debugger modules which is available is an
+extension of the WSGI error catching middleware previously described. This
+is the EvalException class from Paste. It embodies the error catching
+attributes of the ErrorMiddleware class, but also allows some measure of
+interactive debugging and introspection through the web browser.
+
+As with any WSGI middleware component, to use the class entails creating
+a wrapper around the application you wish to debug::
+
+ def application(environ, start_response):
+ status = '200 OK'
+ output = 'Hello World!\n\n'
+
+ response_headers = [('Content-type', 'text/plain'),
+ ('Content-Length', str(len(output)))]
+ start_response(status, response_headers)
+
+ return [output]
+
+ from paste.evalexception.middleware import EvalException
+ application = EvalException(application)
+
+Like ErrorMiddleware when an unexpected exception occurs a web page is
+presented which shows the location of the error along with the contents of
+the WSGI application environment. Where EvalException is different however
+is that it is possible to inspect the local variables residing within each
+stack frame down to where the error occurred. Further, it is possible to
+enter Python code which can be evaluated within the context of the selected
+stack frame in order to access data or call functions or methods of
+objects.
+
+In order for this to all work requires that subsequent requests back to
+the WSGI application always end up with the same process where the error
+originally occurred. With mod_wsgi this does however present a bit of a
+problem as Apache can create and use multiple child processes to handle
+requests.
+
+Because of this requirement, if you want to be able to use this browser
+based interactive debugger, if running your application in embedded mode of
+mod_wsgi, you will need to configure Apache such that it only starts up one
+child process to handle requests and that it never creates any additional
+processes. The Apache configuration directives required to achieve this are
+as follows::
+
+ StartServers 1
+ ServerLimit 1
+
+The directives must be placed at global scope within the main Apache
+configuration files and will affect the whole Apache web server.
+
+If you are using the worker MPM on a UNIX system, restricting Apache to
+just a single process may not be an issue, at least during development. If
+however you are using the prefork MPM on a UNIX system, you may see issues
+if you are using an AJAX intensive page that relies on being able to
+execute parallel requests, as only one request at a time will be able to be
+handled by the Apache web server.
+
+If using Apache 2.X on a UNIX system, a better approach is to use daemon
+mode of mod_wsgi and delegate your application to run in a single daemon
+process. This process may be single or multithreaded as per any threading
+requirements of your application.
+
+Which ever configuration is used, if the browser based interactive debugger
+is used it should only be used on a development system and should never be
+deployed on a production system or in a web hosting environment. This is
+because the debugger will allow one to execute arbitrary Python code within
+the context of your application from a remote client.
+
+Debugging Crashes With GDB
+--------------------------
+
+In cases where Apache itself crashes for no apparent reason, the above
+techniques are not always particularly useful. This is especially the case
+where the crash occurs in non Python code outside of your WSGI application.
+
+The most common cause of Apache crashing, besides any still latent bugs
+that may exist in mod_wsgi, of which hopefully there aren't any, are shared
+library version mismatches. Another major cause of crashes is third party C
+extension modules for Python which are not compatible with being used in a
+Python sub interpreter which isn't the first interpreter created when
+Python is initialised, or modules which are not compatible with Python sub
+interpreters being destroyed and the module then being used in a new Python
+sub interpreter.
+
+Examples of where shared library version mismatches are known to occur are
+between the version of the 'expat' library used by Apache and that embedded
+within the Python 'pyexpat' module. Another is between the version of the
+MySQL client libraries used by PHP and the Python MySQL module.
+
+Both these can be a cause of crashes where the different components are
+compiled and linked against different versions of the shared library for
+the packages in question. It is vitally important that all packages making
+use of a shared library were compiled against and use the same version of
+a shared library.
+
+Another problematic package is Subversion. In this case there can be
+conflicts between the version of Subversion libraries used by mod_dav_svn
+and the Python Subversion bindings. Certain versions of the Python
+Subversion modules also cause problems because they appear to be
+incompatible with use in a Python sub interpreter which isn't the first
+interpreter created when Python is initialised.
+
+In this latter issue, the sub interpreter problems can often be solved by
+forcing the WSGI application using the Python Subversion modules to run in
+the '%{GLOBAL}' application group. This solution often also resolves issues
+with SWIG generated bindings, especially where the '-thread' option was
+supplied to 'swig' when the bindings were generated.
+
+Whatever the reason, in some cases the only way to determine why Apache or
+Python is crashing is to use a C code debugger such as 'gdb'. Now although
+it is possible to attach 'gdb' to a running process, the preferred method
+for using 'gdb' in conjunction with Apache is to run Apache in single
+process debug mode from within 'gdb'.
+
+To do this it is necessary to first shutdown Apache. The 'gdb' debugger can
+then be started against the 'httpd' executable and then the process started
+up from inside of 'gdb'::
+
+ $ /usr/local/apache/bin/apachectl stop
+ $ sudo gdb /usr/local/apache/bin/httpd
+ GNU gdb 6.1-20040303 (Apple version gdb-384) (Mon Mar 21 00:05:26 GMT 2005)
+ Copyright 2004 Free Software Foundation, Inc.
+ GDB is free software, covered by the GNU General Public License, and you are
+ welcome to change it and/or distribute copies of it under certain conditions.
+ Type "show copying" to see the conditions.
+ There is absolutely no warranty for GDB. Type "show warranty" for details.
+ This GDB was configured as "powerpc-apple-darwin"...Reading symbols for shared
+ libraries ........ done
+
+ (gdb) run -X
+ Starting program: /usr/local/apache/bin/httpd -X
+ Reading symbols for shared libraries .+++ done
+ Reading symbols for shared libraries ..................... done
+
+If Apache is normally started as the 'root' user, this also will need to be
+run as the 'root' user otherwise the Apache web server will not have the
+required permissions to write to its log directories etc.
+
+If Apache was crashing on startup, you should immediately encounter the
+error, otherwise use your web browser to access the URL which is causing
+the crash to occur. You can then commence trying to debug why the crash is
+occuring.
+
+Note that you should ensure that you have not assigned your WSGI
+application to run in a mod_wsgi daemon process using the WSGIDaemonProcess
+and WSGIProcessGroup directives. This is because the above procedure will
+only catch crashes which occur when the application is running in embedded
+mode. If it turns out that the application only crashes when run in mod_wsgi
+daemon mode, an alternate method of using 'gdb' will be required.
+
+In this circumstance you should run Apache as normal, but ensure that you
+only create one mod_wsgi daemon process and have it use only a single
+thread::
+
+ WSGIDaemonProcess debug threads=1
+ WSGIProcessGroup debug
+
+If not running the daemon process as a distinct user where you can tell
+which process it is, then you will also need to ensure that Apache
+!LogLevel directive has been set to 'info'. This is to ensure that
+information about daemon processes created by mod_wsgi are logged to the
+Apache error log. This is necessary, as you will need to consult the Apache
+error logs to determine the process ID of the daemon process that has been
+created for that daemon process group::
+
+ mod_wsgi (pid=666): Starting process 'debug' with threads=1.
+
+Knowing the process ID, you should then run 'gdb', telling it to attach
+directly to the daemon process::
+
+ $ sudo gdb /usr/local/apache/bin/httpd 666
+ GNU gdb 6.1-20040303 (Apple version gdb-384) (Mon Mar 21 00:05:26 GMT 2005)
+ Copyright 2004 Free Software Foundation, Inc.
+ GDB is free software, covered by the GNU General Public License, and you are
+ welcome to change it and/or distribute copies of it under certain conditions.
+ Type "show copying" to see the conditions.
+ There is absolutely no warranty for GDB. Type "show warranty" for details.
+ This GDB was configured as "powerpc-apple-darwin"...Reading symbols for shared
+ libraries ........ done
+
+ /Users/grahamd/666: No such file or directory.
+ Attaching to program: `/usr/local/apache/bin/httpd', process 666.
+ Reading symbols for shared libraries .+++..................... done
+ 0x900c7060 in sigwait ()
+ (gdb) cont
+ Continuing.
+
+Once 'gdb' has been started and attached to the process, then initiate the
+request with the URL that causes the application to crash.
+
+Attaching to the running daemon process can also be useful where a single
+request or the whole process is appearing to hang. In this case one can
+force a stack trace to be output for all running threads to try and
+determine what code is getting stuck. The appropriate gdb command in this
+instance is 'thread apply all bt'::
+
+ sudo gdb /usr/local/apache-2.2/bin/httpd 666
+ GNU gdb 6.3.50-20050815 (Apple version gdb-477) (Sun Apr 30 20:06:22 GMT 2006)
+ Copyright 2004 Free Software Foundation, Inc.
+ GDB is free software, covered by the GNU General Public License, and you are
+ welcome to change it and/or distribute copies of it under certain conditions.
+ Type "show copying" to see the conditions.
+ There is absolutely no warranty for GDB. Type "show warranty" for details.
+ This GDB was configured as "powerpc-apple-darwin"...Reading symbols
+ for shared libraries ....... done
+
+ /Users/grahamd/666: No such file or directory.
+ Attaching to program: `/usr/local/apache/bin/httpd', process 666.
+ Reading symbols for shared libraries .+++..................... done
+ 0x900c7060 in sigwait ()
+ (gdb) thread apply all bt
+
+ Thread 4 (process 666 thread 0xd03):
+ #0 0x9001f7ac in select ()
+ #1 0x004189b4 in apr_pollset_poll (pollset=0x1894650,
+ timeout=-1146117585187099488, num=0xf0182d98, descriptors=0xf0182d9c)
+ at poll/unix/select.c:363
+ #2 0x002a57f0 in wsgi_daemon_thread (thd=0x1889660, data=0x18895e8)
+ at mod_wsgi.c:6980
+ #3 0x9002bc28 in _pthread_body ()
+
+ Thread 3 (process 666 thread 0xc03):
+ #0 0x9001f7ac in select ()
+ #1 0x0041d224 in apr_sleep (t=1000000) at time/unix/time.c:246
+ #2 0x002a2b10 in wsgi_deadlock_thread (thd=0x0, data=0x2aee68) at
+ mod_wsgi.c:7119
+ #3 0x9002bc28 in _pthread_body ()
+
+ Thread 2 (process 666 thread 0xb03):
+ #0 0x9001f7ac in select ()
+ #1 0x0041d224 in apr_sleep (t=299970002) at time/unix/time.c:246
+ #2 0x002a2dec in wsgi_monitor_thread (thd=0x0, data=0x18890e8) at
+ mod_wsgi.c:7197
+ #3 0x9002bc28 in _pthread_body ()
+
+ Thread 1 (process 666 thread 0x203):
+ #0 0x900c7060 in sigwait ()
+ #1 0x0041ba9c in apr_signal_thread (signal_handler=0x2a29a0
+ <wsgi_check_signal>) at threadproc/unix/signals.c:383
+ #2 0x002a3728 in wsgi_start_process (p=0x1806418, daemon=0x18890e8)
+ at mod_wsgi.c:7311
+ #3 0x002a6a4c in wsgi_hook_init (pconf=0x1806418, ptemp=0x0,
+ plog=0xc8, s=0x18be8d4) at mod_wsgi.c:7716
+ #4 0x0000a5b0 in ap_run_post_config (pconf=0x1806418, plog=0x1844418,
+ ptemp=0x180e418, s=0x180da78) at config.c:91
+ #5 0x000033d4 in main (argc=3, argv=0xbffffa8c) at main.c:706
+
+It is suggested when trying to debug such issues that the daemon process be
+made to run with only a single thread. This will reduce how many stack
+traces one needs to analyse.
+
+If you are running with multiple processes within the daemon process group
+and all requests are hanging, you will need to get a snapshot of what is
+happening in all processes in the daemon process group. Because doing this
+by hand will be tedious, it is better to automate it.
+
+To automate capturing the stack traces, first create a file called 'gdb.cmds'
+which contains the following::
+
+ set pagination 0
+ thread apply all bt
+ detach
+ quit
+
+This can then be used in conjunction with 'gdb' to avoid needing to enter
+the commands manually. For example::
+
+ sudo gdb /usr/local/apache-2.2/bin/httpd -x gdb.cmds -p 666
+
+To be able to automate this further and apply it to all processes in a
+daemon process group, then first off ensure that daemon processes are named
+in 'ps' output by using the 'display-name' option to WSGIDaemonProcess
+directive.
+
+For example, to apply default naming strategy as implemented by mod_wsgi, use::
+
+ WSGIDaemonProcess xxx display-name=%{GLOBAL}
+
+In the output of a BSD derived 'ps' command, this will now show the process
+as being named '(wsgi:xxx)'::
+
+ $ ps -cxo command,pid | grep wsgi
+ (wsgi:xxx) 666
+
+Note that the name may be truncated as the resultant name can be no longer
+than what was the length of the original executable path for Apache. You
+may therefore like to name it explicitly::
+
+ WSGIDaemonProcess xxx display-name=(wsgi:xxx)
+
+Having named the processes in the daemon process group, we can now parse the
+output of 'ps' to identify the process and apply the 'gdb' command script to
+each::
+
+ for pid in `ps -cxo command,pid | awk '{ if ($0 ~ /wsgi:xxx/ && $1 !~ /grep/) print $NF }'`; do sudo gdb -x gdb.cmds -p $pid; done
+
+The actual name given to the daemon process group using the 'display-name'
+option should be replaced in this command line. That is, change 'wsgi:xxx'
+appropriately.
+
+If you are having problems with process in daemon process groups hanging,
+you might consider implementing a monitoring system which automatically
+detects somehow when the processes are no longer responding to requests and
+automatically trigger this dump of the stack traces before restarting the
+daemon process group or Apache.
+
+Extracting Python Stack Traces
+------------------------------
+
+Using gdb to get stack traces as described above only gives you information
+about what is happening at the C code level. This will not tell where in the
+actual Python code execution was at. Your only clue is going to be where a
+call out was being made to some distinct C function in a C extension module
+for Python.
+
+One can get stack traces for Python code by using::
+
+ def _stacktraces():
+ code = []
+ for threadId, stack in sys._current_frames().items():
+ code.append("\n# ThreadID: %s" % threadId)
+ for filename, lineno, name, line in traceback.extract_stack(stack):
+ code.append('File: "%s", line %d, in %s' % (filename,
+ lineno, name))
+ if line:
+ code.append(" %s" % (line.strip()))
+
+ for line in code:
+ print >> sys.stderr, line
+
+The caveat here obviously is that the process has to still be running. There
+is also the issue of how you trigger that function to dump stack traces for
+executing Python threads.
+
+If the problem you have is that some request handler threads are stuck,
+either blocked, or stuck in an infinite loop, and you want to know what they
+are doing, then so long as there are still some handler threads left and
+the application is still responding to requests, then you could trigger it
+from a request handler triggered by making a request against a specific URL.
+
+This though depends on you only running your application within a single
+process because as soon as you have multiple processes you have no gaurantee
+that a request will go to the process you want to debug.
+
+A better method therefore is to have a perpetually running background thread
+which monitors for a specific file in the file system. When that file is
+created or the modification time changes, then the background thread would
+dump the stack traces for the process.
+
+Sample code which takes this approach is included below. This code could be
+placed temporarily at the end of your WSGI script file if you know you are
+going to need it because of a recurring problem::
+
+ import os
+ import sys
+ import time
+ import signal
+ import threading
+ import atexit
+ import Queue
+ import traceback
+
+ FILE = '/tmp/dump-stack-traces.txt'
+
+ _interval = 1.0
+
+ _running = False
+ _queue = Queue.Queue()
+ _lock = threading.Lock()
+
+ def _stacktraces():
+ code = []
+ for threadId, stack in sys._current_frames().items():
+ code.append("\n# ProcessId: %s" % os.getpid())
+ code.append("# ThreadID: %s" % threadId)
+ for filename, lineno, name, line in traceback.extract_stack(stack):
+ code.append('File: "%s", line %d, in %s' % (filename,
+ lineno, name))
+ if line:
+ code.append(" %s" % (line.strip()))
+
+ for line in code:
+ print >> sys.stderr, line
+
+ try:
+ mtime = os.path.getmtime(FILE)
+ except:
+ mtime = None
+
+ def _monitor():
+ while 1:
+ global mtime
+
+ try:
+ current = os.path.getmtime(FILE)
+ except:
+ current = None
+
+ if current != mtime:
+ mtime = current
+ _stacktraces()
+
+ # Go to sleep for specified interval.
+
+ try:
+ return _queue.get(timeout=_interval)
+ except:
+ pass
+
+ _thread = threading.Thread(target=_monitor)
+ _thread.setDaemon(True)
+
+ def _exiting():
+ try:
+ _queue.put(True)
+ except:
+ pass
+ _thread.join()
+
+ atexit.register(_exiting)
+
+ def _start(interval=1.0):
+ global _interval
+ if interval < _interval:
+ _interval = interval
+
+ global _running
+ _lock.acquire()
+ if not _running:
+ prefix = 'monitor (pid=%d):' % os.getpid()
+ print >> sys.stderr, '%s Starting stack trace monitor.' % prefix
+ _running = True
+ _thread.start()
+ _lock.release()
+
+ _start()
+
+Once your WSGI script file has been loaded, then touching the file
+'/tmp/dump-stack-traces.txt' will cause stack traces for active Python
+threads to be output to the Apache error log.
+
+Note that the sample code doesn't deal with possibility that with multiple
+processes for same application, that all processes may attempt to dump
+information at the same time. As such, you may get interleaving of output
+from multiple processes in Apache error logs at the same time.
+
+What you may want to do is modify this code to dump out to some special
+directory, distinct files containing the trace where the names of the file
+include the process ID and a date/time. That way each will be separate.
+
+An example of what one might expect to see from the above code is as
+follows::
+
+ # ProcessId: 666
+ # ThreadID: 4352905216
+ File: "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line 497, in __bootstrap
+ self.__bootstrap_inner()
+ File: "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line 522, in __bootstrap_inner
+ self.run()
+ File: "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line 477, in run
+ self.__target(*self.__args, **self.__kwargs)
+ File: "/Library/WebServer/Sites/django-1/htdocs/project.wsgi", line 72, in _monitor
+ _stacktraces()
+ File: "/Library/WebServer/Sites/django-1/htdocs/project.wsgi", line 47, in _stacktraces
+ for filename, lineno, name, line in traceback.extract_stack(stack):
+
+ # ThreadID: 4322832384
+ File: "/Library/WebServer/Sites/django-1/htdocs/project.wsgi", line 21, in application
+ return _application(environ, start_response)
+ File: "/Library/WebServer/Sites/django-1/lib/python2.6/site-packages/django/core/handlers/wsgi.py", line 245, in __call__
+ response = middleware_method(request, response)
+ File: "/Library/WebServer/Sites/django-1/lib/python2.6/site-packages/django/contrib/sessions/middleware.py", line 36, in process_response
+ request.session.save()
+ File: "/Library/WebServer/Sites/django-1/lib/python2.6/site-packages/django/contrib/sessions/backends/db.py", line 63, in save
+ obj.save(force_insert=must_create, using=using)
+ File: "/Library/WebServer/Sites/django-1/lib/python2.6/site-packages/django/db/models/base.py", line 434, in save
+ self.save_base(using=using, force_insert=force_insert, force_update=force_update)
+ File: "/Library/WebServer/Sites/django-1/lib/python2.6/site-packages/django/db/models/base.py", line 527, in save_base
+ result = manager._insert(values, return_id=update_pk, using=using)
+ File: "/Library/WebServer/Sites/django-1/lib/python2.6/site-packages/django/db/models/manager.py", line 195, in _insert
+ return insert_query(self.model, values, **kwargs)
+ File: "/Library/WebServer/Sites/django-1/lib/python2.6/site-packages/django/db/models/query.py", line 1479, in insert_query
+ return query.get_compiler(using=using).execute_sql(return_id)
+ File: "/Library/WebServer/Sites/django-1/lib/python2.6/site-packages/django/db/models/sql/compiler.py", line 783, in execute_sql
+ cursor = super(SQLInsertCompiler, self).execute_sql(None)
+ File: "/Library/WebServer/Sites/django-1/lib/python2.6/site-packages/django/db/models/sql/compiler.py", line 727, in execute_sql
+ cursor.execute(sql, params)
+ File: "/Library/WebServer/Sites/django-1/lib/python2.6/site-packages/debug_toolbar/panels/sql.py", line 95, in execute
+ stacktrace = tidy_stacktrace(traceback.extract_stack())
+ File: "/Library/WebServer/Sites/django-1/lib/python2.6/site-packages/debug_toolbar/panels/sql.py", line 40, in tidy_stacktrace
+ s_path = os.path.realpath(s[0])
+ File: "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/posixpath.py", line 355, in realpath
+ if islink(component):
+ File: "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/posixpath.py", line 132, in islink
+ st = os.lstat(path)
+
+Note that one of the displayed threads will be that for the thread which is
+dumping the stack traces. That stack trace can obviously be ignored.
+
+One could extend the above recipe in more elaborate ways by using a WSGI
+middleware that capture details of each request from the WSGI environment
+and also dumping out from that the URL for the request being handled by
+any threads. This may assist in working out whether problems are related
+to a specific URL.