diff options
Diffstat (limited to 'docs/peps/pep-0258.txt')
-rw-r--r-- | docs/peps/pep-0258.txt | 1029 |
1 files changed, 1029 insertions, 0 deletions
diff --git a/docs/peps/pep-0258.txt b/docs/peps/pep-0258.txt new file mode 100644 index 000000000..0d646bb82 --- /dev/null +++ b/docs/peps/pep-0258.txt @@ -0,0 +1,1029 @@ +PEP: 258 +Title: Docutils Design Specification +Version: $Revision$ +Last-Modified: $Date$ +Author: David Goodger <goodger@users.sourceforge.net> +Discussions-To: <doc-sig@python.org> +Status: Draft +Type: Standards Track +Content-Type: text/x-rst +Requires: 256, 257 +Created: 31-May-2001 +Post-History: 13-Jun-2001 + + +========== + Abstract +========== + +This PEP documents design issues and implementation details for +Docutils, a Python Docstring Processing System (DPS). The rationale +and high-level concepts of a DPS are documented in PEP 256, "Docstring +Processing System Framework" [#PEP-256]_. Also see PEP 256 for a +"Road Map to the Docstring PEPs". + +Docutils is being designed modularly so that any of its components can +be replaced easily. In addition, Docutils is not limited to the +processing of Python docstrings; it processes standalone documents as +well, in several contexts. + +No changes to the core Python language are required by this PEP. Its +deliverables consist of a package for the standard library and its +documentation. + + +=============== + Specification +=============== + +Docutils Project Model +====================== + +Project components and data flow:: + + +---------------------------+ + | Docutils: | + | docutils.core.Publisher, | + | docutils.core.publish_*() | + +---------------------------+ + / | \ + / | \ + 1,3,5 / 6 | \ 7 + +--------+ +-------------+ +--------+ + | READER | ----> | TRANSFORMER | ====> | WRITER | + +--------+ +-------------+ +--------+ + / \\ | + / \\ | + 2 / 4 \\ 8 | + +-------+ +--------+ +--------+ + | INPUT | | PARSER | | OUTPUT | + +-------+ +--------+ +--------+ + +The numbers above each component indicate the path a document's data +takes. Double-width lines between Reader & Parser and between +Transformer & Writer indicate that data sent along these paths should +be standard (pure & unextended) Docutils doc trees. Single-width +lines signify that internal tree extensions or completely unrelated +representations are possible, but they must be supported at both ends. + + +Publisher +--------- + +The ``docutils.core`` module contains a "Publisher" facade class and +several convenience functions: "publish_cmdline()" (for command-line +front ends), "publish_file()" (for programmatic use with file-like +I/O), and "publish_string()" (for programmatic use with string I/O). +The Publisher class encapsulates the high-level logic of a Docutils +system. The Publisher class has overall responsibility for +processing, controlled by the ``Publisher.publish()`` method: + +1. Set up internal settings (may include config files & command-line + options) and I/O objects. + +2. Call the Reader object to read data from the source Input object + and parse the data with the Parser object. A document object is + returned. + +3. Set up and apply transforms via the Transformer object attached to + the document. + +4. Call the Writer object which translates the document to the final + output format and writes the formatted data to the destination + Output object. Depending on the Output object, the output may be + returned from the Writer, and then from the ``publish()`` method. + +Calling the "publish" function (or instantiating a "Publisher" object) +with component names will result in default behavior. For custom +behavior (customizing component settings), create custom component +objects first, and pass *them* to the Publisher or ``publish_*`` +convenience functions. + + +Readers +------- + +Readers understand the input context (where the data is coming from), +send the whole input or discrete "chunks" to the parser, and provide +the context to bind the chunks together back into a cohesive whole. + +Each reader is a module or package exporting a "Reader" class with a +"read" method. The base "Reader" class can be found in the +``docutils/readers/__init__.py`` module. + +Most Readers will have to be told what parser to use. So far (see the +list of examples below), only the Python Source Reader ("PySource"; +still incomplete) will be able to determine the parser on its own. + +Responsibilities: + +* Get input text from the source I/O. + +* Pass the input text to the parser, along with a fresh `document + tree`_ root. + +Examples: + +* Standalone (Raw/Plain): Just read a text file and process it. + The reader needs to be told which parser to use. + + The "Standalone Reader" has been implemented in module + ``docutils.readers.standalone``. + +* Python Source: See `Python Source Reader`_ below. This Reader is + currently in development in the Docutils sandbox. + +* Email: RFC-822 headers, quoted excerpts, signatures, MIME parts. + +* PEP: RFC-822 headers, "PEP xxxx" and "RFC xxxx" conversion to URIs. + The "PEP Reader" has been implemented in module + ``docutils.readers.pep``; see PEP 287 and PEP 12. + +* Wiki: Global reference lookups of "wiki links" incorporated into + transforms. (CamelCase only or unrestricted?) Lazy + indentation? + +* Web Page: As standalone, but recognize meta fields as meta tags. + Support for templates of some sort? (After ``<body>``, before + ``</body>``?) + +* FAQ: Structured "question & answer(s)" constructs. + +* Compound document: Merge chapters into a book. Master manifest + file? + + +Parsers +------- + +Parsers analyze their input and produce a Docutils `document tree`_. +They don't know or care anything about the source or destination of +the data. + +Each input parser is a module or package exporting a "Parser" class +with a "parse" method. The base "Parser" class can be found in the +``docutils/parsers/__init__.py`` module. + +Responsibilities: Given raw input text and a doctree root node, +populate the doctree by parsing the input text. + +Example: The only parser implemented so far is for the +reStructuredText markup. It is implemented in the +``docutils/parsers/rst/`` package. + +The development and integration of other parsers is possible and +encouraged. + + +.. _transforms: + +Transformer +----------- + +The Transformer class, in ``docutils/transforms/__init__.py``, stores +transforms and applies them to documents. A transformer object is +attached to every new document tree. The Publisher_ calls +``Transformer.apply_transforms()`` to apply all stored transforms to +the document tree. Transforms change the document tree from one form +to another, add to the tree, or prune it. Transforms resolve +references and footnote numbers, process interpreted text, and do +other context-sensitive processing. + +Some transforms are specific to components (Readers, Parser, Writers, +Input, Output). Standard component-specific transforms are specified +in the ``default_transforms`` attribute of component classes. After +the Reader has finished processing, the Publisher_ calls +``Transformer.populate_from_components()`` with a list of components +and all default transforms are stored. + +Each transform is a class in a module in the ``docutils/transforms/`` +package, a subclass of ``docutils.tranforms.Transform``. Transform +classes each have a ``default_priority`` attribute which is used by +the Transformer to apply transforms in order (low to high). The +default priority can be overridden when adding transforms to the +Transformer object. + +Transformer responsibilities: + +* Apply transforms to the document tree, in priority order. + +* Store a mapping of component type name ('reader', 'writer', etc.) to + component objects. These are used by certain transforms (such as + "components.Filter") to determine suitability. + +Transform responsibilities: + +* Modify a doctree in-place, either purely transforming one structure + into another, or adding new structures based on the doctree and/or + external data. + +Examples of transforms (in the ``docutils/transforms/`` package): + +* frontmatter.DocInfo: Conversion of document metadata (bibliographic + information). + +* references.AnonymousHyperlinks: Resolution of anonymous references + to corresponding targets. + +* parts.Contents: Generates a table of contents for a document. + +* document.Merger: Combining multiple populated doctrees into one. + (Not yet implemented or fully understood.) + +* document.Splitter: Splits a document into a tree-structure of + subdocuments, perhaps by section. It will have to transform + references appropriately. (Neither implemented not remotely + understood.) + +* components.Filter: Includes or excludes elements which depend on a + specific Docutils component. + + +Writers +------- + +Writers produce the final output (HTML, XML, TeX, etc.). Writers +translate the internal `document tree`_ structure into the final data +format, possibly running Writer-specific transforms_ first. + +By the time the document gets to the Writer, it should be in final +form. The Writer's job is simply (and only) to translate from the +Docutils doctree structure to the target format. Some small +transforms may be required, but they should be local and +format-specific. + +Each writer is a module or package exporting a "Writer" class with a +"write" method. The base "Writer" class can be found in the +``docutils/writers/__init__.py`` module. + +Responsibilities: + +* Translate doctree(s) into specific output formats. + + - Transform references into format-native forms. + +* Write the translated output to the destination I/O. + +Examples: + +* XML: Various forms, such as: + + - Docutils XML (an expression of the internal document tree, + implemented as ``docutils.writers.docutils_xml``). + + - DocBook (being implemented in the Docutils sandbox). + +* HTML (XHTML implemented as ``docutils.writers.html4css1``). + +* PDF (a ReportLabs interface is being developed in the Docutils + sandbox). + +* TeX (a LaTeX Writer is being implemented in the sandbox). + +* Docutils-native pseudo-XML (implemented as + ``docutils.writers.pseudoxml``, used for testing). + +* Plain text + +* reStructuredText? + + +Input/Output +------------ + +I/O classes provide a uniform API for low-level input and output. +Subclasses will exist for a variety of input/output mechanisms. +However, they can be considered an implementation detail. Most +applications should be satisfied using one of the convenience +functions associated with the Publisher_. + +I/O classes are currently in the preliminary stages; there's a lot of +work yet to be done. Issues: + +* How to represent multi-file input (files & directories) in the API? + +* How to represent multi-file output? Perhaps "Writer" variants, one + for each output distribution type? Or Output objects with + associated transforms? + +Responsibilities: + +* Read data from the input source (Input objects) or write data to the + output destination (Output objects). + +Examples of input sources: + +* A single file on disk or a stream (implemented as + ``docutils.io.FileInput``). + +* Multiple files on disk (``MultiFileInput``?). + +* Python source files: modules and packages. + +* Python strings, as received from a client application + (implemented as ``docutils.io.StringInput``). + +Examples of output destinations: + +* A single file on disk or a stream (implemented as + ``docutils.io.FileOutput``). + +* A tree of directories and files on disk. + +* A Python string, returned to a client application (implemented as + ``docutils.io.StringOutput``). + +* No output; useful for programmatic applications where only a portion + of the normal output is to be used (implemented as + ``docutils.io.NullOutput``). + +* A single tree-shaped data structure in memory. + +* Some other set of data structures in memory. + + +Docutils Package Structure +========================== + +* Package "docutils". + + - Module "__init__.py" contains: class "Component", a base class for + Docutils components; class "SettingsSpec", a base class for + specifying runtime settings (used by docutils.frontend); and class + "TransformSpec", a base class for specifying transforms. + + - Module "docutils.core" contains facade class "Publisher" and + convenience functions. See `Publisher`_ above. + + - Module "docutils.frontend" provides runtime settings support, for + programmatic use and front-end tools (including configuration file + support, and command-line argument and option processing). + + - Module "docutils.io" provides a uniform API for low-level input + and output. See `Input/Output`_ above. + + - Module "docutils.nodes" contains the Docutils document tree + element class library plus tree-traversal Visitor pattern base + classes. See `Document Tree`_ below. + + - Module "docutils.statemachine" contains a finite state machine + specialized for regular-expression-based text filters and parsers. + The reStructuredText parser implementation is based on this + module. + + - Module "docutils.urischemes" contains a mapping of known URI + schemes ("http", "ftp", "mail", etc.). + + - Module "docutils.utils" contains utility functions and classes, + including a logger class ("Reporter"; see `Error Handling`_ + below). + + - Package "docutils.parsers": markup parsers_. + + - Function "get_parser_class(parser_name)" returns a parser module + by name. Class "Parser" is the base class of specific parsers. + (``docutils/parsers/__init__.py``) + + - Package "docutils.parsers.rst": the reStructuredText parser. + + - Alternate markup parsers may be added. + + See `Parsers`_ above. + + - Package "docutils.readers": context-aware input readers. + + - Function "get_reader_class(reader_name)" returns a reader module + by name or alias. Class "Reader" is the base class of specific + readers. (``docutils/readers/__init__.py``) + + - Module "docutils.readers.standalone" reads independent document + files. + + - Module "docutils.readers.pep" reads PEPs (Python Enhancement + Proposals). + + - Module "docutils.readers.doctree" is used to re-read a + previously stored document tree for reprocessing. + + - Readers to be added for: Python source code (structure & + docstrings), email, FAQ, and perhaps Wiki and others. + + See `Readers`_ above. + + - Package "docutils.writers": output format writers. + + - Function "get_writer_class(writer_name)" returns a writer module + by name. Class "Writer" is the base class of specific writers. + (``docutils/writers/__init__.py``) + + - Package "docutils.writers.html4css1" is a simple HyperText + Markup Language document tree writer for HTML 4.01 and CSS1. + + - Package "docutils.writers.pep_html" generates HTML from + reStructuredText PEPs. + + - Package "docutils.writers.s5_html" generates S5/HTML slide + shows. + + - Package "docutils.writers.latex2e" writes LaTeX. + + - Package "docutils.writers.newlatex2e" also writes LaTeX; it is a + new implementation. + + - Module "docutils.writers.docutils_xml" writes the internal + document tree in XML form. + + - Module "docutils.writers.pseudoxml" is a simple internal + document tree writer; it writes indented pseudo-XML. + + - Module "docutils.writers.null" is a do-nothing writer; it is + used for specialized purposes such as storing the internal + document tree. + + - Writers to be added: HTML 3.2 or 4.01-loose, XML (various forms, + such as DocBook), PDF, plaintext, reStructuredText, and perhaps + others. + + Subpackages of "docutils.writers" contain modules and data files + (such as stylesheets) that support the individual writers. + + See `Writers`_ above. + + - Package "docutils.transforms": tree transform classes. + + - Class "Transformer" stores transforms and applies them to + document trees. (``docutils/transforms/__init__.py``) + + - Class "Transform" is the base class of specific transforms. + (``docutils/transforms/__init__.py``) + + - Each module contains related transform classes. + + See `Transforms`_ above. + + - Package "docutils.languages": Language modules contain + language-dependent strings and mappings. They are named for their + language identifier (as defined in `Choice of Docstring Format`_ + below), converting dashes to underscores. + + - Function "get_language(language_code)", returns matching + language module. (``docutils/languages/__init__.py``) + + - Modules: en.py (English), de.py (German), fr.py (French), it.py + (Italian), sk.py (Slovak), sv.py (Swedish). + + - Other languages to be added. + +* Third-party modules: "extras" directory. These modules are + installed only if they're not already present in the Python + installation. + + - ``extras/optparse.py`` and ``extras/textwrap.py`` provide + option parsing and command-line help; from Greg Ward's + http://optik.sf.net/ project, included for convenience. + + - ``extras/roman.py`` contains Roman numeral conversion routines. + + +Front-End Tools +=============== + +The ``tools/`` directory contains several front ends for common +Docutils processing. See `Docutils Front-End Tools`_ for details. + +.. _Docutils Front-End Tools: + http://docutils.sourceforge.net/docs/user/tools.html + + +Document Tree +============= + +A single intermediate data structure is used internally by Docutils, +in the interfaces between components; it is defined in the +``docutils.nodes`` module. It is not required that this data +structure be used *internally* by any of the components, just +*between* components as outlined in the diagram in the `Docutils +Project Model`_ above. + +Custom node types are allowed, provided that either (a) a transform +converts them to standard Docutils nodes before they reach the Writer +proper, or (b) the custom node is explicitly supported by certain +Writers, and is wrapped in a filtered "pending" node. An example of +condition (a) is the `Python Source Reader`_ (see below), where a +"stylist" transform converts custom nodes. The HTML ``<meta>`` tag is +an example of condition (b); it is supported by the HTML Writer but +not by others. The reStructuredText "meta" directive creates a +"pending" node, which contains knowledge that the embedded "meta" node +can only be handled by HTML-compatible writers. The "pending" node is +resolved by the ``docutils.transforms.components.Filter`` transform, +which checks that the calling writer supports HTML; if it doesn't, the +"pending" node (and enclosed "meta" node) is removed from the +document. + +The document tree data structure is similar to a DOM tree, but with +specific node names (classes) instead of DOM's generic nodes. The +schema is documented in an XML DTD (eXtensible Markup Language +Document Type Definition), which comes in two parts: + +* the Docutils Generic DTD, docutils.dtd_, and + +* the OASIS Exchange Table Model, soextbl.dtd_. + +The DTD defines a rich set of elements, suitable for many input and +output formats. The DTD retains all information necessary to +reconstruct the original input text, or a reasonable facsimile +thereof. + +See `The Docutils Document Tree`_ for details (incomplete). + + +Error Handling +============== + +When the parser encounters an error in markup, it inserts a system +message (DTD element "system_message"). There are five levels of +system messages: + +* Level-0, "DEBUG": an internal reporting issue. There is no effect + on the processing. Level-0 system messages are handled separately + from the others. + +* Level-1, "INFO": a minor issue that can be ignored. There is little + or no effect on the processing. Typically level-1 system messages + are not reported. + +* Level-2, "WARNING": an issue that should be addressed. If ignored, + there may be minor problems with the output. Typically level-2 + system messages are reported but do not halt processing. + +* Level-3, "ERROR": a major issue that should be addressed. If + ignored, the output will contain unpredictable errors. Typically + level-3 system messages are reported but do not halt processing. + +* Level-4, "SEVERE": a critical error that must be addressed. + Typically level-4 system messages are turned into exceptions which + do halt processing. If ignored, the output will contain severe + errors. + +Although the initial message levels were devised independently, they +have a strong correspondence to `VMS error condition severity +levels`_; the names in quotes for levels 1 through 4 were borrowed +from VMS. Error handling has since been influenced by the `log4j +project`_. + + +Python Source Reader +==================== + +The Python Source Reader ("PySource") is the Docutils component that +reads Python source files, extracts docstrings in context, then +parses, links, and assembles the docstrings into a cohesive whole. It +is a major and non-trivial component, currently under experimental +development in the Docutils sandbox. High-level design issues are +presented here. + + +Processing Model +---------------- + +This model will evolve over time, incorporating experience and +discoveries. + +1. The PySource Reader uses an Input class to read in Python packages + and modules, into a tree of strings. + +2. The Python modules are parsed, converting the tree of strings into + a tree of abstract syntax trees with docstring nodes. + +3. The abstract syntax trees are converted into an internal + representation of the packages/modules. Docstrings are extracted, + as well as code structure details. See `AST Mining`_ below. + Namespaces are constructed for lookup in step 6. + +4. One at a time, the docstrings are parsed, producing standard + Docutils doctrees. + +5. PySource assembles all the individual docstrings' doctrees into a + Python-specific custom Docutils tree paralleling the + package/module/class structure; this is a custom Reader-specific + internal representation (see the `Docutils Python Source DTD`_). + Namespaces must be merged: Python identifiers, hyperlink targets. + +6. Cross-references from docstrings (interpreted text) to Python + identifiers are resolved according to the Python namespace lookup + rules. See `Identifier Cross-References`_ below. + +7. A "Stylist" transform is applied to the custom doctree (by the + Transformer_), custom nodes are rendered using standard nodes as + primitives, and a standard document tree is emitted. See `Stylist + Transforms`_ below. + +8. Other transforms are applied to the standard doctree by the + Transformer_. + +9. The standard doctree is sent to a Writer, which translates the + document into a concrete format (HTML, PDF, etc.). + +10. The Writer uses an Output class to write the resulting data to its + destination (disk file, directories and files, etc.). + + +AST Mining +---------- + +Abstract Syntax Tree mining code will be written (or adapted) that +scans a parsed Python module, and returns an ordered tree containing +the names, docstrings (including attribute and additional docstrings; +see below), and additional info (in parentheses below) of all of the +following objects: + +* packages +* modules +* module attributes (+ initial values) +* classes (+ inheritance) +* class attributes (+ initial values) +* instance attributes (+ initial values) +* methods (+ parameters & defaults) +* functions (+ parameters & defaults) + +(Extract comments too? For example, comments at the start of a module +would be a good place for bibliographic field lists.) + +In order to evaluate interpreted text cross-references, namespaces for +each of the above will also be required. + +See the python-dev/docstring-develop thread "AST mining", started on +2001-08-14. + + +Docstring Extraction Rules +-------------------------- + +1. What to examine: + + a) If the "``__all__``" variable is present in the module being + documented, only identifiers listed in "``__all__``" are + examined for docstrings. + + b) In the absence of "``__all__``", all identifiers are examined, + except those whose names are private (names begin with "_" but + don't begin and end with "__"). + + c) 1a and 1b can be overridden by runtime settings. + +2. Where: + + Docstrings are string literal expressions, and are recognized in + the following places within Python modules: + + a) At the beginning of a module, function definition, class + definition, or method definition, after any comments. This is + the standard for Python ``__doc__`` attributes. + + b) Immediately following a simple assignment at the top level of a + module, class definition, or ``__init__`` method definition, + after any comments. See `Attribute Docstrings`_ below. + + c) Additional string literals found immediately after the + docstrings in (a) and (b) will be recognized, extracted, and + concatenated. See `Additional Docstrings`_ below. + + d) @@@ 2.2-style "properties" with attribute docstrings? Wait for + syntax? + +3. How: + + Whenever possible, Python modules should be parsed by Docutils, not + imported. There are several reasons: + + - Importing untrusted code is inherently insecure. + + - Information from the source is lost when using introspection to + examine an imported module, such as comments and the order of + definitions. + + - Docstrings are to be recognized in places where the byte-code + compiler ignores string literal expressions (2b and 2c above), + meaning importing the module will lose these docstrings. + + Of course, standard Python parsing tools such as the "parser" + library module should be used. + + When the Python source code for a module is not available + (i.e. only the ``.pyc`` file exists) or for C extension modules, to + access docstrings the module can only be imported, and any + limitations must be lived with. + +Since attribute docstrings and additional docstrings are ignored by +the Python byte-code compiler, no namespace pollution or runtime bloat +will result from their use. They are not assigned to ``__doc__`` or +to any other attribute. The initial parsing of a module may take a +slight performance hit. + + +Attribute Docstrings +'''''''''''''''''''' + +(This is a simplified version of PEP 224 [#PEP-224]_.) + +A string literal immediately following an assignment statement is +interpreted by the docstring extraction machinery as the docstring of +the target of the assignment statement, under the following +conditions: + +1. The assignment must be in one of the following contexts: + + a) At the top level of a module (i.e., not nested inside a compound + statement such as a loop or conditional): a module attribute. + + b) At the top level of a class definition: a class attribute. + + c) At the top level of the "``__init__``" method definition of a + class: an instance attribute. Instance attributes assigned in + other methods are assumed to be implementation details. (@@@ + ``__new__`` methods?) + + d) A function attribute assignment at the top level of a module or + class definition. + + Since each of the above contexts are at the top level (i.e., in the + outermost suite of a definition), it may be necessary to place + dummy assignments for attributes assigned conditionally or in a + loop. + +2. The assignment must be to a single target, not to a list or a tuple + of targets. + +3. The form of the target: + + a) For contexts 1a and 1b above, the target must be a simple + identifier (not a dotted identifier, a subscripted expression, + or a sliced expression). + + b) For context 1c above, the target must be of the form + "``self.attrib``", where "``self``" matches the "``__init__``" + method's first parameter (the instance parameter) and "attrib" + is a simple identifier as in 3a. + + c) For context 1d above, the target must be of the form + "``name.attrib``", where "``name``" matches an already-defined + function or method name and "attrib" is a simple identifier as + in 3a. + +Blank lines may be used after attribute docstrings to emphasize the +connection between the assignment and the docstring. + +Examples:: + + g = 'module attribute (module-global variable)' + """This is g's docstring.""" + + class AClass: + + c = 'class attribute' + """This is AClass.c's docstring.""" + + def __init__(self): + """Method __init__'s docstring.""" + + self.i = 'instance attribute' + """This is self.i's docstring.""" + + def f(x): + """Function f's docstring.""" + return x**2 + + f.a = 1 + """Function attribute f.a's docstring.""" + + +Additional Docstrings +''''''''''''''''''''' + +(This idea was adapted from PEP 216 [#PEP-216]_.) + +Many programmers would like to make extensive use of docstrings for +API documentation. However, docstrings do take up space in the +running program, so some programmers are reluctant to "bloat up" their +code. Also, not all API documentation is applicable to interactive +environments, where ``__doc__`` would be displayed. + +Docutils' docstring extraction tools will concatenate all string +literal expressions which appear at the beginning of a definition or +after a simple assignment. Only the first strings in definitions will +be available as ``__doc__``, and can be used for brief usage text +suitable for interactive sessions; subsequent string literals and all +attribute docstrings are ignored by the Python byte-code compiler and +may contain more extensive API information. + +Example:: + + def function(arg): + """This is __doc__, function's docstring.""" + """ + This is an additional docstring, ignored by the byte-code + compiler, but extracted by Docutils. + """ + pass + +.. topic:: Issue: ``from __future__ import`` + + This would break "``from __future__ import``" statements introduced + in Python 2.1 for multiple module docstrings (main docstring plus + additional docstring(s)). The Python Reference Manual specifies: + + A future statement must appear near the top of the module. The + only lines that can appear before a future statement are: + + * the module docstring (if any), + * comments, + * blank lines, and + * other future statements. + + Resolution? + + 1. Should we search for docstrings after a ``__future__`` + statement? Very ugly. + + 2. Redefine ``__future__`` statements to allow multiple preceding + string literals? + + 3. Or should we not even worry about this? There probably + shouldn't be ``__future__`` statements in production code, after + all. Perhaps modules with ``__future__`` statements will simply + have to put up with the single-docstring limitation. + + +Choice of Docstring Format +-------------------------- + +Rather than force everyone to use a single docstring format, multiple +input formats are allowed by the processing system. A special +variable, ``__docformat__``, may appear at the top level of a module +before any function or class definitions. Over time or through +decree, a standard format or set of formats should emerge. + +A module's ``__docformat__`` variable only applies to the objects +defined in the module's file. In particular, the ``__docformat__`` +variable in a package's ``__init__.py`` file does not apply to objects +defined in subpackages and submodules. + +The ``__docformat__`` variable is a string containing the name of the +format being used, a case-insensitive string matching the input +parser's module or package name (i.e., the same name as required to +"import" the module or package), or a registered alias. If no +``__docformat__`` is specified, the default format is "plaintext" for +now; this may be changed to the standard format if one is ever +established. + +The ``__docformat__`` string may contain an optional second field, +separated from the format name (first field) by a single space: a +case-insensitive language identifier as defined in RFC 1766. A +typical language identifier consists of a 2-letter language code from +`ISO 639`_ (3-letter codes used only if no 2-letter code exists; RFC +1766 is currently being revised to allow 3-letter codes). If no +language identifier is specified, the default is "en" for English. +The language identifier is passed to the parser and can be used for +language-dependent markup features. + + +Identifier Cross-References +--------------------------- + +In Python docstrings, interpreted text is used to classify and mark up +program identifiers, such as the names of variables, functions, +classes, and modules. If the identifier alone is given, its role is +inferred implicitly according to the Python namespace lookup rules. +For functions and methods (even when dynamically assigned), +parentheses ('()') may be included:: + + This function uses `another()` to do its work. + +For class, instance and module attributes, dotted identifiers are used +when necessary. For example (using reStructuredText markup):: + + class Keeper(Storer): + + """ + Extend `Storer`. Class attribute `instances` keeps track + of the number of `Keeper` objects instantiated. + """ + + instances = 0 + """How many `Keeper` objects are there?""" + + def __init__(self): + """ + Extend `Storer.__init__()` to keep track of instances. + + Keep count in `Keeper.instances`, data in `self.data`. + """ + Storer.__init__(self) + Keeper.instances += 1 + + self.data = [] + """Store data in a list, most recent last.""" + + def store_data(self, data): + """ + Extend `Storer.store_data()`; append new `data` to a + list (in `self.data`). + """ + self.data = data + +Each of the identifiers quoted with backquotes ("`") will become +references to the definitions of the identifiers themselves. + + +Stylist Transforms +------------------ + +Stylist transforms are specialized transforms specific to the PySource +Reader. The PySource Reader doesn't have to make any decisions as to +style; it just produces a logically constructed document tree, parsed +and linked, including custom node types. Stylist transforms +understand the custom nodes created by the Reader and convert them +into standard Docutils nodes. + +Multiple Stylist transforms may be implemented and one can be chosen +at runtime (through a "--style" or "--stylist" command-line option). +Each Stylist transform implements a different layout or style; thus +the name. They decouple the context-understanding part of the Reader +from the layout-generating part of processing, resulting in a more +flexible and robust system. This also serves to "separate style from +content", the SGML/XML ideal. + +By keeping the piece of code that does the styling small and modular, +it becomes much easier for people to roll their own styles. The +"barrier to entry" is too high with existing tools; extracting the +stylist code will lower the barrier considerably. + + +========================== + References and Footnotes +========================== + +.. [#PEP-256] PEP 256, Docstring Processing System Framework, Goodger + (http://www.python.org/peps/pep-0256.html) + +.. [#PEP-224] PEP 224, Attribute Docstrings, Lemburg + (http://www.python.org/peps/pep-0224.html) + +.. [#PEP-216] PEP 216, Docstring Format, Zadka + (http://www.python.org/peps/pep-0216.html) + +.. _docutils.dtd: + http://docutils.sourceforge.net/docs/ref/docutils.dtd + +.. _soextbl.dtd: + http://docutils.sourceforge.net/docs/ref/soextblx.dtd + +.. _The Docutils Document Tree: + http://docutils.sourceforge.net/docs/ref/doctree.html + +.. _VMS error condition severity levels: + http://www.openvms.compaq.com:8000/73final/5841/841pro_027.html + #error_cond_severity + +.. _log4j project: http://logging.apache.org/log4j/docs/index.html + +.. _Docutils Python Source DTD: + http://docutils.sourceforge.net/docs/dev/pysource.dtd + +.. _ISO 639: http://www.loc.gov/standards/iso639-2/englangn.html + +.. _Python Doc-SIG: http://www.python.org/sigs/doc-sig/ + + + +================== + Project Web Site +================== + +A SourceForge project has been set up for this work at +http://docutils.sourceforge.net/. + + +=========== + Copyright +=========== + +This document has been placed in the public domain. + + +================== + Acknowledgements +================== + +This document borrows ideas from the archives of the `Python +Doc-SIG`_. Thanks to all members past & present. + + + +.. + Local Variables: + mode: indented-text + indent-tabs-mode: nil + sentence-end-double-space: t + fill-column: 70 + End: |