diff options
Diffstat (limited to 'docutils/docs/dev')
-rw-r--r-- | docutils/docs/dev/distributing.txt | 146 | ||||
-rw-r--r-- | docutils/docs/dev/enthought-plan.txt | 480 | ||||
-rw-r--r-- | docutils/docs/dev/enthought-rfp.txt | 146 | ||||
-rw-r--r-- | docutils/docs/dev/hacking.txt | 264 | ||||
-rw-r--r-- | docutils/docs/dev/policies.txt | 549 | ||||
-rw-r--r-- | docutils/docs/dev/pysource.dtd | 259 | ||||
-rw-r--r-- | docutils/docs/dev/pysource.txt | 130 | ||||
-rw-r--r-- | docutils/docs/dev/release.txt | 168 | ||||
-rw-r--r-- | docutils/docs/dev/repository.txt | 217 | ||||
-rw-r--r-- | docutils/docs/dev/rst/alternatives.txt | 3129 | ||||
-rw-r--r-- | docutils/docs/dev/rst/problems.txt | 872 | ||||
-rw-r--r-- | docutils/docs/dev/semantics.txt | 119 | ||||
-rw-r--r-- | docutils/docs/dev/testing.txt | 246 | ||||
-rw-r--r-- | docutils/docs/dev/todo.txt | 1964 | ||||
-rw-r--r-- | docutils/docs/dev/website.txt | 46 |
15 files changed, 0 insertions, 8735 deletions
diff --git a/docutils/docs/dev/distributing.txt b/docutils/docs/dev/distributing.txt deleted file mode 100644 index c81807279..000000000 --- a/docutils/docs/dev/distributing.txt +++ /dev/null @@ -1,146 +0,0 @@ -=============================== - Docutils_ Distributor's Guide -=============================== - -:Author: Felix Wiemann -:Contact: Felix.Wiemann@ososo.de -:Revision: $Revision$ -:Date: $Date$ -:Copyright: This document has been placed in the public domain. - -.. _Docutils: http://docutils.sourceforge.net/ - -.. contents:: - -This document describes how to create packages of Docutils (e.g. for -shipping with a Linux distribution). If you have any questions, -please direct them to the Docutils-develop_ mailing list. - -First, please download the most current `release tarball`_ and unpack -it. - -.. _Docutils-develop: ../user/mailing-lists.html#docutils-develop -.. _release tarball: http://docutils.sourceforge.net/#download - - -Dependencies -============ - -Docutils has the following dependencies: - -* Python 2.1 or later is required. While the compiler package from - the Tools/ directory of Python's source distribution must be - installed for the test suite to pass with Python 2.1, the - functionality available to end users should be available without the - compiler package as well. So just use ">= Python 2.1" in the - dependencies. - -* Docutils may optionally make use of the PIL (`Python Imaging - Library`_). If PIL is present, it is automatically detected by - Docutils. - -* There are three files in the ``extras/`` directory of the Docutils - distribution, ``optparse.py``, ``textwrap.py``, and ``roman.py``. - For Python 2.1/2.2, all of them must be installed (into the - ``site-packages/`` directory). Python 2.3 and later versions have - ``textwrap`` and ``optparse`` included in the standard library, so - only ``roman.py`` is required here; installing the other files won't - hurt, though. - - These files are automatically installed by the setup script (when - calling "python setup.py install"). - -.. _Python Imaging Library: http://www.pythonware.com/products/pil/ - - -Python Files -============ - -The Docutils Python files must be installed into the -``site-packages/`` directory of Python. Running ``python setup.py -install`` should do the trick, but if you want to place the files -yourself, you can just install the ``docutils/`` directory of the -Docutils tarball to ``/usr/lib/python/site-packages/docutils/``. In -this case you should also compile the Python files to ``.pyc`` and/or -``.pyo`` files so that Docutils doesn't need to be recompiled every -time it's executed. - - -Executables -=========== - -The executable front-end tools are located in the ``tools/`` directory -of the Docutils tarball. - -The ``rst2*.py`` tools (except ``rst2newlatex.py``) are intended for -end-users. You should install them to ``/usr/bin/``. You do not need -to change the names (e.g. to ``docutils-rst2html.py``) because the -``rst2`` prefix is unique. - - -Documentation -============= - -The documentation should be generated using ``buildhtml.py``. To -generate HTML for all documentation files, go to the ``tools/`` -directory and run:: - - # Place html4css1.css in base directory. - cp ../docutils/writers/html4css1/html4css1.css .. - ./buildhtml.py --stylesheet-path=../html4css1.css .. - -Then install the following files to ``/usr/share/doc/docutils/`` (or -wherever you install documentation): - -* All ``.html`` and ``.txt`` files in the base directory. - -* The ``docs/`` directory. - - Do not install the contents of the ``docs/`` directory directly to - ``/usr/share/doc/docutils/``; it's incomplete and would contain - invalid references! - -* The ``licenses/`` directory. - -* ``html4css1.css`` in the base directory. - - -Removing the ``.txt`` Files ---------------------------- - -If you are tight with disk space, you can remove all ``.txt`` files in -the tree except for: - -* those in the ``licenses/`` directory because they have not been - processed to HTML and - -* ``user/rst/cheatsheet.txt`` and ``user/rst/demo.txt``, which should - be readable in source form. - -Before you remove the ``.txt`` files you should rerun ``buildhtml.py`` -with the ``--no-source-link`` switch to avoid broken references to the -source files. - - -Other Files -=========== - -You may want to install the Emacs-Lisp files -``tools/editors/emacs/*.el`` into the appropriate directory. - - -Configuration File -================== - -It is possible to have a system-wide configuration file at -``/etc/docutils.conf``. However, this is usually not necessary. You -should *not* install ``tools/docutils.conf`` into ``/etc/``. - - -Tests -===== - -While you probably do not need to ship the tests with your -distribution, you can test your package by installing it and then -running ``alltests.py`` from the ``tests/`` directory of the Docutils -tarball. diff --git a/docutils/docs/dev/enthought-plan.txt b/docutils/docs/dev/enthought-plan.txt deleted file mode 100644 index 0ab0d3c83..000000000 --- a/docutils/docs/dev/enthought-plan.txt +++ /dev/null @@ -1,480 +0,0 @@ -=========================================== - Plan for Enthought API Documentation Tool -=========================================== - -:Author: David Goodger -:Contact: goodger@python.org -:Date: $Date$ -:Revision: $Revision$ -:Copyright: 2004 by `Enthought, Inc. <http://www.enthought.com>`_ -:License: `Enthought License`_ (BSD-style) - -.. _Enthought License: http://docutils.sf.net/licenses/enthought.txt - -This document should be read in conjunction with the `Enthought API -Documentation Tool RFP`__ prepared by Janet Swisher. - -__ enthought-rfp.html - -.. contents:: -.. sectnum:: - - -Introduction -============ - -In March 2004 at I met Eric Jones, president and CTO of `Enthought, -Inc.`_, at `PyCon 2004`_ in Washington DC. He told me that Enthought -was using reStructuredText_ for source code documentation, but they -had some issues. He asked if I'd be interested in doing some work on -a customized API documentation tool. Shortly after PyCon, Janet -Swisher, Enthought's senior technical writer, contacted me to work out -details. Some email, a trip to Austin in May, and plenty of Texas -hospitality later, we had a project. This document will record the -details, milestones, and evolution of the project. - -In a nutshell, Enthought is sponsoring the implementation of an open -source API documentation tool that meets their needs. Fortuitously, -their needs coincide well with the "Python Source Reader" description -in `PEP 258`_. In other words, Enthought is funding some significant -improvements to Docutils, improvements that were planned but never -implemented due to time and other constraints. The implementation -will take place gradually over several months, on a part-time basis. - -This is an ideal example of cooperation between a corporation and an -open-source project. The corporation, the project, I personally, and -the community all benefit. Enthought, whose commitment to open source -is also evidenced by their sponsorship of SciPy_, benefits by -obtaining a useful piece of software, much more quickly than would -have been possible without their support. Docutils benefits directly -from the implementation of one of its core subsystems. I benefit from -the funding, which allows me to justify the long hours to my wife and -family. All the corporations, projects, and individuals that make up -the community will benefit from the end result, which will be great. - -All that's left now is to actually do the work! - -.. _PyCon 2004: http://pycon.org/dc2004/ -.. _reStructuredText: http://docutils.sf.net/rst.html -.. _SciPy: http://www.scipy.org/ - - -Development Plan -================ - -1. Analyze prior art, most notably Epydoc_ and HappyDoc_, to see how - they do what they do. I have no desire to reinvent wheels - unnecessarily. I want to take the best ideas from each tool, - combined with the outline in `PEP 258`_ (which will evolve), and - build at least the foundation of the definitive Python - auto-documentation tool. - - .. _Epydoc: http://epydoc.sourceforge.net/ - .. _HappyDoc: http://happydoc.sourceforge.net/ - .. _PEP 258: - http://docutils.sf.net/docs/peps/pep-0258.html#python-source-reader - -2. Decide on a base platform. The best way to achieve Enthought's - goals in a reasonable time frame may be to extend Epydoc or - HappyDoc. Or it may be necessary to start fresh. - -3. Extend the reStructuredText parser. See `Proposed Changes to - reStructuredText`_ below. - -4. Depending on the base platform chosen, build or extend the - docstring & doc comment extraction tool. This may be the biggest - part of the project, but I won't be able to break it down into - details until more is known. - - -Repository -========== - -If possible, all software and documentation files will be stored in -the Subversion repository of Docutils and/or the base project, which -are all publicly-available via anonymous pserver access. - -The Docutils project is very open about granting Subversion write -access; so far, everyone who asked has been given access. Any -Enthought staff member who would like Subversion write access will get -it. - -If either Epydoc or HappyDoc is chosen as the base platform, I will -ask the project's administrator for CVS access for myself and any -Enthought staff member who wants it. If sufficient access is not -granted -- although I doubt that there would be any problem -- we may -have to begin a fork, which could be hosted on SourceForge, on -Enthought's Subversion server, or anywhere else deemed appropriate. - - -Copyright & License -=================== - -Most existing Docutils files have been placed in the public domain, as -follows:: - - :Copyright: This document has been placed in the public domain. - -This is in conjunction with the "Public Domain Dedication" section of -COPYING.txt__. - -__ http://docutils.sourceforge.net/COPYING.html - -The code and documentation originating from Enthought funding will -have Enthought's copyright and license declaration. While I will try -to keep Enthought-specific code and documentation separate from the -existing files, there will inevitably be cases where it makes the most -sense to extend existing files. - -I propose the following: - -1. New files related to this Enthought-funded work will be identified - with the following field-list headers:: - - :Copyright: 2004 by Enthought, Inc. - :License: Enthought License (BSD Style) - - The license field text will be linked to the license file itself. - -2. For significant or major changes to an existing file (more than 10% - change), the headers shall change as follows (for example):: - - :Copyright: 2001-2004 by David Goodger - :Copyright: 2004 by Enthought, Inc. - :License: BSD-style - - If the Enthought-funded portion becomes greater than the previously - existing portion, Enthought's copyright line will be shown first. - -3. In cases of insignificant or minor changes to an existing file - (less than 10% change), the public domain status shall remain - unchanged. - -A section describing all of this will be added to the Docutils -`COPYING`__ instructions file. - -If another project is chosen as the base project, similar changes -would be made to their files, subject to negotiation. - -__ http://docutils.sf.net/COPYING.html - - -Proposed Changes to reStructuredText -==================================== - -Doc Comment Syntax ------------------- - -The "traits" construct is implemented as dictionaries, where -standalone strings would be Python syntax errors. Therefore traits -require documentation in comments. We also need a way to -differentiate between ordinary "internal" comments and documentation -comments (doc comments). - -Javadoc uses the following syntax for doc comments:: - - /** - * The first line of a multi-line doc comment begins with a slash - * and *two* asterisks. The doc comment ends normally. - */ - -Python doesn't have multi-line comments; only single-line. A similar -convention in Python might look like this:: - - ## - # The first line of a doc comment begins with *two* hash marks. - # The doc comment ends with the first non-comment line. - 'data' : AnyValue, - - ## The double-hash-marks could occur on the first line of text, - # saving a line in the source. - 'data' : AnyValue, - -How to indicate the end of the doc comment? :: - - ## - # The first line of a doc comment begins with *two* hash marks. - # The doc comment ends with the first non-comment line, or another - # double-hash-mark. - ## - # This is an ordinary, internal, non-doc comment. - 'data' : AnyValue, - - ## First line of a doc comment, terse syntax. - # Second (and last) line. Ends here: ## - # This is an ordinary, internal, non-doc comment. - 'data' : AnyValue, - -Or do we even need to worry about this case? A simple blank line -could be used:: - - ## First line of a doc comment, terse syntax. - # Second (and last) line. Ends with a blank line. - - # This is an ordinary, internal, non-doc comment. - 'data' : AnyValue, - -Other possibilities:: - - #" Instead of double-hash-marks, we could use a hash mark and a - # quotation mark to begin the doc comment. - 'data' : AnyValue, - - ## We could require double-hash-marks on every line. This has the - ## added benefit of delimiting the *end* of the doc comment, as - ## well as working well with line wrapping in Emacs - ## ("fill-paragraph" command). - # Ordinary non-doc comment. - 'data' : AnyValue, - - #" A hash mark and a quotation mark on each line looks funny, and - #" it doesn't work well with line wrapping in Emacs. - 'data' : AnyValue, - -These styles (repeated on each line) work well with line wrapping in -Emacs:: - - ## #> #| #- #% #! #* - -These styles do *not* work well with line wrapping in Emacs:: - - #" #' #: #) #. #/ #@ #$ #^ #= #+ #_ #~ - -The style of doc comment indicator used could be a runtime, global -and/or per-module setting. That may add more complexity than it's -worth though. - - -Recommendation -`````````````` - -I recommend adopting "#*" on every line:: - - # This is an ordinary non-doc comment. - - #* This is a documentation comment, with an asterisk after the - #* hash marks on every line. - 'data' : AnyValue, - -I initially recommended adopting double-hash-marks:: - - # This is an ordinary non-doc comment. - - ## This is a documentation comment, with double-hash-marks on - ## every line. - 'data' : AnyValue, - -But Janet Swisher rightly pointed out that this could collide with -ordinary comments that are then block-commented. This applies to -double-hash-marks on the first line only as well. So they're out. - -On the other hand, the JavaDoc-comment style ("##" on the first line -only, "#" after that) is used in Fredrik Lundh's PythonDoc_. It may -be worthwhile to conform to this syntax, reinforcing it as a standard. -PythonDoc does not support terse doc comments (text after "##" on the -first line). - -.. _PythonDoc: http://effbot.org/zone/pythondoc.htm - - -Update -`````` - -Enthought's Traits system has switched to a metaclass base, and traits -are now defined via ordinary attributes. Therefore doc comments are -no longer absolutely necessary; attribute docstrings will suffice. -Doc comments may still be desirable though, since they allow -documentation to precede the thing being documented. - - -Docstring Density & Whitespace Minimization -------------------------------------------- - -One problem with extensively documented classes & functions, is that -there is a lot of screen space wasted on whitespace. Here's some -current Enthought code (from lib/cp/fluids/gassmann.py):: - - def max_gas(temperature, pressure, api, specific_gravity=.56): - """ - Computes the maximum dissolved gas in oil using Batzle and - Wang (1992). - - Parameters - ---------- - temperature : sequence - Temperature in degrees Celsius - pressure : sequence - Pressure in MPa - api : sequence - Stock tank oil API - specific_gravity : sequence - Specific gravity of gas at STP, default is .56 - - Returns - ------- - max_gor : sequence - Maximum dissolved gas in liters/liter - - Description - ----------- - This estimate is based on equations given by Mavko, Mukerji, - and Dvorkin, (1998, pp. 218-219, or 2003, p. 236) obtained - originally from Batzle and Wang (1992). - """ - code... - -The docstring is 24 lines long. - -Rather than using subsections, field lists (which exist now) can save -6 lines:: - - def max_gas(temperature, pressure, api, specific_gravity=.56): - """ - Computes the maximum dissolved gas in oil using Batzle and - Wang (1992). - - :Parameters: - temperature : sequence - Temperature in degrees Celsius - pressure : sequence - Pressure in MPa - api : sequence - Stock tank oil API - specific_gravity : sequence - Specific gravity of gas at STP, default is .56 - :Returns: - max_gor : sequence - Maximum dissolved gas in liters/liter - :Description: This estimate is based on equations given by - Mavko, Mukerji, and Dvorkin, (1998, pp. 218-219, or 2003, - p. 236) obtained originally from Batzle and Wang (1992). - """ - code... - -As with the "Description" field above, field bodies may begin on the -same line as the field name, which also saves space. - -The output for field lists is typically a table structure. For -example: - - :Parameters: - temperature : sequence - Temperature in degrees Celsius - pressure : sequence - Pressure in MPa - api : sequence - Stock tank oil API - specific_gravity : sequence - Specific gravity of gas at STP, default is .56 - :Returns: - max_gor : sequence - Maximum dissolved gas in liters/liter - :Description: - This estimate is based on equations given by Mavko, - Mukerji, and Dvorkin, (1998, pp. 218-219, or 2003, p. 236) - obtained originally from Batzle and Wang (1992). - -But the definition lists describing the parameters and return values -are still wasteful of space. There are a lot of half-filled lines. - -Definition lists are currently defined as:: - - term : classifier - definition - -Where the classifier part is optional. Ideas for improvements: - -1. We could allow multiple classifiers:: - - term : classifier one : two : three ... - definition - -2. We could allow the definition on the same line as the term, using - some embedded/inline markup: - - * "--" could be used, but only in limited and well-known contexts:: - - term -- definition - - This is the syntax used by StructuredText (one of - reStructuredText's predecessors). It was not adopted for - reStructuredText because it is ambiguous -- people often use "--" - in their text, as I just did. But given a constrained context, - the ambiguity would be acceptable (or would it?). That context - would be: in docstrings, within a field list, perhaps only with - certain well-defined field names (parameters, returns). - - * The "constrained context" above isn't really enough to make the - ambiguity acceptable. Instead, a slightly more verbose but far - less ambiguous syntax is possible:: - - term === definition - - This syntax has advantages. Equals signs lend themselves to the - connotation of "definition". And whereas one or two equals signs - are commonly used in program code, three equals signs in a row - have no conflicting meanings that I know of. (Update: there - *are* uses out there.) - - The problem with this approach is that using inline markup for - structure is inherently ambiguous in reStructuredText. For - example, writing *about* definition lists would be difficult:: - - ``term === definition`` is an example of a compact definition list item - - The parser checks for structural markup before it does inline - markup processing. But the "===" should be protected by its inline - literal context. - -3. We could allow the definition on the same line as the term, using - structural markup. A variation on bullet lists would work well:: - - : term :: definition - : another term :: and a definition that - wraps across lines - - Some ambiguity remains:: - - : term ``containing :: double colons`` :: definition - - But the likelihood of such cases is negligible, and they can be - covered in the documentation. - - Other possibilities for the definition delimiter include:: - - : term : classifier -- definition - : term : classifier --- definition - : term : classifier : : definition - : term : classifier === definition - -The third idea currently has the best chance of being adopted and -implemented. - - -Recommendation -`````````````` - -Combining these ideas, the function definition becomes:: - - def max_gas(temperature, pressure, api, specific_gravity=.56): - """ - Computes the maximum dissolved gas in oil using Batzle and - Wang (1992). - - :Parameters: - : temperature : sequence :: Temperature in degrees Celsius - : pressure : sequence :: Pressure in MPa - : api : sequence :: Stock tank oil API - : specific_gravity : sequence :: Specific gravity of gas at - STP, default is .56 - :Returns: - : max_gor : sequence :: Maximum dissolved gas in liters/liter - :Description: This estimate is based on equations given by - Mavko, Mukerji, and Dvorkin, (1998, pp. 218-219, or 2003, - p. 236) obtained originally from Batzle and Wang (1992). - """ - code... - -The docstring is reduced to 14 lines, from the original 24. For -longer docstrings with many parameters and return values, the -difference would be more significant. diff --git a/docutils/docs/dev/enthought-rfp.txt b/docutils/docs/dev/enthought-rfp.txt deleted file mode 100644 index 986f5604f..000000000 --- a/docutils/docs/dev/enthought-rfp.txt +++ /dev/null @@ -1,146 +0,0 @@ -================================== - Enthought API Documentation Tool -================================== ------------------------ - Request for Proposals ------------------------ - -:Author: Janet Swisher, Senior Technical Writer -:Organization: `Enthought, Inc. <http://www.enthought.com>`_ -:Copyright: 2004 by Enthought, Inc. -:License: `Enthought License`_ (BSD Style) - -.. _Enthought License: http://docutils.sf.net/licenses/enthought.txt - -The following is excerpted from the full RFP, and is published here -with permission from `Enthought, Inc.`_ See the `Plan for Enthought -API Documentation Tool`__. - -__ enthought-plan.html - -.. contents:: -.. sectnum:: - - -Requirements -============ - -The documentation tool will address the following high-level goals: - - -Documentation Extraction ------------------------- - -1. Documentation will be generated directly from Python source code, - drawing from the code structure, docstrings, and possibly other - comments. - -2. The tool will extract logical constructs as appropriate, minimizing - the need for comments that are redundant with the code structure. - The output should reflect both documented and undocumented - elements. - - -Source Format -------------- - -1. The docstrings will be formatted in as terse syntax as possible. - Required tags, syntax, and white space should be minimized. - -2. The tool must support the use of Traits. Special comment syntax - for Traits may be necessary. Information about the Traits package - is available at http://code.enthought.com/traits/. In the - following example, each trait definition is prefaced by a plain - comment:: - - __traits__ = { - - # The current selection within the frame. - 'selection' : Trait([], TraitInstance(list)), - - # The frame has been activated or deactivated. - 'activated' : TraitEvent(), - - 'closing' : TraitEvent(), - - # The frame is closed. - 'closed' : TraitEvent(), - } - -3. Support for ReStructuredText (ReST) format is desirable, because - much of the existing docstrings uses ReST. However, the complete - ReST specification need not be supported, if a subset can achieve - the project goals. If the tool does not support ReST, the - contractor should also provide a tool or path to convert existing - docstrings. - - -Output Format -------------- - -1. Documentation will be output as a navigable suite of HTML - files. - -2. The style of the HTML files will be customizable by a cascading - style sheet and/or a customizable template. - -3. Page elements such as headers and footer should be customizable, to - support differing requirements from one documentation project to - the next. - - -Output Structure and Navigation -------------------------------- - -1. The navigation scheme for the HTML files should not rely on frames, - and should harmonize with conversion to Microsoft HTML Help (.chm) - format. - -2. The output should be structured to make navigable the architecture - of the Python code. Packages, modules, classes, traits, and - functions should be presented in clear, logical hierarchies. - Diagrams or trees for inheritance, collaboration, sub-packaging, - etc. are desirable but not required. - -3. The output must include indexes that provide a comprehensive view - of all packages, modules, and classes. These indexes will provide - readers with a clear and exhaustive view of the code base. These - indexes should be presented in a way that is easily accessible and - allows easy navigation. - -4. Cross-references to other documented elements will be used - throughout the documentation, to enable the reader to move quickly - relevant information. For example, where type information for an - element is available, the type definition should be - cross-referenced. - -5. The HTML suite should provide consistent navigation back to the - home page, which will include the following information: - - * Bibliographic information - - - Author - - Copyright - - Release date - - Version number - - * Abstract - - * References - - - Links to related internal docs (i.e., other docs for the same - product) - - - Links to related external docs (e.g., supporting development - docs, Python support docs, docs for included packages) - - It should be possible to specify similar information at the top - level of each package, so that packages can be included as - appropriate for a given application. - - -License -======= - -Enthought intends to release the software under an open-source -("BSD-style") license. diff --git a/docutils/docs/dev/hacking.txt b/docutils/docs/dev/hacking.txt deleted file mode 100644 index d0ec9a3fb..000000000 --- a/docutils/docs/dev/hacking.txt +++ /dev/null @@ -1,264 +0,0 @@ -========================== - Docutils_ Hacker's Guide -========================== - -:Author: Felix Wiemann -:Contact: Felix.Wiemann@ososo.de -:Revision: $Revision$ -:Date: $Date$ -:Copyright: This document has been placed in the public domain. - -:Abstract: This is the introduction to Docutils for all persons who - want to extend Docutils in some way. -:Prerequisites: You have used reStructuredText_ and played around with - the `Docutils front-end tools`_ before. Some (basic) Python - knowledge is certainly helpful (though not necessary, strictly - speaking). - -.. _Docutils: http://docutils.sourceforge.net/ -.. _reStructuredText: http://docutils.sourceforge.net/rst.html -.. _Docutils front-end tools: ../user/tools.html - -.. contents:: - - -Overview of the Docutils Architecture -===================================== - -To give you an understanding of the Docutils architecture, we'll dive -right into the internals using a practical example. - -Consider the following reStructuredText file:: - - My *favorite* language is Python_. - - .. _Python: http://www.python.org/ - -Using the ``rst2html.py`` front-end tool, you would get an HTML output -which looks like this:: - - [uninteresting HTML code removed] - <body> - <div class="document"> - <p>My <em>favorite</em> language is <a class="reference" href="http://www.python.org/">Python</a>.</p> - </div> - </body> - </html> - -While this looks very simple, it's enough to illustrate all internal -processing stages of Docutils. Let's see how this document is -processed from the reStructuredText source to the final HTML output: - - -Reading the Document --------------------- - -The **Reader** reads the document from the source file and passes it -to the parser (see below). The default reader is the standalone -reader (``docutils/readers/standalone.py``) which just reads the input -data from a single text file. Unless you want to do really fancy -things, there is no need to change that. - -Since you probably won't need to touch readers, we will just move on -to the next stage: - - -Parsing the Document --------------------- - -The **Parser** analyzes the the input document and creates a **node -tree** representation. In this case we are using the -**reStructuredText parser** (``docutils/parsers/rst/__init__.py``). -To see what that node tree looks like, we call ``quicktest.py`` (which -can be found in the ``tools/`` directory of the Docutils distribution) -with our example file (``test.txt``) as first parameter (Windows users -might need to type ``python quicktest.py test.txt``):: - - $ quicktest.py test.txt - <document source="test.txt"> - <paragraph> - My - <emphasis> - favorite - language is - <reference name="Python" refname="python"> - Python - . - <target ids="python" names="python" refuri="http://www.python.org/"> - -Let us now examine the node tree: - -The top-level node is ``document``. It has a ``source`` attribute -whose value is ``text.txt``. There are two children: A ``paragraph`` -node and a ``target`` node. The ``paragraph`` in turn has children: A -text node ("My "), an ``emphasis`` node, a text node (" language is "), -a ``reference`` node, and again a ``Text`` node ("."). - -These node types (``document``, ``paragraph``, ``emphasis``, etc.) are -all defined in ``docutils/nodes.py``. The node types are internally -arranged as a class hierarchy (for example, both ``emphasis`` and -``reference`` have the common superclass ``Inline``). To get an -overview of the node class hierarchy, use epydoc (type ``epydoc -nodes.py``) and look at the class hierarchy tree. - - -Transforming the Document -------------------------- - -In the node tree above, the ``reference`` node does not contain the -target URI (``http://www.python.org/``) yet. - -Assigning the target URI (from the ``target`` node) to the -``reference`` node is *not* done by the parser (the parser only -translates the input document into a node tree). - -Instead, it's done by a **Transform**. In this case (resolving a -reference), it's done by the ``ExternalTargets`` transform in -``docutils/transforms/references.py``. - -In fact, there are quite a lot of Transforms, which do various useful -things like creating the table of contents, applying substitution -references or resolving auto-numbered footnotes. - -The Transforms are applied after parsing. To see how the node tree -has changed after applying the Transforms, we use the -``rst2pseudoxml.py`` tool: - -.. parsed-literal:: - - $ rst2pseudoxml.py test.txt - <document source="test.txt"> - <paragraph> - My - <emphasis> - favorite - language is - <reference name="Python" **refuri="http://www.python.org/"**> - Python - . - <target ids="python" names="python" ``refuri="http://www.python.org/"``> - -For our small test document, the only change is that the ``refname`` -attribute of the reference has been replaced by a ``refuri`` -attribute |---| the reference has been resolved. - -While this does not look very exciting, transforms are a powerful tool -to apply any kind of transformation on the node tree. - -By the way, you can also get a "real" XML representation of the node -tree by using ``rst2xml.py`` instead of ``rst2pseudoxml.py``. - - -Writing the Document --------------------- - -To get an HTML document out of the node tree, we use a **Writer**, the -HTML writer in this case (``docutils/writers/html4css1.py``). - -The writer receives the node tree and returns the output document. -For HTML output, we can test this using the ``rst2html.py`` tool:: - - $ rst2html.py --link-stylesheet test.txt - <?xml version="1.0" encoding="utf-8" ?> - <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> - <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> - <head> - <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> - <meta name="generator" content="Docutils 0.3.10: http://docutils.sourceforge.net/" /> - <title></title> - <link rel="stylesheet" href="../docutils/writers/html4css1/html4css1.css" type="text/css" /> - </head> - <body> - <div class="document"> - <p>My <em>favorite</em> language is <a class="reference" href="http://www.python.org/">Python</a>.</p> - </div> - </body> - </html> - -So here we finally have our HTML output. The actual document contents -are in the fourth-last line. Note, by the way, that the HTML writer -did not render the (invisible) ``target`` node |---| only the -``paragraph`` node and its children appear in the HTML output. - - -Extending Docutils -================== - -Now you'll ask, "how do I actually extend Docutils?" - -First of all, once you are clear about *what* you want to achieve, you -have to decide *where* to implement it |---| in the Parser (e.g. by -adding a directive or role to the reStructuredText parser), as a -Transform, or in the Writer. There is often one obvious choice among -those three (Parser, Transform, Writer). If you are unsure, ask on -the Docutils-develop_ mailing list. - -In order to find out how to start, it is often helpful to look at -similar features which are already implemented. For example, if you -want to add a new directive to the reStructuredText parser, look at -the implementation of a similar directive in -``docutils/parsers/rst/directives/``. - - -Modifying the Document Tree Before It Is Written ------------------------------------------------- - -You can modify the document tree right before the writer is called. -One possibility is to use the publish_doctree_ and -publish_from_doctree_ functions. - -To retrieve the document tree, call:: - - document = docutils.core.publish_doctree(...) - -Please see the docstring of publish_doctree for a list of parameters. - -.. XXX Need to write a well-readable list of (commonly used) options - of the publish_* functions. Probably in api/publisher.txt. - -``document`` is the root node of the document tree. You can now -change the document by accessing the ``document`` node and its -children |---| see `The Node Interface`_ below. - -When you're done with modifying the document tree, you can write it -out by calling:: - - output = docutils.core.publish_from_doctree(document, ...) - -.. _publish_doctree: ../api/publisher.html#publish_doctree -.. _publish_from_doctree: ../api/publisher.html#publish_from_doctree - - -The Node Interface ------------------- - -As described in the overview above, Docutils' internal representation -of a document is a tree of nodes. We'll now have a look at the -interface of these nodes. - -(To be completed.) - - -What Now? -========= - -This document is not complete. Many topics could (and should) be -covered here. To find out with which topics we should write about -first, we are awaiting *your* feedback. So please ask your questions -on the Docutils-develop_ mailing list. - - -.. _Docutils-develop: ../user/mailing-lists.html#docutils-develop - - -.. |---| unicode:: 8212 .. em-dash - :trim: - - -.. - Local Variables: - mode: indented-text - indent-tabs-mode: nil - sentence-end-double-space: t - fill-column: 70 - End: diff --git a/docutils/docs/dev/policies.txt b/docutils/docs/dev/policies.txt deleted file mode 100644 index 25fb4f2e9..000000000 --- a/docutils/docs/dev/policies.txt +++ /dev/null @@ -1,549 +0,0 @@ -=========================== - Docutils Project Policies -=========================== - -:Author: David Goodger; open to all Docutils developers -:Contact: goodger@python.org -:Date: $Date$ -:Revision: $Revision$ -:Copyright: This document has been placed in the public domain. - -.. contents:: - -The Docutils project group is a meritocracy based on code contribution -and lots of discussion [#bcs]_. A few quotes sum up the policies of -the Docutils project. The IETF's classic credo (by MIT professor Dave -Clark) is an ideal we can aspire to: - - We reject: kings, presidents, and voting. We believe in: rough - consensus and running code. - -As architect, chief cook and bottle-washer, David Goodger currently -functions as BDFN (Benevolent Dictator For Now). (But he would -happily abdicate the throne given a suitable candidate. Any takers?) - -Eric S. Raymond, anthropologist of the hacker subculture, writes in -his essay `The Magic Cauldron`_: - - The number of contributors [to] projects is strongly and inversely - correlated with the number of hoops each project makes a user go - through to contribute. - -We will endeavour to keep the barrier to entry as low as possible. -The policies below should not be thought of as barriers, but merely as -a codification of experience to date. These are "best practices"; -guidelines, not absolutes. Exceptions are expected, tolerated, and -used as a source of improvement. Feedback and criticism is welcome. - -As for control issues, Emmett Plant (CEO of the Xiph.org Foundation, -originators of Ogg Vorbis) put it well when he said: - - Open source dictates that you lose a certain amount of control - over your codebase, and that's okay with us. - -.. [#bcs] Phrase borrowed from `Ben Collins-Sussman of the Subversion - project <http://www.red-bean.com/sussman/svn-anti-fud.html>`__. - -.. _The Magic Cauldron: - http://www.catb.org/~esr/writings/magic-cauldron/ - - -Python Coding Conventions -========================= - -Contributed code will not be refused merely because it does not -strictly adhere to these conditions; as long as it's internally -consistent, clean, and correct, it probably will be accepted. But -don't be surprised if the "offending" code gets fiddled over time to -conform to these conventions. - -The Docutils project shall follow the generic coding conventions as -specified in the `Style Guide for Python Code`_ and `Docstring -Conventions`_ PEPs, summarized, clarified, and extended as follows: - -* 4 spaces per indentation level. No hard tabs. - -* Use only 7-bit ASCII, no 8-bit strings. See `Docutils - Internationalization`_. - -* No one-liner compound statements (i.e., no ``if x: return``: use two - lines & indentation), except for degenerate class or method - definitions (i.e., ``class X: pass`` is OK.). - -* Lines should be no more than 78 characters long. - -* Use "StudlyCaps" for class names (except for element classes in - docutils.nodes). - -* Use "lowercase" or "lowercase_with_underscores" for function, - method, and variable names. For short names, maximum two words, - joined lowercase may be used (e.g. "tagname"). For long names with - three or more words, or where it's hard to parse the split between - two words, use lowercase_with_underscores (e.g., - "note_explicit_target", "explicit_target"). If in doubt, use - underscores. - -* Avoid lambda expressions, which are inherently difficult to - understand. Named functions are preferable and superior: they're - faster (no run-time compilation), and well-chosen names serve to - document and aid understanding. - -* Avoid functional constructs (filter, map, etc.). Use list - comprehensions instead. - -* Avoid ``from __future__ import`` constructs. They are inappropriate - for production code. - -* Use 'single quotes' for string literals, and """triple double - quotes""" for docstrings. - -.. _Style Guide for Python Code: - http://www.python.org/peps/pep-0008.html -.. _Docstring Conventions: http://www.python.org/peps/pep-0257.html -.. _Docutils Internationalization: ../howto/i18n.html#python-code - - -Documentation Conventions -========================= - -* Docutils documentation is written using reStructuredText, of course. - -* Use 7-bit ASCII if at all possible, and Unicode substitutions when - necessary. - -* Use the following section title adornment styles:: - - ================ - Document Title - ================ - - -------------------------------------------- - Document Subtitle, or Major Division Title - -------------------------------------------- - - Section - ======= - - Subsection - ---------- - - Sub-Subsection - `````````````` - - Sub-Sub-Subsection - .................. - -* Use two blank lines before each section/subsection/etc. title. One - blank line is sufficient between immediately adjacent titles. - -* Add a bibliographic field list immediately after the document - title/subtitle. See the beginning of this document for an example. - -* Add an Emacs "local variables" block in a comment at the end of the - document. See the end of this document for an example. - - -Copyrights and Licensing -======================== - -The majority of the Docutils project code and documentation has been -placed in the public domain. Unless clearly and explicitly indicated -otherwise, any patches (modifications to existing files) submitted to -the project for inclusion (via Subversion, SourceForge trackers, -mailing lists, or private email) are assumed to be in the public -domain as well. - -Any new files contributed to the project should clearly state their -intentions regarding copyright, in one of the following ways: - -* Public domain (preferred): include the statement "This - module/document has been placed in the public domain." - -* Copyright & open source license: include a copyright notice, along - with either an embedded license statement, a reference to an - accompanying license file, or a license URL. - -One of the goals of the Docutils project, once complete, is to be -incorporated into the Python standard library. At that time copyright -of the Docutils code will be assumed by or transferred to the Python -Software Foundation (PSF), and will be released under Python's -license. If the copyright/license option is chosen for new files, the -license should be compatible with Python's current license, and the -author(s) of the files should be willing to assign copyright to the -PSF. The PSF accepts the `Academic Free License v. 2.1 -<http://opensource.org/licenses/afl-2.1.php>`_ and the `Apache -License, Version 2.0 <http://opensource.org/licenses/apache2.0.php>`_. - - -Subversion Repository -===================== - -Please see the `repository documentation`_ for details on how to -access Docutils' Subversion repository. Anyone can access the -repository anonymously. Only project developers can make changes. -(If you would like to become a project developer, just ask!) Also see -`Setting Up For Docutils Development`_ below for some useful info. - -Unless you really *really* know what you're doing, please do *not* use -``svn import``. It's quite easy to mess up the repository with an -import. - -.. _repository documentation: repository.html - - -Branches --------- - -(These branch policies go into effect with Docutils 0.4.) - -The "docutils" directory of the **trunk** (a.k.a. the **Docutils -core**) is used for active -- but stable, fully tested, and reviewed --- development. - -There will be at least one active **maintenance branch** at a time, -based on at least the latest feature release. For example, when -Docutils 0.5 is released, its maintenance branch will take over, and -the 0.4.x maintenance branch may be retired. Maintenance branches -will receive bug fixes only; no new features will be allowed here. - -Obvious and uncontroversial bug fixes *with tests* can be checked in -directly to the core and to the maintenance branches. Don't forget to -add test cases! Many (but not all) bug fixes will be applicable both -to the core and to the maintenance branches; these should be applied -to both. No patches or dedicated branches are required for bug fixes, -but they may be used. It is up to the discretion of project -developers to decide which mechanism to use for each case. - -Feature additions and API changes will be done in **feature -branches**. Feature branches will not be managed in any way. -Frequent small checkins are encouraged here. Feature branches must be -discussed on the docutils-develop mailing list and reviewed before -being merged into the core. - - -Review Criteria -``````````````` - -Before a new feature, an API change, or a complex, disruptive, or -controversial bug fix can be checked in to the core or into a -maintenance branch, it must undergo review. These are the criteria: - -* The branch must be complete, and include full documentation and - tests. - -* There should ideally be one branch merge commit per feature or - change. In other words, each branch merge should represent a - coherent change set. - -* The code must be stable and uncontroversial. Moving targets and - features under debate are not ready to be merged. - -* The code must work. The test suite must complete with no failures. - See `Docutils Testing`_. - -The review process will ensure that at least one other set of eyeballs -& brains sees the code before it enters the core. In addition to the -above, the general `Check-ins`_ policy (below) also applies. - -.. _Docutils Testing: testing.html - - -Check-ins ---------- - -Changes or additions to the Docutils core and maintenance branches -carry a commitment to the Docutils user community. Developers must be -prepared to fix and maintain any code they have committed. - -The Docutils core (``trunk/docutils`` directory) and maintenance -branches should always be kept in a stable state (usable and as -problem-free as possible). All changes to the Docutils core or -maintenance branches must be in `good shape`_, usable_, documented_, -tested_, and `reasonably complete`_. - -* _`Good shape` means that the code is clean, readable, and free of - junk code (unused legacy code; by analogy to "junk DNA"). - -* _`Usable` means that the code does what it claims to do. An "XYZ - Writer" should produce reasonable XYZ output. - -* _`Documented`: The more complete the documentation the better. - Modules & files must be at least minimally documented internally. - `Docutils Front-End Tools`_ should have a new section for any - front-end tool that is added. `Docutils Configuration Files`_ - should be modified with any settings/options defined. For any - non-trivial change, the HISTORY.txt_ file should be updated. - -* _`Tested` means that unit and/or functional tests, that catch all - bugs fixed and/or cover all new functionality, have been added to - the test suite. These tests must be checked by running the test - suite under all supported Python versions, and the entire test suite - must pass. See `Docutils Testing`_. - -* _`Reasonably complete` means that the code must handle all input. - Here "handle" means that no input can cause the code to fail (cause - an exception, or silently and incorrectly produce nothing). - "Reasonably complete" does not mean "finished" (no work left to be - done). For example, a writer must handle every standard element - from the Docutils document model; for unimplemented elements, it - must *at the very least* warn that "Output for element X is not yet - implemented in writer Y". - -If you really want to check code directly into the Docutils core, -you can, but you must ensure that it fulfills the above criteria -first. People will start to use it and they will expect it to work! -If there are any issues with your code, or if you only have time for -gradual development, you should put it on a branch or in the sandbox -first. It's easy to move code over to the Docutils core once it's -complete. - -It is the responsibility and obligation of all developers to keep the -Docutils core and maintenance branches stable. If a commit is made to -the core or maintenance branch which breaks any test, the solution is -simply to revert the change. This is not vindictive; it's practical. -We revert first, and discuss later. - -Docutils will pursue an open and trusting policy for as long as -possible, and deal with any aberrations if (and hopefully not when) -they happen. We'd rather see a torrent of loose contributions than -just a trickle of perfect-as-they-stand changes. The occasional -mistake is easy to fix. That's what Subversion is for! - -.. _Docutils Front-End Tools: ../user/tools.html -.. _Docutils Configuration Files: ../user/config.html -.. _HISTORY.txt: ../../HISTORY.txt - - -Version Numbering -================= - -Docutils version numbering uses a ``major.minor.micro`` scheme (x.y.z; -for example, 0.4.1). - -**Major releases** (x.0, e.g. 1.0) will be rare, and will represent -major changes in API, functionality, or commitment. For example, as -long as the major version of Docutils is 0, it is to be considered -*experimental code*. When Docutils reaches version 1.0, the major -APIs will be considered frozen and backward compatibility will become -of paramount importance. - -Releases that change the minor number (x.y, e.g. 0.5) will be -**feature releases**; new features from the `Docutils core`_ will be -included. - -Releases that change the micro number (x.y.z, e.g. 0.4.1) will be -**bug-fix releases**. No new features will be introduced in these -releases; only bug fixes off of `maintenance branches`_ will be -included. - -This policy was adopted in October 2005, and will take effect with -Docutils version 0.4. Prior to version 0.4, Docutils didn't have an -official version numbering policy, and micro releases contained both -bug fixes and new features. - -.. _Docutils core: - http://svn.berlios.de/viewcvs/docutils/trunk/docutils/ -.. _maintenance branches: - http://svn.berlios.de/viewcvs/docutils/branches/ - - -Snapshots -========= - -Snapshot tarballs will be generated regularly from - -* the Docutils core, representing the current cutting-edge state of - development; - -* each active maintenance branch, for bug fixes; - -* each development branch, representing the unstable - seat-of-your-pants bleeding edge. - -The ``sandbox/infrastructure/docutils-update`` shell script, run as an -hourly cron job on the BerliOS server, is responsible for -automatically generating the snapshots and updating the web site. See -the `web site docs <website.html>`__. - - -Setting Up For Docutils Development -=================================== - -When making changes to the code, testing is a must. The code should -be run to verify that it produces the expected results, and the entire -test suite should be run too. The modified Docutils code has to be -accessible to Python for the tests to have any meaning. There are two -ways to keep the Docutils code accessible during development: - -1. Update your ``PYTHONPATH`` environment variable so that Python - picks up your local working copy of the code. This is the - recommended method. - - We'll assume that the Docutils trunk is checked out under your - ~/projects/ directory as follows:: - - svn co svn+ssh://<user>@svn.berlios.de/svnroot/repos/docutils/trunk \ - docutils - - For the bash shell, add this to your ``~/.profile``:: - - PYTHONPATH=$HOME/projects/docutils/docutils - PYTHONPATH=$PYTHONPATH:$HOME/projects/docutils/docutils/extras - export PYTHONPATH - - The first line points to the directory containing the ``docutils`` - package. The second line adds the directory containing the - third-party modules Docutils depends on. The third line exports - this environment variable. You may also wish to add the ``tools`` - directory to your ``PATH``:: - - PATH=$PATH:$HOME/projects/docutils/docutils/tools - export PATH - -2. Before you run anything, every time you make a change, reinstall - Docutils:: - - python setup.py install - - .. CAUTION:: - - This method is **not** recommended for day-to-day development; - it's too easy to forget. Confusion inevitably ensues. - - If you install Docutils this way, Python will always pick up the - last-installed copy of the code. If you ever forget to - reinstall the "docutils" package, Python won't see your latest - changes. - -A useful addition to the ``docutils`` top-level directory in branches -and alternate copies of the code is a ``set-PATHS`` file -containing the following lines:: - - # source this file - export PYTHONPATH=$PWD:$PWD/extras - export PATH=$PWD/tools:$PATH - -Open a shell for this branch, ``cd`` to the ``docutils`` top-level -directory, and "source" this file. For example, using the bash -shell:: - - $ cd some-branch/docutils - $ . set-PATHS - - -Mailing Lists -============= - -Developers are recommended to subscribe to all `Docutils mailing -lists`_. - -.. _Docutils mailing lists: ../user/mailing-lists.html - - -The Wiki -======== - -There is a development wiki at http://docutils.python-hosting.com/ as -a scratchpad for transient notes. Please use the repository for -permament document storage. - - -The Sandbox -=========== - -The `sandbox directory`_ is a place to play around, to try out and -share ideas. It's a part of the Subversion repository but it isn't -distributed as part of Docutils releases. Feel free to check in code -to the sandbox; that way people can try it out but you won't have to -worry about it working 100% error-free, as is the goal of the Docutils -core. Each developer who wants to play in the sandbox should create -either a project-specific subdirectory or personal subdirectory -(suggested name: SourceForge ID, nickname, or given name + family -initial). It's OK to make a mess in your personal space! But please, -play nice. - -Please update the `sandbox README`_ file with links and a brief -description of your work. - -In order to minimize the work necessary for others to install and try -out new, experimental components, the following sandbox directory -structure is recommended:: - - sandbox/ - project_name/ # For a collaborative project. - # Structure as in userid/component_name below. - userid/ # For personal space. - component_name/ # A verbose name is best. - README.txt # Please explain the requirements, - # purpose/goals, and usage. - docs/ - ... - component.py # The component is a single module. - # *OR* (but *not* both) - component/ # The component is a package. - __init__.py # Contains the Reader/Writer class. - other1.py # Other modules and data files used - data.txt # by this component. - ... - test/ # Test suite. - ... - tools/ # For front ends etc. - ... - setup.py # Use Distutils to install the component - # code and tools/ files into the right - # places in Docutils. - -Some sandbox projects are destined to become Docutils components once -completed. Others, such as add-ons to Docutils or applications of -Docutils, graduate to become `parallel projects`_. - -.. _sandbox README: http://docutils.sf.net/sandbox/README.html -.. _sandbox directory: - http://svn.berlios.de/viewcvs/docutils/trunk/sandbox/ - - -.. _parallel project: - -Parallel Projects -================= - -Parallel projects contain useful code that is not central to the -functioning of Docutils. Examples are specialized add-ons or -plug-ins, and applications of Docutils. They use Docutils, but -Docutils does not require their presence to function. - -An official parallel project will have its own directory beside (or -parallel to) the main ``docutils`` directory in the Subversion -repository. It can have its own web page in the -docutils.sourceforge.net domain, its own file releases and -downloadable snapshots, and even a mailing list if that proves useful. -However, an official parallel project has implications: it is expected -to be maintained and continue to work with changes to the core -Docutils. - -A parallel project requires a project leader, who must commit to -coordinate and maintain the implementation: - -* Answer questions from users and developers. -* Review suggestions, bug reports, and patches. -* Monitor changes and ensure the quality of the code and - documentation. -* Coordinate with Docutils to ensure interoperability. -* Put together official project releases. - -Of course, related projects may be created independently of Docutils. -The advantage of a parallel project is that the SourceForge -environment and the developer and user communities are already -established. Core Docutils developers are available for consultation -and may contribute to the parallel project. It's easier to keep the -projects in sync when there are changes made to the core Docutils -code. - - -.. - Local Variables: - mode: indented-text - indent-tabs-mode: nil - sentence-end-double-space: t - fill-column: 70 - End: diff --git a/docutils/docs/dev/pysource.dtd b/docutils/docs/dev/pysource.dtd deleted file mode 100644 index fb8af4091..000000000 --- a/docutils/docs/dev/pysource.dtd +++ /dev/null @@ -1,259 +0,0 @@ -<!-- -====================================================================== - Docutils Python Source DTD -====================================================================== -:Author: David Goodger -:Contact: goodger@users.sourceforge.net -:Revision: $Revision$ -:Date: $Date$ -:Copyright: This DTD has been placed in the public domain. -:Filename: pysource.dtd - -This DTD (document type definition) extends the Generic DTD (see -below). - -More information about this DTD and the Docutils project can be found -at http://docutils.sourceforge.net/. The latest version of this DTD -is available from -http://docutils.sourceforge.net/docs/dev/pysource.dtd. - -The formal public identifier for this DTD is:: - - +//IDN docutils.sourceforge.net//DTD Docutils Python Source//EN//XML ---> - -<!-- -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Parameter Entity Overrides -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ---> - -<!ENTITY % additional.section.elements - " | package_section | module_section | class_section - | method_section | function_section - | module_attribute_section | function_attribute_section - | class_attribute_section | instance_attribute_section "> - -<!ENTITY % additional.inline.elements - " | package | module | class | method | function - | variable | parameter | type | attribute - | module_attribute | class_attribute | instance_attribute - | exception_class | warning_class "> - -<!-- -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Generic DTD -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - -This DTD extends the Docutils Generic DTD, available from -http://docutils.sourceforge.net/docs/ref/docutils.dtd. ---> - -<!ENTITY % docutils PUBLIC - "+//IDN python.org//DTD Docutils Generic//EN//XML" - "docutils.dtd"> -%docutils; - -<!-- -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Additional Section Elements -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ---> - -<!ELEMENT package_section - (package, fullname?, import_list?, %structure.model;)> -<!ATTLIST package_section %basic.atts;> - -<!ELEMENT module_section - (module, fullname?, import_list?, %structure.model;)> -<!ATTLIST module_section %basic.atts;> - -<!ELEMENT class_section - (class, inheritance_list?, fullname?, subclasses?, - %structure.model;)> -<!ATTLIST class_section %basic.atts;> - -<!ELEMENT method_section - (method, parameter_list?, fullname?, overrides?, - %structure.model;)> -<!ATTLIST method_section %basic.atts;> - -<!ELEMENT function_section - (function, parameter_list?, fullname?, %structure.model;)> -<!ATTLIST function_section %basic.atts;> - -<!ELEMENT module_attribute_section - (attribute, initial_value?, fullname?, %structure.model;)> -<!ATTLIST module_attribute_section %basic.atts;> - -<!ELEMENT function_attribute_section - (attribute, initial_value?, fullname?, %structure.model;)> -<!ATTLIST function_attribute_section %basic.atts;> - -<!ELEMENT class_attribute_section - (attribute, initial_value?, fullname?, overrides?, - %structure.model;)> -<!ATTLIST class_attribute_section %basic.atts;> - -<!ELEMENT instance_attribute_section - (attribute, initial_value?, fullname?, overrides?, - %structure.model;)> -<!ATTLIST instance_attribute_section %basic.atts;> - -<!-- - Section Subelements -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ---> - -<!ELEMENT fullname - (package | module | class | method | function | attribute)+> -<!ATTLIST fullname %basic.atts;> - -<!ELEMENT import_list (import_item+)> -<!ATTLIST import_list %basic.atts;> - -<!-- -Support ``import module``, ``import module as alias``, ``from module -import identifier``, and ``from module import identifier as alias``. ---> -<!ELEMENT import_item (fullname, identifier?, alias?)> -<!ATTLIST import_item %basic.atts;> - -<!ELEMENT inheritance_list (class+)> -<!ATTLIST inheritance_list %basic.atts;> - -<!ELEMENT subclasses (class+)> -<!ATTLIST subclasses %basic.atts;> - -<!ELEMENT parameter_list - ((parameter_item+, optional_parameters*) | optional_parameters+)> -<!ATTLIST parameter_list %basic.atts;> - -<!ELEMENT parameter_item - ((parameter | parameter_tuple), parameter_default?)> -<!ATTLIST parameter_item %basic.atts;> - -<!ELEMENT optional_parameters (parameter_item+, optional_parameters*)> -<!ATTLIST optional_parameters %basic.atts;> - -<!ELEMENT parameter_tuple (parameter | parameter_tuple)+> -<!ATTLIST parameter_tuple %basic.atts;> - -<!ELEMENT parameter_default (#PCDATA)> -<!ATTLIST parameter_default %basic.atts;> - -<!ELEMENT overrides (fullname+)> -<!ATTLIST overrides %basic.atts;> - -<!ELEMENT initial_value (#PCDATA)> -<!ATTLIST initial_value %basic.atts;> - -<!-- -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - Additional Inline Elements -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ---> - -<!-- Also used as the `package_section` identifier/title. --> -<!ELEMENT package (#PCDATA)> -<!ATTLIST package - %basic.atts; - %reference.atts;> - -<!-- Also used as the `module_section` identifier/title. --> -<!ELEMENT module (#PCDATA)> -<!ATTLIST module - %basic.atts; - %reference.atts;> - -<!-- -Also used as the `class_section` identifier/title, and in the -`inheritance` element. ---> -<!ELEMENT class (#PCDATA)> -<!ATTLIST class - %basic.atts; - %reference.atts;> - -<!-- Also used as the `method_section` identifier/title. --> -<!ELEMENT method (#PCDATA)> -<!ATTLIST method - %basic.atts; - %reference.atts;> - -<!-- Also used as the `function_section` identifier/title. --> -<!ELEMENT function (#PCDATA)> -<!ATTLIST function - %basic.atts; - %reference.atts;> - -<!-- -??? Use this instead of the ``*_attribute`` elements below? Add a -"type" attribute to differentiate? - -Also used as the identifier/title for `module_attribute_section`, -`class_attribute_section`, and `instance_attribute_section`. ---> -<!ELEMENT attribute (#PCDATA)> -<!ATTLIST attribute - %basic.atts; - %reference.atts;> - -<!-- -Also used as the `module_attribute_section` identifier/title. A module -attribute is an exported module-level global variable. ---> -<!ELEMENT module_attribute (#PCDATA)> -<!ATTLIST module_attribute - %basic.atts; - %reference.atts;> - -<!-- Also used as the `class_attribute_section` identifier/title. --> -<!ELEMENT class_attribute (#PCDATA)> -<!ATTLIST class_attribute - %basic.atts; - %reference.atts;> - -<!-- -Also used as the `instance_attribute_section` identifier/title. ---> -<!ELEMENT instance_attribute (#PCDATA)> -<!ATTLIST instance_attribute - %basic.atts; - %reference.atts;> - -<!ELEMENT variable (#PCDATA)> -<!ATTLIST variable - %basic.atts; - %reference.atts;> - -<!-- Also used in `parameter_list`. --> -<!ELEMENT parameter (#PCDATA)> -<!ATTLIST parameter - %basic.atts; - %reference.atts; - excess_positional %yesorno; #IMPLIED - excess_keyword %yesorno; #IMPLIED> - -<!ELEMENT type (#PCDATA)> -<!ATTLIST type - %basic.atts; - %reference.atts;> - -<!ELEMENT exception_class (#PCDATA)> -<!ATTLIST exception_class - %basic.atts; - %reference.atts;> - -<!ELEMENT warning_class (#PCDATA)> -<!ATTLIST warning_class - %basic.atts; - %reference.atts;> - -<!-- -Local Variables: -mode: sgml -indent-tabs-mode: nil -fill-column: 70 -End: ---> diff --git a/docutils/docs/dev/pysource.txt b/docutils/docs/dev/pysource.txt deleted file mode 100644 index 6f173a709..000000000 --- a/docutils/docs/dev/pysource.txt +++ /dev/null @@ -1,130 +0,0 @@ -====================== - Python Source Reader -====================== -:Author: David Goodger -:Contact: goodger@users.sourceforge.net -:Revision: $Revision$ -:Date: $Date$ -:Copyright: This document has been placed in the public domain. - -This document explores issues around extracting and processing -docstrings from Python modules. - -For definitive element hierarchy details, see the "Python Plaintext -Document Interface DTD" XML document type definition, pysource.dtd_ -(which modifies the generic docutils.dtd_). Descriptions below list -'DTD elements' (XML 'generic identifiers' or tag names) corresponding -to syntax constructs. - - -.. contents:: - - -Model -===== - -The Python Source Reader ("PySource") model that's evolving in my mind -goes something like this: - -1. Extract the docstring/namespace [#]_ tree from the module(s) and/or - package(s). - - .. [#] See `Docstring Extractor`_ below. - -2. Run the parser on each docstring in turn, producing a forest of - doctrees (per nodes.py). - -3. Join the docstring trees together into a single tree, running - transforms: - - - merge hyperlinks - - merge namespaces - - create various sections like "Module Attributes", "Functions", - "Classes", "Class Attributes", etc.; see pysource.dtd_ - - convert the above special sections to ordinary doctree nodes - -4. Run transforms on the combined doctree. Examples: resolving - cross-references/hyperlinks (including interpreted text on Python - identifiers); footnote auto-numbering; first field list -> - bibliographic elements. - - (Or should step 4's transforms come before step 3?) - -5. Pass the resulting unified tree to the writer/builder. - -I've had trouble reconciling the roles of input parser and output -writer with the idea of modes ("readers" or "directors"). Does the -mode govern the tranformation of the input, the output, or both? -Perhaps the mode should be split into two. - -For example, say the source of our input is a Python module. Our -"input mode" should be the "Python Source Reader". It discovers (from -``__docformat__``) that the input parser is "reStructuredText". If we -want HTML, we'll specify the "HTML" output formatter. But there's a -piece missing. What *kind* or *style* of HTML output do we want? -PyDoc-style, LibRefMan style, etc. (many people will want to specify -and control their own style). Is the output style specific to a -particular output format (XML, HTML, etc.)? Is the style specific to -the input mode? Or can/should they be independent? - -I envision interaction between the input parser, an "input mode" , and -the output formatter. The same intermediate data format would be used -between each of these, being transformed as it progresses. - - -Docstring Extractor -=================== - -We need code that scans a parsed Python module, and returns an ordered -tree containing the names, docstrings (including attribute and -additional docstrings), and additional info (in parentheses below) of -all of the following objects: - -- packages -- modules -- module attributes (+ values) -- classes (+ inheritance) -- class attributes (+ values) -- instance attributes (+ values) -- methods (+ formal parameters & defaults) -- functions (+ formal parameters & defaults) - -(Extract comments too? For example, comments at the start of a module -would be a good place for bibliographic field lists.) - -In order to evaluate interpreted text cross-references, namespaces for -each of the above will also be required. - -See python-dev/docstring-develop thread "AST mining", started on -2001-08-14. - - -Interpreted Text -================ - -DTD elements: package, module, class, method, function, -module_attribute, class_attribute, instance_attribute, variable, -parameter, type, exception_class, warning_class. - -To classify identifiers explicitly, the role is given along with the -identifier in either prefix or suffix form:: - - Use :method:`Keeper.storedata` to store the object's data in - `Keeper.data`:instance_attribute:. - -The role may be one of 'package', 'module', 'class', 'method', -'function', 'module_attribute', 'class_attribute', -'instance_attribute', 'variable', 'parameter', 'type', -'exception_class', 'exception', 'warning_class', or 'warning'. Other -roles may be defined. - -.. _pysource.dtd: pysource.dtd -.. _docutils.dtd: ../ref/docutils.dtd - - -.. - Local Variables: - mode: indented-text - indent-tabs-mode: nil - fill-column: 70 - End: diff --git a/docutils/docs/dev/release.txt b/docutils/docs/dev/release.txt deleted file mode 100644 index fa58bc46f..000000000 --- a/docutils/docs/dev/release.txt +++ /dev/null @@ -1,168 +0,0 @@ -============================= - Docutils_ Release Procedure -============================= -:Author: David Goodger; Felix Wiemann; open to all Docutils developers -:Contact: goodger@python.org -:Date: $Date$ -:Revision: $Revision$ -:Copyright: This document has been placed in the public domain. - -.. _Docutils: http://docutils.sourceforge.net/ - - -(Steps in boldface text are *not* covered by the release script at -sandbox/fwiemann/release.sh. "Not covered" means that you aren't even -reminded of them. Note: The release.sh script needs to be updated to -reflect the recent move to Subversion!) - -* **Announce a check-in freeze on Docutils-develop. Post a list of - major changes since the last release and ask for additions.** - - .. _CHANGES.txt: - - **You may want to save this list of changes in a file - (e.g. CHANGES.txt) to have it at hand when you need it for posting - announcements or pasting it into forms.** - -* Change ``__version_details__`` in docutils/docutils/__init__.py to - "release" (from "repository"). - -* Bump the _`version number` in the following files: - - + docutils/setup.py - + docutils/docutils/__init__.py - + docutils/test/functional/expected/* ("Generator: Docutils X.Y.Z") - -* Close the "Changes Since ..." section in docutils/HISTORY.txt. - -* Clear/unset the PYTHONPATH environment variable. - -* Create the release tarball: - - (a) Create a new empty directory and ``cd`` into it. - - (b) Get a clean snapshot of the main tree:: - - svn export svn://svn.berlios.de/docutils/trunk/docutils - - (c) Use Distutils to create the release tarball:: - - cd docutils - python setup.py sdist - -* Expand and _`install` the release tarball in isolation: - - (a) Expand the tarball in a new location, not over any existing - files. - - (b) Remove the old installation from site-packages (including - roman.py, and optparse.py, textwrap.py). - - Install from expanded directory:: - - cd docutils-X.Y.Z - python setup.py install - - The "install" command may require root permissions. - - (c) Repeat step b) for all supported Python versions. - -* Run the _`test suite` from the expanded archive directory with all - supported Python versions: ``cd test ; python -u alltests.py``. - -* Add a directory X.Y.Z (where X.Y.Z is the current version number - of Docutils) in the webroot (i.e. the ``htdocs/`` directory). - Put all documentation files into it:: - - cd docutils-X.Y.Z - rm -rf build - cd tools/ - ./buildhtml.py .. - cd .. - find -name test -type d -prune -o -name \*.css -print0 \ - -o -name \*.html -print0 -o -name \*.txt -print0 \ - | tar -cjvf docutils-docs.tar.bz2 -T - --null - scp docutils-docs.tar.bz2 <username>@shell.sourceforge.net: - - Now log in to shell.sourceforge.net and:: - - cd /home/groups/d/do/docutils/htdocs/ - mkdir -m g+rwxs X.Y.Z - cd X.Y.Z - tar -xjvf ~/docutils-docs.tar.bz2 - rm ~/docutils-docs.tar.bz2 - -* Upload the release tarball:: - - $ ftp upload.sourceforge.net - Connected to osdn.dl.sourceforge.net. - ... - Name (upload.sourceforge.net:david): anonymous - 331 Anonymous login ok, send your complete e-mail address as password. - Password: - ... - 230 Anonymous access granted, restrictions apply. - ftp> bin - 200 Type set to I. - ftp> cd /incoming - 250 CWD command successful. - ftp> put docutils-X.Y.Z.tar.gz - -* Access the _`file release system` on SourceForge (Admin - interface). Fill in the fields: - - :Package ID: docutils - :Release Name: <use release number only, e.g. 0.3> - :Release Date: <today's date> - :Status: Active - :File Name: <select the file just uploaded> - :File Type: Source .gz - :Processor Type: Platform-Independent - :Release Notes: <insert README.txt file here> - :Change Log: <insert summary from CHANGES.txt_> - - Also check the "Preserve my pre-formatted text" box. - -* For verifying the integrity of the release, download the release - tarball (you may need to wait up to 30 minutes), install_ it, and - re-run the `test suite`_. - -* Register with PyPI (``python setup.py register``). - -* Restore ``__version_details__`` in docutils/docutils/__init__.py to - "repository" (from "release"). - -* Bump the `version number`_ again. - -* Add a new empty section "Changes Since ..." in HISTORY.txt. - -* Update the web page (web/index.txt). - -* Run docutils-update on the server. - -* **Send announcement email to:** - - * docutils-develop@lists.sourceforge.net (also announcing the end - of the check-in freeze) - * docutils-users@lists.sourceforge.net - * doc-sig@python.org - * python-announce@python.org - -* **Add a SourceForge News item, with title "Docutils X.Y.Z released" - and containing the release tarball's download URL.** - -* **Register with FreshMeat.** (Add a `new release`__ for the - `Docutils project`__). - - __ http://freshmeat.net/add-release/48702/ - __ http://freshmeat.net/projects/docutils/ - - - -.. - Local Variables: - mode: indented-text - indent-tabs-mode: nil - sentence-end-double-space: t - fill-column: 70 - End: diff --git a/docutils/docs/dev/repository.txt b/docutils/docs/dev/repository.txt deleted file mode 100644 index 2c613b10e..000000000 --- a/docutils/docs/dev/repository.txt +++ /dev/null @@ -1,217 +0,0 @@ -===================================== - The Docutils_ Subversion Repository -===================================== - -:Author: Felix Wiemann -:Contact: Felix.Wiemann@ososo.de -:Revision: $Revision$ -:Date: $Date$ -:Copyright: This document has been placed in the public domain. - -.. _Docutils: http://docutils.sourceforge.net/ - -.. contents:: - -Docutils uses a Subversion_ repository located at ``svn.berlios.de``. -Subversion is exhaustively documented in the `Subversion Book`_ -(svnbook). - -.. _Subversion: http://subversion.tigris.org/ -.. _Subversion Book: http://svnbook.red-bean.com/ - -.. Note:: - - While the repository resides at BerliOS, all other project data - (web site, snapshots, releases, mailing lists, trackers) is hosted - at SourceForge. - -For the project policy on repository use (check-in requirements, -branching, etc.), please see the `Docutils Project Policies`__. - -__ policies.html#subversion-repository - - -Accessing the Repository -======================== - -Web Access ----------- - -The repository can be browsed and examined via the web at -http://svn.berlios.de/viewcvs/docutils/. - - -Anonymous Access ----------------- - -Anonymous (read-only) access is available at ``svn://svn.berlios.de/docutils/``. - -To check out the current main source tree of Docutils, type :: - - svn checkout svn://svn.berlios.de/docutils/trunk/docutils - -To check out everything (main tree, sandboxes, and web site), type :: - - svn checkout svn://svn.berlios.de/docutils/trunk docutils - -This will create a working copy of the whole trunk in a new directory -called ``docutils``. - -If you cannot use the ``svn`` port, you can also use the HTTP access -method by substituting "http://svn.berlios.de/svnroot/repos" for -"svn://svn.berlios.de". - -Note that you should *not* check out ``svn://svn.berlios.de/docutils`` -(without "trunk"), because then you'd end up fetching the whole -Docutils tree for every branch and tag over and over again, wasting -your and BerliOS's bandwidth. - -To update your working copy later on, cd into the working copy and -type :: - - svn update - - -Developer Access ----------------- - -(Developers who had write-access for Docutils' CVS repository on -SourceForge.net should `register at BerliOS`__ and send a message with -their BerliOS user name to `Felix Wiemann <Felix.Wiemann@ososo.de>`_.) - -__ https://developer.berlios.de/account/register.php - -If you are a developer, you get read-write access via -``svn+ssh://<user>@svn.berlios.de/svnroot/repos/docutils/``, where -``<user>`` is your BerliOS user account name. So to retrieve a -working copy, type :: - - svn checkout svn+ssh://<user>@svn.berlios.de/svnroot/repos/docutils/trunk \ - docutils - -If you previously had an anonymous working copy and gained developer -access, you can switch the URL associated with your working copy by -typing :: - - svn switch --relocate svn://svn.berlios.de/docutils/trunk/docutils \ - svn+ssh://<user>@svn.berlios.de/svnroot/repos/docutils - -(Again, ``<user>`` is your BerliOS user account name.) - -If you cannot use the ``ssh`` port, you can also use the HTTPS access -method by substituting "https://svn.berlios.de" for -"svn+ssh://svn.berlios.de". - - -Setting Up Your Subversion Client For Development -````````````````````````````````````````````````` - -Before commiting changes to the repository, please ensure that the -following lines are contained (and uncommented) in your -~/.subversion/config file, so that new files are added with the -correct properties set:: - - [miscellany] - # For your convenience: - global-ignores = ... *.pyc ... - # For correct properties: - enable-auto-props = yes - - [auto-props] - *.py = svn:eol-style=native;svn:keywords=Author Date Id Revision - *.txt = svn:eol-style=native;svn:keywords=Author Date Id Revision - *.html = svn:eol-style=native;svn:keywords=Author Date Id Revision - *.xml = svn:eol-style=native;svn:keywords=Author Date Id Revision - *.tex = svn:eol-style=native;svn:keywords=Author Date Id Revision - *.css = svn:eol-style=native;svn:keywords=Author Date Id Revision - *.patch = svn:eol-style=native - *.sh = svn:eol-style=native;svn:executable;svn:keywords=Author Date Id Revision - *.png = svn:mime-type=image/png - *.jpg = svn:mime-type=image/jpeg - *.gif = svn:mime-type=image/gif - - -Setting Up SSH Access -````````````````````` - -With a public & private key pair, you can access the shell and -Subversion servers without having to enter your password. There are -two places to add your SSH public key on BerliOS: your web account and -your shell account. - -* Adding your SSH key to your BerliOS web account: - - 1. Log in on the web at https://developer.berlios.de/. Create your - account first if necessary. You should be taken to your "My - Personal Page" (https://developer.berlios.de/my/). - - 2. Choose "Account Options" from the menu below the top banner. - - 3. At the bottom of the "Account Maintenance" page - (https://developer.berlios.de/account/) you'll find a "Shell - Account Information" section; click on "[Edit Keys]". - - 4. Copy and paste your SSH public key into the edit box on this page - (https://developer.berlios.de/account/editsshkeys.php). Further - instructions are available on this page. - -* Adding your SSH key to your BerliOS shell account: - - 1. Log in to the BerliOS shell server:: - - ssh <user>@shell.berlios.de - - You'll be asked for your password, which you set when you created - your account. - - 2. Create a .ssh directory in your home directory, and remove - permissions for group & other:: - - mkdir .ssh - chmod og-rwx .ssh - - Exit the SSH session. - - 3. Copy your public key to the .ssh directory on BerliOS:: - - scp .ssh/id_dsa.pub <user>@shell.berlios.de:.ssh/authorized_keys - - Now you should be able to start an SSH session without needing your - password. - - -Repository Layout -================= - -The following tree shows the repository layout:: - - docutils/ - |-- branches/ - | |-- branch1/ - | | |-- docutils/ - | | |-- sandbox/ - | | `-- web/ - | `-- branch2/ - | |-- docutils/ - | |-- sandbox/ - | `-- web/ - |-- tags/ - | |-- tag1/ - | | |-- docutils/ - | | |-- sandbox/ - | | `-- web/ - | `-- tag2/ - | |-- docutils/ - | |-- sandbox/ - | `-- web/ - `-- trunk/ - |-- docutils/ - |-- sandbox/ - `-- web/ - -``docutils/branches/`` and ``docutils/tags/`` contain (shallow) copies -of the whole trunk. - -The main source tree lives at ``docutils/trunk/docutils/``, next to -the sandboxes (``docutils/trunk/sandbox/``) and the web site files -(``docutils/trunk/web/``). diff --git a/docutils/docs/dev/rst/alternatives.txt b/docutils/docs/dev/rst/alternatives.txt deleted file mode 100644 index 12874c5fb..000000000 --- a/docutils/docs/dev/rst/alternatives.txt +++ /dev/null @@ -1,3129 +0,0 @@ -================================================== - A Record of reStructuredText Syntax Alternatives -================================================== - -:Author: David Goodger -:Contact: goodger@users.sourceforge.net -:Revision: $Revision$ -:Date: $Date$ -:Copyright: This document has been placed in the public domain. - -The following are ideas, alternatives, and justifications that were -considered for reStructuredText syntax, which did not originate with -Setext_ or StructuredText_. For an analysis of constructs which *did* -originate with StructuredText or Setext, please see `Problems With -StructuredText`_. See the `reStructuredText Markup Specification`_ -for full details of the established syntax. - -The ideas are divided into sections: - -* Implemented_: already done. The issues and alternatives are - recorded here for posterity. - -* `Not Implemented`_: these ideas won't be implemented. - -* Tabled_: these ideas should be revisited in the future. - -* `To Do`_: these ideas should be implemented. They're just waiting - for a champion to resolve issues and get them done. - -* `... Or Not To Do?`_: possible but questionable. These probably - won't be implemented, but you never know. - -.. _Setext: http://docutils.sourceforge.net/mirror/setext.html -.. _StructuredText: - http://www.zope.org/DevHome/Members/jim/StructuredTextWiki/FrontPage -.. _Problems with StructuredText: problems.html -.. _reStructuredText Markup Specification: - ../../ref/rst/restructuredtext.html - - -.. contents:: - -------------- - Implemented -------------- - -Field Lists -=========== - -Prior to the syntax for field lists being finalized, several -alternatives were proposed. - -1. Unadorned RFC822_ everywhere:: - - Author: Me - Version: 1 - - Advantages: clean, precedent (RFC822-compliant). Disadvantage: - ambiguous (these paragraphs are a prime example). - - Conclusion: rejected. - -2. Special case: use unadorned RFC822_ for the very first or very last - text block of a document:: - - """ - Author: Me - Version: 1 - - The rest of the document... - """ - - Advantages: clean, precedent (RFC822-compliant). Disadvantages: - special case, flat (unnested) field lists only, still ambiguous:: - - """ - Usage: cmdname [options] arg1 arg2 ... - - We obviously *don't* want the like above to be interpreted as a - field list item. Or do we? - """ - - Conclusion: rejected for the general case, accepted for specific - contexts (PEPs, email). - -3. Use a directive:: - - .. fields:: - - Author: Me - Version: 1 - - Advantages: explicit and unambiguous, RFC822-compliant. - Disadvantage: cumbersome. - - Conclusion: rejected for the general case (but such a directive - could certainly be written). - -4. Use Javadoc-style:: - - @Author: Me - @Version: 1 - @param a: integer - - Advantages: unambiguous, precedent, flexible. Disadvantages: - non-intuitive, ugly, not RFC822-compliant. - - Conclusion: rejected. - -5. Use leading colons:: - - :Author: Me - :Version: 1 - - Advantages: unambiguous, obvious (*almost* RFC822-compliant), - flexible, perhaps even elegant. Disadvantages: no precedent, not - quite RFC822-compliant. - - Conclusion: accepted! - -6. Use double colons:: - - Author:: Me - Version:: 1 - - Advantages: unambiguous, obvious? (*almost* RFC822-compliant), - flexible, similar to syntax already used for literal blocks and - directives. Disadvantages: no precedent, not quite - RFC822-compliant, similar to syntax already used for literal blocks - and directives. - - Conclusion: rejected because of the syntax similarity & conflicts. - -Why is RFC822 compliance important? It's a universal Internet -standard, and super obvious. Also, I'd like to support the PEP format -(ulterior motive: get PEPs to use reStructuredText as their standard). -But it *would* be easy to get used to an alternative (easy even to -convert PEPs; probably harder to convert python-deviants ;-). - -Unfortunately, without well-defined context (such as in email headers: -RFC822 only applies before any blank lines), the RFC822 format is -ambiguous. It is very common in ordinary text. To implement field -lists unambiguously, we need explicit syntax. - -The following question was posed in a footnote: - - Should "bibliographic field lists" be defined at the parser level, - or at the DPS transformation level? In other words, are they - reStructuredText-specific, or would they also be applicable to - another (many/every other?) syntax? - -The answer is that bibliographic fields are a -reStructuredText-specific markup convention. Other syntaxes may -implement the bibliographic elements explicitly. For example, there -would be no need for such a transformation for an XML-based markup -syntax. - -.. _RFC822: http://www.rfc-editor.org/rfc/rfc822.txt - - -Interpreted Text "Roles" -======================== - -The original purpose of interpreted text was as a mechanism for -descriptive markup, to describe the nature or role of a word or -phrase. For example, in XML we could say "<function>len</function>" -to mark up "len" as a function. It is envisaged that within Python -docstrings (inline documentation in Python module source files, the -primary market for reStructuredText) the role of a piece of -interpreted text can be inferred implicitly from the context of the -docstring within the program source. For other applications, however, -the role may have to be indicated explicitly. - -Interpreted text is enclosed in single backquotes (`). - -1. Initially, it was proposed that an explicit role could be indicated - as a word or phrase within the enclosing backquotes: - - - As a prefix, separated by a colon and whitespace:: - - `role: interpreted text` - - - As a suffix, separated by whitespace and a colon:: - - `interpreted text :role` - - There are problems with the initial approach: - - - There could be ambiguity with interpreted text containing colons. - For example, an index entry of "Mission: Impossible" would - require a backslash-escaped colon. - - - The explicit role is descriptive markup, not content, and will - not be visible in the processed output. Putting it inside the - backquotes doesn't feel right; the *role* isn't being quoted. - -2. Tony Ibbs suggested that the role be placed outside the - backquotes:: - - role:`prefix` or `suffix`:role - - This removes the embedded-colons ambiguity, but limits the role - identifier to be a single word (whitespace would be illegal). - Since roles are not meant to be visible after processing, the lack - of whitespace support is not important. - - The suggested syntax remains ambiguous with respect to ratios and - some writing styles. For example, suppose there is a "signal" - identifier, and we write:: - - ...calculate the `signal`:noise ratio. - - "noise" looks like a role. - -3. As an improvement on #2, we can bracket the role with colons:: - - :role:`prefix` or `suffix`:role: - - This syntax is similar to that of field lists, which is fine since - both are doing similar things: describing. - - This is the syntax chosen for reStructuredText. - -4. Another alternative is two colons instead of one:: - - role::`prefix` or `suffix`::role - - But this is used for analogies ("A:B::C:D": "A is to B as C is to - D"). - - Both alternative #2 and #4 lack delimiters on both sides of the - role, making it difficult to parse (by the reader). - -5. Some kind of bracketing could be used: - - - Parentheses:: - - (role)`prefix` or `suffix`(role) - - - Braces:: - - {role}`prefix` or `suffix`{role} - - - Square brackets:: - - [role]`prefix` or `suffix`[role] - - - Angle brackets:: - - <role>`prefix` or `suffix`<role> - - (The overlap of \*ML tags with angle brackets would be too - confusing and precludes their use.) - -Syntax #3 was chosen for reStructuredText. - - -Comments -======== - -A problem with comments (actually, with all indented constructs) is -that they cannot be followed by an indented block -- a block quote -- -without swallowing it up. - -I thought that perhaps comments should be one-liners only. But would -this mean that footnotes, hyperlink targets, and directives must then -also be one-liners? Not a good solution. - -Tony Ibbs suggested a "comment" directive. I added that we could -limit a comment to a single text block, and that a "multi-block -comment" could use "comment-start" and "comment-end" directives. This -would remove the indentation incompatibility. A "comment" directive -automatically suggests "footnote" and (hyperlink) "target" directives -as well. This could go on forever! Bad choice. - -Garth Kidd suggested that an "empty comment", a ".." explicit markup -start with nothing on the first line (except possibly whitespace) and -a blank line immediately following, could serve as an "unindent". An -empty comment does **not** swallow up indented blocks following it, -so block quotes are safe. "A tiny but practical wart." Accepted. - - -Anonymous Hyperlinks -==================== - -Alan Jaffray came up with this idea, along with the following syntax:: - - Search the `Python DOC-SIG mailing list archives`{}_. - - .. _: http://mail.python.org/pipermail/doc-sig/ - -The idea is sound and useful. I suggested a "double underscore" -syntax:: - - Search the `Python DOC-SIG mailing list archives`__. - - .. __: http://mail.python.org/pipermail/doc-sig/ - -But perhaps single underscores are okay? The syntax looks better, but -the hyperlink itself doesn't explicitly say "anonymous":: - - Search the `Python DOC-SIG mailing list archives`_. - - .. _: http://mail.python.org/pipermail/doc-sig/ - -Mixing anonymous and named hyperlinks becomes confusing. The order of -targets is not significant for named hyperlinks, but it is for -anonymous hyperlinks:: - - Hyperlinks: anonymous_, named_, and another anonymous_. - - .. _named: named - .. _: anonymous1 - .. _: anonymous2 - -Without the extra syntax of double underscores, determining which -hyperlink references are anonymous may be difficult. We'd have to -check which references don't have corresponding targets, and match -those up with anonymous targets. Keeping to a simple consistent -ordering (as with auto-numbered footnotes) seems simplest. - -reStructuredText will use the explicit double-underscore syntax for -anonymous hyperlinks. An alternative (see `Reworking Explicit Markup -(Round 1)`_ below) for the somewhat awkward ".. __:" syntax is "__":: - - An anonymous__ reference. - - __ http://anonymous - - -Reworking Explicit Markup (Round 1) -=================================== - -Alan Jaffray came up with the idea of `anonymous hyperlinks`_, added -to reStructuredText. Subsequently it was asserted that hyperlinks -(especially anonymous hyperlinks) would play an increasingly important -role in reStructuredText documents, and therefore they require a -simpler and more concise syntax. This prompted a review of the -current and proposed explicit markup syntaxes with regards to -improving usability. - -1. Original syntax:: - - .. _blah: internal hyperlink target - .. _blah: http://somewhere external hyperlink target - .. _blah: blahblah_ indirect hyperlink target - .. __: anonymous internal target - .. __: http://somewhere anonymous external target - .. __: blahblah_ anonymous indirect target - .. [blah] http://somewhere footnote - .. blah:: http://somewhere directive - .. blah: http://somewhere comment - - .. Note:: - - The comment text was intentionally made to look like a hyperlink - target. - - Origins: - - * Except for the colon (a delimiter necessary to allow for - phrase-links), hyperlink target ``.. _blah:`` comes from Setext. - * Comment syntax from Setext. - * Footnote syntax from StructuredText ("named links"). - * Directives and anonymous hyperlinks original to reStructuredText. - - Advantages: - - + Consistent explicit markup indicator: "..". - + Consistent hyperlink syntax: ".. _" & ":". - - Disadvantages: - - - Anonymous target markup is awkward: ".. __:". - - The explicit markup indicator ("..") is excessively overloaded? - - Comment text is limited (can't look like a footnote, hyperlink, - or directive). But this is probably not important. - -2. Alan Jaffray's proposed syntax #1:: - - __ _blah internal hyperlink target - __ blah: http://somewhere external hyperlink target - __ blah: blahblah_ indirect hyperlink target - __ anonymous internal target - __ http://somewhere anonymous external target - __ blahblah_ anonymous indirect target - __ [blah] http://somewhere footnote - .. blah:: http://somewhere directive - .. blah: http://somewhere comment - - The hyperlink-connoted underscores have become first-level syntax. - - Advantages: - - + Anonymous targets are simpler. - + All hyperlink targets are one character shorter. - - Disadvantages: - - - Inconsistent internal hyperlink targets. Unlike all other named - hyperlink targets, there's no colon. There's an extra leading - underscore, but we can't drop it because without it, "blah" looks - like a relative URI. Unless we restore the colon:: - - __ blah: internal hyperlink target - - - Obtrusive markup? - -3. Alan Jaffray's proposed syntax #2:: - - .. _blah internal hyperlink target - .. blah: http://somewhere external hyperlink target - .. blah: blahblah_ indirect hyperlink target - .. anonymous internal target - .. http://somewhere anonymous external target - .. blahblah_ anonymous indirect target - .. [blah] http://somewhere footnote - !! blah: http://somewhere directive - ## blah: http://somewhere comment - - Leading underscores have been (almost) replaced by "..", while - comments and directives have gained their own syntax. - - Advantages: - - + Anonymous hyperlinks are simpler. - + Unique syntax for comments. Connotation of "comment" from - some programming languages (including our favorite). - + Unique syntax for directives. Connotation of "action!". - - Disadvantages: - - - Inconsistent internal hyperlink targets. Again, unlike all other - named hyperlink targets, there's no colon. There's a leading - underscore, matching the trailing underscores of references, - which no other hyperlink targets have. We can't drop that one - leading underscore though: without it, "blah" looks like a - relative URI. Again, unless we restore the colon:: - - .. blah: internal hyperlink target - - - All (except for internal) hyperlink targets lack their leading - underscores, losing the "hyperlink" connotation. - - - Obtrusive syntax for comments. Alternatives:: - - ;; blah: http://somewhere - (also comment syntax in Lisp & others) - ,, blah: http://somewhere - ("comma comma": sounds like "comment"!) - - - Iffy syntax for directives. Alternatives? - -4. Tony Ibbs' proposed syntax:: - - .. _blah: internal hyperlink target - .. _blah: http://somewhere external hyperlink target - .. _blah: blahblah_ indirect hyperlink target - .. anonymous internal target - .. http://somewhere anonymous external target - .. blahblah_ anonymous indirect target - .. [blah] http://somewhere footnote - .. blah:: http://somewhere directive - .. blah: http://somewhere comment - - This is the same as the current syntax, except for anonymous - targets which drop their "__: ". - - Advantage: - - + Anonymous targets are simpler. - - Disadvantages: - - - Anonymous targets lack their leading underscores, losing the - "hyperlink" connotation. - - Anonymous targets are almost indistinguishable from comments. - (Better to know "up front".) - -5. David Goodger's proposed syntax: Perhaps going back to one of - Alan's earlier suggestions might be the best solution. How about - simply adding "__ " as a synonym for ".. __: " in the original - syntax? These would become equivalent:: - - .. __: anonymous internal target - .. __: http://somewhere anonymous external target - .. __: blahblah_ anonymous indirect target - - __ anonymous internal target - __ http://somewhere anonymous external target - __ blahblah_ anonymous indirect target - -Alternative 5 has been adopted. - - -Backquotes in Phrase-Links -========================== - -[From a 2001-06-05 Doc-SIG post in reply to questions from Doug -Hellmann.] - -The first draft of the spec, posted to the Doc-SIG in November 2000, -used square brackets for phrase-links. I changed my mind because: - -1. In the first draft, I had already decided on single-backquotes for - inline literal text. - -2. However, I wanted to minimize the necessity for backslash escapes, - for example when quoting Python repr-equivalent syntax that uses - backquotes. - -3. The processing of identifiers (function/method/attribute/module - etc. names) into hyperlinks is a useful feature. PyDoc recognizes - identifiers heuristically, but it doesn't take much imagination to - come up with counter-examples where PyDoc's heuristics would result - in embarassing failure. I wanted to do it deterministically, and - that called for syntax. I called this construct "interpreted - text". - -4. Leveraging off the ``*emphasis*/**strong**`` syntax, lead to the - idea of using double-backquotes as syntax. - -5. I worked out some rules for inline markup recognition. - -6. In combination with #5, double backquotes lent themselves to inline - literals, neatly satisfying #2, minimizing backslash escapes. In - fact, the spec says that no interpretation of any kind is done - within double-backquote inline literal text; backslashes do *no* - escaping within literal text. - -7. Single backquotes are then freed up for interpreted text. - -8. I already had square brackets required for footnote references. - -9. Since interpreted text will typically turn into hyperlinks, it was - a natural fit to use backquotes as the phrase-quoting syntax for - trailing-underscore hyperlinks. - -The original inspiration for the trailing underscore hyperlink syntax -was Setext. But for phrases Setext used a very cumbersome -``underscores_between_words_like_this_`` syntax. - -The underscores can be viewed as if they were right-pointing arrows: -``-->``. So ``hyperlink_`` points away from the reference, and -``.. _hyperlink:`` points toward the target. - - -Substitution Mechanism -====================== - -Substitutions arose out of a Doc-SIG thread begun on 2001-10-28 by -Alan Jaffray, "reStructuredText inline markup". It reminded me of a -missing piece of the reStructuredText puzzle, first referred to in my -contribution to "Documentation markup & processing / PEPs" (Doc-SIG -2001-06-21). - -Substitutions allow the power and flexibility of directives to be -shared by inline text. They are a way to allow arbitrarily complex -inline objects, while keeping the details out of the flow of text. -They are the equivalent of SGML/XML's named entities. For example, an -inline image (using reference syntax alternative 4d (vertical bars) -and definition alternative 3, the alternatives chosen for inclusion in -the spec):: - - The |biohazard| symbol must be used on containers used to dispose - of medical waste. - - .. |biohazard| image:: biohazard.png - [height=20 width=20] - -The ``|biohazard|`` substitution reference will be replaced in-line by -whatever the ``.. |biohazard|`` substitution definition generates (in -this case, an image). A substitution definition contains the -substitution text bracketed with vertical bars, followed by a an -embedded inline-compatible directive, such as "image". A transform is -required to complete the substitution. - -Syntax alternatives for the reference: - -1. Use the existing interpreted text syntax, with a predefined role - such as "sub":: - - The `biohazard`:sub: symbol... - - Advantages: existing syntax, explicit. Disadvantages: verbose, - obtrusive. - -2. Use a variant of the interpreted text syntax, with a new suffix - akin to the underscore in phrase-link references:: - - (a) `name`@ - (b) `name`# - (c) `name`& - (d) `name`/ - (e) `name`< - (f) `name`:: - (g) `name`: - - - Due to incompatibility with other constructs and ordinary text - usage, (f) and (g) are not possible. - -3. Use interpreted text syntax with a fixed internal format:: - - (a) `:name:` - (b) `name:` - (c) `name::` - (d) `::name::` - (e) `%name%` - (f) `#name#` - (g) `/name/` - (h) `&name&` - (i) `|name|` - (j) `[name]` - (k) `<name>` - (l) `&name;` - (m) `'name'` - - To avoid ML confusion (k) and (l) are definitely out. Square - brackets (j) won't work in the target (the substitution definition - would be indistinguishable from a footnote). - - The ```/name/``` syntax (g) is reminiscent of "s/find/sub" - substitution syntax in ed-like languages. However, it may have a - misleading association with regexps, and looks like an absolute - POSIX path. (i) is visually equivalent and lacking the - connotations. - - A disadvantage of all of these is that they limit interpreted text, - albeit only slightly. - -4. Use specialized syntax, something new:: - - (a) #name# - (b) @name@ - (c) /name/ - (d) |name| - (e) <<name>> - (f) //name// - (g) ||name|| - (h) ^name^ - (i) [[name]] - (j) ~name~ - (k) !name! - (l) =name= - (m) ?name? - (n) >name< - - "#" (a) and "@" (b) are obtrusive. "/" (c) without backquotes - looks just like a POSIX path; it is likely for such usage to appear - in text. - - "|" (d) and "^" (h) are feasible. - -5. Redefine the trailing underscore syntax. See definition syntax - alternative 4, below. - -Syntax alternatives for the definition: - -1. Use the existing directive syntax, with a predefined directive such - as "sub". It contains a further embedded directive resolving to an - inline-compatible object:: - - .. sub:: biohazard - .. image:: biohazard.png - [height=20 width=20] - - .. sub:: parrot - That bird wouldn't *voom* if you put 10,000,000 volts - through it! - - The advantages and disadvantages are the same as in inline - alternative 1. - -2. Use syntax as in #1, but with an embedded directivecompressed:: - - .. sub:: biohazard image:: biohazard.png - [height=20 width=20] - - This is a bit better than alternative 1, but still too much. - -3. Use a variant of directive syntax, incorporating the substitution - text, obviating the need for a special "sub" directive name. If we - assume reference alternative 4d (vertical bars), the matching - definition would look like this:: - - .. |biohazard| image:: biohazard.png - [height=20 width=20] - -4. (Suggested by Alan Jaffray on Doc-SIG from 2001-11-06.) - - Instead of adding new syntax, redefine the trailing underscore - syntax to mean "substitution reference" instead of "hyperlink - reference". Alan's example:: - - I had lunch with Jonathan_ today. We talked about Zope_. - - .. _Jonathan: lj [user=jhl] - .. _Zope: http://www.zope.org/ - - A problem with the proposed syntax is that URIs which look like - simple reference names (alphanum plus ".", "-", "_") would be - indistinguishable from substitution directive names. A more - consistent syntax would be:: - - I had lunch with Jonathan_ today. We talked about Zope_. - - .. _Jonathan: lj:: user=jhl - .. _Zope: http://www.zope.org/ - - (``::`` after ``.. _Jonathan: lj``.) - - The "Zope" target is a simple external hyperlink, but the - "Jonathan" target contains a directive. Alan proposed is that the - reference text be replaced by whatever the referenced directive - (the "directive target") produces. A directive reference becomes a - hyperlink reference if the contents of the directive target resolve - to a hyperlink. If the directive target resolves to an icon, the - reference is replaced by an inline icon. If the directive target - resolves to a hyperlink, the directive reference becomes a - hyperlink reference. - - This seems too indirect and complicated for easy comprehension. - - The reference in the text will sometimes become a link, sometimes - not. Sometimes the reference text will remain, sometimes not. We - don't know *at the reference*:: - - This is a `hyperlink reference`_; its text will remain. - This is an `inline icon`_; its text will disappear. - - That's a problem. - -The syntax that has been incorporated into the spec and parser is -reference alternative 4d with definition alternative 3:: - - The |biohazard| symbol... - - .. |biohazard| image:: biohazard.png - [height=20 width=20] - -We can also combine substitution references with hyperlink references, -by appending a "_" (named hyperlink reference) or "__" (anonymous -hyperlink reference) suffix to the substitution reference. This -allows us to click on an image-link:: - - The |biohazard|_ symbol... - - .. |biohazard| image:: biohazard.png - [height=20 width=20] - .. _biohazard: http://www.cdc.gov/ - -There have been several suggestions for the naming of these -constructs, originally called "substitution references" and -"substitutions". - -1. Candidate names for the reference construct: - - (a) substitution reference - (b) tagging reference - (c) inline directive reference - (d) directive reference - (e) indirect inline directive reference - (f) inline directive placeholder - (g) inline directive insertion reference - (h) directive insertion reference - (i) insertion reference - (j) directive macro reference - (k) macro reference - (l) substitution directive reference - -2. Candidate names for the definition construct: - - (a) substitution - (b) substitution directive - (c) tag - (d) tagged directive - (e) directive target - (f) inline directive - (g) inline directive definition - (h) referenced directive - (i) indirect directive - (j) indirect directive definition - (k) directive definition - (l) indirect inline directive - (m) named directive definition - (n) inline directive insertion definition - (o) directive insertion definition - (p) insertion definition - (q) insertion directive - (r) substitution definition - (s) directive macro definition - (t) macro definition - (u) substitution directive definition - (v) substitution definition - -"Inline directive reference" (1c) seems to be an appropriate term at -first, but the term "inline" is redundant in the case of the -reference. Its counterpart "inline directive definition" (2g) is -awkward, because the directive definition itself is not inline. - -"Directive reference" (1d) and "directive definition" (2k) are too -vague. "Directive definition" could be used to refer to any -directive, not just those used for inline substitutions. - -One meaning of the term "macro" (1k, 2s, 2t) is too -programming-language-specific. Also, macros are typically simple text -substitution mechanisms: the text is substituted first and evaluated -later. reStructuredText substitution definitions are evaluated in -place at parse time and substituted afterwards. - -"Insertion" (1h, 1i, 2n-2q) is almost right, but it implies that -something new is getting added rather than one construct being -replaced by another. - -Which brings us back to "substitution". The overall best names are -"substitution reference" (1a) and "substitution definition" (2v). A -long way to go to add one word! - - -Inline External Targets -======================= - -Currently reStructuredText has two hyperlink syntax variations: - -* Named hyperlinks:: - - This is a named reference_ of one word ("reference"). Here is - a `phrase reference`_. Phrase references may even cross `line - boundaries`_. - - .. _reference: http://www.example.org/reference/ - .. _phrase reference: http://www.example.org/phrase_reference/ - .. _line boundaries: http://www.example.org/line_boundaries/ - - + Advantages: - - - The plaintext is readable. - - Each target may be reused multiple times (e.g., just write - ``"reference_"`` again). - - No syncronized ordering of references and targets is necessary. - - + Disadvantages: - - - The reference text must be repeated as target names; could lead - to mistakes. - - The target URLs may be located far from the references, and hard - to find in the plaintext. - -* Anonymous hyperlinks (in current reStructuredText):: - - This is an anonymous reference__. Here is an anonymous - `phrase reference`__. Phrase references may even cross `line - boundaries`__. - - __ http://www.example.org/reference/ - __ http://www.example.org/phrase_reference/ - __ http://www.example.org/line_boundaries/ - - + Advantages: - - - The plaintext is readable. - - The reference text does not have to be repeated. - - + Disadvantages: - - - References and targets must be kept in sync. - - Targets cannot be reused. - - The target URLs may be located far from the references. - -For comparison and historical background, StructuredText also has two -syntaxes for hyperlinks: - -* First, ``"reference text":URL``:: - - This is a "reference":http://www.example.org/reference/ - of one word ("reference"). Here is a "phrase - reference":http://www.example.org/phrase_reference/. - -* Second, ``"reference text", http://example.com/absolute_URL``:: - - This is a "reference", http://www.example.org/reference/ - of one word ("reference"). Here is a "phrase reference", - http://www.example.org/phrase_reference/. - -Both syntaxes share advantages and disadvantages: - -+ Advantages: - - - The target is specified immediately adjacent to the reference. - -+ Disadvantages: - - - Poor plaintext readability. - - Targets cannot be reused. - - Both syntaxes use double quotes, common in ordinary text. - - In the first syntax, the URL and the last word are stuck - together, exacerbating the line wrap problem. - - The second syntax is too magical; text could easily be written - that way by accident (although only absolute URLs are recognized - here, perhaps because of the potential for ambiguity). - -A new type of "inline external hyperlink" has been proposed. - -1. On 2002-06-28, Simon Budig proposed__ a new syntax for - reStructuredText hyperlinks:: - - This is a reference_(http://www.example.org/reference/) of one - word ("reference"). Here is a `phrase - reference`_(http://www.example.org/phrase_reference/). Are - these examples, (single-underscore), named? If so, `anonymous - references`__(http://www.example.org/anonymous/) using two - underscores would probably be preferable. - - __ http://mail.python.org/pipermail/doc-sig/2002-June/002648.html - - The syntax, advantages, and disadvantages are similar to those of - StructuredText. - - + Advantages: - - - The target is specified immediately adjacent to the reference. - - + Disadvantages: - - - Poor plaintext readability. - - Targets cannot be reused (unless named, but the semantics are - unclear). - - + Problems: - - - The ``"`ref`_(URL)"`` syntax forces the last word of the - reference text to be joined to the URL, making a potentially - very long word that can't be wrapped (URLs can be very long). - The reference and the URL should be separate. This is a - symptom of the following point: - - - The syntax produces a single compound construct made up of two - equally important parts, *with syntax in the middle*, *between* - the reference and the target. This is unprecedented in - reStructuredText. - - - The "inline hyperlink" text is *not* a named reference (there's - no lookup by name), so it shouldn't look like one. - - - According to the IETF standards RFC 2396 and RFC 2732, - parentheses are legal URI characters and curly braces are legal - email characters, making their use prohibitively difficult. - - - The named/anonymous semantics are unclear. - -2. After an analysis__ of the syntax of (1) above, we came up with the - following compromise syntax:: - - This is an anonymous reference__ - __<http://www.example.org/reference/> of one word - ("reference"). Here is a `phrase reference`__ - __<http://www.example.org/phrase_reference/>. `Named - references`_ _<http://www.example.org/anonymous/> use single - underscores. - - __ http://mail.python.org/pipermail/doc-sig/2002-July/002670.html - - The syntax builds on that of the existing "inline internal - targets": ``an _`inline internal target`.`` - - + Advantages: - - - The target is specified immediately adjacent to the reference, - improving maintainability: - - - References and targets are easily kept in sync. - - The reference text does not have to be repeated. - - - The construct is executed in two parts: references identical to - existing references, and targets that are new but not too big a - stretch from current syntax. - - - There's overwhelming precedent for quoting URLs with angle - brackets [#]_. - - + Disadvantages: - - - Poor plaintext readability. - - Lots of "line noise". - - Targets cannot be reused (unless named; see below). - - To alleviate the readability issue slightly, we could allow the - target to appear later, such as after the end of the sentence:: - - This is a named reference__ of one word ("reference"). - __<http://www.example.org/reference/> Here is a `phrase - reference`__. __<http://www.example.org/phrase_reference/> - - Problem: this could only work for one reference at a time - (reference/target pairs must be proximate [refA trgA refB trgB], - not interleaved [refA refB trgA trgB] or nested [refA refB trgB - trgA]). This variation is too problematic; references and inline - external targets will have to be kept imediately adjacent (see (3) - below). - - The ``"reference__ __<target>"`` syntax is actually for "anonymous - inline external targets", emphasized by the double underscores. It - follows that single trailing and leading underscores would lead to - *implicitly named* inline external targets. This would allow the - reuse of targets by name. So after ``"reference_ _<target>"``, - another ``"reference_"`` would point to the same target. - - .. [#] - From RFC 2396 (URI syntax): - - The angle-bracket "<" and ">" and double-quote (") - characters are excluded [from URIs] because they are often - used as the delimiters around URI in text documents and - protocol fields. - - Using <> angle brackets around each URI is especially - recommended as a delimiting style for URI that contain - whitespace. - - From RFC 822 (email headers): - - Angle brackets ("<" and ">") are generally used to indicate - the presence of a one machine-usable reference (e.g., - delimiting mailboxes), possibly including source-routing to - the machine. - -3. If it is best for references and inline external targets to be - immediately adjacent, then they might as well be integrated. - Here's an alternative syntax embedding the target URL in the - reference:: - - This is an anonymous `reference <http://www.example.org - /reference/>`__ of one word ("reference"). Here is a `phrase - reference <http://www.example.org/phrase_reference/>`__. - - Advantages and disadvantages are similar to those in (2). - Readability is still an issue, but the syntax is a bit less - heavyweight (reduced line noise). Backquotes are required, even - for one-word references; the target URL is included within the - reference text, forcing a phrase context. - - We'll call this variant "embedded URIs". - - Problem: how to refer to a title like "HTML Anchors: <a>" (which - ends with an HTML/SGML/XML tag)? We could either require more - syntax on the target (like ``"`reference text - __<http://example.com/>`__"``), or require the odd conflicting - title to be escaped (like ``"`HTML Anchors: \<a>`__"``). The - latter seems preferable, and not too onerous. - - Similarly to (2) above, a single trailing underscore would convert - the reference & inline external target from anonymous to implicitly - named, allowing reuse of targets by name. - - I think this is the least objectionable of the syntax alternatives. - -Other syntax variations have been proposed (by Brett Cannon and Benja -Fallenstein):: - - `phrase reference`->http://www.example.com - - `phrase reference`@http://www.example.com - - `phrase reference`__ ->http://www.example.com - - `phrase reference` [-> http://www.example.com] - - `phrase reference`__ [-> http://www.example.com] - - `phrase reference` <http://www.example.com>_ - -None of these variations are clearly superior to #3 above. Some have -problems that exclude their use. - -With any kind of inline external target syntax it comes down to the -conflict between maintainability and plaintext readability. I don't -see a major problem with reStructuredText's maintainability, and I -don't want to sacrifice plaintext readability to "improve" it. - -The proponents of inline external targets want them for easily -maintainable web pages. The arguments go something like this: - -- Named hyperlinks are difficult to maintain because the reference - text is duplicated as the target name. - - To which I said, "So use anonymous hyperlinks." - -- Anonymous hyperlinks are difficult to maintain becuase the - references and targets have to be kept in sync. - - "So keep the targets close to the references, grouped after each - paragraph. Maintenance is trivial." - -- But targets grouped after paragraphs break the flow of text. - - "Surely less than URLs embedded in the text! And if the intent is - to produce web pages, not readable plaintext, then who cares about - the flow of text?" - -Many participants have voiced their objections to the proposed syntax: - - Garth Kidd: "I strongly prefer the current way of doing it. - Inline is spectactularly messy, IMHO." - - Tony Ibbs: "I vehemently agree... that the inline alternatives - being suggested look messy - there are/were good reasons they've - been taken out... I don't believe I would gain from the new - syntaxes." - - Paul Moore: "I agree as well. The proposed syntax is far too - punctuation-heavy, and any of the alternatives discussed are - ambiguous or too subtle." - -Others have voiced their support: - - fantasai: "I agree with Simon. In many cases, though certainly - not in all, I find parenthesizing the url in plain text flows - better than relegating it to a footnote." - - Ken Manheimer: "I'd like to weigh in requesting some kind of easy, - direct inline reference link." - -(Interesting that those *against* the proposal have been using -reStructuredText for a while, and those *for* the proposal are either -new to the list ["fantasai", background unknown] or longtime -StructuredText users [Ken Manheimer].) - -I was initially ambivalent/against the proposed "inline external -targets". I value reStructuredText's readability very highly, and -although the proposed syntax offers convenience, I don't know if the -convenience is worth the cost in ugliness. Does the proposed syntax -compromise readability too much, or should the choice be left up to -the author? Perhaps if the syntax is *allowed* but its use strongly -*discouraged*, for aesthetic/readability reasons? - -After a great deal of thought and much input from users, I've decided -that there are reasonable use cases for this construct. The -documentation should strongly caution against its use in most -situations, recommending independent block-level targets instead. -Syntax #3 above ("embedded URIs") will be used. - - -Doctree Representation of Transitions -===================================== - -(Although not reStructuredText-specific, this section fits best in -this document.) - -Having added the "horizontal rule" construct to the `reStructuredText -Markup Specification`_, a decision had to be made as to how to reflect -the construct in the implementation of the document tree. Given this -source:: - - Document - ======== - - Paragraph 1 - - -------- - - Paragraph 2 - -The horizontal rule indicates a "transition" (in prose terms) or the -start of a new "division". Before implementation, the parsed document -tree would be:: - - <document> - <section names="document"> - <title> - Document - <paragraph> - Paragraph 1 - -------- <--- error here - <paragraph> - Paragraph 2 - -There are several possibilities for the implementation: - -1. Implement horizontal rules as "divisions" or segments. A - "division" is a title-less, non-hierarchical section. The first - try at an implementation looked like this:: - - <document> - <section names="document"> - <title> - Document - <paragraph> - Paragraph 1 - <division> - <paragraph> - Paragraph 2 - - But the two paragraphs are really at the same level; they shouldn't - appear to be at different levels. There's really an invisible - "first division". The horizontal rule splits the document body - into two segments, which should be treated uniformly. - -2. Treating "divisions" uniformly brings us to the second - possibility:: - - <document> - <section names="document"> - <title> - Document - <division> - <paragraph> - Paragraph 1 - <division> - <paragraph> - Paragraph 2 - - With this change, documents and sections will directly contain - divisions and sections, but not body elements. Only divisions will - directly contain body elements. Even without a horizontal rule - anywhere, the body elements of a document or section would be - contained within a division element. This makes the document tree - deeper. This is similar to the way HTML_ treats document contents: - grouped within a ``<body>`` element. - -3. Implement them as "transitions", empty elements:: - - <document> - <section names="document"> - <title> - Document - <paragraph> - Paragraph 1 - <transition> - <paragraph> - Paragraph 2 - - A transition would be a "point element", not containing anything, - only identifying a point within the document structure. This keeps - the document tree flatter, but the idea of a "point element" like - "transition" smells bad. A transition isn't a thing itself, it's - the space between two divisions. However, transitions are a - practical solution. - -Solution 3 was chosen for incorporation into the document tree model. - -.. _HTML: http://www.w3.org/MarkUp/ - - -Syntax for Line Blocks -====================== - -* An early idea: How about a literal-block-like prefix, perhaps - "``;;``"? (It is, after all, a *semi-literal* literal block, no?) - Example:: - - Take it away, Eric the Orchestra Leader! ;; - - A one, two, a one two three four - - Half a bee, philosophically, - must, *ipso facto*, half not be. - But half the bee has got to be, - *vis a vis* its entity. D'you see? - - But can a bee be said to be - or not to be an entire bee, - when half the bee is not a bee, - due to some ancient injury? - - Singing... - - Kinda lame. - -* Another idea: in an ordinary paragraph, if the first line ends with - a backslash (escaping the newline), interpret the entire paragraph - as a verse block? For example:: - - Add just one backslash\ - And this paragraph becomes - An awful haiku - - (Awful, and arguably invalid, since in Japanese the word "haiku" - contains three syllables not two.) - - This idea was superceded by the rules for escaped whitespace, useful - for `character-level inline markup`_. - -* In a `2004-02-22 docutils-develop message`__, Jarno Elonen proposed - a "plain list" syntax (and also provided a patch):: - - | John Doe - | President, SuperDuper Corp. - | jdoe@example.org - - __ http://thread.gmane.org/gmane.text.docutils.devel/1187 - - This syntax is very natural. However, these "plain lists" seem very - similar to line blocks, and I see so little intrinsic "list-ness" - that I'm loathe to add a new object. I used the term "blurbs" to - remove the "list" connotation from the originally proposed name. - Perhaps line blocks could be refined to add the two properties they - currently lack: - - A) long lines wrap nicely - B) HTML output doesn't look like program code in non-CSS web - browsers - - (A) is an issue of all 3 aspects of Docutils: syntax (construct - behaviour), internal representation, and output. (B) is partly an - issue of internal representation but mostly of output. - -ReStructuredText will redefine line blocks with the "|"-quoting -syntax. The following is my current thinking. - - -Syntax ------- - -Perhaps line block syntax like this would do:: - - | M6: James Bond - | MIB: Mr. J. - | IMF: not decided yet, but probably one of the following: - | Ethan Hunt - | Jim Phelps - | Claire Phelps - | CIA: Felix Leiter - -Note that the "nested" list does not have nested syntax (the "|" are -not further indented); the leading whitespace would still be -significant somehow (more below). As for long lines in the input, -this could suffice:: - - | John Doe - | Founder, President, Chief Executive Officer, Cook, Bottle - Washer, and All-Round Great Guy - | SuperDuper Corp. - | jdoe@example.org - -The lack of "|" on the third line indicates that it's a continuation -of the second line, wrapped. - -I don't see much point in allowing arbitrary nested content. Multiple -paragraphs or bullet lists inside a "blurb" doesn't make sense to me. -Simple nested line blocks should suffice. - - -Internal Representation ------------------------ - -Line blocks are currently represented as text blobs as follows:: - - <!ELEMENT line_block %text.model;> - <!ATTLIST line_block - %basic.atts; - %fixedspace.att;> - -Instead, we could represent each line by a separate element:: - - <!ELEMENT line_block (line+)> - <!ATTLIST line_block %basic.atts;> - - <!ELEMENT line %text.model;> - <!ATTLIST line %basic.atts;> - -We'd keep the significance of the leading whitespace of each line -either by converting it to non-breaking spaces at output, or with a -per-line margin. Non-breaking spaces are simpler (for HTML, anyway) -but kludgey, and wouldn't support indented long lines that wrap. But -should inter-word whitespace (i.e., not leading whitespace) be -preserved? Currently it is preserved in line blocks. - -Representing a more complex line block may be tricky:: - - | But can a bee be said to be - | or not to be an entire bee, - | when half the bee is not a bee, - | due to some ancient injury? - -Perhaps the representation could allow for nested line blocks:: - - <!ELEMENT line_block (line | line_block)+> - -With this model, leading whitespace would no longer be significant. -Instead, left margins are implied by the nesting. The example above -could be represented as follows:: - - <line_block> - <line> - But can a bee be said to be - <line_block> - <line> - or not to be an entire bee, - <line_block> - <line> - when half the bee is not a bee, - <line_block> - <line> - due to some ancient injury? - -I wasn't sure what to do about even more complex line blocks:: - - | Indented - | Not indented - | Indented a bit - | A bit more - | Only one space - -How should that be parsed and nested? Should the first line have -the same nesting level (== indentation in the output) as the fourth -line, or the same as the last line? Mark Nodine suggested that such -line blocks be parsed similarly to complexly-nested block quotes, -which seems reasonable. In the example above, this would result in -the nesting of first line matching the last line's nesting. In -other words, the nesting would be relative to neighboring lines -only. - - -Output ------- - -In HTML, line blocks are currently output as "<pre>" blocks, which -gives us significant whitespace and line breaks, but doesn't allow -long lines to wrap and causes monospaced output without stylesheets. -Instead, we could output "<div>" elements parallelling the -representation above, where each nested <div class="line_block"> would -have an increased left margin (specified in the stylesheet). - -Jarno suggested the following HTML output:: - - <div class="line_block"> - <span class="line">First, top level line</span><br class="hidden"/> - <div class="line_block"><span class="hidden"> </span> - <span class="line">Second, once nested</span><br class="hidden"/> - <span class="line">Third, once nested</span><br class="hidden"/> - ... - </div> - ... - </div> - -The ``<br class="hidden" />`` and ``<span -class="hidden"> </span>`` are meant to support non-CSS and -non-graphical browsers. I understand the case for "br", but I'm not -so sure about hidden " ". I question how much effort should be -put toward supporting non-graphical and especially non-CSS browsers, -at least for html4css1.py output. - -Should the lines themselves be ``<span>`` or ``<div>``? I don't like -mixing inline and block-level elements. - - -Implementation Plan -------------------- - -We'll leave the old implementation in place (via the "line-block" -directive only) until all Writers have been updated to support the new -syntax & implementation. The "line-block" directive can then be -updated to use the new internal representation, and its documentation -will be updated to recommend the new syntax. - - -List-Driven Tables -================== - -The original idea came from Dylan Jay: - - ... to use a two level bulleted list with something to - indicate it should be rendered as a table ... - -It's an interesting idea. It could be implemented in as a directive -which transforms a uniform two-level list into a table. Using a -directive would allow the author to explicitly set the table's -orientation (by column or by row), the presence of row headers, etc. - -Alternatives: - -1. (Implemented in Docutils 0.3.8). - - Bullet-list-tables might look like this:: - - .. list-table:: - - * - Treat - - Quantity - - Description - * - Albatross! - - 299 - - On a stick! - * - Crunchy Frog! - - 1499 - - If we took the bones out, it wouldn't be crunchy, - now would it? - * - Gannet Ripple! - - 199 - - On a stick! - - This list must be written in two levels. This wouldn't work:: - - .. list-table:: - - * Treat - * Albatross! - * Gannet! - * Crunchy Frog! - - * Quantity - * 299 - * 199 - * 1499 - - * Description - * On a stick! - * On a stick! - * If we took the bones out... - - The above is a single list of 12 items. The blank lines are not - significant to the markup. We'd have to explicitly specify how - many columns or rows to use, which isn't a good idea. - -2. Beni Cherniavsky suggested a field list alternative. It could look - like this:: - - .. field-list-table:: - :headrows: 1 - - - :treat: Treat - :quantity: Quantity - :descr: Description - - - :treat: Albatross! - :quantity: 299 - :descr: On a stick! - - - :treat: Crunchy Frog! - :quantity: 1499 - :descr: If we took the bones out, it wouldn't be - crunchy, now would it? - - Column order is determined from the order of fields in the first - row. Field order in all other rows is ignored. As a side-effect, - this allows trivial re-arrangement of columns. By using named - fields, it becomes possible to omit fields in some rows without - losing track of things, which is important for spans. - -3. An alternative to two-level bullet lists would be to use enumerated - lists for the table cells:: - - .. list-table:: - - * 1. Treat - 2. Quantity - 3. Description - * 1. Albatross! - 2. 299 - 3. On a stick! - * 1. Crunchy Frog! - 2. 1499 - 3. If we took the bones out, it wouldn't be crunchy, - now would it? - - That provides better correspondence between cells in the same - column than does bullet-list syntax, but not as good as field list - syntax. I think that were only field-list-tables available, a lot - of users would use the equivalent degenerate case:: - - .. field-list-table:: - - :1: Treat - :2: Quantity - :3: Description - ... - -4. Another natural variant is to allow a description list with field - lists as descriptions:: - - .. list-table:: - :headrows: 1 - - Treat - :quantity: Quantity - :descr: Description - Albatross! - :quantity: 299 - :descr: On a stick! - Crunchy Frog! - :quantity: 1499 - :descr: If we took the bones out, it wouldn't be - crunchy, now would it? - - This would make the whole first column a header column ("stub"). - It's limited to a single column and a single paragraph fitting on - one source line. Also it wouldn't allow for empty cells or row - spans in the first column. But these are limitations that we could - live with, like those of simple tables. - -The List-driven table feature could be done in many ways. Each user -will have their preferred usage. Perhaps a single "list-table" -directive could handle them all, depending on which options and -content are present. - -Issues: - -* How to indicate that there's 1 header row? Perhaps two lists? :: - - .. list-table:: - - + - Treat - - Quantity - - Description - - * - Albatross! - - 299 - - On a stick! - - This is probably too subtle though. Better would be a directive - option, like ``:headrows: 1``. An early suggestion for the header - row(s) was to use a directive option:: - - .. field-list-table:: - :header: - - :treat: Treat - :quantity: Quantity - :descr: Description - - :treat: Albatross! - :quantity: 299 - :descr: On a stick! - - But the table data is at two levels and looks inconsistent. - - In general, we cannot extract the header row from field lists' field - names because field names cannot contain everything one might put in - a table cell. A separate header row also allows shorter field names - and doesn't force one to rewrite the whole table when the header - text changes. But for simpler cases, we can offer a ":header: - fields" option, which does extract header cells from field names:: - - .. field-list-table:: - :header: fields - - - :Treat: Albatross! - :Quantity: 299 - :Description: On a stick! - -* How to indicate the column widths? A directive option? :: - - .. list-table:: - :widths: 15 10 35 - - Automatic defaults from the text used? - -* How to handle row and/or column spans? - - In a field list, column-spans can be indicated by specifying the - first and last fields, separated by space-dash-space or ellipsis:: - - - :foo - baz: quuux - - :foo ... baz: quuux - - Commas were proposed for column spans:: - - - :foo, bar: quux - - But non-adjacent columns become problematic. Should we report an - error, or duplicate the value into each span of adjacent columns (as - was suggested)? The latter suggestion is appealing but may be too - clever. Best perhaps to simply specify the two ends. - - It was suggested that comma syntax should be allowed, too, in order - to allow the user to avoid trouble when changing the column order. - But changing the column order of a table with spans is not trivial; - we shouldn't make it easier to mess up. - - One possible syntax for row-spans is to simply treat any row where a - field is missing as a row-span from the last row where it appeared. - Leaving a field empty would still be possible by writing a field - with empty content. But this is too implicit. - - Another way would be to require an explicit continuation marker - (``...``/``-"-``/``"``?) in all but the first row of a spanned - field. Empty comments could work (".."). If implemented, the same - marker could also be supported in simple tables, which lack - row-spanning abilities. - - Explicit markup like ":rowspan:" and ":colspan:" was also suggested. - - Sometimes in a table, the first header row contains spans. It may - be necessary to provide a way to specify the column field names - independently of data rows. A directive option would do it. - -* We could specify "column-wise" or "row-wise" ordering, with the same - markup structure. For example, with definition data:: - - .. list-table:: - :column-wise: - - Treat - - Albatross! - - Crunchy Frog! - Quantity - - 299 - - 1499 - Description - - On a stick! - - If we took the bones out, it wouldn't be - crunchy, now would it? - -* A syntax for _`stubs in grid tables` is easy to imagine:: - - +------------------------++------------+----------+ - | Header row, column 1 || Header 2 | Header 3 | - +========================++============+==========+ - | body row 1, column 1 || column 2 | column 3 | - +------------------------++------------+----------+ - - Or this idea from Nick Moffitt:: - - +-----+---+---+ - | XOR # T | F | - +=====+===+===+ - | T # F | T | - +-----+---+---+ - | F # T | F | - +-----+---+---+ - - -Auto-Enumerated Lists -===================== - -Implemented 2005-03-24: combination of variation 1 & 2. - -The advantage of auto-numbered enumerated lists would be similar to -that of auto-numbered footnotes: lists could be written and rearranged -without having to manually renumber them. The disadvantages are also -the same: input and output wouldn't match exactly; the markup may be -ugly or confusing (depending on which alternative is chosen). - -1. Use the "#" symbol. Example:: - - #. Item 1. - #. Item 2. - #. Item 3. - - Advantages: simple, explicit. Disadvantage: enumeration sequence - cannot be specified (limited to arabic numerals); ugly. - -2. As a variation on #1, first initialize the enumeration sequence? - For example:: - - a) Item a. - #) Item b. - #) Item c. - - Advantages: simple, explicit, any enumeration sequence possible. - Disadvantages: ugly; perhaps confusing with mixed concrete/abstract - enumerators. - -3. Alternative suggested by Fred Bremmer, from experience with MoinMoin:: - - 1. Item 1. - 1. Item 2. - 1. Item 3. - - Advantages: enumeration sequence is explicit (could be multiple - "a." or "(I)" tokens). Disadvantages: perhaps confusing; otherwise - erroneous input (e.g., a duplicate item "1.") would pass silently, - either causing a problem later in the list (if no blank lines - between items) or creating two lists (with blanks). - - Take this input for example:: - - 1. Item 1. - - 1. Unintentional duplicate of item 1. - - 2. Item 2. - - Currently the parser will produce two list, "1" and "1,2" (no - warnings, because of the presence of blank lines). Using Fred's - notation, the current behavior is "1,1,2 -> 1 1,2" (without blank - lines between items, it would be "1,1,2 -> 1 [WARNING] 1,2"). What - should the behavior be with auto-numbering? - - Fred has produced a patch__, whose initial behavior is as follows:: - - 1,1,1 -> 1,2,3 - 1,2,2 -> 1,2,3 - 3,3,3 -> 3,4,5 - 1,2,2,3 -> 1,2,3 [WARNING] 3 - 1,1,2 -> 1,2 [WARNING] 2 - - (After the "[WARNING]", the "3" would begin a new list.) - - I have mixed feelings about adding this functionality to the spec & - parser. It would certainly be useful to some users (myself - included; I often have to renumber lists). Perhaps it's too - clever, asking the parser to guess too much. What if you *do* want - three one-item lists in a row, each beginning with "1."? You'd - have to use empty comments to force breaks. Also, I question - whether "1,2,2 -> 1,2,3" is optimal behavior. - - In response, Fred came up with "a stricter and more explicit rule - [which] would be to only auto-number silently if *all* the - enumerators of a list were identical". In that case:: - - 1,1,1 -> 1,2,3 - 1,2,2 -> 1,2 [WARNING] 2 - 3,3,3 -> 3,4,5 - 1,2,2,3 -> 1,2 [WARNING] 2,3 - 1,1,2 -> 1,2 [WARNING] 2 - - Should any start-value be allowed ("3,3,3"), or should - auto-numbered lists be limited to begin with ordinal-1 ("1", "A", - "a", "I", or "i")? - - __ http://sourceforge.net/tracker/index.php?func=detail&aid=548802 - &group_id=38414&atid=422032 - -4. Alternative proposed by Tony Ibbs:: - - #1. First item. - #3. Aha - I edited this in later. - #2. Second item. - - The initial proposal required unique enumerators within a list, but - this limits the convenience of a feature of already limited - applicability and convenience. Not a useful requirement; dropped. - - Instead, simply prepend a "#" to a standard list enumerator to - indicate auto-enumeration. The numbers (or letters) of the - enumerators themselves are not significant, except: - - - as a sequence indicator (arabic, roman, alphabetic; upper/lower), - - - and perhaps as a start value (first list item). - - Advantages: explicit, any enumeration sequence possible. - Disadvantages: a bit ugly. - - ------------------ - Not Implemented ------------------ - -Reworking Footnotes -=================== - -As a further wrinkle (see `Reworking Explicit Markup (Round 1)`_ -above), in the wee hours of 2002-02-28 I posted several ideas for -changes to footnote syntax: - - - Change footnote syntax from ``.. [1]`` to ``_[1]``? ... - - Differentiate (with new DTD elements) author-date "citations" - (``[GVR2002]``) from numbered footnotes? ... - - Render footnote references as superscripts without "[]"? ... - -These ideas are all related, and suggest changes in the -reStructuredText syntax as well as the docutils tree model. - -The footnote has been used for both true footnotes (asides expanding -on points or defining terms) and for citations (references to external -works). Rather than dealing with one amalgam construct, we could -separate the current footnote concept into strict footnotes and -citations. Citations could be interpreted and treated differently -from footnotes. Footnotes would be limited to numerical labels: -manual ("1") and auto-numbered (anonymous "#", named "#label"). - -The footnote is the only explicit markup construct (starts with ".. ") -that directly translates to a visible body element. I've always been -a little bit uncomfortable with the ".. " marker for footnotes because -of this; ".. " has a connotation of "special", but footnotes aren't -especially "special". Printed texts often put footnotes at the bottom -of the page where the reference occurs (thus "foot note"). Some HTML -designs would leave footnotes to be rendered the same positions where -they're defined. Other online and printed designs will gather -footnotes into a section near the end of the document, converting them -to "endnotes" (perhaps using a directive in our case); but this -"special processing" is not an intrinsic property of the footnote -itself, but a decision made by the document author or processing -system. - -Citations are almost invariably collected in a section at the end of a -document or section. Citations "disappear" from where they are -defined and are magically reinserted at some well-defined point. -There's more of a connection to the "special" connotation of the ".. " -syntax. The point at which the list of citations is inserted could be -defined manually by a directive (e.g., ".. citations::"), and/or have -default behavior (e.g., a section automatically inserted at the end of -the document) that might be influenced by options to the Writer. - -Syntax proposals: - -+ Footnotes: - - - Current syntax:: - - .. [1] Footnote 1 - .. [#] Auto-numbered footnote. - .. [#label] Auto-labeled footnote. - - - The syntax proposed in the original 2002-02-28 Doc-SIG post: - remove the ".. ", prefix a "_":: - - _[1] Footnote 1 - _[#] Auto-numbered footnote. - _[#label] Auto-labeled footnote. - - The leading underscore syntax (earlier dropped because - ``.. _[1]:`` was too verbose) is a useful reminder that footnotes - are hyperlink targets. - - - Minimal syntax: remove the ".. [" and "]", prefix a "_", and - suffix a ".":: - - _1. Footnote 1. - _#. Auto-numbered footnote. - _#label. Auto-labeled footnote. - - ``_1.``, ``_#.``, and ``_#label.`` are markers, - like list markers. - - Footnotes could be rendered something like this in HTML - - | 1. This is a footnote. The brackets could be dropped - | from the label, and a vertical bar could set them - | off from the rest of the document in the HTML. - - Two-way hyperlinks on the footnote marker ("1." above) would also - help to differentiate footnotes from enumerated lists. - - If converted to endnotes (by a directive/transform), a horizontal - half-line might be used instead. Page-oriented output formats - would typically use the horizontal line for true footnotes. - -+ Footnote references: - - - Current syntax:: - - [1]_, [#]_, [#label]_ - - - Minimal syntax to match the minimal footnote syntax above:: - - 1_, #_, #label_ - - As a consequence, pure-numeric hyperlink references would not be - possible; they'd be interpreted as footnote references. - -+ Citation references: no change is proposed from the current footnote - reference syntax:: - - [GVR2001]_ - -+ Citations: - - - Current syntax (footnote syntax):: - - .. [GVR2001] Python Documentation; van Rossum, Drake, et al.; - http://www.python.org/doc/ - - - Possible new syntax:: - - _[GVR2001] Python Documentation; van Rossum, Drake, et al.; - http://www.python.org/doc/ - - _[DJG2002] - Docutils: Python Documentation Utilities project; Goodger - et al.; http://docutils.sourceforge.net/ - - Without the ".. " marker, subsequent lines would either have to - align as in one of the above, or we'd have to allow loose - alignment (I'd rather not):: - - _[GVR2001] Python Documentation; van Rossum, Drake, et al.; - http://www.python.org/doc/ - -I proposed adopting the "minimal" syntax for footnotes and footnote -references, and adding citations and citation references to -reStructuredText's repertoire. The current footnote syntax for -citations is better than the alternatives given. - -From a reply by Tony Ibbs on 2002-03-01: - - However, I think easier with examples, so let's create one:: - - Fans of Terry Pratchett are perhaps more likely to use - footnotes [1]_ in their own writings than other people - [2]_. Of course, in *general*, one only sees footnotes - in academic or technical writing - it's use in fiction - and letter writing is not normally considered good - style [4]_, particularly in emails (not a medium that - lends itself to footnotes). - - .. [1] That is, little bits of referenced text at the - bottom of the page. - .. [2] Because Terry himself does, of course [3]_. - .. [3] Although he has the distinction of being - *funny* when he does it, and his fans don't always - achieve that aim. - .. [4] Presumably because it detracts from linear - reading of the text - this is, of course, the point. - - and look at it with the second syntax proposal:: - - Fans of Terry Pratchett are perhaps more likely to use - footnotes [1]_ in their own writings than other people - [2]_. Of course, in *general*, one only sees footnotes - in academic or technical writing - it's use in fiction - and letter writing is not normally considered good - style [4]_, particularly in emails (not a medium that - lends itself to footnotes). - - _[1] That is, little bits of referenced text at the - bottom of the page. - _[2] Because Terry himself does, of course [3]_. - _[3] Although he has the distinction of being - *funny* when he does it, and his fans don't always - achieve that aim. - _[4] Presumably because it detracts from linear - reading of the text - this is, of course, the point. - - (I note here that if I have gotten the indentation of the - footnotes themselves correct, this is clearly not as nice. And if - the indentation should be to the left margin instead, I like that - even less). - - and the third (new) proposal:: - - Fans of Terry Pratchett are perhaps more likely to use - footnotes 1_ in their own writings than other people - 2_. Of course, in *general*, one only sees footnotes - in academic or technical writing - it's use in fiction - and letter writing is not normally considered good - style 4_, particularly in emails (not a medium that - lends itself to footnotes). - - _1. That is, little bits of referenced text at the - bottom of the page. - _2. Because Terry himself does, of course 3_. - _3. Although he has the distinction of being - *funny* when he does it, and his fans don't always - achieve that aim. - _4. Presumably because it detracts from linear - reading of the text - this is, of course, the point. - - I think I don't, in practice, mind the targets too much (the use - of a dot after the number helps a lot here), but I do have a - problem with the body text, in that I don't naturally separate out - the footnotes as different than the rest of the text - instead I - keep wondering why there are numbers interspered in the text. The - use of brackets around the numbers ([ and ]) made me somehow parse - the footnote references as "odd" - i.e., not part of the body text - - and thus both easier to skip, and also (paradoxically) easier to - pick out so that I could follow them. - - Thus, for the moment (and as always susceptable to argument), I'd - say -1 on the new form of footnote reference (i.e., I much prefer - the existing ``[1]_`` over the proposed ``1_``), and ambivalent - over the proposed target change. - - That leaves David's problem of wanting to distinguish footnotes - and citations - and the only thing I can propose there is that - footnotes are numeric or # and citations are not (which, as a - human being, I can probably cope with!). - -From a reply by Paul Moore on 2002-03-01: - - I think the current footnote syntax ``[1]_`` is *exactly* the - right balance of distinctness vs unobtrusiveness. I very - definitely don't think this should change. - - On the target change, it doesn't matter much to me. - -From a further reply by Tony Ibbs on 2002-03-01, referring to the -"[1]" form and actual usage in email: - - Clearly this is a form people are used to, and thus we should - consider it strongly (in the same way that the usage of ``*..*`` - to mean emphasis was taken partly from email practise). - - Equally clearly, there is something "magical" for people in the - use of a similar form (i.e., ``[1]``) for both footnote reference - and footnote target - it seems natural to keep them similar. - - ... - - I think that this established plaintext usage leads me to strongly - believe we should retain square brackets at both ends of a - footnote. The markup of the reference end (a single trailing - underscore) seems about as minimal as we can get away with. The - markup of the target end depends on how one envisages the thing - - if ".." means "I am a target" (as I tend to see it), then that's - good, but one can also argue that the "_[1]" syntax has a neat - symmetry with the footnote reference itself, if one wishes (in - which case ".." presumably means "hidden/special" as David seems - to think, which is why one needs a ".." *and* a leading underline - for hyperlink targets. - -Given the persuading arguments voiced, we'll leave footnote & footnote -reference syntax alone. Except that these discussions gave rise to -the "auto-symbol footnote" concept, which has been added. Citations -and citation references have also been added. - - -Syntax for Questions & Answers -============================== - -Implement as a generic two-column marked list? As a standalone -(non-directive) construct? (Is the markup ambiguous?) Add support to -parts.contents? - -New elements would be required. Perhaps:: - - <!ELEMENT question_list (question_list_item+)> - <!ATTLIST question_list - numbering (none | local | global) - #IMPLIED - start NUMBER #IMPLIED> - <!ELEMENT question_list_item (question, answer*)> - <!ELEMENT question %text.model;> - <!ELEMENT answer (%body.elements;)+> - -Originally I thought of implementing a Q&A list with special syntax:: - - Q: What am I? - - A: You are a question-and-answer - list. - - Q: What are you? - - A: I am the omniscient "we". - -Where each "Q" and "A" could also be numbered (e.g., "Q1"). However, -a simple enumerated or bulleted list will do just fine for syntax. A -directive could treat the list specially; e.g. the first paragraph -could be treated as a question, the remainder as the answer (multiple -answers could be represented by nested lists). Without special -syntax, this directive becomes low priority. - -As described in the FAQ__, no special syntax or directive is needed -for this application. - -__ http://docutils.sf.net/FAQ.html - #how-can-i-mark-up-a-faq-or-other-list-of-questions-answers - - --------- - Tabled --------- - -Reworking Explicit Markup (Round 2) -=================================== - -See `Reworking Explicit Markup (Round 1)`_ for an earlier discussion. - -In April 2004, a new thread becan on docutils-develop: `Inconsistency -in RST markup`__. Several arguments were made; the first argument -begat later arguments. Below, the arguments are paraphrased "in -quotes", with responses. - -__ http://thread.gmane.org/gmane.text.docutils.devel/1386 - -1. References and targets take this form:: - - targetname_ - - .. _targetname: stuff - - But footnotes, "which generate links just like targets do", are - written as:: - - [1]_ - - .. [1] stuff - - "Footnotes should be written as":: - - [1]_ - - .. _[1]: stuff - - But they're not the same type of animal. That's not a "footnote - target", it's a *footnote*. Being a target is not a footnote's - primary purpose (an arguable point). It just happens to grow a - target automatically, for convenience. Just as a section title:: - - Title - ===== - - isn't a "title target", it's a *title*, which happens to grow a - target automatically. The consistency is there, it's just deeper - than at first glance. - - Also, ".. [1]" was chosen for footnote syntax because it closely - resembles one form of actual footnote rendering. ".. _[1]:" is too - verbose; excessive punctuation is required to get the job done. - - For more of the reasoning behind the syntax, see `Problems With - StructuredText (Hyperlinks) <problems.html#hyperlinks>`__ and - `Reworking Footnotes`_. - -2. "I expect directives to also look like ``.. this:`` [one colon] - because that also closely parallels the link and footnote target - markup." - - There are good reasons for the two-colon syntax: - - Two colons are used after the directive type for these reasons: - - - Two colons are distinctive, and unlikely to be used in common - text. - - - Two colons avoids clashes with common comment text like:: - - .. Danger: modify at your own risk! - - - If an implementation of reStructuredText does not recognize a - directive (i.e., the directive-handler is not installed), a - level-3 (error) system message is generated, and the entire - directive block (including the directive itself) will be - included as a literal block. Thus "::" is a natural choice. - - -- `restructuredtext.html#directives - <../../ref/rst/restructuredtext.html#directives>`__ - - The last reason is not particularly compelling; it's more of a - convenient coincidence or mnemonic. - -3. "Comments always seemed too easy. I almost never write comments. - I'd have no problem writing '.. comment:' in front of my comments. - In fact, it would probably be more readable, as comments *should* - be set off strongly, because they are very different from normal - text." - - Many people do use comments though, and some applications of - reStructuredText require it. For example, all reStructuredText - PEPs (and this document!) have an Emacs stanza at the bottom, in a - comment. Having to write ".. comment::" would be very obtrusive. - - Comments *should* be dirt-easy to do. It should be easy to - "comment out" a block of text. Comments in programming languages - and other markup languages are invariably easy. - - Any author is welcome to preface their comments with "Comment:" or - "Do Not Print" or "Note to Editor" or anything they like. A - "comment" directive could easily be implemented. It might be - confused with admonition directives, like "note" and "caution" - though. In unrelated (and unpublished and unfinished) work, adding - a "comment" directive as a true document element was considered:: - - If structure is necessary, we could use a "comment" directive - (to avoid nonsensical DTD changes, the "comment" directive - could produce an untitled topic element). - -4. "One of the goals of reStructuredText is to be *readable* by people - who don't know it. This construction violates that: it is not at - all obvious to the uninitiated that text marked by '..' is a - comment. On the other hand, '.. comment:' would be totally - transparent." - - Totally transparent, perhaps, but also very obtrusive. Another of - `reStructuredText's goals`_ is to be unobtrusive, and - ".. comment::" would violate that. The goals of reStructuredText - are many, and they conflict. Determining the right set of goals - and finding solutions that best fit is done on a case-by-case - basis. - - Even readability is has two aspects. Being readable without any - prior knowledge is one. Being as easily read in raw form as in - processed form is the other. ".." may not contribute to the former - aspect, but ".. comment::" would certainly detract from the latter. - - .. _author's note: - .. _reStructuredText's goals: ../../ref/rst/introduction.html#goals - -5. "Recently I sent someone an rst document, and they got confused; I - had to explain to them that '..' marks comments, *unless* it's a - directive, etc..." - - The explanation of directives *is* roundabout, defining comments in - terms of not being other things. That's definitely a wart. - -6. "Under the current system, a mistyped directive (with ':' instead - of '::') will be silently ignored. This is an error that could - easily go unnoticed." - - A parser option/setting like "--comments-on-stderr" would help. - -7. "I'd prefer to see double-dot-space / command / double-colon as the - standard Docutils markup-marker. It's unusual enough to avoid - being accidently used. Everything that starts with a double-dot - should end with a double-colon." - - That would increase the punctuation verbosity of some constructs - considerably. - -8. Edward Loper proposed the following plan for backwards - compatibility: - - 1. ".. foo" will generate a deprecation warning to stderr, and - nothing in the output (no system messages). - 2. ".. foo: bar" will be treated as a directive foo. If there - is no foo directive, then do the normal error output. - 3. ".. foo:: bar" will generate a deprecation warning to - stderr, and be treated as a directive. Or leave it valid? - - So some existing documents might start printing deprecation - warnings, but the only existing documents that would *break* - would be ones that say something like:: - - .. warning: this should be a comment - - instead of:: - - .. warning:: this should be a comment - - Here, we're trading fairly common a silent error (directive - falsely treated as a comment) for a fairly uncommon explicitly - flagged error (comment falsely treated as directive). To make - things even easier, we could add a sentence to the - unknown-directive error. Something like "If you intended to - create a comment, please use '.. comment:' instead". - -On one hand, I understand and sympathize with the points raised. On -the other hand, I think the current syntax strikes the right balance -(but I acknowledge a possible lack of objectivity). On the gripping -hand, the comment and directive syntax has become well established, so -even if it's a wart, it may be a wart we have to live with. - -Making any of these changes would cause a lot of breakage or at least -deprecation warnings. I'm not sure the benefit is worth the cost. - -For now, we'll treat this as an unresolved legacy issue. - - -------- - To Do -------- - -Nested Inline Markup -==================== - -These are collected notes on a long-discussed issue. The original -mailing list messages should be referred to for details. - -* In a 2001-10-31 discussion I wrote: - - Try, for example, `Ed Loper's 2001-03-21 post`_, which details - some rules for nested inline markup. I think the complexity is - prohibitive for the marginal benefit. (And if you can understand - that tree without going mad, you're a better man than I. ;-) - - Inline markup is already fragile. Allowing nested inline markup - would only be asking for trouble IMHO. If it proves absolutely - necessary, it can be added later. The rules for what can appear - inside what must be well thought out first though. - - .. _Ed Loper's 2001-03-21 post: - http://mail.python.org/pipermail/doc-sig/2001-March/001487.html - - -- http://mail.python.org/pipermail/doc-sig/2001-October/002354.html - -* In a 2001-11-09 Doc-SIG post, I wrote: - - The problem is that in the - what-you-see-is-more-or-less-what-you-get markup language that - is reStructuredText, the symbols used for inline markup ("*", - "**", "`", "``", etc.) may preclude nesting. - - I've rethought this position. Nested markup is not precluded, just - tricky. People and software parse "double and 'single' quotes" all - the time. Continuing, - - I've thought over how we might implement nested inline - markup. The first algorithm ("first identify the outer inline - markup as we do now, then recursively scan for nested inline - markup") won't work; counterexamples were given in my `last post - <http://mail.python.org/pipermail/doc-sig/2001-November/002363.html>`__. - - The second algorithm makes my head hurt:: - - while 1: - scan for start-string - if found: - push on stack - scan for start or end string - if new start string found: - recurse - elif matching end string found: - pop stack - elif non-matching end string found: - if its a markup error: - generate warning - elif the initial start-string was misinterpreted: - # e.g. in this case: ***strong** in emphasis* - restart with the other interpretation - # but it might be several layers back ... - ... - - This is similar to how the parser does section title - recognition, but sections are much more regular and - deterministic. - - Bottom line is, I don't think the benefits are worth the effort, - even if it is possible. I'm not going to try to write the code, - at least not now. If somebody codes up a consistent, working, - general solution, I'll be happy to consider it. - - -- http://mail.python.org/pipermail/doc-sig/2001-November/002388.html - -* In a `2003-05-06 Docutils-Users post`__ Paul Tremblay proposed a new - syntax to allow for easier nesting. It eventually evolved into - this:: - - :role:[inline text] - - The duplication with the existing interpreted text syntax is - problematic though. - - __ http://article.gmane.org/gmane.text.docutils.user/317 - -* Could the parser be extended to parse nested interpreted text? :: - - :emphasis:`Some emphasized text with :strong:`some more - emphasized text` in it and **perhaps** :reference:`a link`` - -* In a `2003-06-18 Docutils-Develop post`__, Mark Nodine reported on - his implementation of a form of nested inline markup in his - Perl-based parser (unpublished). He brought up some interesting - ideas. The implementation was flawed, however, by the change in - semantics required for backslash escapes. - - __ http://article.gmane.org/gmane.text.docutils.devel/795 - -* Docutils-develop threads between David Abrahams, David Goodger, and - Mark Nodine (beginning 2004-01-16__ and 2004-01-19__) hashed out - many of the details of a potentially successful implementation, as - described below. David Abrahams checked in code to the "nesting" - branch of CVS, awaiting thorough review. - - __ http://thread.gmane.org/gmane.text.docutils.devel/1102 - __ http://thread.gmane.org/gmane.text.docutils.devel/1125 - -It may be possible to accomplish nested inline markup in general with -a more powerful inline markup parser. There may be some issues, but -I'm not averse to the idea of nested inline markup in general. I just -don't have the time or inclination to write a new parser now. Of -course, a good patch would be welcome! - -I envisage something like this. Explicit-role interpreted text must -be nestable. Prefix-based is probably preferred, since suffix-based -will look like inline literals:: - - ``text`:role1:`:role2: - -But it can be disambiguated, so it ought to be left up to the author:: - - `\ `text`:role1:`:role2: - -In addition, other forms of inline markup may be nested if -unambiguous:: - - *emphasized ``literal`` and |substitution ref| and link_* - -IOW, the parser ought to be as permissive as possible. - - -Index Entries & Indexes -======================= - -Were I writing a book with an index, I guess I'd need two -different kinds of index targets: inline/implicit and -out-of-line/explicit. For example:: - - In this `paragraph`:index:, several words are being - `marked`:index: inline as implicit `index`:index: - entries. - - .. index:: markup - .. index:: syntax - - The explicit index directives above would refer to - this paragraph. It might also make sense to allow multiple - entries in an ``index`` directive: - - .. index:: - markup - syntax - -The words "paragraph", "marked", and "index" would become index -entries pointing at the words in the first paragraph. The index -entry words appear verbatim in the text. (Don't worry about the -ugly ":index:" part; if indexing is the only/main application of -interpreted text in your documents, it can be implicit and -omitted.) The two directives provide manual indexing, where the -index entry words ("markup" and "syntax") do not appear in the -main text. We could combine the two directives into one:: - - .. index:: markup; syntax - -Semicolons instead of commas because commas could *be* part of the -index target, like:: - - .. index:: van Rossum, Guido - -Another reason for index directives is because other inline markup -wouldn't be possible within inline index targets. - -Sometimes index entries have multiple levels. Given:: - - .. index:: statement syntax: expression statements - -In a hypothetical index, combined with other entries, it might -look like this:: - - statement syntax - expression statements ..... 56 - assignment ................ 57 - simple statements ......... 58 - compound statements ....... 60 - -Inline multi-level index targets could be done too. Perhaps -something like:: - - When dealing with `expression statements <statement syntax:>`, - we must remember ... - -The opposite sense could also be possible:: - - When dealing with `index entries <:multi-level>`, there are - many permutations to consider. - -Also "see / see also" index entries. - -Given:: - - Here's a paragraph. - - .. index:: paragraph - -(The "index" directive above actually targets the *preceding* -object.) The directive should produce something like this XML:: - - <paragraph> - <index_entry text="paragraph"/> - Here's a paragraph. - </paragraph> - -This kind of content model would also allow true inline -index-entries:: - - Here's a `paragraph`:index:. - -If the "index" role were the default for the application, it could be -dropped:: - - Here's a `paragraph`. - -Both of these would result in this XML:: - - <paragraph> - Here's a <index_entry>paragraph</index_entry>. - </paragraph> - - -from 2002-06-24 docutils-develop posts --------------------------------------- - - If all of your index entries will appear verbatim in the text, - this should be sufficient. If not (e.g., if you want "Van Rossum, - Guido" in the index but "Guido van Rossum" in the text), we'll - have to figure out a supplemental mechanism, perhaps using - substitutions. - -I've thought a bit more on this, and I came up with two possibilities: - -1. Using interpreted text, embed the index entry text within the - interpreted text:: - - ... by `Guido van Rossum [Van Rossum, Guido]` ... - - The problem with this is obvious: the text becomes cluttered and - hard to read. The processed output would drop the text in - brackets, which goes against the spirit of interpreted text. - -2. Use substitutions:: - - ... by |Guido van Rossum| ... - - .. |Guido van Rossum| index:: Van Rossum, Guido - - A problem with this is that each substitution definition must have - a unique name. A subsequent ``.. |Guido van Rossum| index:: BDFL`` - would be illegal. Some kind of anonymous substitution definition - mechanism would be required, but I think that's going too far. - -Both of these alternatives are flawed. Any other ideas? - - -------------------- - ... Or Not To Do? -------------------- - -This is the realm of the possible but questionably probable. These -ideas are kept here as a record of what has been proposed, for -posterity and in case any of them prove to be useful. - - -Compound Enumerated Lists -========================= - -Allow for compound enumerators, such as "1.1." or "1.a." or "1(a)", to -allow for nested enumerated lists without indentation? - - -Indented Lists -============== - -Allow for variant styles by interpreting indented lists as if they -weren't indented? For example, currently the list below will be -parsed as a list within a block quote:: - - paragraph - - * list item 1 - * list item 2 - -But a lot of people seem to write that way, and HTML browsers make it -look as if that's the way it should be. The parser could check the -contents of block quotes, and if they contain only a single list, -remove the block quote wrapper. There would be two problems: - -1. What if we actually *do* want a list inside a block quote? - -2. What if such a list comes immediately after an indented construct, - such as a literal block? - -Both could be solved using empty comments (problem 2 already exists -for a block quote after a literal block). But that's a hack. - -Perhaps a runtime setting, allowing or disabling this convenience, -would be appropriate. But that raises issues too: - - User A, who writes lists indented (and their config file is set up - to allow it), sends a file to user B, who doesn't (and their - config file disables indented lists). The result of processing by - the two users will be different. - -It may seem minor, but it adds ambiguity to the parser, which is bad. - -See the `Doc-SIG discussion starting 2001-04-18`__ with Ed Loper's -"Structuring: a summary; and an attempt at EBNF", item 4 (and -follow-ups, here__ and here__). Also `docutils-users, 2003-02-17`__ -and `beginning 2003-08-04`__. - -__ http://mail.python.org/pipermail/doc-sig/2001-April/001776.html -__ http://mail.python.org/pipermail/doc-sig/2001-April/001789.html -__ http://mail.python.org/pipermail/doc-sig/2001-April/001793.html -__ http://sourceforge.net/mailarchive/message.php?msg_id=3838913 -__ http://sf.net/mailarchive/forum.php?thread_id=2957175&forum_id=11444 - - -Sloppy Indentation of List Items -================================ - -Perhaps the indentation shouldn't be so strict. Currently, this is -required:: - - 1. First line, - second line. - -Anything wrong with this? :: - - 1. First line, - second line. - -Problem? :: - - 1. First para. - - Block quote. (no good: requires some indent relative to first - para) - - Second Para. - - 2. Have to carefully define where the literal block ends:: - - Literal block - - Literal block? - -Hmm... Non-strict indentation isn't such a good idea. - - -Lazy Indentation of List Items -============================== - -Another approach: Going back to the first draft of reStructuredText -(2000-11-27 post to Doc-SIG):: - - - This is the fourth item of the main list (no blank line above). - The second line of this item is not indented relative to the - bullet, which precludes it from having a second paragraph. - -Change that to *require* a blank line above and below, to reduce -ambiguity. This "loosening" may be added later, once the parser's -been nailed down. However, a serious drawback of this approach is to -limit the content of each list item to a single paragraph. - - -David's Idea for Lazy Indentation ---------------------------------- - -Consider a paragraph in a word processor. It is a single logical line -of text which ends with a newline, soft-wrapped arbitrarily at the -right edge of the page or screen. We can think of a plaintext -paragraph in the same way, as a single logical line of text, ending -with two newlines (a blank line) instead of one, and which may contain -arbitrary line breaks (newlines) where it was accidentally -hard-wrapped by an application. We can compensate for the accidental -hard-wrapping by "unwrapping" every unindented second and subsequent -line. The indentation of the first line of a paragraph or list item -would determine the indentation for the entire element. Blank lines -would be required between list items when using lazy indentation. - -The following example shows the lazy indentation of multiple body -elements:: - - - This is the first paragraph - of the first list item. - - Here is the second paragraph - of the first list item. - - - This is the first paragraph - of the second list item. - - Here is the second paragraph - of the second list item. - -A more complex example shows the limitations of lazy indentation:: - - - This is the first paragraph - of the first list item. - - Next is a definition list item: - - Term - Definition. The indentation of the term is - required, as is the indentation of the definition's - first line. - - When the definition extends to more than - one line, lazy indentation may occur. (This is the second - paragraph of the definition.) - - - This is the first paragraph - of the second list item. - - - Here is the first paragraph of - the first item of a nested list. - - So this paragraph would be outside of the nested list, - but inside the second list item of the outer list. - - But this paragraph is not part of the list at all. - -And the ambiguity remains:: - - - Look at the hyphen at the beginning of the next line - - is it a second list item marker, or a dash in the text? - - Similarly, we may want to refer to numbers inside enumerated - lists: - - 1. How many socks in a pair? There are - 2. How many pants in a pair? Exactly - 1. Go figure. - -Literal blocks and block quotes would still require consistent -indentation for all their lines. For block quotes, we might be able -to get away with only requiring that the first line of each contained -element be indented. For example:: - - Here's a paragraph. - - This is a paragraph inside a block quote. - Second and subsequent lines need not be indented at all. - - - A bullet list inside - the block quote. - - Second paragraph of the - bullet list inside the block quote. - -Although feasible, this form of lazy indentation has problems. The -document structure and hierarchy is not obvious from the indentation, -making the source plaintext difficult to read. This will also make -keeping track of the indentation while writing difficult and -error-prone. However, these problems may be acceptable for Wikis and -email mode, where we may be able to rely on less complex structure -(few nested lists, for example). - - -Multiple Roles in Interpreted Text -================================== - -In reStructuredText, inline markup cannot be nested (yet; `see -above`__). This also applies to interpreted text. In order to -simultaneously combine multiple roles for a single piece of text, a -syntax extension would be necessary. Ideas: - -1. Initial idea:: - - `interpreted text`:role1,role2: - -2. Suggested by Jason Diamond:: - - `interpreted text`:role1:role2: - -If a document is so complex as to require nested inline markup, -perhaps another markup system should be considered. By design, -reStructuredText does not have the flexibility of XML. - -__ `Nested Inline Markup`_ - - -Parameterized Interpreted Text -============================== - -In some cases it may be expedient to pass parameters to interpreted -text, analogous to function calls. Ideas: - -1. Parameterize the interpreted text role itself (suggested by Jason - Diamond):: - - `interpreted text`:role1(foo=bar): - - Positional parameters could also be supported:: - - `CSS`:acronym(Cascading Style Sheets): is used for HTML, and - `CSS`:acronym(Content Scrambling System): is used for DVDs. - - Technical problem: current interpreted text syntax does not - recognize roles containing whitespace. Design problem: this smells - like programming language syntax, but reStructuredText is not a - programming language. - -2. Put the parameters inside the interpreted text:: - - `CSS (Cascading Style Sheets)`:acronym: is used for HTML, and - `CSS (Content Scrambling System)`:acronym: is used for DVDs. - - Although this could be defined on an individual basis (per role), - we ought to have a standard. Hyperlinks with embedded URIs already - use angle brackets; perhaps they could be used here too:: - - `CSS <Cascading Style Sheets>`:acronym: is used for HTML, and - `CSS <Content Scrambling System>`:acronym: is used for DVDs. - - Do angle brackets connote URLs too much for this to be acceptable? - How about the "tag" connotation -- does it save them or doom them? - -3. `Nested inline markup`_ could prove useful here:: - - `CSS :def:`Cascading Style Sheets``:acronym: is used for HTML, - and `CSS :def:`Content Scrambling System``:acronym: is used for - DVDs. - - Inline markup roles could even define the default roles of nested - inline markup, allowing this cleaner syntax:: - - `CSS `Cascading Style Sheets``:acronym: is used for HTML, and - `CSS `Content Scrambling System``:acronym: is used for DVDs. - -Does this push inline markup too far? Readability becomes a serious -issue. Substitutions may provide a better alternative (at the expense -of verbosity and duplication) by pulling the details out of the text -flow:: - - |CSS| is used for HTML, and |CSS-DVD| is used for DVDs. - - .. |CSS| acronym:: Cascading Style Sheets - .. |CSS-DVD| acronym:: Content Scrambling System - :text: CSS - ----------------------------------------------------------------------- - -This whole idea may be going beyond the scope of reStructuredText. -Documents requiring this functionality may be better off using XML or -another markup system. - -This argument comes up regularly when pushing the envelope of -reStructuredText syntax. I think it's a useful argument in that it -provides a check on creeping featurism. In many cases, the resulting -verbosity produces such unreadable plaintext that there's a natural -desire *not* to use it unless absolutely necessary. It's a matter of -finding the right balance. - - -Syntax for Interpreted Text Role Bindings -========================================= - -The following syntax (idea from Jeffrey C. Jacobs) could be used to -associate directives with roles:: - - .. :rewrite: class:: rewrite - - `She wore ribbons in her hair and it lay with streaks of - grey`:rewrite: - -The syntax is similar to that of substitution declarations, and the -directive/role association may resolve implementation issues. The -semantics, ramifications, and implementation details would need to be -worked out. - -The example above would implement the "rewrite" role as adding a -``class="rewrite"`` attribute to the interpreted text ("inline" -element). The stylesheet would then pick up on the "class" attribute -to do the actual formatting. - -The advantage of the new syntax would be flexibility. Uses other than -"class" may present themselves. The disadvantage is complexity: -having to implement new syntax for a relatively specialized operation, -and having new semantics in existing directives ("class::" would do -something different). - -The `"role" directive`__ has been implemented. - -__ ../../ref/rst/directives.html#role - - -Character Processing -==================== - -Several people have suggested adding some form of character processing -to reStructuredText: - -* Some sort of automated replacement of ASCII sequences: - - - ``--`` to em-dash (or ``--`` to en-dash, and ``---`` to em-dash). - - Convert quotes to curly quote entities. (Essentially impossible - for HTML? Unnecessary for TeX.) - - Various forms of ``:-)`` to smiley icons. - - ``"\ "`` to . Problem with line-wrapping though: it could - end up escaping the newline. - - Escaped newlines to <BR>. - - Escaped period or quote or dash as a disappearing catalyst to - allow character-level inline markup? - -* XML-style character entities, such as "©" for the copyright - symbol. - -Docutils has no need of a character entity subsystem. Supporting -Unicode and text encodings, character entities should be directly -represented in the text: a copyright symbol should be represented by -the copyright symbol character. If this is not possible in an -authoring environment, a pre-processing stage can be added, or a table -of substitution definitions can be devised. - -A "unicode" directive has been implemented to allow direct -specification of esoteric characters. In combination with the -substitution construct, "include" files defining common sets of -character entities can be defined and used. `A set of character -entity set definition files have been defined`__ (`tarball`__). -There's also `a description and instructions for use`__. - -__ http://docutils.sf.net/tmp/charents/ -__ http://docutils.sf.net/tmp/charents.tgz -__ http://docutils.sf.net/tmp/charents/README.html - -To allow for `character-level inline markup`_, a limited form of -character processing has been added to the spec and parser: escaped -whitespace characters are removed from the processed document. Any -further character processing will be of this functional type, rather -than of the character-encoding type. - -.. _character-level inline markup: - ../../ref/rst/restructuredtext.html#character-level-inline-markup - -* Directive idea:: - - .. text-replace:: "pattern" "replacement" - - - Support Unicode "U+XXXX" codes. - - Support regexps, perhaps with alternative "regexp-replace" - directive. - - Flags for regexps; ":flags:" option, or individuals. - - Specifically, should the default be case-sensistive or - -insensitive? - - -Page Or Line Breaks -=================== - -* Should ^L (or something else in reST) be defined to mean - force/suggest page breaks in whatever output we have? - - A "break" or "page-break" directive would be easy to add. A new - doctree element would be required though (perhaps "break"). The - final behavior would be up to the Writer. The directive argument - could be one of page/column/recto/verso for added flexibility. - - Currently ^L (Python's ``\f``) characters are treated as whitespace. - They're converted to single spaces, actually, as are vertical tabs - (^K, Python's ``\v``). It would be possible to recognize form feeds - as markup, but it requires some thought and discussion first. Are - there any downsides? Many editing environments do not allow the - insertion of control characters. Will it cause any harm? It would - be useful as a shorthand for the directive. - - It's common practice to use ^L before Emacs "Local Variables" - lists:: - - ^L - .. - Local Variables: - mode: indented-text - indent-tabs-mode: nil - sentence-end-double-space: t - fill-column: 70 - End: - - These are already present in many PEPs and Docutils project - documents. From the Emacs manual (info): - - A "local variables list" goes near the end of the file, in the - last page. (It is often best to put it on a page by itself.) - - It would be unfortunate if this construct caused a final blank page - to be generated (for those Writers that recognize the page breaks). - We'll have to add a transform that looks for a "break" plus zero or - more comments at the end of a document, and removes them. - - Probably a bad idea because there is no such thing as a page in a - generic document format. - -* Could the "break" concept above be extended to inline forms? - E.g. "^L" in the middle of a sentence could cause a line break. - Only recognize it at the end of a line (i.e., ``\f\n``)? - - Or is formfeed inappropriate? Perhaps vertical tab (``\v``), but - even that's a stretch. Can't use carriage returns, since they're - commonly used for line endings. - - Probably a bad idea as well because we do not want to use control - characters for well-readable and well-writable markup, and after all - we have the line block syntax for line breaks. - - -Superscript Markup -================== - -Add ``^superscript^`` inline markup? The only common non-markup uses -of "^" I can think of are as short hand for "superscript" itself and -for describing control characters ("^C to cancel"). The former -supports the proposed syntax, and it could be argued that the latter -ought to be literal text anyhow (e.g. "``^C`` to cancel"). - -However, superscripts are seldom needed, and new syntax would break -existing documents. When it's needed, the ``:superscript:`` -(``:sup:``) role can we used as well. - - -Code Execution -============== - -Add the following directives? - -- "exec": Execute Python code & insert the results. Call it - "python" to allow for other languages? - -- "system": Execute an ``os.system()`` call, and insert the results - (possibly as a literal block). Definitely dangerous! How to make - it safe? Perhaps such processing should be left outside of the - document, in the user's production system (a makefile or a script or - whatever). Or, the directive could be disabled by default and only - enabled with an explicit command-line option or config file setting. - Even then, an interactive prompt may be useful, such as: - - The file.txt document you are processing contains a "system" - directive requesting that the ``sudo rm -rf /`` command be - executed. Allow it to execute? (y/N) - -- "eval": Evaluate an expression & insert the text. At parse - time or at substitution time? Dangerous? Perhaps limit to canned - macros; see text.date_. - - .. _text.date: ../todo.html#text-date - -It's too dangerous (or too complicated in the case of "eval"). We do -not want to have such things in the core. - - -``encoding`` Directive -====================== - -Add an "encoding" directive to specify the character encoding of the -input data? Not a good idea for the following reasons: - -- When it sees the directive, the parser will already have read the - input data, and encoding determination will already have been done. - -- If a file with an "encoding" directive is edited and saved with - a different encoding, the directive may cause data corruption. - - -Support for Annotations -======================= - -Add an "annotation" role, as the equivalent of the HTML "title" -attribute? This is secondary information that may "pop up" when the -pointer hovers over the main text. A corresponding directive would be -required to associate annotations with the original text (by name, or -positionally as in anonymous targets?). - -There have not been many requests for such feature, though. Also, -cluttering WYSIWYG plaintext with annotations may not seem like a good -idea, and there is no "tool tip" in formats other than HTML. - - -``term`` Role -============= - -Add a "term" role for unfamiliar or specialized terminology? Probably -not; there is no real use case, and emphasis is enough for most cases. - - -.. - Local Variables: - mode: indented-text - indent-tabs-mode: nil - sentence-end-double-space: t - fill-column: 70 - End: diff --git a/docutils/docs/dev/rst/problems.txt b/docutils/docs/dev/rst/problems.txt deleted file mode 100644 index bc0101cbf..000000000 --- a/docutils/docs/dev/rst/problems.txt +++ /dev/null @@ -1,872 +0,0 @@ -============================== - Problems With StructuredText -============================== -:Author: David Goodger -:Contact: goodger@users.sourceforge.net -:Revision: $Revision$ -:Date: $Date$ -:Copyright: This document has been placed in the public domain. - -There are several problems, unresolved issues, and areas of -controversy within StructuredText_ (Classic and Next Generation). In -order to resolve all these issues, this analysis brings all of the -issues out into the open, enumerates all the alternatives, and -proposes solutions to be incorporated into the reStructuredText_ -specification. - - -.. contents:: - - -Formal Specification -==================== - -The description in the original StructuredText.py has been criticized -for being vague. For practical purposes, "the code *is* the spec." -Tony Ibbs has been working on deducing a `detailed description`_ from -the documentation and code of StructuredTextNG_. Edward Loper's -STMinus_ is another attempt to formalize a spec. - -For this kind of a project, the specification should always precede -the code. Otherwise, the markup is a moving target which can never be -adopted as a standard. Of course, a specification may be revised -during lifetime of the code, but without a spec there is no visible -control and thus no confidence. - - -Understanding and Extending the Code -==================================== - -The original StructuredText_ is a dense mass of sparsely commented -code and inscrutable regular expressions. It was not designed to be -extended and is very difficult to understand. StructuredTextNG_ has -been designed to allow input (syntax) and output extensions, but its -documentation (both internal [comments & docstrings], and external) is -inadequate for the complexity of the code itself. - -For reStructuredText to become truly useful, perhaps even part of -Python's standard library, it must have clear, understandable -documentation and implementation code. For the implementation of -reStructuredText to be taken seriously, it must be a sterling example -of the potential of docstrings; the implementation must practice what -the specification preaches. - - -Section Structure via Indentation -================================= - -Setext_ required that body text be indented by 2 spaces. The original -StructuredText_ and StructuredTextNG_ require that section structure -be indicated through indentation, as "inspired by Python". For -certain structures with a very limited, local extent (such as lists, -block quotes, and literal blocks), indentation naturally indicates -structure or hierarchy. For sections (which may have a very large -extent), structure via indentation is unnecessary, unnatural and -ambiguous. Rather, the syntax of the section title *itself* should -indicate that it is a section title. - -The original StructuredText states that "A single-line paragraph whose -immediately succeeding paragraphs are lower level is treated as a -header." Requiring indentation in this way is: - -- Unnecessary. The vast majority of docstrings and standalone - documents will have no more than one level of section structure. - Requiring indentation for such docstrings is unnecessary and - irritating. - -- Unnatural. Most published works use title style (type size, face, - weight, and position) and/or section/subsection numbering rather - than indentation to indicate hierarchy. This is a tradition with a - very long history. - -- Ambiguous. A StructuredText header is indistinguishable from a - one-line paragraph followed by a block quote (precluding the use of - block quotes). Enumerated section titles are ambiguous (is it a - header? is it a list item?). Some additional adornment must be - required to confirm the line's role as a title, both to a parser and - to the human reader of the source text. - -Python's use of significant whitespace is a wonderful (if not -original) innovation, however requiring indentation in ordinary -written text is hypergeneralization. - -reStructuredText_ indicates section structure through title adornment -style (as exemplified by this document). This is far more natural. -In fact, it is already in widespread use in plain text documents, -including in Python's standard distribution (such as the toplevel -README_ file). - - -Character Escaping Mechanism -============================ - -No matter what characters are chosen for markup, some day someone will -want to write documentation *about* that markup or using markup -characters in a non-markup context. Therefore, any complete markup -language must have an escaping or encoding mechanism. For a -lightweight markup system, encoding mechanisms like SGML/XML's '*' -are out. So an escaping mechanism is in. However, with carefully -chosen markup, it should be necessary to use the escaping mechanism -only infrequently. - -reStructuredText_ needs an escaping mechanism: a way to treat -markup-significant characters as the characters themselves. Currently -there is no such mechanism (although ZWiki uses '!'). What are the -candidates? - -1. ``!`` - (http://www.zope.org/DevHome/Members/jim/StructuredTextWiki/NGEscaping) -2. ``\`` -3. ``~`` -4. doubling of characters - -The best choice for this is the backslash (``\``). It's "the single -most popular escaping character in the world!", therefore familiar and -unsurprising. Since characters only need to be escaped under special -circumstances, which are typically those explaining technical -programming issues, the use of the backslash is natural and -understandable. Python docstrings can be raw (prefixed with an 'r', -as in 'r""'), which would obviate the need for gratuitous doubling-up -of backslashes. - -(On 2001-03-29 on the Doc-SIG mailing list, GvR endorsed backslash -escapes, saying, "'nuff said. Backslash it is." Although neither -legally binding nor irrevocable nor any kind of guarantee of anything, -it is a good sign.) - -The rule would be: An unescaped backslash followed by any markup -character escapes the character. The escaped character represents the -character itself, and is prevented from playing a role in any markup -interpretation. The backslash is removed from the output. A literal -backslash is represented by an "escaped backslash," two backslashes in -a row. - -A carefully constructed set of recognition rules for inline markup -will obviate the need for backslash-escapes in almost all cases; see -`Delimitation of Inline Markup`_ below. - -When an expression (requiring backslashes and other characters used -for markup) becomes too complicated and therefore unreadable, a -literal block may be used instead. Inside literal blocks, no markup -is recognized, therefore backslashes (for the purpose of escaping -markup) become unnecessary. - -We could allow backslashes preceding non-markup characters to remain -in the output. This would make describing regular expressions and -other uses of backslashes easier. However, this would complicate the -markup rules and would be confusing. - - -Blank Lines in Lists -==================== - -Oft-requested in Doc-SIG (the earliest reference is dated 1996-08-13) -is the ability to write lists without requiring blank lines between -items. In docstrings, space is at a premium. Authors want to convey -their API or usage information in as compact a form as possible. -StructuredText_ requires blank lines between all body elements, -including list items, even when boundaries are obvious from the markup -itself. - -In reStructuredText, blank lines are optional between list items. -However, in order to eliminate ambiguity, a blank line is required -before the first list item and after the last. Nested lists also -require blank lines before the list start and after the list end. - - -Bullet List Markup -================== - -StructuredText_ includes 'o' as a bullet character. This is dangerous -and counter to the language-independent nature of the markup. There -are many languages in which 'o' is a word. For example, in Spanish:: - - Llamame a la casa - o al trabajo. - - (Call me at home or at work.) - -And in Japanese (when romanized):: - - Senshuu no doyoubi ni tegami - o kakimashita. - - ([I] wrote a letter on Saturday last week.) - -If a paragraph containing an 'o' word wraps such that the 'o' is the -first text on a line, or if a paragraph begins with such a word, it -could be misinterpreted as a bullet list. - -In reStructuredText_, 'o' is not used as a bullet character. '-', -'*', and '+' are the possible bullet characters. - - -Enumerated List Markup -====================== - -StructuredText enumerated lists are allowed to begin with numbers and -letters followed by a period or right-parenthesis, then whitespace. -This has surprising consequences for writing styles. For example, -this is recognized as an enumerated list item by StructuredText:: - - Mr. Creosote. - -People will write enumerated lists in all different ways. It is folly -to try to come up with the "perfect" format for an enumerated list, -and limit the docstring parser's recognition to that one format only. - -Rather, the parser should recognize a variety of enumerator styles. -It is also recommended that the enumerator of the first list item be -ordinal-1 ('1', 'A', 'a', 'I', or 'i'), as output formats may not be -able to begin a list at an arbitrary enumeration. - -An initial idea was to require two or more consistent enumerated list -items in a row. This idea proved impractical and was dropped. In -practice, the presence of a proper enumerator is enough to reliably -recognize an enumerated list item; any ambiguities are reported by the -parser. Here's the original idea for posterity: - - The parser should recognize a variety of enumerator styles, mark - each block as a potential enumerated list item (PELI), and - interpret the enumerators of adjacent PELIs to decide whether they - make up a consistent enumerated list. - - If a PELI is labeled with a "1.", and is immediately followed by a - PELI labeled with a "2.", we've got an enumerated list. Or "(A)" - followed by "(B)". Or "i)" followed by "ii)", etc. The chances - of accidentally recognizing two adjacent and consistently labeled - PELIs, are acceptably small. - - For an enumerated list to be recognized, the following must be - true: - - - the list must consist of multiple adjacent list items (2 or - more) - - the enumerators must all have the same format - - the enumerators must be sequential - - -Definition List Markup -====================== - -StructuredText uses ' -- ' (whitespace, two hyphens, whitespace) on -the first line of a paragraph to indicate a definition list item. The -' -- ' serves to separate the term (on the left) from the definition -(on the right). - -Many people use ' -- ' as an em-dash in their text, conflicting with -the StructuredText usage. Although the Chicago Manual of Style says -that spaces should not be used around an em-dash, Peter Funk pointed -out that this is standard usage in German (according to the Duden, the -official German reference), and possibly in other languages as well. -The widespread use of ' -- ' precludes its use for definition lists; -it would violate the "unsurprising" criterion. - -A simpler, and at least equally visually distinctive construct -(proposed by Guido van Rossum, who incidentally is a frequent user of -' -- ') would do just as well:: - - term 1 - Definition. - - term 2 - Definition 2, paragraph 1. - - Definition 2, paragraph 2. - -A reStructuredText definition list item consists of a term and a -definition. A term is a simple one-line paragraph. A definition is a -block indented relative to the term, and may contain multiple -paragraphs and other body elements. No blank line precedes a -definition (this distinguishes definition lists from block quotes). - - -Literal Blocks -============== - -The StructuredText_ specification has literal blocks indicated by -'example', 'examples', or '::' ending the preceding paragraph. STNG -only recognizes '::'; 'example'/'examples' are not implemented. This -is good; it fixes an unnecessary language dependency. The problem is -what to do with the sometimes- unwanted '::'. - -In reStructuredText_ '::' at the end of a paragraph indicates that -subsequent *indented* blocks are treated as literal text. No further -markup interpretation is done within literal blocks (not even -backslash-escapes). If the '::' is preceded by whitespace, '::' is -omitted from the output; if '::' was the sole content of a paragraph, -the entire paragraph is removed (no 'empty' paragraph remains). If -'::' is preceded by a non-whitespace character, '::' is replaced by -':' (i.e., the extra colon is removed). - -Thus, a section could begin with a literal block as follows:: - - Section Title - ------------- - - :: - - print "this is example literal" - - -Tables -====== - -The table markup scheme in classic StructuredText was horrible. Its -omission from StructuredTextNG is welcome, and its markup will not be -repeated here. However, tables themselves are useful in -documentation. Alternatives: - -1. This format is the most natural and obvious. It was independently - invented (no great feat of creation!), and later found to be the - format supported by the `Emacs table mode`_:: - - +------------+------------+------------+--------------+ - | Header 1 | Header 2 | Header 3 | Header 4 | - +============+============+============+==============+ - | Column 1 | Column 2 | Column 3 & 4 span (Row 1) | - +------------+------------+------------+--------------+ - | Column 1 & 2 span | Column 3 | - Column 4 | - +------------+------------+------------+ - Row 2 & 3 | - | 1 | 2 | 3 | - span | - +------------+------------+------------+--------------+ - - Tables are described with a visual outline made up of the - characters '-', '=', '|', and '+': - - - The hyphen ('-') is used for horizontal lines (row separators). - - The equals sign ('=') is optionally used as a header separator - (as of version 1.5.24, this is not supported by the Emacs table - mode). - - The vertical bar ('|') is used for for vertical lines (column - separators). - - The plus sign ('+') is used for intersections of horizontal and - vertical lines. - - Row and column spans are possible simply by omitting the column or - row separators, respectively. The header row separator must be - complete; in other words, a header cell may not span into the table - body. Each cell contains body elements, and may have multiple - paragraphs, lists, etc. Initial spaces for a left margin are - allowed; the first line of text in a cell determines its left - margin. - -2. Below is a simpler table structure. It may be better suited to - manual input than alternative #1, but there is no Emacs editing - mode available. One disadvantage is that it resembles section - titles; a one-column table would look exactly like section & - subsection titles. :: - - ============ ============ ============ ============== - Header 1 Header 2 Header 3 Header 4 - ============ ============ ============ ============== - Column 1 Column 2 Column 3 & 4 span (Row 1) - ------------ ------------ --------------------------- - Column 1 & 2 span Column 3 - Column 4 - ------------------------- ------------ - Row 2 & 3 - 1 2 3 - span - ============ ============ ============ ============== - - The table begins with a top border of equals signs with a space at - each column boundary (regardless of spans). Each row is - underlined. Internal row separators are underlines of '-', with - spaces at column boundaries. The last of the optional head rows is - underlined with '=', again with spaces at column boundaries. - Column spans have no spaces in their underline. Row spans simply - lack an underline at the row boundary. The bottom boundary of the - table consists of '=' underlines. A blank line is required - following a table. - -3. A minimalist alternative is as follows:: - - ==== ===== ======== ======== ======= ==== ===== ===== - Old State Input Action New State Notes - ----------- -------- ----------------- ----------- - ids types new type sys.msg. dupname ids types - ==== ===== ======== ======== ======= ==== ===== ===== - -- -- explicit -- -- new True - -- -- implicit -- -- new False - None False explicit -- -- new True - old False explicit implicit old new True - None True explicit explicit new None True - old True explicit explicit new,old None True [1] - None False implicit implicit new None False - old False implicit implicit new,old None False - None True implicit implicit new None True - old True implicit implicit new old True - ==== ===== ======== ======== ======= ==== ===== ===== - - The table begins with a top border of equals signs with one or more - spaces at each column boundary (regardless of spans). There must - be at least two columns in the table (to differentiate it from - section headers). Each line starts a new row. The rightmost - column is unbounded; text may continue past the edge of the table. - Each row/line must contain spaces at column boundaries, except for - explicit column spans. Underlines of '-' can be used to indicate - column spans, but should be used sparingly if at all. Lines - containing column span underlines may not contain any other text. - The last of the optional head rows is underlined with '=', again - with spaces at column boundaries. The bottom boundary of the table - consists of '=' underlines. A blank line is required following a - table. - - This table sums up the features. Using all the features in such a - small space is not pretty though:: - - ======== ======== ======== - Header 2 & 3 Span - ------------------ - Header 1 Header 2 Header 3 - ======== ======== ======== - Each line is a new row. - Each row consists of one line only. - Row spans are not possible. - The last column may spill over to the right. - Column spans are possible with an underline joining columns. - ---------------------------- - The span is limited to the row above the underline. - ======== ======== ======== - -4. As a variation of alternative 3, bullet list syntax in the first - column could be used to indicate row starts. Multi-line rows are - possible, but row spans are not. For example:: - - ===== ===== - col 1 col 2 - ===== ===== - - 1 Second column of row 1. - - 2 Second column of row 2. - Second line of paragraph. - - 3 Second column of row 3. - - Second paragraph of row 3, - column 2 - ===== ===== - - Column spans would be indicated on the line after the last line of - the row. To indicate a real bullet list within a first-column - cell, simply nest the bullets. - -5. In a further variation, we could simply assume that whitespace in - the first column implies a multi-line row; the text in other - columns is continuation text. For example:: - - ===== ===== - col 1 col 2 - ===== ===== - 1 Second column of row 1. - 2 Second column of row 2. - Second line of paragraph. - 3 Second column of row 3. - - Second paragraph of row 3, - column 2 - ===== ===== - - Limitations of this approach: - - - Cells in the first column are limited to one line of text. - - - Cells in the first column *must* contain some text; blank cells - would lead to a misinterpretation. An empty comment ("..") is - sufficient. - -6. Combining alternative 3 and 4, a bullet list in the first column - could mean multi-line rows, and no bullet list means single-line - rows only. - -Alternatives 1 and 5 has been adopted by reStructuredText. - - -Delimitation of Inline Markup -============================= - -StructuredText specifies that inline markup must begin with -whitespace, precluding such constructs as parenthesized or quoted -emphatic text:: - - "**What?**" she cried. (*exit stage left*) - -The `reStructuredText markup specification`_ allows for such -constructs and disambiguates inline markup through a set of -recognition rules. These recognition rules define the context of -markup start-strings and end-strings, allowing markup characters to be -used in most non-markup contexts without a problem (or a backslash). -So we can say, "Use asterisks (*) around words or phrases to -*emphasisze* them." The '(*)' will not be recognized as markup. This -reduces the need for markup escaping to the point where an escape -character is *almost* (but not quite!) unnecessary. - - -Underlining -=========== - -StructuredText uses '_text_' to indicate underlining. To quote David -Ascher in his 2000-01-21 Doc-SIG mailing list post, "Docstring -grammar: a very revised proposal": - - The tagging of underlined text with _'s is suboptimal. Underlines - shouldn't be used from a typographic perspective (underlines were - designed to be used in manuscripts to communicate to the - typesetter that the text should be italicized -- no well-typeset - book ever uses underlines), and conflict with double-underscored - Python variable names (__init__ and the like), which would get - truncated and underlined when that effect is not desired. Note - that while *complete* markup would prevent that truncation - ('__init__'), I think of docstring markups much like I think of - type annotations -- they should be optional and above all do no - harm. In this case the underline markup does harm. - -Underlining is not part of the reStructuredText specification. - - -Inline Literals -=============== - -StructuredText's markup for inline literals (text left as-is, -verbatim, usually in a monospaced font; as in HTML <TT>) is single -quotes ('literals'). The problem with single quotes is that they are -too often used for other purposes: - -- Apostrophes: "Don't blame me, 'cause it ain't mine, it's Chris'."; - -- Quoting text: - - First Bruce: "Well Bruce, I heard the prime minister use it. - 'S'hot enough to boil a monkey's bum in 'ere your Majesty,' he - said, and she smiled quietly to herself." - - In the UK, single quotes are used for dialogue in published works. - -- String literals: s = '' - -Alternatives:: - - 'text' \'text\' ''text'' "text" \"text\" ""text"" - #text# @text@ `text` ^text^ ``text'' ``text`` - -The examples below contain inline literals, quoted text, and -apostrophes. Each example should evaluate to the following HTML:: - - Some <TT>code</TT>, with a 'quote', "double", ain't it grand? - Does <TT>a[b] = 'c' + "d" + `2^3`</TT> work? - - 0. Some code, with a quote, double, ain't it grand? - Does a[b] = 'c' + "d" + `2^3` work? - 1. Some 'code', with a \'quote\', "double", ain\'t it grand? - Does 'a[b] = \'c\' + "d" + `2^3`' work? - 2. Some \'code\', with a 'quote', "double", ain't it grand? - Does \'a[b] = 'c' + "d" + `2^3`\' work? - 3. Some ''code'', with a 'quote', "double", ain't it grand? - Does ''a[b] = 'c' + "d" + `2^3`'' work? - 4. Some "code", with a 'quote', \"double\", ain't it grand? - Does "a[b] = 'c' + "d" + `2^3`" work? - 5. Some \"code\", with a 'quote', "double", ain't it grand? - Does \"a[b] = 'c' + "d" + `2^3`\" work? - 6. Some ""code"", with a 'quote', "double", ain't it grand? - Does ""a[b] = 'c' + "d" + `2^3`"" work? - 7. Some #code#, with a 'quote', "double", ain't it grand? - Does #a[b] = 'c' + "d" + `2^3`# work? - 8. Some @code@, with a 'quote', "double", ain't it grand? - Does @a[b] = 'c' + "d" + `2^3`@ work? - 9. Some `code`, with a 'quote', "double", ain't it grand? - Does `a[b] = 'c' + "d" + \`2^3\`` work? - 10. Some ^code^, with a 'quote', "double", ain't it grand? - Does ^a[b] = 'c' + "d" + `2\^3`^ work? - 11. Some ``code'', with a 'quote', "double", ain't it grand? - Does ``a[b] = 'c' + "d" + `2^3`'' work? - 12. Some ``code``, with a 'quote', "double", ain't it grand? - Does ``a[b] = 'c' + "d" + `2^3\``` work? - -Backquotes (#9 & #12) are the best choice. They are unobtrusive and -relatviely rarely used (more rarely than ' or ", anyhow). Backquotes -have the connotation of 'quotes', which other options (like carets, -#10) don't. - -Analogously with ``*emph*`` & ``**strong**``, double-backquotes (#12) -could be used for inline literals. If single-backquotes are used for -'interpreted text' (context-sensitive domain-specific descriptive -markup) such as function name hyperlinks in Python docstrings, then -double-backquotes could be used for absolute-literals, wherein no -processing whatsoever takes place. An advantage of double-backquotes -would be that backslash-escaping would no longer be necessary for -embedded single-backquotes; however, embedded double-backquotes (in an -end-string context) would be illegal. See `Backquotes in -Phrase-Links`__ in `Record of reStructuredText Syntax Alternatives`__. - -__ alternatives.html#backquotes-in-phrase-links -__ alternatives.html - -Alternative choices are carets (#10) and TeX-style quotes (#11). For -examples of TeX-style quoting, see -http://www.zope.org/Members/jim/StructuredTextWiki/CustomizingTheDocumentProcessor. - -Some existing uses of backquotes: - -1. As a synonym for repr() in Python. -2. For command-interpolation in shell scripts. -3. Used as open-quotes in TeX code (and carried over into plaintext - by TeXies). - -The inline markup start-string and end-string recognition rules -defined by the `reStructuredText markup specification`_ would allow -all of these cases inside inline literals, with very few exceptions. -As a fallback, literal blocks could handle all cases. - -Outside of inline literals, the above uses of backquotes would require -backslash-escaping. However, these are all prime examples of text -that should be marked up with inline literals. - -If either backquotes or straight single-quotes are used as markup, -TeX-quotes are too troublesome to support, so no special-casing of -TeX-quotes should be done (at least at first). If TeX-quotes have to -be used outside of literals, a single backslash-escaped would suffice: -\``TeX quote''. Ugly, true, but very infrequently used. - -Using literal blocks is a fallback option which removes the need for -backslash-escaping:: - - like this:: - - Here, we can do ``absolutely'' anything `'`'\|/|\ we like! - -No mechanism for inline literals is perfect, just as no escaping -mechanism is perfect. No matter what we use, complicated inline -expressions involving the inline literal quote and/or the backslash -will end up looking ugly. We can only choose the least often ugly -option. - -reStructuredText will use double backquotes for inline literals, and -single backqoutes for interpreted text. - - -Hyperlinks -========== - -There are three forms of hyperlink currently in StructuredText_: - -1. (Absolute & relative URIs.) Text enclosed by double quotes - followed by a colon, a URI, and concluded by punctuation plus white - space, or just white space, is treated as a hyperlink:: - - "Python":http://www.python.org/ - -2. (Absolute URIs only.) Text enclosed by double quotes followed by a - comma, one or more spaces, an absolute URI and concluded by - punctuation plus white space, or just white space, is treated as a - hyperlink:: - - "mail me", mailto:me@mail.com - -3. (Endnotes.) Text enclosed by brackets link to an endnote at the - end of the document: at the beginning of the line, two dots, a - space, and the same text in brackets, followed by the end note - itself:: - - Please refer to the fine manual [GVR2001]. - - .. [GVR2001] Python Documentation, Release 2.1, van Rossum, - Drake, et al., http://www.python.org/doc/ - -The problem with forms 1 and 2 is that they are neither intuitive nor -unobtrusive (they break design goals 5 & 2). They overload -double-quotes, which are too often used in ordinary text (potentially -breaking design goal 4). The brackets in form 3 are also too common -in ordinary text (such as [nested] asides and Python lists like [12]). - -Alternatives: - -1. Have no special markup for hyperlinks. - -2. A. Interpret and mark up hyperlinks as any contiguous text - containing '://' or ':...@' (absolute URI) or '@' (email - address) after an alphanumeric word. To de-emphasize the URI, - simply enclose it in parentheses: - - Python (http://www.python.org/) - - B. Leave special hyperlink markup as a domain-specific extension. - Hyperlinks in ordinary reStructuredText documents would be - required to be standalone (i.e. the URI text inline in the - document text). Processed hyperlinks (where the URI text is - hidden behind the link) are important enough to warrant syntax. - -3. The original Setext_ introduced a mechanism of indirect hyperlinks. - A source link word ('hot word') in the text was given a trailing - underscore:: - - Here is some text with a hyperlink_ built in. - - The hyperlink itself appeared at the end of the document on a line - by itself, beginning with two dots, a space, the link word with a - leading underscore, whitespace, and the URI itself:: - - .. _hyperlink http://www.123.xyz - - Setext used ``underscores_instead_of_spaces_`` for phrase links. - -With some modification, alternative 3 best satisfies the design goals. -It has the advantage of being readable and relatively unobtrusive. -Since each source link must match up to a target, the odd variable -ending in an underscore can be spared being marked up (although it -should generate a "no such link target" warning). The only -disadvantage is that phrase-links aren't possible without some -obtrusive syntax. - -We could achieve phrase-links if we enclose the link text: - -1. in double quotes:: - - "like this"_ - -2. in brackets:: - - [like this]_ - -3. or in backquotes:: - - `like this`_ - -Each gives us somewhat obtrusive markup, but that is unavoidable. The -bracketed syntax (#2) is reminiscent of links on many web pages -(intuitive), although it is somewhat obtrusive. Alternative #3 is -much less obtrusive, and is consistent with interpreted text: the -trailing underscore indicates the interpretation of the phrase, as a -hyperlink. #3 also disambiguates hyperlinks from footnote references. -Alternative #3 wins. - -The same trailing underscore markup can also be used for footnote and -citation references, removing the problem with ordinary bracketed text -and Python lists:: - - Please refer to the fine manual [GVR2000]_. - - .. [GVR2000] Python Documentation, van Rossum, Drake, et al., - http://www.python.org/doc/ - -The two-dots-and-a-space syntax was generalized by Setext for -comments, which are removed from the (visible) processed output. -reStructuredText uses this syntax for comments, footnotes, and link -target, collectively termed "explicit markup". For link targets, in -order to eliminate ambiguity with comments and footnotes, -reStructuredText specifies that a colon always follow the link target -word/phrase. The colon denotes 'maps to'. There is no reason to -restrict target links to the end of the document; they could just as -easily be interspersed. - -Internal hyperlinks (links from one point to another within a single -document) can be expressed by a source link as before, and a target -link with a colon but no URI. In effect, these targets 'map to' the -element immediately following. - -As an added bonus, we now have a perfect candidate for -reStructuredText directives, a simple extension mechanism: explicit -markup containing a single word followed by two colons and whitespace. -The interpretation of subsequent data on the directive line or -following is directive-dependent. - -To summarize:: - - .. This is a comment. - - .. The line below is an example of a directive. - .. version:: 1 - - This is a footnote [1]_. - - This internal hyperlink will take us to the footnotes_ area below. - - Here is a one-word_ external hyperlink. - - Here is `a hyperlink phrase`_. - - .. _footnotes: - .. [1] Footnote text goes here. - - .. external hyperlink target mappings: - .. _one-word: http://www.123.xyz - .. _a hyperlink phrase: http://www.123.xyz - -The presence or absence of a colon after the target link -differentiates an indirect hyperlink from a footnote, respectively. A -footnote requires brackets. Backquotes around a target link word or -phrase are required if the phrase contains a colon, optional -otherwise. - -Below are examples using no markup, the two StructuredText hypertext -styles, and the reStructuredText hypertext style. Each example -contains an indirect link, a direct link, a footnote/endnote, and -bracketed text. In HTML, each example should evaluate to:: - - <P>A <A HREF="http://spam.org">URI</A>, see <A HREF="#eggs2000"> - [eggs2000]</A> (in Bacon [Publisher]). Also see - <A HREF="http://eggs.org">http://eggs.org</A>.</P> - - <P><A NAME="eggs2000">[eggs2000]</A> "Spam, Spam, Spam, Eggs, - Bacon, and Spam"</P> - -1. No markup:: - - A URI http://spam.org, see eggs2000 (in Bacon [Publisher]). - Also see http://eggs.org. - - eggs2000 "Spam, Spam, Spam, Eggs, Bacon, and Spam" - -2. StructuredText absolute/relative URI syntax - ("text":http://www.url.org):: - - A "URI":http://spam.org, see [eggs2000] (in Bacon [Publisher]). - Also see "http://eggs.org":http://eggs.org. - - .. [eggs2000] "Spam, Spam, Spam, Eggs, Bacon, and Spam" - - Note that StructuredText does not recognize standalone URIs, - forcing doubling up as shown in the second line of the example - above. - -3. StructuredText absolute-only URI syntax - ("text", mailto:you@your.com):: - - A "URI", http://spam.org, see [eggs2000] (in Bacon - [Publisher]). Also see "http://eggs.org", http://eggs.org. - - .. [eggs2000] "Spam, Spam, Spam, Eggs, Bacon, and Spam" - -4. reStructuredText syntax:: - - 4. A URI_, see [eggs2000]_ (in Bacon [Publisher]). - Also see http://eggs.org. - - .. _URI: http:/spam.org - .. [eggs2000] "Spam, Spam, Spam, Eggs, Bacon, and Spam" - -The bracketed text '[Publisher]' may be problematic with -StructuredText (syntax 2 & 3). - -reStructuredText's syntax (#4) is definitely the most readable. The -text is separated from the link URI and the footnote, resulting in -cleanly readable text. - -.. _StructuredText: - http://www.zope.org/DevHome/Members/jim/StructuredTextWiki/FrontPage -.. _Setext: http://docutils.sourceforge.net/mirror/setext.html -.. _reStructuredText: http://docutils.sourceforge.net/rst.html -.. _detailed description: - http://homepage.ntlworld.com/tibsnjoan/docutils/STNG-format.html -.. _STMinus: http://www.cis.upenn.edu/~edloper/pydoc/stminus.html -.. _StructuredTextNG: - http://www.zope.org/DevHome/Members/jim/StructuredTextWiki/StructuredTextNG -.. _README: http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/~checkout~/ - python/python/dist/src/README -.. _Emacs table mode: http://table.sourceforge.net/ -.. _reStructuredText Markup Specification: - ../../ref/rst/restructuredtext.html - - -.. - Local Variables: - mode: indented-text - indent-tabs-mode: nil - sentence-end-double-space: t - fill-column: 70 - End: diff --git a/docutils/docs/dev/semantics.txt b/docutils/docs/dev/semantics.txt deleted file mode 100644 index cd20e15f6..000000000 --- a/docutils/docs/dev/semantics.txt +++ /dev/null @@ -1,119 +0,0 @@ -===================== - Docstring Semantics -===================== -:Author: David Goodger -:Contact: goodger@users.sourceforge.net -:Revision: $Revision$ -:Date: $Date$ -:Copyright: This document has been placed in the public domain. - -These are notes for a possible future PEP providing the final piece of -the Python docstring puzzle: docstring semantics or documentation -methodology. `PEP 257`_, Docstring Conventions, sketches out some -guidelines, but does not get into methodology details. - -I haven't explored documentation methodology more because, in my -opinion, it is a completely separate issue from syntax, and it's even -more controversial than syntax. Nobody wants to be told how to lay -out their documentation, a la JavaDoc_. I think the JavaDoc way is -butt-ugly, but it *is* an established standard for the Java world. -Any standard documentation methodology has to be formal enough to be -useful but remain light enough to be usable. If the methodology is -too strict, too heavy, or too ugly, many/most will not want to use it. - -I think a standard methodology could benefit the Python community, but -it would be a hard sell. A PEP would be the place to start. For most -human-readable documentation needs, the free-form text approach is -adequate. We'd only need a formal methodology if we want to extract -the parameters into a data dictionary, index, or summary of some kind. - - -PythonDoc -========= - -(Not to be confused with Daniel Larsson's pythondoc_ project.) - -A Python version of the JavaDoc_ semantics (not syntax). A set of -conventions which are understood by the Docutils. What JavaDoc has -done is to establish a syntax that enables a certain documentation -methodology, or standard *semantics*. JavaDoc is not just syntax; it -prescribes a methodology. - -- Use field lists or definition lists for "tagged blocks". By this I - mean that field lists can be used similarly to JavaDoc's ``@tag`` - syntax. That's actually one of the motivators behind field lists. - For example, we could have:: - - """ - :Parameters: - - `lines`: a list of one-line strings without newlines. - - `until_blank`: Stop collecting at the first blank line if - true (1). - - `strip_indent`: Strip common leading indent if true (1, - default). - - :Return: - - a list of indented lines with mininum indent removed; - - the amount of the indent; - - whether or not the block finished with a blank line or at - the end of `lines`. - """ - - This is taken straight out of docutils/statemachine.py, in which I - experimented with a simple documentation methodology. Another - variation I've thought of exploits the Grouch_-compatible - "classifier" element of definition lists. For example:: - - :Parameters: - `lines` : [string] - List of one-line strings without newlines. - `until_blank` : boolean - Stop collecting at the first blank line if true (1). - `strip_indent` : boolean - Strip common leading indent if true (1, default). - -- Field lists could even be used in a one-to-one correspondence with - JavaDoc ``@tags``, although I doubt if I'd recommend it. Several - ports of JavaDoc's ``@tag`` methodology exist in Python, most - recently Ed Loper's "epydoc_". - - -Other Ideas -=========== - -- Can we extract comments from parsed modules? Could be handy for - documenting function/method parameters:: - - def method(self, - source, # path of input file - dest # path of output file - ): - - This would save having to repeat parameter names in the docstring. - - Idea from Mark Hammond's 1998-06-23 Doc-SIG post, "Re: [Doc-SIG] - Documentation tool": - - it would be quite hard to add a new param to this method without - realising you should document it - -- Frederic Giacometti's `iPhrase Python documentation conventions`_ is - an attachment to his Doc-SIG post of 2001-05-30. - - -.. _PEP 257: http://www.python.org/peps/pep-0257.html -.. _JavaDoc: http://java.sun.com/j2se/javadoc/ -.. _pythondoc: http://starship.python.net/crew/danilo/pythondoc/ -.. _Grouch: http://www.mems-exchange.org/software/grouch/ -.. _epydoc: http://epydoc.sf.net/ -.. _iPhrase Python documentation conventions: - http://mail.python.org/pipermail/doc-sig/2001-May/001840.html - - -.. - Local Variables: - mode: indented-text - indent-tabs-mode: nil - sentence-end-double-space: t - fill-column: 70 - End: diff --git a/docutils/docs/dev/testing.txt b/docutils/docs/dev/testing.txt deleted file mode 100644 index bde54116f..000000000 --- a/docutils/docs/dev/testing.txt +++ /dev/null @@ -1,246 +0,0 @@ -=================== - Docutils_ Testing -=================== - -:Author: Felix Wiemann -:Author: David Goodger -:Revision: $Revision$ -:Date: $Date$ -:Copyright: This document has been placed in the public domain. - -.. _Docutils: http://docutils.sourceforge.net/ - -.. contents:: - -When adding new functionality (or fixing bugs), be sure to add test -cases to the test suite. Practise test-first programming; it's fun, -it's addictive, and it works! - -This document describes how to run the Docutils test suite, how the -tests are organized and how to add new tests or modify existing tests. - - -Running the Test Suite -====================== - -Before checking in any changes, run the entire Docutils test suite to -be sure that you haven't broken anything. From a shell:: - - cd docutils/test - ./alltests.py - - -Python Versions -=============== - -The Docutils 0.4 release supports Python 2.1 [#py21]_ or later, with -some features only working (and being tested) with Python 2.3+. -Therefore, you should actually have Pythons 2.1 [#py21]_, 2.2, 2.3, as -well as the latest Python installed and always run the tests on all of -them. (A good way to do that is to always run the test suite through -a short script that runs ``alltests.py`` under each version of -Python.) If you can't afford intalling 3 or more Python versions, the -edge cases (2.1 and 2.3) should cover most of it. - -.. [#py21] Python 2.1 may be used providing the compiler package is - installed. The compiler package can be found in the Tools/ - directory of Python 2.1's source distribution. - -Good resources covering the differences between Python versions: - -* `What's New in Python 2.2`__ -* `What's New in Python 2.3`__ -* `What's New in Python 2.4`__ -* `PEP 290 - Code Migration and Modernization`__ - -__ http://www.python.org/doc/2.2.3/whatsnew/whatsnew22.html -__ http://www.python.org/doc/2.3.5/whatsnew/whatsnew23.html -__ http://www.python.org/doc/2.4.1/whatsnew/whatsnew24.html -__ http://www.python.org/peps/pep-0290.html - -.. _Python Check-in Policies: http://www.python.org/dev/tools.html -.. _sandbox directory: - http://svn.berlios.de/viewcvs/docutils/trunk/sandbox/ -.. _nightly repository tarball: - http://svn.berlios.de/svndumps/docutils-repos.gz - - -Unit Tests -========== - -Unit tests test single functions or modules (i.e. whitebox testing). - -If you are implementing a new feature, be sure to write a test case -covering its functionality. It happens very frequently that your -implementation (or even only a part of it) doesn't work with an older -(or even newer) Python version, and the only reliable way to detect -those cases is using tests. - -Often, it's easier to write the test first and then implement the -functionality required to make the test pass. - - -Writing New Tests ------------------ - -When writing new tests, it very often helps to see how a similar test -is implemented. For example, the files in the -``test_parsers/test_rst/`` directory all look very similar. So when -adding a test, you don't have to reinvent the wheel. - -If there is no similar test, you can write a new test from scratch -using Python's ``unittest`` module. For an example, please have a -look at the following imaginary ``test_square.py``:: - - #! /usr/bin/env python - - # Author: your name - # Contact: your email address - # Revision: $Revision$ - # Date: $Date$ - # Copyright: This module has been placed in the public domain. - - """ - Test module for docutils.square. - """ - - import unittest - import docutils.square - - - class SquareTest(unittest.TestCase): - - def test_square(self): - self.assertEqual(docutils.square.square(0), 0) - self.assertEqual(docutils.square.square(5), 25) - self.assertEqual(docutils.square.square(7), 49) - - def test_square_root(self): - self.assertEqual(docutils.square.sqrt(49), 7) - self.assertEqual(docutils.square.sqrt(0), 0) - self.assertRaises(docutils.square.SquareRootError, - docutils.square.sqrt, 20) - - - if __name__ == '__main__': - unittest.main() - -For more details on how to write tests, please refer to the -documentation of the ``unittest`` module. - - -.. _functional: - -Functional Tests -================ - -The directory ``test/functional/`` contains data for functional tests. - -Performing functional testing means testing the Docutils system as a -whole (i.e. blackbox testing). - - -Directory Structure -------------------- - -+ ``functional/`` The main data directory. - - + ``input/`` The input files. - - - ``some_test.txt``, for example. - - + ``output/`` The actual output. - - - ``some_test.html``, for example. - - + ``expected/`` The expected output. - - - ``some_test.html``, for example. - - + ``tests/`` The config files for processing the input files. - - - ``some_test.py``, for example. - - - ``_default.py``, the `default configuration file`_. - - -The Testing Process -------------------- - -When running ``test_functional.py``, all config files in -``functional/tests/`` are processed. (Config files whose names begin -with an underscore are ignored.) The current working directory is -always Docutils' main test directory (``test/``). - -For example, ``functional/tests/some_test.py`` could read like this:: - - # Source and destination file names. - test_source = "some_test.txt" - test_destination = "some_test.html" - - # Keyword parameters passed to publish_file. - reader_name = "standalone" - parser_name = "rst" - writer_name = "html" - settings_overrides['output-encoding'] = 'utf-8' - # Relative to main ``test/`` directory. - settings_overrides['stylesheet_path'] = '../docutils/writers/html4css1/html4css1.css' - -The two variables ``test_source`` and ``test_destination`` contain the -input file name (relative to ``functional/input/``) and the output -file name (relative to ``functional/output/`` and -``functional/expected/``). Note that the file names can be chosen -arbitrarily. However, the file names in ``functional/output/`` *must* -match the file names in ``functional/expected/``. - -If defined, ``_test_more`` must be a function with the following -signature:: - - def _test_more(expected_dir, output_dir, test_case, parameters): - -This function is called from the test case to perform tests beyond the -simple comparison of expected and actual output files. - -``test_source`` and ``test_destination`` are removed from the -namespace, as are all variables whose names begin with an underscore -("_"). The remaining names are passed as keyword arguments to -``docutils.core.publish_file``, so you can set reader, parser, writer -and anything else you want to configure. Note that -``settings_overrides`` is already initialized as a dictionary *before* -the execution of the config file. - - -Creating New Tests ------------------- - -In order to create a new test, put the input test file into -``functional/input/``. Then create a config file in -``functional/tests/`` which sets at least input and output file names, -reader, parser and writer. - -Now run ``test_functional.py``. The test will fail, of course, -because you do not have an expected output yet. However, an output -file will have been generated in ``functional/output/``. Check this -output file for validity and correctness. Then copy the file to -``functional/expected/``. - -If you rerun ``test_functional.py`` now, it should pass. - -If you run ``test_functional.py`` later and the actual output doesn't -match the expected output anymore, the test will fail. - -If this is the case and you made an intentional change, check the -actual output for validity and correctness, copy it to -``functional/expected/`` (overwriting the old expected output), and -commit the change. - - -.. _default configuration file: - -The Default Configuration File ------------------------------- - -The file ``functional/tests/_default.py`` contains default settings. -It is executed just before the actual configuration files, which has -the same effect as if the contents of ``_default.py`` were prepended -to every configuration file. diff --git a/docutils/docs/dev/todo.txt b/docutils/docs/dev/todo.txt deleted file mode 100644 index 6f1c6291d..000000000 --- a/docutils/docs/dev/todo.txt +++ /dev/null @@ -1,1964 +0,0 @@ -====================== - Docutils_ To Do List -====================== - -:Author: David Goodger (with input from many); open to all Docutils - developers -:Contact: goodger@python.org -:Date: $Date$ -:Revision: $Revision$ -:Copyright: This document has been placed in the public domain. - -.. _Docutils: http://docutils.sourceforge.net/ - -.. contents:: - - -Priority items are marked with "@" symbols. The more @s, the higher -the priority. Items in question form (containing "?") are ideas which -require more thought and debate; they are potential to-do's. - -Many of these items are awaiting champions. If you see something -you'd like to tackle, please do! If there's something you'd like to -see done but are unable to implement it yourself, please consider -donating to Docutils: |donate| - -.. |donate| image:: http://images.sourceforge.net/images/project-support.jpg - :target: http://sourceforge.net/donate/index.php?group_id=38414 - :align: middle - :width: 88 - :height: 32 - :alt: Support the Docutils project! - -Please see also the Bugs_ document for a list of bugs in Docutils. - -.. _bugs: ../../BUGS.html - - -Release 0.4 -=========== - -We should get Docutils 0.4 out soon, but we shouldn't just cut a -"frozen snapshot" release. Here's a list of features (achievable in -the short term) to include: - -* [DONE in rev. 3901] Move support files to docutils/writers/support. - -* [DONE in rev. 4163] Convert ``docutils/writers/support/*`` into - individual writer packages. - -* [DONE in rev. 3901] Remove docutils.transforms.html.StylesheetCheck - (no longer needed because of the above change). - -* [DONE in rev. 3962] Incorporate new branch policy into the docs. - ("Development strategy" thread on Docutils-develop) - -* [DONE in rev. 4152] Added East-Asian double-width character support. - -* [DONE in rev. 4156] Merge the S5 branch. - -Anything else? - -Once released, - -* Tag it and create a maintenance branch (perhaps "maint-0-4"). - -* Declare that: - - - Docutils 0.4.x is the last version that will support Python 2.1 - (and perhaps higher?) - - - Docutils 0.4.x is the last version that will support (make - compromises for) Netscape Navigator 4 - - -Minimum Requirements for Python Standard Library Candidacy -========================================================== - -Below are action items that must be added and issues that must be -addressed before Docutils can be considered suitable to be proposed -for inclusion in the Python standard library. - -* Support for `document splitting`_. May require some major code - rework. - -* Support for subdocuments (see `large documents`_). - -* `Object numbering and object references`_. - -* `Nested inline markup`_. - -* `Python Source Reader`_. - -* The HTML writer needs to be rewritten (or a second HTML writer - added) to allow for custom classes, and for arbitrary splitting - (stack-based?). - -* Documentation_ of the architecture. Other docs too. - -* Plugin support. - -* A LaTeX writer making use of (La)TeX's power, so that the rendering - of the resulting documents is more easily customizable. (Similar to - what you wrote about a new HTML Writer.) - -* Suitability for `Python module documentation - <http://docutils.sf.net/sandbox/README.html#documenting-python>`_. - - -General -======= - -* Allow different report levels for STDERR and system_messages inside - the document? - -* Change the docutils-update script (in sandbox/infrastructure), to - support arbitrary branch snapshots. - -* Add a generic "container" element, equivalent to "inline", to which - a "class" attribute can be attached. Will require a reST directive - also. - -* Move some general-interest sandboxes out of individuals' - directories, into subprojects? - -* Add option for file (and URL) access restriction to make Docutils - usable in Wikis and similar applications. - - 2005-03-21: added ``file_insertion_enabled`` & ``raw_enabled`` - settings. These partially solve the problem, allowing or disabling - **all** file accesses, but not limited access. - -* Configuration file handling needs discussion: - - - There should be some error checking on the contents of config - files. How much checking should be done? How loudly should - Docutils complain if it encounters an error/problem? - - - Docutils doesn't complain when it doesn't find a configuration - file supplied with the ``--config`` option. Should it? (If yes, - error or warning?) - -* Internationalization: - - - I18n needs refactoring, the language dictionaries are difficult to - maintain. Maybe have a look at gettext or similar tools. - - - Language modules: in accented languages it may be useful to have - both accented and unaccented entries in the - ``bibliographic_fields`` mapping for versatility. - - - Add a "--strict-language" option & setting: no English fallback - for language-dependent features. - - - Add internationalization to _`footer boilerplate text` (resulting - from "--generator", "--source-link", and "--date" etc.), allowing - translations. - -* Add validation? See http://pytrex.sourceforge.net, RELAX NG, pyRXP. - -* In ``docutils.readers.get_reader_class`` (& ``parsers`` & - ``writers`` too), should we be importing "standalone" or - "docutils.readers.standalone"? (This would avoid importing - top-level modules if the module name is not in docutils/readers. - Potential nastiness.) - -* Perhaps store a _`name-to-id mapping file`? This could be stored - permanently, read by subsequent processing runs, and updated with - new entries. ("Persistent ID mapping"?) - -* Perhaps the ``Component.supports`` method should deal with - individual features ("meta" etc.) instead of formats ("html" etc.)? - -* Add _`object numbering and object references` (tables & figures). - These would be the equivalent of DocBook's "formal" elements. - - We may need _`persistent sequences`, such as chapter numbers. See - `OpenOffice.org XML`_ "fields". Should the sequences be automatic - or manual (user-specifyable)? - - We need to name the objects: - - - "name" option for the "figure" directive? :: - - .. figure:: image.png - :name: image's name - - Same for the "table" directive:: - - .. table:: optional title here - :name: table's name - - ===== ===== - x not x - ===== ===== - True False - False True - ===== ===== - - This would also allow other options to be set, like border - styles. The same technique could be used for other objects. - - A preliminary "table" directive has been implemented, supporting - table titles. Perhaps the name should derive from the title. - - - The object could also be done this way:: - - .. _figure name: - - .. figure:: image.png - - This may be a more general solution, equally applicable to tables. - However, explicit naming using an option seems simpler to users. - - - Perhaps the figure name could be incorporated into the figure - definition, as an optional inline target part of the directive - argument:: - - .. figure:: _`figure name` image.png - - Maybe with a delimiter:: - - .. figure:: _`figure name`: image.png - - Or some other, simpler syntax. - - We'll also need syntax for object references. See `OpenOffice.org - XML`_ "reference fields": - - - Parameterized substitutions? For example:: - - See |figure (figure name)| on |page (figure name)|. - - .. |figure (name)| figure-ref:: (name) - .. |page (name)| page-ref:: (name) - - The result would be:: - - See figure 3.11 on page 157. - - But this would require substitution directives to be processed at - reference-time, not at definition-time as they are now. Or, - perhaps the directives could just leave ``pending`` elements - behind, and the transforms do the work? How to pass the data - through? Too complicated. - - - An interpreted text approach is simpler and better:: - - See :figure:`figure name` on :page:`figure name`. - - The "figure" and "page" roles could generate appropriate - boilerplate text. The position of the role (prefix or suffix) - could also be utilized. - - See `Interpreted Text`_ below. - - - We could leave the boilerplate text up to the document:: - - See Figure :fig:`figure name` on page :pg:`figure name`. - - - Reference boilerplate could be specified in the document - (defaulting to nothing):: - - .. fignum:: - :prefix-ref: "Figure " - :prefix-caption: "Fig. " - :suffix-caption: : - - .. _OpenOffice.org XML: http://xml.openoffice.org/ - -* Think about _`large documents` made up of multiple subdocument - files. Issues: continuity (`persistent sequences`_ above), - cross-references (`name-to-id mapping file`_ above and `targets in - other documents`_ below), splitting (`document splitting`_ below). - - When writing a book, the author probably wants to split it up into - files, perhaps one per chapter (but perhaps even more detailed). - However, we'd like to be able to have references from one chapter to - another, and have continuous numbering (pages and chapters, as - applicable). Of course, none of this is implemented yet. There has - been some thought put into some aspects; see `the "include" - directive`__ and the `Reference Merging`_ transform below. - - When I was working with SGML in Japan, we had a system where there - was a top-level coordinating file, book.sgml, which contained the - top-level structure of a book: the <book> element, containing the - book <title> and empty component elements (<preface>, <chapter>, - <appendix>, etc.), each with filename attributes pointing to the - actual source for the component. Something like this:: - - <book id="bk01"> - <title>Title of the Book</title> - <preface inrefid="pr01"></preface> - <chapter inrefid="ch01"></chapter> - <chapter inrefid="ch02"></chapter> - <chapter inrefid="ch03"></chapter> - <appendix inrefid="ap01"></appendix> - </book> - - (The "inrefid" attribute stood for "insertion reference ID".) - - The processing system would process each component separately, but - it would recognize and use the book file to coordinate chapter and - page numbering, and keep a persistent ID to (title, page number) - mapping database for cross-references. Docutils could use a similar - system for large-scale, multipart documents. - - __ ../ref/rst/directives.html#including-an-external-document-fragment - - Aahz's idea: - - First the ToC:: - - .. ToC-list:: - Introduction.txt - Objects.txt - Data.txt - Control.txt - - Then a sample use:: - - .. include:: ToC.txt - - As I said earlier in chapter :chapter:`Objects.txt`, the - reference count gets increased every time a binding is made. - - Which produces:: - - As I said earlier in chapter 2, the - reference count gets increased every time a binding is made. - - The ToC in this form doesn't even need to be references to actual - reST documents; I'm simply doing it that way for a minimum of - future-proofing, in case I do want to add the ability to pick up - references within external chapters. - - Perhaps, instead of ToC (which would overload the "contents" - directive concept already in use), we could use "manifest". A - "manifest" directive might associate local reference names with - files:: - - .. manifest:: - intro: Introduction.txt - objects: Objects.txt - data: Data.txt - control: Control.txt - - Then the sample becomes:: - - .. include:: manifest.txt - - As I said earlier in chapter :chapter:`objects`, the - reference count gets increased every time a binding is made. - -* Add support for _`multiple output files`. - -* Add testing for Docutils' front end tools? - -* Publisher: "Ordinary setup" shouldn't requre specific ordering; at - the very least, there ought to be error checking higher up in the - call chain. [Aahz] - - ``Publisher.get_settings`` requires that all components be set up - before it's called. Perhaps the I/O *objects* shouldn't be set, but - I/O *classes*. Then options are set up (``.set_options``), and - ``Publisher.set_io`` (or equivalent code) is called with source & - destination paths, creating the I/O objects. - - Perhaps I/O objects shouldn't be instantiated until required. For - split output, the Writer may be called multiple times, once for each - doctree, and each doctree should have a separate Output object (with - a different path). Is the "Builder" pattern applicable here? - -* Perhaps I/O objects should become full-fledged components (i.e. - subclasses of ``docutils.Component``, as are Readers, Parsers, and - Writers now), and thus have associated option/setting specs and - transforms. - -* Multiple file I/O suggestion from Michael Hudson: use a file-like - object or something you can iterate over to get file-like objects. - -* Add an "--input-language" option & setting? Specify a different - language module for input (bibliographic fields, directives) than - for output. The "--language" option would set both input & output - languages. - -* Auto-generate reference tables for language-dependent features? - Could be generated from the source modules. A special command-line - option could be added to Docutils front ends to do this. (Idea from - Engelbert Gruber.) - -* Enable feedback of some kind from internal decisions, such as - reporting the successful input encoding. Modify runtime settings? - System message? Simple stderr output? - -* Rationalize Writer settings (HTML/LaTeX/PEP) -- share settings. - -* Merge docs/user/latex.txt info into tools.txt and config.txt. - -* Add an "--include file" command-line option (config setting too?), - equivalent to ".. include:: file" as the first line of the doc text? - Especially useful for character entity sets, text transform specs, - boilerplate, etc. - -* Parameterize the Reporter object or class? See the `2004-02-18 - "rest checking and source path"`_ thread. - - .. _2004-02-18 "rest checking and source path": - http://thread.gmane.org/gmane.text.docutils.user/1112 - -* Add a "disable_transforms" setting? And a dummy Writer subclass - that does nothing when its .write() method is called? Would allow - for easy syntax checking. See the `2004-02-18 "rest checking and - source path"`_ thread. - -* Add a generic meta-stylesheet mechanism? An external file could - associate style names ("class" attributes) with specific elements. - Could be generalized to arbitrary output attributes; useful for HTML - & XMLs. Aahz implemented something like this in - sandbox/aahz/Effective/EffMap.py. - -* .. _classes for table cells: - - William Dode suggested that table cells be assigned "class" - attributes by columns, so that stylesheets can affect text - alignment. Unfortunately, there doesn't seem to be a way (in HTML - at least) to leverage the "colspec" elements (HTML "col" tags) by - adding classes to them. The resulting HTML is very verbose:: - - <td class="col1">111</td> - <td class="col2">222</td> - ... - - At the very least, it should be an option. People who don't use it - shouldn't be penalized by increases in their HTML file sizes. - - Table rows could also be assigned classes (like odd/even). That - would be easier to implement. - - How should it be implemented? - - * There could be writer options (column classes & row classes) with - standard values. - - * The table directive could grow some options. Something like - ":cell-classes: col1 col2 col3" (either must match the number of - columns, or repeat to fill?) and ":row-classes: odd even" (repeat - to fill; body rows only, or header rows too?). - - Probably per-table directive options are best. The "class" values - could be used by any writer, and applying such classes to all tables - in a document with writer options is too broad. - -* Add file-specific settings support to config files, like:: - - [file index.txt] - compact-lists: no - - Is this even possible? Should the criterion be the name of the - input file or the output file? - -* The "validator" support added to OptionParser is very similar to - "traits_" in SciPy_. Perhaps something could be done with them? - (Had I known about traits when I was implementing docutils.frontend, - I may have used them instead of rolling my own.) - - .. _traits: http://code.enthought.com/traits/ - .. _SciPy: http://www.scipy.org/ - -* tools/buildhtml.py: Extend the --prune option ("prune" config - setting) to accept file names (generic path) in addition to - directories (e.g. --prune=docs/user/rst/cheatsheet.txt, which should - *not* be converted to HTML). - -* Add support for _`plugins`. - -* _`Config directories`: Currently, ~/.docutils, ./docutils.conf/, & - /etc/docutils.conf are read as configuration files. Proposal: allow - ~/.docutils to be a a configuration *directory*, along with - /etc/docutils/ and ./docutils.conf/. Within these directories, - check for config.txt files. We can also have subdirectories here, - for plugins, S5 themes, components (readers/writers/parsers) etc. - - Docutils will continue to support configuration files for backwards - compatibility. - -* Add support for document decorations other than headers & footers? - For example, top/bottom/side navigation bars for web pages. Generic - decorations? - - Seems like a bad idea as long as it isn't independent from the ouput - format (for example, navigation bars are only useful for web pages). - -* docutils_update: Check for a ``Makefile`` in a directory, and run - ``make`` if found? This would allow for variant processing on - specific source files, such as running rst2s5.py instead of - rst2html.py. - -* Add a "disable table of contents" setting? The S5 writer could set - it as a default. Rationale: - - The ``contents`` (table of contents) directive must not be used - [in S5/HTML documents]. It changes the CSS class of headings - and they won't show up correctly in the screen presentation. - - -- `Easy Slide Shows With reStructuredText & S5 - <../user/slide-shows.html>`_ - - -Documentation -============= - -User Docs ---------- - -* Add a FAQ entry about using Docutils (with reStructuredText) on a - server and that it's terribly slow. See the first paragraphs in - <http://article.gmane.org/gmane.text.docutils.user/1584>. - -* Add document about what Docutils has previously been used for - (web/use-cases.txt?). - - -Developer Docs --------------- - -* Complete `Docutils Runtime Settings <../api/runtime-settings.html>`_. - -* Improve the internal module documentation (docstrings in the code). - Specific deficiencies listed below. - - - docutils.parsers.rst.states.State.build_table: data structure - required (including StringList). - - - docutils.parsers.rst.states: more complete documentation of parser - internals. - -* docs/ref/doctree.txt: DTD element structural relationships, - semantics, and attributes. In progress; element descriptions to be - completed. - -* Document the ``pending`` elements, how they're generated and what - they do. - -* Document the transforms (perhaps in docstrings?): how they're used, - what they do, dependencies & order considerations. - -* Document the HTML classes used by html4css1.py. - -* Write an overview of the Docutils architecture, as an introduction - for developers. What connects to what, why, and how. Either update - PEP 258 (see PEPs_ below) or as a separate doc. - -* Give information about unit tests. Maybe as a howto? - -* Document the docutils.nodes APIs. - -* Complete the docs/api/publisher.txt docs. - - -How-Tos -------- - -* Creating Docutils Writers - -* Creating Docutils Readers - -* Creating Docutils Transforms - -* Creating Docutils Parsers - -* Using Docutils as a Library - - -PEPs ----- - -* Complete PEP 258 Docutils Design Specification. - - - Fill in the blanks in API details. - - - Specify the nodes.py internal data structure implementation? - - [Tibs:] Eventually we need to have direct documentation in - there on how it all hangs together - the DTD is not enough - (indeed, is it still meant to be correct? [Yes, it is. - --DG]). - -* Rework PEP 257, separating style from spec from tools, wrt Docutils? - See Doc-SIG from 2001-06-19/20. - - -Python Source Reader -==================== - -General: - -* Analyze Tony Ibbs' PySource code. - -* Analyze Doug Hellmann's HappyDoc project. - -* Investigate how POD handles literate programming. - -* Take the best ideas and integrate them into Docutils. - -Miscellaneous ideas: - -* Ask Python-dev for opinions (GvR for a pronouncement) on special - variables (__author__, __version__, etc.): convenience vs. namespace - pollution. Ask opinions on whether or not Docutils should recognize - & use them. - -* If we can detect that a comment block begins with ``##``, a la - JavaDoc, it might be useful to indicate interspersed section headers - & explanatory text in a module. For example:: - - """Module docstring.""" - - ## - # Constants - # ========= - - a = 1 - b = 2 - - ## - # Exception Classes - # ================= - - class MyException(Exception): pass - - # etc. - -* Should standalone strings also become (module/class) docstrings? - Under what conditions? We want to prevent arbitrary strings from - becomming docstrings of prior attribute assignments etc. Assume - that there must be no blank lines between attributes and attribute - docstrings? (Use lineno of NEWLINE token.) - - Triple-quotes are sometimes used for multi-line comments (such as - commenting out blocks of code). How to reconcile? - -* HappyDoc's idea of using comment blocks when there's no docstring - may be useful to get around the conflict between `additional - docstrings`_ and ``from __future__ import`` for module docstrings. - A module could begin like this:: - - #!/usr/bin/env python - # :Author: Me - # :Copyright: whatever - - """This is the public module docstring (``__doc__``).""" - - # More docs, in comments. - # All comments at the beginning of a module could be - # accumulated as docstrings. - # We can't have another docstring here, because of the - # ``__future__`` statement. - - from __future__ import division - - Using the JavaDoc convention of a doc-comment block beginning with - ``##`` is useful though. It allows doc-comments and implementation - comments. - - .. _additional docstrings: - ../peps/pep-0258.html#additional-docstrings - -* HappyDoc uses an initial comment block to set "parser configuration - values". Do the same thing for Docutils, to set runtime settings on - a per-module basis? I.e.:: - - # Docutils:setting=value - - Could be used to turn on/off function parameter comment recognition - & other marginal features. Could be used as a general mechanism to - augment config files and command-line options (but which takes - precedence?). - -* Multi-file output should be divisible at arbitrary level. - -* Support all forms of ``import`` statements: - - - ``import module``: listed as "module" - - ``import module as alias``: "alias (module)" - - ``from module import identifier``: "identifier (from module)" - - ``from module import identifier as alias``: "alias (identifier - from module)" - - ``from module import *``: "all identifiers (``*``) from module" - -* Have links to colorized Python source files from API docs? And - vice-versa: backlinks from the colorized source files to the API - docs! - -* In summaries, use the first *sentence* of a docstring if the first - line is not followed by a blank line. - - -reStructuredText Parser -======================= - -Also see the `... Or Not To Do?`__ list. - -__ rst/alternatives.html#or-not-to-do - -* Treat enumerated lists that are not arabic and consist of only one - item in a single line as ordinary paragraphs. See - <http://article.gmane.org/gmane.text.docutils.user/2635>. - -* The citation syntax could use some enhancements. See - <http://thread.gmane.org/gmane.text.docutils.user/2499> and - <http://thread.gmane.org/gmane.text.docutils.user/2443>. - -* The current list-recognition logic has too many false positives, as - in :: - - * Aorta - * V. cava superior - * V. cava inferior - - Here ``V.`` is recognized as an enumerator, which leads to - confusion. We need to find a solution that resolves such problems - without complicating the spec to much. - - See <http://thread.gmane.org/gmane.text.docutils.user/2524>. - -* Add indirect links via citation references & footnote references. - Example:: - - `Goodger (2005)`_ is helpful. - - .. _Goodger (2005): [goodger2005]_ - .. [goodger2005] citation text - - See <http://thread.gmane.org/gmane.text.docutils.user/2499>. - -* Allow multiple block quotes, only separated by attributions - (http://article.gmane.org/gmane.text.docutils.devel/2985), e.g.:: - - quote 1 - - ---Attrib 1 - - quote 2 - - ---Attrib 2 - -* Change the specification so that more punctuation is allowed - before/after inline markup start/end string - (http://article.gmane.org/gmane.text.docutils.cvs/3824). - -* Complain about bad URI characters - (http://article.gmane.org/gmane.text.docutils.user/2046) and - disallow internal whitespace - (http://article.gmane.org/gmane.text.docutils.user/2214). - -* Create ``info``-level system messages for unnecessarily - backslash-escaped characters (as in ``"\something"``, rendered as - "something") to allow checking for errors which silently slipped - through. - -* Add (functional) tests for untested roles. - -* Add test for ":figwidth: image" option of "figure" directive. (Test - code needs to check if PIL is available on the system.) - -* Add support for CJK double-width whitespace (indentation) & - punctuation characters (markup; e.g. double-width "*", "-", "+")? - -* Add motivation sections for constructs in spec. - -* Support generic hyperlink references to _`targets in other - documents`? Not in an HTML-centric way, though (it's trivial to say - ``http://www.example.com/doc#name``, and useless in non-HTML - contexts). XLink/XPointer? ``.. baseref::``? See Doc-SIG - 2001-08-10. - -* .. _adaptable file extensions: - - In target URLs, it would be useful to not explicitly specify the - file extension. If we're generating HTML, then ".html" is - appropriate; if PDF, then ".pdf"; etc. How about using ".*" to - indicate "choose the most appropriate filename extension"? For - example:: - - .. _Another Document: another.* - - What is to be done for output formats that don't *have* hyperlinks? - For example, LaTeX targeted at print. Hyperlinks may be "called - out", as footnotes with explicit URLs. - - But then there's also LaTeX targeted at PDFs, which *can* have - links. Perhaps a runtime setting for "*" could explicitly provide - the extension, defaulting to the output file's extension. - - Should the system check for existing files? No, not practical. - - Handle documents only, or objects (images, etc.) also? - - If this handles images also, how to differentiate between document - and image links? Element context (within "image")? Which image - extension to use for which document format? Again, a runtime - setting would suffice. - - This may not be just a parser issue; it may need framework support. - - Mailing list threads: `Images in both HTML and LaTeX`__ (especially - `this summary of Felix's objections`__), `more-universal links?`__, - `Output-format-sensitive link targets?`__ - - __ http://thread.gmane.org/gmane.text.docutils.user/1239 - __ http://article.gmane.org/gmane.text.docutils.user/1278 - __ http://thread.gmane.org/gmane.text.docutils.user/1915 - __ http://thread.gmane.org/gmane.text.docutils.user/2438 - -* Implement the header row separator modification to table.el. (Wrote - to Takaaki Ota & the table.el mailing list on 2001-08-12, suggesting - support for "=====" header rows. On 2001-08-17 he replied, saying - he'd put it on his to-do list, but "don't hold your breath".) - -* Fix the parser's indentation handling to conform with the stricter - definition in the spec. (Explicit markup blocks should be strict or - forgiving?) - - .. XXX What does this mean? Can you elaborate, David? - -* Make the parser modular. Allow syntax constructs to be added or - disabled at run-time. Subclassing is probably not enough because it - makes it difficult to apply multiple extensions. - -* Generalize the "doctest block" construct (which is overly - Python-centric) to other interactive sessions? "Doctest block" - could be renamed to "I/O block" or "interactive block", and each of - these could also be recognized as such by the parser: - - - Shell sessions:: - - $ cat example1.txt - A block beginning with a "$ " prompt is interpreted as a shell - session interactive block. As with Doctest blocks, the - interactive block ends with the first blank line, and wouldn't - have to be indented. - - - Root shell sessions:: - - # cat example2.txt - A block beginning with a "# " prompt is interpreted as a root - shell session (the user is or has to be logged in as root) - interactive block. Again, the block ends with a blank line. - - Other standard (and unambiguous) interactive session prompts could - easily be added (such as "> " for WinDOS). - - Tony Ibbs spoke out against this idea (2002-06-14 Doc-SIG thread - "docutils feedback"). - -* The "doctest" element should go away. The construct could simply be - a front-end to generic literal blocks. We could immediately (in - 0.4, or 0.5) remove the doctest node from the doctree, but leave the - syntax in reST. The reST parser could represent doctest blocks as - literal blocks with a class attribute. The syntax could be left in - reST for a set period of time. - -* Add support for pragma (syntax-altering) directives. - - Some pragma directives could be local-scope unless explicitly - specified as global/pragma using ":global:" options. - -* Support whitespace in angle-bracketed standalone URLs according to - Appendix E ("Recommendations for Delimiting URI in Context") of `RFC - 2396`_. - - .. _RFC 2396: http://www.rfc-editor.org/rfc/rfc2396.txt - -* Use the vertical spacing of the source text to determine the - corresponding vertical spacing of the output? - -* [From Mark Nodine] For cells in simple tables that comprise a - single line, the justification can be inferred according to the - following rules: - - 1. If the text begins at the leftmost column of the cell, - then left justification, ELSE - 2. If the text begins at the rightmost column of the cell, - then right justification, ELSE - 3. Center justification. - - The onus is on the author to make the text unambiguous by adding - blank columns as necessary. There should be a parser setting to - turn off justification-recognition (normally on would be fine). - - Decimal justification? - - All this shouldn't be done automatically. Only when it's requested - by the user, e.g. with something like this:: - - .. table:: - :auto-indent: - - (Table goes here.) - - Otherwise it will break existing documents. - -* Generate a warning or info message for paragraphs which should have - been lists, like this one:: - - 1. line one - 3. line two - -* Generalize the "target-notes" directive into a command-line option - somehow? See docutils-develop 2003-02-13. - -* Allow a "::"-only paragraph (first line, actually) to introduce a - _`literal block without a blank line`? (Idea from Paul Moore.) :: - - :: - This is a literal block - - Is indentation enough to make the separation between a paragraph - which contains just a ``::`` and the literal text unambiguous? - (There's one problem with this concession: If one wants a definition - list item which defines the term "::", we'd have to escape it.) It - would only be reasonable to apply it to "::"-only paragraphs though. - I think the blank line is visually necessary if there's text before - the "::":: - - The text in this paragraph needs separation - from the literal block following:: - This doesn't look right. - -* Add new syntax for _`nested inline markup`? Or extend the parser to - parse nested inline markup somehow? See the `collected notes - <rst/alternatives.html#nested-inline-markup>`__. - -* Drop the backticks from embedded URIs with omitted reference text? - Should the angle brackets be kept in the output or not? :: - - <file_name>_ - - Probably not worth the trouble. - -* Add _`math markup`. We should try for a general solution, that's - applicable to any output format. Using a standard, such as MathML_, - would be best. TeX (or itex_) would be acceptable as a *front-end* - to MathML. See `the culmination of a relevant discussion - <http://article.gmane.org/gmane.text.docutils.user/118>`__. - - Both a directive and an interpreted text role will be necessary (for - each markup). Directive example:: - - .. itex:: - \alpha_t(i) = P(O_1, O_2, \dots O_t, q_t = S_i \lambda) - - The same thing inline:: - - The equation in question is :itex:`\alpha_t(i) = P(O_1, O_2, - \dots O_t, q_t = S_i \lambda)`. - - .. _MathML: http://www.w3.org/TR/MathML2/ - .. _itex: http://pear.math.pitt.edu/mathzilla/itex2mmlItex.html - -* How about a syntax for alternative hyperlink behavior, such as "open - in a new window" (as in HTML's ``<a target="_blank">``)? Double - angle brackets might work for inline targets:: - - The `reference docs <<url>>`__ may be handy. - - But what about explicit targets? - - The MoinMoin wiki uses a caret ("^") at the beginning of the URL - ("^" is not a legal URI character). That could work for both inline - and explicit targets:: - - The `reference docs <^url>`__ may be handy. - - .. _name: ^url - - This may be too specific to HTML. It hasn't been requested very - often either. - -* Add an option to add URI schemes at runtime. - -* _`Segmented lists`:: - - : segment : segment : segment - : segment : segment : very long - segment - : segment : segment : segment - - The initial colon (":") can be thought of as a type of bullet - - We could even have segment titles:: - - :: title : title : title - : segment : segment : segment - : segment : segment : segment - - This would correspond well to DocBook's SegmentedList. Output could - be tabular or "name: value" pairs, as described in DocBook's docs. - -* Allow backslash-escaped colons in field names:: - - :Case Study\: Event Handling: This chapter will be dropped. - -* _`footnote spaces`: - - When supplying the command line options - --footnote-references=brackets and --use-latex-footnotes with the - LaTeX writer (which might very well happen when using configuration - files), the spaces in front of footnote references aren't trimmed. - -* Enable grid _`tables inside XML comments`, where "--" ends comments. - I see three implementation possibilities: - - 1. Make the table syntax characters into "table" directive options. - This is the most flexible but most difficult, and we probably - don't need that much flexibility. - - 2. Substitute "~" for "-" with a specialized directive option - (e.g. ":tildes:"). - - 3. Make the standard table syntax recognize "~" as well as "-", even - without a directive option. Individual tables would have to be - internally consistent. - - Directive options are preferable to configuration settings, because - tables are document-specific. A pragma directive would be another - approach, to set the syntax once for a whole document. - - In the meantime, the list-table_ directive is a good replacement for - grid tables inside XML comments. - - .. _list-table: ../ref/rst/directives.html#list-table - -* Generalize docinfo contents (bibliographic fields): remove specific - fields, and have only a single generic "field"? - - -Directives ----------- - -Directives below are often referred to as "module.directive", the -directive function. The "module." is not part of the directive name -when used in a document. - -* Make the _`directive interface` object-oriented - (http://article.gmane.org/gmane.text.docutils.user/1871). - -* Allow for field lists in list tables. See - <http://thread.gmane.org/gmane.text.docutils.devel/3392>. - -* .. _unify tables: - - Unify table implementations and unify options of table directives - (http://article.gmane.org/gmane.text.docutils.user/1857). - -* Allow directives to be added at run-time? - -* Use the language module for directive option names? - -* Add "substitution_only" and "substitution_ok" function attributes, - and automate context checking? - -* Change directive functions to directive classes? Superclass' - ``__init__()`` could handle all the bookkeeping. - -* Implement options or features on existing directives: - - - Add a "name" option to directives, to set an author-supplied - identifier? - - - All directives that produce titled elements should grow implicit - reference names based on the titles. - - - Allow the _`:trim:` option for all directives when they occur in a - substitution definition, not only the unicode_ directive. - - .. _unicode: ../ref/rst/directives.html#unicode-character-codes - - - _`images.figure`: "title" and "number", to indicate a formal - figure? - - - _`parts.sectnum`: "local"?, "refnum" - - A "local" option could enable numbering for sections from a - certain point down, and sections in the rest of the document are - not numbered. For example, a reference section of a manual might - be numbered, but not the rest. OTOH, an all-or-nothing approach - would probably be enough. - - The "sectnum" directive should be usable multiple times in a - single document. For example, in a long document with "chapter" - and "appendix" sections, there could be a second "sectnum" before - the first appendix, changing the sequence used (from 1,2,3... to - A,B,C...). This is where the "local" concept comes in. This part - of the implementation can be left for later. - - A "refnum" option (better name?) would insert reference names - (targets) consisting of the reference number. Then a URL could be - of the form ``http://host/document.html#2.5`` (or "2-5"?). Allow - internal references by number? Allow name-based *and* - number-based ids at the same time, or only one or the other (which - would the table of contents use)? Usage issue: altering the - section structure of a document could render hyperlinks invalid. - - - _`parts.contents`: Add a "suppress" or "prune" option? It would - suppress contents display for sections in a branch from that point - down. Or a new directive, like "prune-contents"? - - Add an option to include topics in the TOC? Another for sidebars? - The "topic" directive could have a "contents" option, or the - "contents" directive" could have an "include-topics" option. See - docutils-develop 2003-01-29. - - - _`parts.header` & _`parts.footer`: Support multiple, named headers - & footers? For example, separate headers & footers for odd, even, - and the first page of a document. - - This may be too specific to output formats which have a notion of - "pages". - - - _`misc.class`: - - - Add a ``:parent:`` option for setting the parent's class - (http://article.gmane.org/gmane.text.docutils.devel/3165). - - - _`misc.include`: - - - Option to select a range of lines? - - - Option to label lines? - - - How about an environment variable, say RSTINCLUDEPATH or - RSTPATH, for standard includes (as in ``.. include:: <name>``)? - This could be combined with a setting/option to allow - user-defined include directories. - - - Add support for inclusion by URL? :: - - .. include:: - :url: http://www.example.org/inclusion.txt - - - _`misc.raw`: add a "destination" option to the "raw" directive? :: - - .. raw:: html - :destination: head - - <link ...> - - It needs thought & discussion though, to come up with a consistent - set of destination labels and consistent behavior. - - And placing HTML code inside the <head> element of an HTML - document is rather the job of a templating system. - - - _`body.sidebar`: Allow internal section structure? Adornment - styles would be independent of the main document. - - That is really complicated, however, and the document model - greatly benefits from its simplicity. - -* Implement directives. Each of the list items below begins with an - identifier of the form, "module_name.directive_function_name". The - directive name itself could be the same as the - directive_function_name, or it could differ. - - - _`html.imagemap` - - It has the disadvantage that it's only easily implementable for - HTML, so it's specific to one output format. - - (For non-HTML writers, the imagemap would have to be replaced with - the image only.) - - - _`parts.endnotes` (or "footnotes"): See `Footnote & Citation Gathering`_. - - - _`parts.citations`: See `Footnote & Citation Gathering`_. - - - _`misc.language`: Specify (= change) the language of a document at - parse time. - - - _`misc.settings`: Set any(?) Docutils runtime setting from within - a document? Needs much thought and discussion. - - - _`misc.gather`: Gather (move, or copy) all instances of a specific - element. A generalization of the "endnotes" & "citations" ideas. - - - Add a custom "directive" directive, equivalent to "role"? For - example:: - - .. directive:: incr - - .. class:: incremental - - .. incr:: - - "``.. incr::``" above is equivalent to "``.. class:: incremental``". - - Another example:: - - .. directive:: printed-links - - .. topic:: Links - :class: print-block - - .. target-notes:: - :class: print-inline - - This acts like macros. The directive contents will have to be - evaluated when referenced, not when defined. - - * Needs a better name? "Macro", "substitution"? - * What to do with directive arguments & options when the - macro/directive is referenced? - - - .. _conditional directives: - - Docutils already has the ability to say "use this content for - Writer X" (via the "raw" directive), but it doesn't have the - ability to say "use this content for any Writer other than X". It - wouldn't be difficult to add this ability though. - - My first idea would be to add a set of conditional directives. - Let's call them "writer-is" and "writer-is-not" for discussion - purposes (don't worry about implemention details). We might - have:: - - .. writer-is:: text-only - - :: - - +----------+ - | SNMP | - +----------+ - | UDP | - +----------+ - | IP | - +----------+ - | Ethernet | - +----------+ - - .. writer-is:: pdf - - .. figure:: protocol_stack.eps - - .. writer-is-not:: text-only pdf - - .. figure:: protocol_stack.png - - This could be an interface to the Filter transform - (docutils.transforms.components.Filter). - - The ideas in `adaptable file extensions`_ above may also be - applicable here. - - SVG's "switch" statement may provide inspiration. - - Here's an example of a directive that could produce multiple - outputs (*both* raw troff pass-through *and* a GIF, for example) - and allow the Writer to select. :: - - .. eqn:: - - .EQ - delim %% - .EN - %sum from i=o to inf c sup i~=~lim from {m -> inf} - sum from i=0 to m sup i% - .EQ - delim off - .EN - - - _`body.example`: Examples; suggested by Simon Hefti. Semantics as - per Docbook's "example"; admonition-style, numbered, reference, - with a caption/title. - - - _`body.index`: Index targets. - - See `Index Entries & Indexes - <./rst/alternatives.html#index-entries-indexes>`__. - - - _`body.literal`: Literal block, possibly "formal" (see `object - numbering and object references`_ above). Possible options: - - - "highlight" a range of lines - - - include only a specified range of lines - - - "number" or "line-numbers" - - - "styled" could indicate that the directive should check for - style comments at the end of lines to indicate styling or - markup. - - Specific derivatives (i.e., a "python-interactive" directive) - could interpret style based on cues, like the ">>> " prompt and - "input()"/"raw_input()" calls. - - See docutils-users 2003-03-03. - - - _`body.listing`: Code listing with title (to be numbered - eventually), equivalent of "figure" and "table" directives. - - - _`colorize.python`: Colorize Python code. Fine for HTML output, - but what about other formats? Revert to a literal block? Do we - need some kind of "alternate" mechanism? Perhaps use a "pending" - transform, which could switch its output based on the "format" in - use. Use a factory function "transformFF()" which returns either - "HTMLTransform()" instance or "GenericTransform" instance? - - If we take a Python-to-HTML pretty-printer and make it output a - Docutils internal doctree (as per nodes.py) instead of HTML, then - each output format's stylesheet (or equivalent) mechanism could - take care of the rest. The pretty-printer code could turn this - doctree fragment:: - - <literal_block xml:space="preserve"> - print 'This is Python code.' - for i in range(10): - print i - </literal_block> - - into something like this ("</>" is end-tag shorthand):: - - <literal_block xml:space="preserve" class="python"> - <keyword>print</> <string>'This is Python code.'</> - <keyword>for</> <identifier>i</> <keyword - >in</> <expression>range(10)</>: - <keyword>print</> <expression>i</> - </literal_block> - - But I'm leaning toward adding a single new general-purpose - element, "phrase", equivalent to HTML's <span>. Here's the - example rewritten using the generic "phrase":: - - <literal_block xml:space="preserve" class="python"> - <phrase class="keyword">print</> <phrase - class="string">'This is Python code.'</> - <phrase class="keyword">for</> <phrase - class="identifier">i</> <phrase class="keyword">in</> <phrase - class="expression">range(10)</>: - <phrase class="keyword">print</> <phrase - class="expression">i</> - </literal_block> - - It's more verbose but more easily extensible and more appropriate - for the case at hand. It allows us to edit style sheets to add - support for new formats, not the Docutils code itself. - - Perhaps a single directive with a format parameter would be - better:: - - .. colorize:: python - - print 'This is Python code.' - for i in range(10): - print i - - But directives can have synonyms for convenience. "format:: - python" was suggested, but "format" seems too generic. - - - _`pysource.usage`: Extract a usage message from the program, - either by running it at the command line with a ``--help`` option - or through an exposed API. [Suggestion for Optik.] - - -Interpreted Text ----------------- - -Interpreted text is entirely a reStructuredText markup construct, a -way to get around built-in limitations of the medium. Some roles are -intended to introduce new doctree elements, such as "title-reference". -Others are merely convenience features, like "RFC". - -All supported interpreted text roles must already be known to the -Parser when they are encountered in a document. Whether pre-defined -in core/client code, or in the document, doesn't matter; the roles -just need to have already been declared. Adding a new role may -involve adding a new element to the DTD and may require extensive -support, therefore such additions should be well thought-out. There -should be a limited number of roles. - -The only place where no limit is placed on variation is at the start, -at the Reader/Parser interface. Transforms are inserted by the Reader -into the Transformer's queue, where non-standard elements are -converted. Once past the Transformer, no variation from the standard -Docutils doctree is possible. - -An example is the Python Source Reader, which will use interpreted -text extensively. The default role will be "Python identifier", which -will be further interpreted by namespace context into <class>, -<method>, <module>, <attribute>, etc. elements (see pysource.dtd), -which will be transformed into standard hyperlink references, which -will be processed by the various Writers. No Writer will need to have -any knowledge of the Python-Reader origin of these elements. - -* Add explicit interpreted text roles for the rest of the implicit - inline markup constructs: named-reference, anonymous-reference, - footnote-reference, citation-reference, substitution-reference, - target, uri-reference (& synonyms). - -* Add directives for each role as well? This would allow indirect - nested markup:: - - This text contains |nested inline markup|. - - .. |nested inline markup| emphasis:: - - nested ``inline`` markup - -* Implement roles: - - - "_`raw-wrapped`" (or "_`raw-wrap`"): Base role to wrap raw text - around role contents. - - For example, the following reStructuredText source ... :: - - .. role:: red(raw-formatting) - :prefix: - :html: <font color="red"> - :latex: {\color{red} - :suffix: - :html: </font> - :latex: } - - colored :red:`text` - - ... will yield the following document fragment:: - - <paragraph> - colored - <inline classes="red"> - <raw format="html"> - <font color="red"> - <raw format="latex"> - {\color{red} - <inline classes="red"> - text - <raw format="html"> - </font> - <raw format="latex"> - } - - Possibly without the intermediate "inline" node. - - - "acronym" and "abbreviation": Associate the full text with a short - form. Jason Diamond's description: - - I want to translate ```reST`:acronym:`` into ``<acronym - title='reStructuredText'>reST</acronym>``. The value of the - title attribute has to be defined out-of-band since you can't - parameterize interpreted text. Right now I have them in a - separate file but I'm experimenting with creating a directive - that will use some form of reST syntax to let you define them. - - Should Docutils complain about undefined acronyms or - abbreviations? - - What to do if there are multiple definitions? How to - differentiate between CSS (Content Scrambling System) and CSS - (Cascading Style Sheets) in a single document? David Priest - responds, - - The short answer is: you don't. Anyone who did such a thing - would be writing very poor documentation indeed. (Though I - note that `somewhere else in the docs`__, there's mention of - allowing replacement text to be associated with the - abbreviation. That takes care of the duplicate - acronyms/abbreviations problem, though a writer would be - foolish to ever need it.) - - __ `inline parameter syntax`_ - - How to define the full text? Possibilities: - - 1. With a directive and a definition list? :: - - .. acronyms:: - - reST - reStructuredText - DPS - Docstring Processing System - - Would this list remain in the document as a glossary, or would - it simply build an internal lookup table? A "glossary" - directive could be used to make the intention clear. - Acronyms/abbreviations and glossaries could work together. - - Then again, a glossary could be formed by gathering individual - definitions from around the document. - - 2. Some kind of `inline parameter syntax`_? :: - - `reST <reStructuredText>`:acronym: is `WYSIWYG <what you - see is what you get>`:acronym: plaintext markup. - - .. _inline parameter syntax: - rst/alternatives.html#parameterized-interpreted-text - - 3. A combination of 1 & 2? - - The multiple definitions issue could be handled by establishing - rules of priority. For example, directive-based lookup tables - have highest priority, followed by the first inline definition. - Multiple definitions in directive-based lookup tables would - trigger warnings, similar to the rules of `implicit hyperlink - targets`__. - - __ ../ref/rst/restructuredtext.html#implicit-hyperlink-targets - - 4. Using substitutions? :: - - .. |reST| acronym:: reST - :text: reStructuredText - - What do we do for other formats than HTML which do not support - tool tips? Put the full text in parentheses? - - - "figure", "table", "listing", "chapter", "page", etc: See `object - numbering and object references`_ above. - - - "glossary-term": This would establish a link to a glossary. It - would require an associated "glossary-entry" directive, whose - contents could be a definition list:: - - .. glossary-entry:: - - term1 - definition1 - term2 - definition2 - - This would allow entries to be defined anywhere in the document, - and collected (via a "glossary" directive perhaps) at one point. - - -Unimplemented Transforms -======================== - -* _`Footnote & Citation Gathering` - - Collect and move footnotes & citations to the end of a document. - (Separate transforms.) - -* _`Reference Merging` - - When merging two or more subdocuments (such as docstrings), - conflicting references may need to be resolved. There may be: - - * duplicate reference and/or substitution names that need to be made - unique; and/or - * duplicate footnote numbers that need to be renumbered. - - Should this be done before or after reference-resolving transforms - are applied? What about references from within one subdocument to - inside another? - -* _`Document Splitting` - - If the processed document is written to multiple files (possibly in - a directory tree), it will need to be split up. Internal references - will have to be adjusted. - - (HTML only? Initially, yes. Eventually, anything should be - splittable.) - - Ideas: - - - Insert a "destination" attribute into the root element of each - split-out document, containing the path/filename. The Output - object or Writer will recognize this attribute and split out the - files accordingly. Must allow for common headers & footers, - prev/next, breadcrumbs, etc. - - - Transform a single-root document into a document containing - multiple subdocuments, recursively. The content model of the - "document" element would have to change to:: - - <!ELEMENT document - ( (title, subtitle?)?, - decoration?, - (docinfo, transition?)?, - %structure.model;, - document* )> - - (I.e., add the last line -- 0 or more document elements.) - - Let's look at the case of hierarchical (directories and files) - HTML output. Each document element containing further document - elements would correspond to a directory (with an index.html file - for the content preceding the subdocuments). Each document - element containing no subdocuments (i.e., structure model elements - only) corresponds to a concrete file with no directory. - - The natural transform would be to map sections to subdocuments, - but possibly only a given number of levels deep. - -* _`Navigation` - - If a document is split up, each segment will need navigation links: - parent, children (small TOC), previous (preorder), next (preorder). - Part of `Document Splitting`_? - -* _`List of System Messages` - - The ``system_message`` elements are inserted into the document tree, - adjacent to the problems themselves where possible. Some (those - generated post-parse) are kept until later, in - ``document.messages``, and added as a special final section, - "Docutils System Messages". - - Docutils could be made to generate hyperlinks to all known - system_messages and add them to the document, perhaps to the end of - the "Docutils System Messages" section. - - Fred L. Drake, Jr. wrote: - - I'd like to propose that both parse- and transformation-time - messages are included in the "Docutils System Messages" section. - If there are no objections, I can make the change. - - The advantage of the current way of doing things is that parse-time - system messages don't require a transform; they're already in the - document. This is valuable for testing (unit tests, - tools/quicktest.py). So if we do decide to make a change, I think - the insertion of parse-time system messages ought to remain as-is - and the Messages transform ought to move all parse-time system - messages (remove from their originally inserted positions, insert in - System Messages section). - -* _`Index Generation` - - -HTML Writer -=========== - -* Add support for _`multiple stylesheets`. See - <http://thread.gmane.org/gmane.text.docutils.cvs/4336>. - -* Idea for field-list rendering: hanging indent:: - - Field name (bold): First paragraph of field body begins - with the field name inline. - - If the first item of a field body is not a paragraph, - it would begin on the following line. - -* Add more support for <link> elements, especially for navigation - bars. - - The framework does not have a notion of document relationships, so - probably raw.destination_ should be used. - - We'll have framework support for document relationships when support - for `multiple output files`_ is added. The HTML writer could - automatically generate <link> elements then. - - .. _raw.destination: misc.raw_ - -* Base list compaction on the spacing of source list? Would require - parser support. (Idea: fantasai, 16 Dec 2002, doc-sig.) - -* Add a tool tip ("title" attribute?) to footnote back-links - identifying them as such. Text in Docutils language module. - - -PEP/HTML Writer -=============== - -* Remove the generic style information (duplicated from html4css1.css) - from pep.css to avoid redundancy. - - We need support for `multiple stylesheets`_ first, though. - - -LaTeX writer -============ - -* Add an ``--embed-stylesheet`` (and ``--link-stylesheet``) option. - - -HTML SlideShow Writer -===================== - -Add a Writer for presentations, derivative of the HTML Writer. Given -an input document containing one section per slide, the output would -consist of a master document for the speaker, and a slide file (or set -of filess, one (or more) for each slide). Each slide would contain -the slide text (large, stylesheet-controlled) and images, plus "next" -and "previous" links in consistent places. The speaker's master -document would contain a small version of the slide text with -speaker's notes interspersed. The master document could use -``target="whatever"`` to direct links to a separate window on a second -monitor (e.g., a projector). - -Ideas: - -* Base the output on |S5|_. I discovered |S5| a few weeks before it - appeared on Slashdot, after writing most of this section. It turns - out that |S5| does most of what I wanted. - - Chris Liechti has `integrated S5 with the HTML writer - <http://homepage.hispeed.ch/py430/python/index.html#rst2s5>`__. - - .. |S5| replace:: S\ :sup:`5` - .. _S5: http://www.meyerweb.com/eric/tools/s5/ - -Below, "[S5]" indicates that |S5| already implements the feature or -may implement all or part of the feature. "[S5 1.1]" indicates that -|S5| version 1.1 implements the feature (a preview of the 1.1 beta is -available in the `S5 testbed`_). - -.. _S5 testbed: http://meyerweb.com/eric/tools/s5/testbed/ - -Features & issues: - -* [S5 1.1] Incremental slides, where each slide adds to the one before - (ticking off items in a list, delaying display of later items). The - speaker's master document would list each transition in the TOC and - provide links in the content. - - * Use transitions to separate stages. Problem with transitions is - that they can't be used everywhere -- not, for example, within a - list (see the example below). - - * Use a special directive to separate stages. Possible names: - pause, delay, break, cut, continue, suspend, hold, stay, stop. - Should the directive be available in all contexts (and ineffectual - in all but SlideShow context), or added at runtime by the - SlideShow Writer? Probably such a "pause" directive should only - be available for slide shows; slide shows are too much of a - special case to justify adding a directive (and node?) to the - core. - - The directive could accept text content, which would be rendered - while paused but would disappear when the slide is continued (the - text could also be a link to the next slide). In the speaker's - master document, the text "paused:" could appear, prefixed to the - directive text. - - * Use a special directive or class to declare incremental content. - This works best with the S5 ideas. For example:: - - Slide Title - =========== - - .. incremental:: - - * item one - * item two - * item three - - Add an option to make all bullet lists implicitly incremental? - -* Speaker's notes -- how to intersperse? Could use reST comments - (".."), but make them visible in the speaker's master document. If - structure is necessary, we could use a "comment" directive (to avoid - nonsensical DTD changes, the "comment" directive could produce an - untitled topic element). - - The speaker's notes could (should?) be separate from S5's handout - content. - -* The speaker's master document could use frames for easy navigation: - TOC on the left, content on the right. - - - It would be nice if clicking in the TOC frame simultaneously - linked to both the speaker's notes frame and to the slide window, - synchronizing both. Needs JavaScript? - - - TOC would have to be tightly formatted -- minimal indentation. - - - TOC auto-generated, as in the PEP Reader. (What if there already - is a "contents" directive in the document?) - - - There could be another frame on the left (top-left or bottom-left) - containing a single "Next" link, always pointing to the next slide - (synchronized, of course). Also "Previous" link? FF/Rew go to - the beginning of the next/current parent section? First/Last - also? Tape-player-style buttons like ``|<< << < > >> >>|``? - -* [S5] Need to support templating of some kind, for uniform slide - layout. S5 handles this via CSS. - - Build in support for limited features? E.g., top/bottom or - left/right banners, images on each page, background color and/or - image, etc. - -* [S5?] One layout for all slides, or allow some variation? - - While S5 seems to support only one style per HTML file, it's - pretty easy to split a presentation in different files and - insert a hyperlink to the last slide of the first part and load - the second part by a click on it. - - -- Chris Liechti - -* For nested sections, do we show the section's ancestry on each - slide? Optional? No -- leave the implementation to someone who - wants it. - -* [S5] Stylesheets for slides: - - - Tweaked for different resolutions, 1024x768 etc. - - Some layout elements have fixed positions. - - Text must be quite large. - - Allow 10 lines of text per slide? 15? - - Title styles vary by level, but not so much? - -* [not required with S5.] Need a transform to number slides for - output filenames?, and for hyperlinks? - -* Directive to begin a new, untitled (blank) slide? - -* Directive to begin a new slide, continuation, using the same title - as the previous slide? (Unnecessary?) - -* Have a timeout on incremental items, so the colour goes away after 1 - second. - -Here's an example that I was hoping to show at PyCon DC 2005:: - - ======================== - The Docutils SlideShow - ======================== - - Welcome To The Docutils SlideShow! - ================================== - - .. pause:: - - David Goodger - - goodger@python.org - - http://python.net/~goodger - - .. (introduce yourself) - - Hi, I'm David Goodger from Montreal, Canada. - - I've been working on Docutils since 2000. - Time flies! - - .. pause:: - - Docutils - - http://docutils.sourceforge.net - - .. I also volunteer as a Python Enhancement Proposal (or PEP) - editor. - - .. SlideShow is a new feature of Docutils. This presentation was - written using the Docutils SlideShow system. The slides you - are seeing are HTML, rendered by a standard Mozilla Firefox - browser. - - - The Docutils SlideShow System - ============================= - - .. The Docutils SlideShow System provides - - Easy and open presentations. - - - Features - ======== - - * reStructuredText-based input files. - - .. reStructuredText is a what-you-see-is-what-you-get - plaintext format. Easy to read & write, non-proprietary, - editable in your favourite text editor. - - .. Parsers for other markup languages can be added to Docutils. - In the future, I hope some are. - - .. pause:: ... - - * Stylesheet-driven HTML output. - - .. The format of all elements of the output slides are - controlled by CSS (cascading stylesheets). - - .. pause:: ... - - * Works with any modern browser. - - .. that supports CSS, frames, and JavaScript. - Tested with Mozilla Firefox. - - .. pause:: ... - - * Works on any OS. - - - Etc. - ==== - - That's as far as I got, but you get the idea... - - -Front-End Tools -=============== - -* What about if we don't know which Reader and/or Writer we are - going to use? If the Reader/Writer is specified on the - command-line? (Will this ever happen?) - - Perhaps have different types of front ends: - - a) _`Fully qualified`: Reader and Writer are hard-coded into the - front end (e.g. ``pep2html [options]``, ``pysource2pdf - [options]``). - - b) _`Partially qualified`: Reader is hard-coded, and the Writer is - specified a sub-command (e.g. ``pep2 html [options]``, - ``pysource2 pdf [options]``). The Writer is known before option - processing happens, allowing the OptionParser to be built - dynamically. Alternatively, the Writer could be hard-coded and - the Reader specified as a sub-command (e.g. ``htmlfrom pep - [options]``). - - c) _`Unqualified`: Reader and Writer are specified as subcommands - (e.g. ``publish pep html [options]``, ``publish pysource pdf - [options]``). A single front end would be sufficient, but - probably only useful for testing purposes. - - d) _`Dynamic`: Reader and/or Writer are specified by options, with - defaults if unspecified (e.g. ``publish --writer pdf - [options]``). Is this possible? The option parser would have - to be told about new options it needs to handle, on the fly. - Component-specific options would have to be specified *after* - the component-specifying option. - - Allow common options before subcommands, as in CVS? Or group all - options together? In the case of the `fully qualified`_ - front ends, all the options will have to be grouped together - anyway, so there's no advantage (we can't use it to avoid - conflicts) to splitting common and component-specific options - apart. - -* Parameterize help text & defaults somehow? Perhaps a callback? Or - initialize ``settings_spec`` in ``__init__`` or ``init_options``? - -* Disable common options that don't apply? - -* Add ``--section-numbering`` command line option. The "sectnum" - directive should override the ``--no-section-numbering`` command - line option then. - -* Create a single dynamic_ or unqualified_ front end that can be - installed? - - -.. - Local Variables: - mode: indented-text - indent-tabs-mode: nil - sentence-end-double-space: t - fill-column: 70 - End: diff --git a/docutils/docs/dev/website.txt b/docutils/docs/dev/website.txt deleted file mode 100644 index 193e9c0f2..000000000 --- a/docutils/docs/dev/website.txt +++ /dev/null @@ -1,46 +0,0 @@ -=================== - Docutils Web Site -=================== - -:Author: David Goodger; open to all Docutils developers -:Contact: goodger@python.org -:Date: $Date$ -:Revision: $Revision$ -:Copyright: This document has been placed in the public domain. - -The Docutils web site, <http://docutils.sourceforge.net/>, is -maintained automatically by the ``docutils-update`` script, run as an -hourly cron job on shell.berlios.de (by user "felixwiemann"). The -script will process any .txt file which is newer than the -corresponding .html file in the project's web directory on -shell.berlios.de (``/home/groups/docutils/htdocs/aux/htdocs/``) and -upload the changes to the web site at SourceForge. For a new .txt -file, just SSH to ``<username>@shell.berlios.de`` and :: - - cd /home/groups/docutils/htdocs/aux/htdocs/ - touch filename.html - chmod g+w filename.html - sleep 1 - touch filename.txt - -The script will take care of the rest within an hour. Thereafter -whenever the .txt file is modified (checked in to SVN), the .html will -be regenerated automatically. - -After adding directories to SVN, allow the script to run once to -create the directories in the filesystem before preparing for HTML -processing as described above. - -The docutils-update__ script is located at -``sandbox/infrastructure/docutils-update``. - -__ http://docutils.sf.net/sandbox/infrastructure/docutils-update - - -.. - Local Variables: - mode: indented-text - indent-tabs-mode: nil - sentence-end-double-space: t - fill-column: 70 - End: |