summaryrefslogtreecommitdiff
path: root/docutils/docs/api/publisher.txt
blob: 73cfc0ef2d70327d73a16254144d8c3a787738e4 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
========================
 The Docutils Publisher
========================

:Author: David Goodger
:Contact: goodger@python.org
:Date: $Date$
:Revision: $Revision$
:Copyright: This document has been placed in the public domain.

.. contents::


Publisher Convenience Functions
===============================

Each of these functions set up a ``docutils.core.Publisher`` object,
then call its ``publish`` method.  ``docutils.core.Publisher.publish``
handles everything else.  There are five convenience functions in the
``docutils.core`` module:

:_`publish_cmdline`: for command-line front-end tools, like
  ``rst2html.py``.  There are several examples in the ``tools/``
  directory.  A detailed analysis of one such tool is in `Inside A
  Docutils Command-Line Front-End Tool`_

:_`publish_file`: for programmatic use with file-like I/O.  In
  addition to writing the encoded output to a file, also returns the
  encoded output as a string.

:_`publish_string`: for programmatic use with string I/O.  Returns
  the encoded output as a string.

:_`publish_parts`: for programmatic use with string input; returns a
  dictionary of document parts.  Dictionary keys are the names of
  parts, and values are Unicode strings; encoding is up to the client.
  Useful when only portions of the processed document are desired.
  See `publish_parts Details`_ below.

  There are usage examples in the `docutils/examples.py`_ module.

:_`publish_programmatically`: for custom programmatic use.  This
  function implements common code and is used by ``publish_file``,
  ``publish_string``, and ``publish_parts``.  It returns a 2-tuple:
  the encoded string output and the Publisher object.

.. _Inside A Docutils Command-Line Front-End Tool: ./cmdline-tool.html
.. _docutils/examples.py: ../../docutils/examples.py


Configuration
-------------

To pass application-specific setting defaults to the Publisher
convenience functions, use the ``settings_overrides`` parameter.  Pass
a dictionary of setting names & values, like this::

    overrides = {'input_encoding': 'ascii',
                 'output_encoding': 'latin-1'}
    output = publish_string(..., settings_overrides=overrides)

Settings from command-line options override configuration file
settings, and they override application defaults.  For details, see
`Docutils Runtime Settings`_.  See `Docutils Configuration Files`_ for
details about individual settings.

.. _Docutils Runtime Settings: ./runtime-settings.html
.. _Docutils Configuration Files: ../user/tools.html


Encodings
---------

The default output encoding of Docutils is UTF-8.  If you have any
non-ASCII in your input text, you may have to do a bit more setup.
Docutils may introduce some non-ASCII text if you use
`auto-symbol footnotes`_ or the `"contents" directive`_.

.. _auto-symbol footnotes:
   ../ref/rst/restructuredtext.html#auto-symbol-footnotes
.. _"contents" directive:
   ../ref/rst/directives.html#table-of-contents


``publish_parts`` Details
=========================

The ``docutils.core.publish_parts`` convenience function returns a
dictionary of document parts.  Dictionary keys are the names of parts,
and values are Unicode strings.

Each Writer component may publish a different set of document parts,
described below.  Currently only the HTML Writer implements more than
the "whole" part.


Parts Provided By All Writers
-----------------------------

_`whole`
    ``parts['whole']`` contains the entire formatted document.


Parts Provided By the HTML Writer
---------------------------------

_`body`
    ``parts['body']`` is equivalent to parts['fragment_'].  It is
    *not* equivalent to parts['html_body_'].

_`docinfo`
    ``parts['docinfo']`` contains the document bibliographic data.

_`footer`
    ``parts['footer']`` contains the document footer content, meant to
    appear at the bottom of a web page, or repeated at the bottom of
    every printed page.

_`fragment`
    ``parts['fragment']`` contains the document body (*not* the HTML
    ``<body>``).  In other words, it contains the entire document,
    less the document title, subtitle, docinfo, header, and footer.

_`header`
    ``parts['header']`` contains the document header content, meant to
    appear at the top of a web page, or repeated at the top of every
    printed page.

_`html_body`
    ``parts['html_body']`` contains the HTML ``<body>`` content, less
    the ``<body>`` and ``</body>`` tags themselves.

_`html_head`
    ``parts['html_head']`` contains the HTML ``<head>`` content, less
    the stylesheet link and the ``<head>`` and ``</head>`` tags
    themselves.  Since ``publish_parts`` returns Unicode strings and
    does not know about the output encoding, the "Content-Type" meta
    tag's "charset" value is left unresolved, as "%s"::

        <meta http-equiv="Content-Type" content="text/html; charset=%s" />

    The interpolation should be done by client code.

_`html_prolog`
    ``parts['html_prolog]`` contains the XML declaration and the
    doctype declaration.  The XML declaration's "encoding" attribute's
    value is left unresolved, as "%s"::

        <?xml version="1.0" encoding="%s" ?>

    The interpolation should be done by client code.

_`html_subtitle`
    ``parts['html_subtitle']`` contains the document subtitle,
    including the enclosing ``<h2 class="subtitle">`` & ``</h2>``
    tags.

_`html_title`
    ``parts['html_title']`` contains the document title, including the
    enclosing ``<h1 class="title">`` & ``</h1>`` tags.

_`meta`
    ``parts['meta']`` contains all ``<meta ... />`` tags.

_`stylesheet`
    ``parts['stylesheet']`` contains the document stylesheet link.

_`subtitle`
    ``parts['subtitle']`` contains the document subtitle text and any
    inline markup.  It does not include the enclosing ``<h2>`` &
    ``</h2>`` tags.

_`title`
    ``parts['title']`` contains the document title text and any inline
    markup.  It does not include the enclosing ``<h1>`` & ``</h1>``
    tags.