1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
|
.. -*- rst-mode -*-
Syntax Highlight
================
.. contents::
.. sectnum::
Syntax highlighting significantly enhances the readability of code. However,
in the current version, docutils does not highlight literal blocks.
This sandbox project aims to add syntax highlight of code blocks to the
capabilities of docutils. To find its way into the docutils core, it should
meet the requirements laid out in a mail on `Questions about writing
programming manuals and scientific documents`__, by docutils main developer
David Goodger:
I'd be happy to include Python source colouring support, and other
languages would be welcome too. A multi-language solution would be
useful, of course. My issue is providing support for all output formats
-- HTML and LaTeX and XML and anything in the future -- simultaneously.
Just HTML isn't good enough. Until there is a generic-output solution,
this will be something users will have to put together themselves.
__ http://sourceforge.net/mailarchive/message.php?msg_id=12921194
State of the art
----------------
There are already docutils extensions providing syntax colouring, e.g:
SilverCity_,
a C++ library and Python extension that can provide lexical
analysis for over 20 different programming languages. A recipe__ for a
"code-block" directive provides syntax highlight by SilverCity.
__ http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/252170
`listings`_,
a LaTeX package providing highly customisable and advanced
syntax highlight, though only for LaTeX (and LaTeX derived PS|PDF).
See also section `listings.sty`_ and a proposal__ Gael Varoquaux.
__ http://article.gmane.org/gmane.text.docutils.devel/3914
Trac_
has `reStructuredText support`__ and offers syntax highlighting with
a "code-block" directive using GNU Enscript_, SilverCity_, or Pygments_.
__ http://trac.edgewall.org/wiki/WikiRestructuredText
rest2web_,
the "site builder" provides the `colorize`__ macro (using the
`Moin-Moin Python colorizer`_)
__ http://www.voidspace.org.uk/python/rest2web/macros.html#colorize
Pygments_
is a generic syntax highlighter written completely in Python.
* Usable as a command-line tool and as a Python package.
* A wide range of common `languages and markup formats`_ is supported.
* Additionally, OpenOffice's ``*.odt`` is supported by the odtwriter_.
* The layout is configurable by style sheets.
* Several built-in styles and an option for line-numbering.
* Built-in output formats include HTML, LaTeX, rtf
* Support for new languages, formats, and styles is added easily (modular
structure, Python code, existing documentation).
* Well documented and actively maintained.
* The web site provides a recipe for `using Pygments in ReST documents`_.
It is used in the `Pygments enhanced docutils front-ends`_ below.
Odtwriter_, experimental writer for Docutils OpenOffice export supports syntax
colours using Pygments. (See section `Odtwriter syntax`_.)
Pygments_ seems to be the most promising docutils highlighter. For printed
output, the listings_ package has its advantages too.
Pygments enhanced docutils front-ends
-------------------------------------
Syntax highlight can be achieved by `front-end scripts`_ combining docutils and
pygments.
Advantages:
+ Easy implementation with no changes to the stock docutils_.
+ Separation of code blocks and ordinary literal blocks.
Disadvantages:
1. "code-block" content is formatted by `pygments`_ and inserted in the
document tree as a "raw" node making the approach writer-dependant.
2. documents are incompatible with the standard docutils because of the
locally defined directive.
3. more "invasive" markup distracting from content
(no "minimal" code block marker -- three additional lines per code block)
Point 1 and 2 lead to the `code-block directive proposal`_. Point 3
becomes an issue in literate programming where a code block is the most used
block markup. It is addressed in the proposal for a `configurable literal
block directive`_).
`code-block` directive proposal
-------------------------------
Reading
"""""""
Felix Wiemann provided a `proof of concept`_ script that utilizes the
pygments_ parser to parse a source code string and store the result in
the document tree.
This concept is used in a `pygments_code_block_directive`_ (Source:
`pygments_code_block_directive.py`_), to define and register a "code-block"
directive.
* The ``DocutilsInterface`` class uses pygments to parse the content of the
directive and classify the tokens using short CSS class names identical to
pygments HTML output. If pygments is not available, the unparsed code is
returned.
* The ``code_block_directive`` function inserts the tokens in a "rich"
<literal_block> element with "classified" <inline> nodes.
The XML rendering of the small example file `myfunction.py.txt`_ looks like
`myfunction.py.xml`_.
Writing
"""""""
The writers can use the class information in the <inline> elements to render
the tokens. They should ignore the class information if they are unable to
use it or to pass it on.
HTML
The "html" writer works out of the box.
* The rst2html-highlight_ front end registers the "code-block" directive and
converts an input file to html.
* Styling is done with the adapted CSS style sheet `pygments-default.css`_
based on docutils' default stylesheet and the output of
``pygmentize -S default -f html``.
* The result looks like `myfunction.py.htm`_.
The "s5" and "pep" writers are not tested yet.
XML
"xml" and "pseudoxml" work out of the box, too. See `myfunction.py.xml`_
and `myfunction.py.pseudoxml`_
LaTeX
Latex writers must be updated to handle the "rich" <literal_block> element
correct.
* The "latex" writer currently fails to handle "classified" <inline>
doctree elements. The output `myfunction.py.tex`_ contains undefined
control sequences ``\docutilsroleNone``.
* The "newlatex2e" writer produces a valid LaTeX document
(`myfunction.py.newlatex2e.tex`_). However the `pdflatex` output looks
mixed up a bit (`myfunction.py.newlatex2e.pdf`_).
The pygments-produced style file will not currently work with
"newlatex2e" output.
OpenOffice
The non-official "odtwriter" provides syntax highlight with
pygments but uses a different syntax.
TODO
""""
* fix the "latex" writers.
* think about an interface for pygments' options (like "encoding" or
"linenumbers").
.. _proof of concept:
http://article.gmane.org/gmane.text.docutils.user/3689
.. _pygments_code_block_directive.py: ../pygments_code_block_directive.py
.. _pygments_code_block_directive: pygments_code_block_directive-bunt.py.htm
.. _pygments_docutils_interface.py: pygments_docutils_interface.py
.. _myfunction.py.txt: myfunction.py.txt
.. _myfunction.py.xml: myfunction.py.xml
.. _myfunction.py.htm: myfunction.py.htm
.. _myfunction.py.pseudoxml: myfunction.py.pseudoxml
.. _myfunction.py.tex: myfunction.py.tex
.. _myfunction.py.newlatex2e.tex: myfunction.py.newlatex2e.tex
.. _myfunction.py.newlatex2e.pdf: myfunction.py.newlatex2e.pdf
.. _rst2html-highlight: ../rst2html-highlight
.. _pygments-long.css: ../data/pygments-long.css
Configurable literal block directive
------------------------------------
Goal
""""
A clean and simple syntax for highlighted code blocks -- preserving the
space saving feature of the "minimised" literal block marker (``::`` at the
end of a text paragraph). This is especially desirable in documents with
many code blocks like tutorials or literate programs.
Inline analogon
"""""""""""""""
The *role* of inline `interpreted text` can be customised with the
"default-role" directive. This allows the use of the concise "backtick"
syntax for the most often used role, e.g. in a chemical paper, one could
use::
.. default-role:: subscript
The triple point of H\ `2`\O is at 0°C.
.. default-role:: subscript
to produce
The triple point of H\ `2`\O is at 0°C.
This customisation is currently not possible for block markup.
Proposal
""""""""
* Define a new "literal" directive for an ordinary literal block.
This would insert the block content into the document tree as
"literal-block" element with no parsing.
* Define a "literal-block" setting that controls which directive is called
on a block following ``::``. Default would be the "literal" directive.
Alternatively, define a new "default-literal-block" directive instead of
a settings key.
* From a syntax view, this would be analog to the behaviour of the odtwriter_.
(I am not sure about the representation in the document tree, though.)
Motive
""""""
Analogue to customising the default role of "interpreted text" with the
"default-role" directive, the concise ``::`` literal-block markup could be
used for e.g.
* a "code-block" or "sourcecode" directive for colourful code
(analog to the one in the `pygments enhanced docutils front-ends`_)
* the "line-block" directive for poems or addresses
* the "parsed-literal" directive
Example (using the upcoming "settings" directive)::
ordinary literal block::
some text typeset in monospace
.. settings::
:literal-block: code-block python
colourful Python code::
def hello():
print "hello world"
In the same line, a "default-block-quote" setting or directive could be
considered to configure the role of a block quote.
Odtwriter syntax
----------------
Dave Kuhlman's odtwriter_ extension can add syntax highlighting
to ordinary literal blocks.
The ``--add-syntax-highlighting`` command line flag activates syntax
highlighting in literal blocks. By default, the "python" lexer is used.
You can change this within your reST document with the `sourcecode`
directive::
.. sourcecode:: off
ordinary literal block::
content set in teletype
.. sourcecode:: on
.. sourcecode:: python
colourful Python code::
def hello():
print "hello world"
The "sourcecode" directive defined by the odtwriter is principally
different from the "code-block" directive of ``rst2html-pygments``:
* The odtwriter directive does not have content. It is a switch.
* The syntax highlighting state and language/lexer set by this directive
remain in effect until the next sourcecode directive is encountered in the
reST document.
``.. sourcecode:: <newstate>``
make highlighting active or inactive.
<newstate> is either ``on`` or ``off``.
``.. sourcecode:: <lexer>``
change the lexer parsing literal code blocks.
<lexer> should be one of aliases listed at pygment's `languages and
markup formats`_.
I.e. the odtwriter implements a `configurable literal block directive`_
(but with a slightly different syntax than my proposal below).
``listings.sty``
----------------
Using the listings_ LaTeX package for syntax highlight is currently not
possible with the standard latex writer output.
Support for the use of listings_ with docutils is an issue that must be
settled separate from the `code-block directive proposal`_. It needs
* a new, specialized docutils latex writer, or
* a new option (and behaviour) to the existing latex writer.
Ideas and experimental code is in the Sandbox under `latex-variants`_.
.. External links
.. _pylit: http://pylit.berlios.de
.. _docutils: http://docutils.sourceforge.net/
.. _rest2web: http://www.voidspace.org.uk/python/rest2web/
.. _Enscript: http://www.gnu.org/software/enscript/enscript.html
.. _SilverCity: http://silvercity.sourceforge.net/
.. _Trac: http://trac.edgewall.org/
.. _Moin-Moin Python colorizer:
http://www.standards-schmandards.com/2005/fangs-093/
.. _odtwriter: http://www.rexx.com/~dkuhlman/odtwriter.html
.. _pygments: http://pygments.org/
.. _listings:
http://www.ctan.org/tex-archive/help/Catalogue/entries/listings.html
.. _fancyvrb:
http://www.ctan.org/tex-archive/help/Catalogue/entries/fancyvrb.html
.. _alltt: http://www.ctan.org/tex-archive/help/Catalogue/entries/alltt.html
.. _moreverb:
http://www.ctan.org/tex-archive/help/Catalogue/entries/moreverb.html
.. _verbatim:
http://www.ctan.org/tex-archive/help/Catalogue/entries/verbatim.html
.. _languages and markup formats: http://pygments.org/languages
.. _Using Pygments in ReST documents: http://pygments.org/docs/rstdirective/
.. _Docutils Document Tree:
http://docutils.sf.net/docs/ref/doctree.html#classes
.. _latex-variants: http://docutils.sourceforge.net/sandbox/latex-variants/
.. Internal links
.. _front-end scripts: ../tools/pygments-enhanced-front-ends
.. _pygments-default.css: ../data/pygments-default.css
|