1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
|
.. -*- mode: rst -*-
=====================
The full Pygments API
=====================
This page describes the Pygments API.
High-level API
==============
Functions from the `pygments` module:
def `lex(code, lexer):`
Lex `code` with the `lexer` (must be a `Lexer` instance)
and return an iterable of tokens. Currently, this only calls
`lexer.get_tokens()`.
def `format(tokens, formatter, outfile=None):`
Format a token stream (iterable of tokens) `tokens` with the
`formatter` (must be a `Formatter` instance). The result is
written to `outfile`, or if that is ``None``, returned as a
string.
def `highlight(code, lexer, formatter, outfile=None):`
This is the most high-level highlighting function.
It combines `lex` and `format` in one function.
Functions from `pygments.lexers`:
def `get_lexer_by_name(alias, **options):`
Return an instance of a `Lexer` subclass that has `alias` in its
aliases list. The lexer is given the `options` at its
instantiation.
Will raise `ValueError` if no lexer with that alias is found.
def `get_lexer_for_filename(fn, **options):`
Return a `Lexer` subclass instance that has a filename pattern
matching `fn`. The lexer is given the `options` at its
instantiation.
Will raise `ValueError` if no lexer for that filename is found.
Functions from `pygments.formatters`:
def `get_formatter_by_name(alias, **options):`
Return an instance of a `Formatter` subclass that has `alias` in its
aliases list. The formatter is given the `options` at its
instantiation.
Will raise `ValueError` if no formatter with that alias is found.
def `get_formatter_for_filename(fn, **options):`
Return a `Formatter` subclass instance that has a filename pattern
matching `fn`. The formatter is given the `options` at its
instantiation.
Will raise `ValueError` if no formatter for that filename is found.
Lexers
======
A lexer (derived from `pygments.lexer.Lexer`) has the following functions:
def `__init__(self, **options):`
The constructor. Takes a \*\*keywords dictionary of options.
Every subclass must first process its own options and then call
the `Lexer` constructor, since it processes the `stripnl`,
`stripall` and `tabsize` options.
An example looks like this:
.. sourcecode:: python
def __init__(self, **options):
self.compress = options.get('compress', '')
Lexer.__init__(self, **options)
As these options must all be specifiable as strings (due to the
command line usage), there are various utility functions
available to help with that, see `Option processing`_.
def `get_tokens(self, text):`
This method is the basic interface of a lexer. It is called by
the `highlight()` function. It must process the text and return an
iterable of ``(tokentype, value)`` pairs from `text`.
Normally, you don't need to override this method. The default
implementation processes the `stripnl`, `stripall` and `tabsize`
options and then yields all tokens from `get_tokens_unprocessed()`,
with the ``index`` dropped.
def `get_tokens_unprocessed(self, text):`
This method should process the text and return an iterable of
``(index, tokentype, value)`` tuples where ``index`` is the starting
position of the token within the input text.
This method must be overridden by subclasses.
For a list of known tokens have a look at the `Tokens`_ page.
The lexer also recognizes the following attributes that are used by the
builtin lookup mechanism.
`name`
Full name for the lexer, in human-readable form.
`aliases`
A list of short, unique identifiers that can be used to lookup
the lexer from a list.
`filenames`
A list of `fnmatch` patterns that can be used to find a lexer for
a given filename.
.. _Tokens: tokens.txt
Formatters
==========
A formatter (derived from `pygments.formatter.Formatter`) has the following
functions:
def `__init__(self, **options):`
As with lexers, this constructor processes options and then must call
the base class `__init__`.
The `Formatter` class recognizes the options `style`, `full` and
`title`. It is up to the formatter class whether it uses them.
def `get_style_defs(self, arg=''):`
This method must return statements or declarations suitable to define
the current style for subsequent highlighted text (e.g. CSS classes
in the `HTMLFormatter`).
The optional argument `arg` can be used to modify the generation and
is formatter dependent (it is standardized because it can be given on
the command line).
This method is called by the ``-S`` `command-line option`_, the `arg`
is then given by the ``-a`` option.
def `format(self, tokensource, outfile):`
This method must format the tokens from the `tokensource` iterable and
write the formatted version to the file object `outfile`.
Formatter options can control how exactly the tokens are converted.
.. _command-line option: cmdline.txt
Option processing
=================
The `pygments.util` module has some utility functions usable for option
processing:
class `OptionError`
This exception will be raised by all option processing functions if
the type of the argument is not correct.
def `get_bool_opt(options, optname, default=None):`
Interpret the key `optname` from the dictionary `options`
as a boolean and return it. Return `default` if `optname`
is not in `options`.
The valid string values for ``True`` are ``1``, ``yes``,
``true`` and ``on``, the ones for ``False`` are ``0``,
``no``, ``false`` and ``off`` (matched case-insensitively).
def `get_int_opt(options, optname, default=None):`
As `get_bool_opt`, but interpret the value as an integer.
def `get_list_opt(options, optname, default=None):`
If the key `optname` from the dictionary `options` is a string,
split it at whitespace and return it. If it is already a list
or a tuple, it is returned as a list.
|