summaryrefslogtreecommitdiff
path: root/docs/src/api.txt
blob: 903171471b4873305a45fc20e199ebbe0e63ff15 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
.. -*- mode: rst -*-

=====================
The full Pygments API
=====================

This page describes the Pygments API.

High-level API
==============

Functions from the `pygments` module:

def `lex(code, lexer):`
    Lex `code` with the `lexer` (must be a `Lexer` instance)
    and return an iterable of tokens. Currently, this only calls
    `lexer.get_tokens()`.

def `format(tokens, formatter, outfile=None):`
    Format a token stream (iterable of tokens) `tokens` with the
    `formatter` (must be a `Formatter` instance). The result is
    written to `outfile`, or if that is ``None``, returned as a
    string.

def `highlight(code, lexer, formatter, outfile=None):`
    This is the most high-level highlighting function.
    It combines `lex` and `format` in one function.


Functions from `pygments.lexers`:

def `get_lexer_by_name(alias, **options):`
    Return an instance of a `Lexer` subclass that has `alias` in its
    aliases list. The lexer is given the `options` at its
    instantiation.

    Will raise `ValueError` if no lexer with that alias is found.

def `get_lexer_for_filename(fn, **options):`
    Return a `Lexer` subclass instance that has a filename pattern
    matching `fn`. The lexer is given the `options` at its
    instantiation.

    Will raise `ValueError` if no lexer for that filename is found.


Functions from `pygments.formatters`:

def `get_formatter_by_name(alias, **options):`
    Return an instance of a `Formatter` subclass that has `alias` in its
    aliases list. The formatter is given the `options` at its
    instantiation.

    Will raise `ValueError` if no formatter with that alias is found.

def `get_formatter_for_filename(fn, **options):`
    Return a `Formatter` subclass instance that has a filename pattern
    matching `fn`. The formatter is given the `options` at its
    instantiation.

    Will raise `ValueError` if no formatter for that filename is found.


Lexers
======

A lexer (derived from `pygments.lexer.Lexer`) has the following functions:

def `__init__(self, **options):`
    The constructor. Takes a \*\*keywords dictionary of options.
    Every subclass must first process its own options and then call
    the `Lexer` constructor, since it processes the `stripnl`,
    `stripall` and `tabsize` options.

    An example looks like this:

    .. sourcecode:: python

        def __init__(self, **options):
            self.compress = options.get('compress', '')
            Lexer.__init__(self, **options)

    As these options must all be specifiable as strings (due to the
    command line usage), there are various utility functions
    available to help with that, see `Option processing`_.

def `get_tokens(self, text):`
    This method is the basic interface of a lexer. It is called by
    the `highlight()` function. It must process the text and return an
    iterable of ``(tokentype, value)`` pairs from `text`.

    Normally, you don't need to override this method. The default
    implementation processes the `stripnl`, `stripall` and `tabsize`
    options and then yields all tokens from `get_tokens_unprocessed()`,
    with the ``index`` dropped.

def `get_tokens_unprocessed(self, text):`
    This method should process the text and return an iterable of
    ``(index, tokentype, value)`` tuples where ``index`` is the starting
    position of the token within the input text.

    This method must be overridden by subclasses.

For a list of known tokens have a look at the `Tokens`_ page.

The lexer also recognizes the following attributes that are used by the
builtin lookup mechanism.

`name`
    Full name for the lexer, in human-readable form.

`aliases`
    A list of short, unique identifiers that can be used to lookup
    the lexer from a list.

`filenames`
    A list of `fnmatch` patterns that can be used to find a lexer for
    a given filename.


.. _Tokens: tokens.txt


Formatters
==========

A formatter (derived from `pygments.formatter.Formatter`) has the following
functions:

def `__init__(self, **options):`
    As with lexers, this constructor processes options and then must call
    the base class `__init__`.

    The `Formatter` class recognizes the options `style`, `full` and
    `title`. It is up to the formatter class whether it uses them.

def `get_style_defs(self, arg=''):`
    This method must return statements or declarations suitable to define
    the current style for subsequent highlighted text (e.g. CSS classes
    in the `HTMLFormatter`).

    The optional argument `arg` can be used to modify the generation and
    is formatter dependent (it is standardized because it can be given on
    the command line).

    This method is called by the ``-S`` `command-line option`_, the `arg`
    is then given by the ``-a`` option.

def `format(self, tokensource, outfile):`
    This method must format the tokens from the `tokensource` iterable and
    write the formatted version to the file object `outfile`.

    Formatter options can control how exactly the tokens are converted.

.. _command-line option: cmdline.txt


Option processing
=================

The `pygments.util` module has some utility functions usable for option
processing:

class `OptionError`
    This exception will be raised by all option processing functions if
    the type of the argument is not correct.

def `get_bool_opt(options, optname, default=None):`
    Interpret the key `optname` from the dictionary `options`
    as a boolean and return it. Return `default` if `optname`
    is not in `options`.

    The valid string values for ``True`` are ``1``, ``yes``,
    ``true`` and ``on``, the ones for ``False`` are ``0``,
    ``no``, ``false`` and ``off`` (matched case-insensitively).

def `get_int_opt(options, optname, default=None):`
    As `get_bool_opt`, but interpret the value as an integer.

def `get_list_opt(options, optname, default=None):`
    If the key `optname` from the dictionary `options` is a string,
    split it at whitespace and return it. If it is already a list
    or a tuple, it is returned as a list.