diff options
author | Georg Brandl <georg@python.org> | 2014-10-04 22:44:05 +0200 |
---|---|---|
committer | Georg Brandl <georg@python.org> | 2014-10-04 22:44:05 +0200 |
commit | 8182f8ecafd849532737331f5b71ed099521f729 (patch) | |
tree | 5c9f385d0a539f7d52ad15eb6f5ad464d60e8e23 /doc/docs | |
parent | 902cdc5c54b43d4401535b89636c328df695803e (diff) | |
download | pygments-8182f8ecafd849532737331f5b71ed099521f729.tar.gz |
Include newer features in the lexerdevelopment doc, and update it a bit.
Diffstat (limited to 'doc/docs')
-rw-r--r-- | doc/docs/lexerdevelopment.rst | 25 |
1 files changed, 22 insertions, 3 deletions
diff --git a/doc/docs/lexerdevelopment.rst b/doc/docs/lexerdevelopment.rst index 83455d65..6ea08dba 100644 --- a/doc/docs/lexerdevelopment.rst +++ b/doc/docs/lexerdevelopment.rst @@ -373,19 +373,22 @@ There are a few more things you can do with states: Subclassing lexers derived from RegexLexer ========================================== +.. versionadded:: 1.6 + Sometimes multiple languages are very similar, but should still be lexed by different lexer classes. When subclassing a lexer derived from RegexLexer, the ``tokens`` dictionaries defined in the parent and child class are merged. For example:: - from pygments.lexer import RegexLexer, bygroups, include + from pygments.lexer import RegexLexer, inherit from pygments.token import * class BaseLexer(RegexLexer): tokens = { 'root': [ ('[a-z]+', Name), + (r'/\*', Comment, 'comment'), ('"', String, 'string'), ('\s+', Text), ], @@ -393,19 +396,35 @@ defined in the parent and child class are merged. For example:: ('[^"]+', String), ('"', String, '#pop'), ], + 'comment': [ + ... + ], } class DerivedLexer(BaseLexer): tokens = { 'root': [ + ('[0-9]+', Number), inherit, ], 'string': [ - ('[^"] + (r'[^"\\]+', String), + (r'\\.', String.Escape), + ('"', String, '#pop'), ], } - .. versionadded:: 1.6 +The `BaseLexer` defines two states, lexing names and strings. The +`DerivedLexer` defines its own tokens dictionary, which extends the definitions +of the base lexer: + +* The "root" state has an additional rule and then the special object `inherit`, + which tells Pygments to insert the token definitions of the parent class at + that point. + +* The "string" state is replaced entirely, since there is not `inherit` rule. + +* The "comment" state is inherited entirely. Using multiple lexers |