summaryrefslogtreecommitdiff
path: root/doc/docs
diff options
context:
space:
mode:
authorGeorg Brandl <georg@python.org>2014-10-04 22:44:05 +0200
committerGeorg Brandl <georg@python.org>2014-10-04 22:44:05 +0200
commit8182f8ecafd849532737331f5b71ed099521f729 (patch)
tree5c9f385d0a539f7d52ad15eb6f5ad464d60e8e23 /doc/docs
parent902cdc5c54b43d4401535b89636c328df695803e (diff)
downloadpygments-8182f8ecafd849532737331f5b71ed099521f729.tar.gz
Include newer features in the lexerdevelopment doc, and update it a bit.
Diffstat (limited to 'doc/docs')
-rw-r--r--doc/docs/lexerdevelopment.rst25
1 files changed, 22 insertions, 3 deletions
diff --git a/doc/docs/lexerdevelopment.rst b/doc/docs/lexerdevelopment.rst
index 83455d65..6ea08dba 100644
--- a/doc/docs/lexerdevelopment.rst
+++ b/doc/docs/lexerdevelopment.rst
@@ -373,19 +373,22 @@ There are a few more things you can do with states:
Subclassing lexers derived from RegexLexer
==========================================
+.. versionadded:: 1.6
+
Sometimes multiple languages are very similar, but should still be lexed by
different lexer classes.
When subclassing a lexer derived from RegexLexer, the ``tokens`` dictionaries
defined in the parent and child class are merged. For example::
- from pygments.lexer import RegexLexer, bygroups, include
+ from pygments.lexer import RegexLexer, inherit
from pygments.token import *
class BaseLexer(RegexLexer):
tokens = {
'root': [
('[a-z]+', Name),
+ (r'/\*', Comment, 'comment'),
('"', String, 'string'),
('\s+', Text),
],
@@ -393,19 +396,35 @@ defined in the parent and child class are merged. For example::
('[^"]+', String),
('"', String, '#pop'),
],
+ 'comment': [
+ ...
+ ],
}
class DerivedLexer(BaseLexer):
tokens = {
'root': [
+ ('[0-9]+', Number),
inherit,
],
'string': [
- ('[^"]
+ (r'[^"\\]+', String),
+ (r'\\.', String.Escape),
+ ('"', String, '#pop'),
],
}
- .. versionadded:: 1.6
+The `BaseLexer` defines two states, lexing names and strings. The
+`DerivedLexer` defines its own tokens dictionary, which extends the definitions
+of the base lexer:
+
+* The "root" state has an additional rule and then the special object `inherit`,
+ which tells Pygments to insert the token definitions of the parent class at
+ that point.
+
+* The "string" state is replaced entirely, since there is not `inherit` rule.
+
+* The "comment" state is inherited entirely.
Using multiple lexers