diff options
author | blackbird <devnull@localhost> | 2006-10-27 23:59:07 +0200 |
---|---|---|
committer | blackbird <devnull@localhost> | 2006-10-27 23:59:07 +0200 |
commit | 0b4ae9ab3fa6057dce2833a3e34ba01511c10e44 (patch) | |
tree | f4afa530b3b9aae10144448d0aaa25bc990fb482 /docs/src | |
parent | a400243228ed76501b820f2a6d0e7f924d5f9882 (diff) | |
download | pygments-0b4ae9ab3fa6057dce2833a3e34ba01511c10e44.tar.gz |
[svn] checked in changes from the last days. including:
- text in logo
- documentation update
- new `guess_lexer` method
Diffstat (limited to 'docs/src')
-rw-r--r-- | docs/src/lexerdevelopment.txt | 31 |
1 files changed, 31 insertions, 0 deletions
diff --git a/docs/src/lexerdevelopment.txt b/docs/src/lexerdevelopment.txt index c5d2921b..80822e1d 100644 --- a/docs/src/lexerdevelopment.txt +++ b/docs/src/lexerdevelopment.txt @@ -480,3 +480,34 @@ This might sound confusing (and it can really be). But it is needed, and for an example look at the Ruby lexer in `agile.py`_. .. _agile.py: http://trac.pocoo.org/repos/pygments/trunk/pygments/lexers/agile.py + + +Filtering Token Streams +======================= + +Some languages ship a lot of builtin functions (for example PHP). The total +amount of those functions differs from system to system because not everybody +has every extension installed. In the case of PHP there are over 3000 builtin +functions. That's an incredible huge amount of functions, much more than you +can put into a regular expression. + +But because only `Name` tokens can be function names it's solvable by overriding +the ``get_tokens_unprocessed`` method. The following lexer subclasses the +`PythonLexer` so that it highlights some additional names as pseudo keywords: + +.. sourcecode:: python + + from pykleur.lexers.agile import PythonLexer + from pykleur.token import Name, Keyword + + class MyPythonLexer(PythonLexer): + EXTRA_KEYWORDS = ['foo', 'bar', 'foobar', 'barfoo', 'spam', 'eggs'] + + def get_tokens_unprocessed(self, text): + for index, token, value in PythonLexer.get_tokens_unprocessed(self, text): + if token is Name and value in self.EXTRA_KEYWORDS: + yield index, Keyword.Pseudo, value + else: + yield index, token, value + +The `PhpLexer` and `LuaLexer` use this method to resolve builtin functions. |