summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--AUTHORS3
-rw-r--r--CHANGES10
-rw-r--r--docs/src/api.txt2
-rw-r--r--docs/src/index.txt2
-rw-r--r--docs/src/integrate.txt5
-rw-r--r--docs/src/java.txt70
-rw-r--r--pygments/cmdline.py8
-rw-r--r--pygments/lexer.py5
-rw-r--r--pygments/lexers/_mapping.py6
-rw-r--r--pygments/lexers/_robotframeworklexer.py546
-rw-r--r--pygments/lexers/asm.py4
-rw-r--r--pygments/lexers/compiled.py238
-rw-r--r--pygments/lexers/jvm.py2
-rw-r--r--pygments/lexers/math.py17
-rw-r--r--pygments/lexers/other.py95
-rw-r--r--pygments/lexers/text.py47
-rw-r--r--pygments/lexers/web.py2
-rw-r--r--tests/examplefiles/BOM.js1
-rw-r--r--tests/examplefiles/classes.dylan91
-rw-r--r--tests/examplefiles/nanomsg.intr95
-rw-r--r--tests/examplefiles/robotframework.txt39
-rw-r--r--tests/examplefiles/rust_example.rs892
-rw-r--r--tests/examplefiles/test2.pypylog120
-rw-r--r--tests/examplefiles/unix-io.lid37
-rw-r--r--tests/test_basic_api.py2
-rw-r--r--tests/test_examplefiles.py2
26 files changed, 1580 insertions, 761 deletions
diff --git a/AUTHORS b/AUTHORS
index cd2b8565..54957dc4 100644
--- a/AUTHORS
+++ b/AUTHORS
@@ -56,6 +56,7 @@ Other contributors, listed alphabetically, are:
* Brian R. Jackson -- Tea lexer
* Dennis Kaarsemaker -- sources.list lexer
* Igor Kalnitsky -- vhdl lexer
+* Pekka Klärck -- Robot Framework lexer
* Eric Knibbe -- Lasso lexer
* Adam Koprowski -- Opa lexer
* Benjamin Kowarsch -- Modula-2 lexer
@@ -73,10 +74,12 @@ Other contributors, listed alphabetically, are:
* Gordon McGregor -- SystemVerilog lexer
* Stephen McKamey -- Duel/JBST lexer
* Brian McKenna -- F# lexer
+* Charles McLaughlin -- Puppet lexer
* Lukas Meuser -- BBCode formatter, Lua lexer
* Paul Miller -- LiveScript lexer
* Hong Minhee -- HTTP lexer
* Michael Mior -- Awk lexer
+* Bruce Mitchener -- Dylan lexer rewrite
* Reuben Morais -- SourcePawn lexer
* Jon Morton -- Rust lexer
* Paulo Moura -- Logtalk lexer
diff --git a/CHANGES b/CHANGES
index 01a4b9a1..b671746b 100644
--- a/CHANGES
+++ b/CHANGES
@@ -6,7 +6,7 @@ http://bitbucket.org/birkenfeld/pygments-main/issues.
Version 1.6
-----------
-(in development, to be released xx November 2012)
+(in development, to be released xx January 2013)
- Lexers added:
@@ -23,8 +23,10 @@ Version 1.6
* LiveScript (PR#84)
* Monkey (PR#117)
* Mscgen (PR#80)
+ * Puppet (PR#133)
* Racket (PR#94)
* Rdoc (PR#99)
+ * Robot Framework (PR#137)
* Rust (PR#67)
* Smali (Dalvik assembly)
* SourcePawn (PR#39)
@@ -34,12 +36,18 @@ Version 1.6
* Windows Registry (#819)
* Xtend (PR#68)
+- Use "colorama" on Windows for console color output (PR#142)
+
- Fix Template Haskell highlighting (PR#63)
- Fix some S/R lexer errors (PR#91)
- Fix a bug in the Prolog lexer with names that start with 'is' (#810)
+- Rewrite Dylan lexer, add Dylan LID lexer (PR#147)
+
+- Add a Java quickstart document (PR#146)
+
Version 1.5
-----------
diff --git a/docs/src/api.txt b/docs/src/api.txt
index b8159379..4276eea2 100644
--- a/docs/src/api.txt
+++ b/docs/src/api.txt
@@ -64,7 +64,7 @@ def `guess_lexer(text, **options):`
def `guess_lexer_for_filename(filename, text, **options):`
As `guess_lexer()`, but only lexers which have a pattern in `filenames`
or `alias_filenames` that matches `filename` are taken into consideration.
-
+
`pygments.util.ClassNotFound` is raised if no lexer thinks it can handle the
content.
diff --git a/docs/src/index.txt b/docs/src/index.txt
index 1fad0f03..b1e099c7 100644
--- a/docs/src/index.txt
+++ b/docs/src/index.txt
@@ -37,7 +37,7 @@ Welcome to the Pygments documentation.
- `Write your own lexer <lexerdevelopment.txt>`_
- `Write your own formatter <formatterdevelopment.txt>`_
-
+
- `Write your own filter <filterdevelopment.txt>`_
- `Register plugins <plugins.txt>`_
diff --git a/docs/src/integrate.txt b/docs/src/integrate.txt
index 51a3dac4..6f8c1253 100644
--- a/docs/src/integrate.txt
+++ b/docs/src/integrate.txt
@@ -41,3 +41,8 @@ Bash completion
The source distribution contains a file ``external/pygments.bashcomp`` that
sets up completion for the ``pygmentize`` command in bash.
+
+Java
+----
+
+See the `Java quickstart <java.txt>`_ document.
diff --git a/docs/src/java.txt b/docs/src/java.txt
new file mode 100644
index 00000000..f40a52d2
--- /dev/null
+++ b/docs/src/java.txt
@@ -0,0 +1,70 @@
+=====================
+Use Pygments in Java
+=====================
+
+Thanks to `Jython <http://www.jython.org>`__ it is possible to use Pygments in
+Java.
+
+This page is a simple tutorial to get an idea of how this is working. You can
+then look at the `Jython documentation <http://www.jython.org/docs/>`__ for more
+advanced use.
+
+Since version 1.5, Pygments is deployed on `Maven Central
+<http://repo1.maven.org/maven2/org/pygments/pygments/>`__ as a JAR so is Jython
+which make it a lot easier to create the Java project.
+
+Here is an example of a `Maven <http://www.maven.org>`__ ``pom.xml`` file for a
+project running Pygments:
+
+.. sourcecode:: xml
+
+ <?xml version="1.0" encoding="UTF-8"?>
+
+ <project xmlns="http://maven.apache.org/POM/4.0.0"
+ xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
+ http://maven.apache.org/maven-v4_0_0.xsd">
+ <modelVersion>4.0.0</modelVersion>
+ <groupId>example</groupId>
+ <artifactId>example</artifactId>
+ <version>1.0-SNAPSHOT</version>
+ <dependencies>
+ <dependency>
+ <groupId>org.python</groupId>
+ <artifactId>jython-standalone</artifactId>
+ <version>2.5.3</version>
+ </dependency>
+ <dependency>
+ <groupId>org.pygments</groupId>
+ <artifactId>pygments</artifactId>
+ <version>1.5</version>
+ <scope>runtime</scope>
+ </dependency>
+ </dependencies>
+ </project>
+
+The following Java example:
+
+.. sourcecode:: java
+
+ PythonInterpreter interpreter = new PythonInterpreter();
+
+ // Set a variable with the content you want to work with
+ interpreter.set("code", code);
+
+ // Simple use Pygments as you would in Python
+ interpreter.exec("from pygments import highlight\n"
+ + "from pygments.lexers import PythonLexer\n"
+ + "from pygments.formatters import HtmlFormatter\n"
+ + "\nresult = highlight(code, PythonLexer(), HtmlFormatter())");
+
+ // Get the result that has been set in a variable
+ System.out.println(interpreter.get("result", String.class));
+
+will print something like:
+
+.. sourcecode:: html
+
+ <div class="highlight">
+ <pre><span class="k">print</span> <span class="s">&quot;Hello World&quot;</span></pre>
+ </div>
diff --git a/pygments/cmdline.py b/pygments/cmdline.py
index 1f14cf5d..6fa80305 100644
--- a/pygments/cmdline.py
+++ b/pygments/cmdline.py
@@ -192,6 +192,14 @@ def main(args=sys.argv):
usage = USAGE % ((args[0],) * 6)
+ if sys.platform in ['win32', 'cygwin']:
+ try:
+ # Provide coloring under Windows, if possible
+ import colorama
+ colorama.init()
+ except ImportError:
+ pass
+
try:
popts, args = getopt.getopt(args[1:], "l:f:F:o:O:P:LS:a:N:hVHg")
except getopt.GetoptError, err:
diff --git a/pygments/lexer.py b/pygments/lexer.py
index ad2c72d1..7cb212d5 100644
--- a/pygments/lexer.py
+++ b/pygments/lexer.py
@@ -163,6 +163,10 @@ class Lexer(object):
text = decoded
else:
text = text.decode(self.encoding)
+ else:
+ if text.startswith(u'\ufeff'):
+ text = text[len(u'\ufeff'):]
+
# text now *is* a unicode string
text = text.replace('\r\n', '\n')
text = text.replace('\r', '\n')
@@ -694,4 +698,3 @@ def do_insertions(insertions, tokens):
except StopIteration:
insleft = False
break # not strictly necessary
-
diff --git a/pygments/lexers/_mapping.py b/pygments/lexers/_mapping.py
index ae3d2361..963431d4 100644
--- a/pygments/lexers/_mapping.py
+++ b/pygments/lexers/_mapping.py
@@ -83,7 +83,8 @@ LEXERS = {
'DjangoLexer': ('pygments.lexers.templates', 'Django/Jinja', ('django', 'jinja'), (), ('application/x-django-templating', 'application/x-jinja')),
'DtdLexer': ('pygments.lexers.web', 'DTD', ('dtd',), ('*.dtd',), ('application/xml-dtd',)),
'DuelLexer': ('pygments.lexers.web', 'Duel', ('duel', 'Duel Engine', 'Duel View', 'JBST', 'jbst', 'JsonML+BST'), ('*.duel', '*.jbst'), ('text/x-duel', 'text/x-jbst')),
- 'DylanLexer': ('pygments.lexers.compiled', 'Dylan', ('dylan',), ('*.dylan', '*.dyl'), ('text/x-dylan',)),
+ 'DylanLexer': ('pygments.lexers.compiled', 'Dylan', ('dylan',), ('*.dylan', '*.dyl', '*.intr'), ('text/x-dylan',)),
+ 'DylanLidLexer': ('pygments.lexers.compiled', 'DylanLID', ('dylan-lid', 'lid'), ('*.lid', '*.hdp'), ('text/x-dylan-lid',)),
'ECLLexer': ('pygments.lexers.other', 'ECL', ('ecl',), ('*.ecl',), ('application/x-ecl',)),
'ECLexer': ('pygments.lexers.compiled', 'eC', ('ec',), ('*.ec', '*.eh'), ('text/x-echdr', 'text/x-ecsrc')),
'ElixirConsoleLexer': ('pygments.lexers.functional', 'Elixir iex session', ('iex',), (), ('text/x-elixir-shellsession',)),
@@ -122,6 +123,7 @@ LEXERS = {
'HtmlPhpLexer': ('pygments.lexers.templates', 'HTML+PHP', ('html+php',), ('*.phtml',), ('application/x-php', 'application/x-httpd-php', 'application/x-httpd-php3', 'application/x-httpd-php4', 'application/x-httpd-php5')),
'HtmlSmartyLexer': ('pygments.lexers.templates', 'HTML+Smarty', ('html+smarty',), (), ('text/html+smarty',)),
'HttpLexer': ('pygments.lexers.text', 'HTTP', ('http',), (), ()),
+ 'HxmlLexer': ('pygments.lexers.text', 'Hxml', ('haxeml', 'hxml'), ('*.hxml',), ()),
'HybrisLexer': ('pygments.lexers.other', 'Hybris', ('hybris', 'hy'), ('*.hy', '*.hyb'), ('text/x-hybris', 'application/x-hybris')),
'IniLexer': ('pygments.lexers.text', 'INI', ('ini', 'cfg'), ('*.ini', '*.cfg'), ('text/x-ini',)),
'IoLexer': ('pygments.lexers.agile', 'Io', ('io',), ('*.io',), ('text/x-iosrc',)),
@@ -207,6 +209,7 @@ LEXERS = {
'PrologLexer': ('pygments.lexers.compiled', 'Prolog', ('prolog',), ('*.prolog', '*.pro', '*.pl'), ('text/x-prolog',)),
'PropertiesLexer': ('pygments.lexers.text', 'Properties', ('properties',), ('*.properties',), ('text/x-java-properties',)),
'ProtoBufLexer': ('pygments.lexers.other', 'Protocol Buffer', ('protobuf',), ('*.proto',), ()),
+ 'PuppetLexer': ('pygments.lexers.other', 'Puppet', ('puppet',), ('*.pp',), ()),
'PyPyLogLexer': ('pygments.lexers.text', 'PyPy Log', ('pypylog', 'pypy'), ('*.pypylog',), ('application/x-pypylog',)),
'Python3Lexer': ('pygments.lexers.agile', 'Python 3', ('python3', 'py3'), (), ('text/x-python3', 'application/x-python3')),
'Python3TracebackLexer': ('pygments.lexers.agile', 'Python 3.0 Traceback', ('py3tb',), ('*.py3tb',), ('text/x-python3-traceback',)),
@@ -229,6 +232,7 @@ LEXERS = {
'RedcodeLexer': ('pygments.lexers.other', 'Redcode', ('redcode',), ('*.cw',), ()),
'RegeditLexer': ('pygments.lexers.text', 'reg', (), ('*.reg',), ('text/x-windows-registry',)),
'RhtmlLexer': ('pygments.lexers.templates', 'RHTML', ('rhtml', 'html+erb', 'html+ruby'), ('*.rhtml',), ('text/html+ruby',)),
+ 'RobotFrameworkLexer': ('pygments.lexers.other', 'RobotFramework', ('RobotFramework', 'robotframework'), ('*.txt',), ('text/x-robotframework',)),
'RstLexer': ('pygments.lexers.text', 'reStructuredText', ('rst', 'rest', 'restructuredtext'), ('*.rst', '*.rest'), ('text/x-rst', 'text/prs.fallenstein.rst')),
'RubyConsoleLexer': ('pygments.lexers.agile', 'Ruby irb session', ('rbcon', 'irb'), (), ('text/x-ruby-shellsession',)),
'RubyLexer': ('pygments.lexers.agile', 'Ruby', ('rb', 'ruby', 'duby'), ('*.rb', '*.rbw', 'Rakefile', '*.rake', '*.gemspec', '*.rbx', '*.duby'), ('text/x-ruby', 'application/x-ruby')),
diff --git a/pygments/lexers/_robotframeworklexer.py b/pygments/lexers/_robotframeworklexer.py
new file mode 100644
index 00000000..f3b5f223
--- /dev/null
+++ b/pygments/lexers/_robotframeworklexer.py
@@ -0,0 +1,546 @@
+# Copyright 2012 Nokia Siemens Networks Oyj
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import re
+
+from pygments.lexer import Lexer
+from pygments.token import Token
+
+
+HEADING = Token.Generic.Heading
+SETTING = Token.Keyword.Namespace
+IMPORT = Token.Name.Namespace
+TC_KW_NAME = Token.Generic.Subheading
+KEYWORD = Token.Name.Function
+ARGUMENT = Token.String
+VARIABLE = Token.Name.Variable
+COMMENT = Token.Comment
+SEPARATOR = Token.Punctuation
+SYNTAX = Token.Punctuation
+GHERKIN = Token.Generic.Emph
+ERROR = Token.Error
+
+
+def normalize(string, remove=''):
+ string = string.lower()
+ for char in remove + ' ':
+ if char in string:
+ string = string.replace(char, '')
+ return string
+
+
+class RobotFrameworkLexer(Lexer):
+ """
+ For `Robot Framework <http://robotframework.org>`_ test data.
+
+ Supports both space and pipe separated plain text formats.
+
+ *New in Pygments 1.6.*
+ """
+ name = 'RobotFramework'
+ aliases = ['RobotFramework', 'robotframework']
+ filenames = ['*.txt']
+ mimetypes = ['text/x-robotframework']
+
+ def __init__(self, **options):
+ options['tabsize'] = 2
+ options['encoding'] = 'UTF-8'
+ Lexer.__init__(self, **options)
+
+ def get_tokens_unprocessed(self, text):
+ row_tokenizer = RowTokenizer()
+ var_tokenizer = VariableTokenizer()
+ index = 0
+ for row in text.splitlines():
+ for value, token in row_tokenizer.tokenize(row):
+ for value, token in var_tokenizer.tokenize(value, token):
+ if value:
+ yield index, token, unicode(value)
+ index += len(value)
+
+
+class VariableTokenizer(object):
+
+ def tokenize(self, string, token):
+ var = VariableSplitter(string, identifiers='$@%')
+ if var.start < 0 or token in (COMMENT, ERROR):
+ yield string, token
+ return
+ for value, token in self._tokenize(var, string, token):
+ if value:
+ yield value, token
+
+ def _tokenize(self, var, string, orig_token):
+ before = string[:var.start]
+ yield before, orig_token
+ yield var.identifier + '{', SYNTAX
+ for value, token in self.tokenize(var.base, VARIABLE):
+ yield value, token
+ yield '}', SYNTAX
+ if var.index:
+ yield '[', SYNTAX
+ for value, token in self.tokenize(var.index, VARIABLE):
+ yield value, token
+ yield ']', SYNTAX
+ for value, token in self.tokenize(string[var.end:], orig_token):
+ yield value, token
+
+
+class RowTokenizer(object):
+
+ def __init__(self):
+ self._table = UnknownTable()
+ self._splitter = RowSplitter()
+ testcases = TestCaseTable()
+ settings = SettingTable(testcases.set_default_template)
+ variables = VariableTable()
+ keywords = KeywordTable()
+ self._tables = {'settings': settings, 'setting': settings,
+ 'metadata': settings,
+ 'variables': variables, 'variable': variables,
+ 'testcases': testcases, 'testcase': testcases,
+ 'keywords': keywords, 'keyword': keywords,
+ 'userkeywords': keywords, 'userkeyword': keywords}
+
+ def tokenize(self, row):
+ commented = False
+ heading = False
+ for index, value in enumerate(self._splitter.split(row)):
+ # First value, and every second after that, is a separator.
+ index, separator = divmod(index-1, 2)
+ if value.startswith('#'):
+ commented = True
+ elif index == 0 and value.startswith('*'):
+ self._table = self._start_table(value)
+ heading = True
+ for value, token in self._tokenize(value, index, commented,
+ separator, heading):
+ yield value, token
+ self._table.end_row()
+
+ def _start_table(self, header):
+ name = normalize(header, remove='*')
+ return self._tables.get(name, UnknownTable())
+
+ def _tokenize(self, value, index, commented, separator, heading):
+ if commented:
+ yield value, COMMENT
+ elif separator:
+ yield value, SEPARATOR
+ elif heading:
+ yield value, HEADING
+ else:
+ for value, token in self._table.tokenize(value, index):
+ yield value, token
+
+
+class RowSplitter(object):
+ _space_splitter = re.compile('( {2,})')
+ _pipe_splitter = re.compile('((?:^| +)\|(?: +|$))')
+
+ def split(self, row):
+ splitter = (row.startswith('| ') and self._split_from_pipes
+ or self._split_from_spaces)
+ for value in splitter(row.rstrip()):
+ yield value
+ yield '\n'
+
+ def _split_from_spaces(self, row):
+ yield '' # Start with (pseudo)separator similarly as with pipes
+ for value in self._space_splitter.split(row):
+ yield value
+
+ def _split_from_pipes(self, row):
+ _, separator, rest = self._pipe_splitter.split(row, 1)
+ yield separator
+ while self._pipe_splitter.search(rest):
+ cell, separator, rest = self._pipe_splitter.split(rest, 1)
+ yield cell
+ yield separator
+ yield rest
+
+
+class Tokenizer(object):
+ _tokens = None
+
+ def __init__(self):
+ self._index = 0
+
+ def tokenize(self, value):
+ values_and_tokens = self._tokenize(value, self._index)
+ self._index += 1
+ if isinstance(values_and_tokens, type(Token)):
+ values_and_tokens = [(value, values_and_tokens)]
+ return values_and_tokens
+
+ def _tokenize(self, value, index):
+ index = min(index, len(self._tokens) - 1)
+ return self._tokens[index]
+
+ def _is_assign(self, value):
+ if value.endswith('='):
+ value = value[:-1].strip()
+ var = VariableSplitter(value, identifiers='$@')
+ return var.start == 0 and var.end == len(value)
+
+
+class Comment(Tokenizer):
+ _tokens = (COMMENT,)
+
+
+class Setting(Tokenizer):
+ _tokens = (SETTING, ARGUMENT)
+ _keyword_settings = ('suitesetup', 'suiteprecondition', 'suiteteardown',
+ 'suitepostcondition', 'testsetup', 'testprecondition',
+ 'testteardown', 'testpostcondition', 'testtemplate')
+ _import_settings = ('library', 'resource', 'variables')
+ _other_settings = ('documentation', 'metadata', 'forcetags', 'defaulttags',
+ 'testtimeout')
+ _custom_tokenizer = None
+
+ def __init__(self, template_setter=None):
+ Tokenizer.__init__(self)
+ self._template_setter = template_setter
+
+ def _tokenize(self, value, index):
+ if index == 1 and self._template_setter:
+ self._template_setter(value)
+ if index == 0:
+ normalized = normalize(value)
+ if normalized in self._keyword_settings:
+ self._custom_tokenizer = KeywordCall(support_assign=False)
+ elif normalized in self._import_settings:
+ self._custom_tokenizer = ImportSetting()
+ elif normalized not in self._other_settings:
+ return ERROR
+ elif self._custom_tokenizer:
+ return self._custom_tokenizer.tokenize(value)
+ return Tokenizer._tokenize(self, value, index)
+
+
+class ImportSetting(Tokenizer):
+ _tokens = (IMPORT, ARGUMENT)
+
+
+class TestCaseSetting(Setting):
+ _keyword_settings = ('setup', 'precondition', 'teardown', 'postcondition',
+ 'template')
+ _import_settings = ()
+ _other_settings = ('documentation', 'tags', 'timeout')
+
+ def _tokenize(self, value, index):
+ if index == 0:
+ type = Setting._tokenize(self, value[1:-1], index)
+ return [('[', SYNTAX), (value[1:-1], type), (']', SYNTAX)]
+ return Setting._tokenize(self, value, index)
+
+
+class KeywordSetting(TestCaseSetting):
+ _keyword_settings = ('teardown',)
+ _other_settings = ('documentation', 'arguments', 'return', 'timeout')
+
+
+class Variable(Tokenizer):
+ _tokens = (SYNTAX, ARGUMENT)
+
+ def _tokenize(self, value, index):
+ if index == 0 and not self._is_assign(value):
+ return ERROR
+ return Tokenizer._tokenize(self, value, index)
+
+
+class KeywordCall(Tokenizer):
+ _tokens = (KEYWORD, ARGUMENT)
+
+ def __init__(self, support_assign=True):
+ Tokenizer.__init__(self)
+ self._keyword_found = not support_assign
+ self._assigns = 0
+
+ def _tokenize(self, value, index):
+ if not self._keyword_found and self._is_assign(value):
+ self._assigns += 1
+ return SYNTAX # VariableTokenizer tokenizes this later.
+ if self._keyword_found:
+ return Tokenizer._tokenize(self, value, index - self._assigns)
+ self._keyword_found = True
+ return GherkinTokenizer().tokenize(value, KEYWORD)
+
+
+class GherkinTokenizer(object):
+ _gherkin_prefix = re.compile('^(Given|When|Then|And) ', re.IGNORECASE)
+
+ def tokenize(self, value, token):
+ match = self._gherkin_prefix.match(value)
+ if not match:
+ return [(value, token)]
+ end = match.end()
+ return [(value[:end], GHERKIN), (value[end:], token)]
+
+
+class TemplatedKeywordCall(Tokenizer):
+ _tokens = (ARGUMENT,)
+
+
+class ForLoop(Tokenizer):
+
+ def __init__(self):
+ Tokenizer.__init__(self)
+ self._in_arguments = False
+
+ def _tokenize(self, value, index):
+ token = self._in_arguments and ARGUMENT or SYNTAX
+ if value.upper() in ('IN', 'IN RANGE'):
+ self._in_arguments = True
+ return token
+
+
+class _Table(object):
+ _tokenizer_class = None
+
+ def __init__(self, prev_tokenizer=None):
+ self._tokenizer = self._tokenizer_class()
+ self._prev_tokenizer = prev_tokenizer
+ self._prev_values_on_row = []
+
+ def tokenize(self, value, index):
+ if self._continues(value, index):
+ self._tokenizer = self._prev_tokenizer
+ yield value, SYNTAX
+ else:
+ for value_and_token in self._tokenize(value, index):
+ yield value_and_token
+ self._prev_values_on_row.append(value)
+
+ def _continues(self, value, index):
+ return value == '...' and all(self._is_empty(t)
+ for t in self._prev_values_on_row)
+
+ def _is_empty(self, value):
+ return value in ('', '\\')
+
+ def _tokenize(self, value, index):
+ return self._tokenizer.tokenize(value)
+
+ def end_row(self):
+ self.__init__(prev_tokenizer=self._tokenizer)
+
+
+class UnknownTable(_Table):
+ _tokenizer_class = Comment
+
+ def _continues(self, value, index):
+ return False
+
+
+class VariableTable(_Table):
+ _tokenizer_class = Variable
+
+
+class SettingTable(_Table):
+ _tokenizer_class = Setting
+
+ def __init__(self, template_setter, prev_tokenizer=None):
+ _Table.__init__(self, prev_tokenizer)
+ self._template_setter = template_setter
+
+ def _tokenize(self, value, index):
+ if index == 0 and normalize(value) == 'testtemplate':
+ self._tokenizer = Setting(self._template_setter)
+ return _Table._tokenize(self, value, index)
+
+ def end_row(self):
+ self.__init__(self._template_setter, prev_tokenizer=self._tokenizer)
+
+
+class TestCaseTable(_Table):
+ _setting_class = TestCaseSetting
+ _test_template = None
+ _default_template = None
+
+ @property
+ def _tokenizer_class(self):
+ if self._test_template or (self._default_template and
+ self._test_template is not False):
+ return TemplatedKeywordCall
+ return KeywordCall
+
+ def _continues(self, value, index):
+ return index > 0 and _Table._continues(self, value, index)
+
+ def _tokenize(self, value, index):
+ if index == 0:
+ if value:
+ self._test_template = None
+ return GherkinTokenizer().tokenize(value, TC_KW_NAME)
+ if index == 1 and self._is_setting(value):
+ if self._is_template(value):
+ self._test_template = False
+ self._tokenizer = self._setting_class(self.set_test_template)
+ else:
+ self._tokenizer = self._setting_class()
+ if index == 1 and self._is_for_loop(value):
+ self._tokenizer = ForLoop()
+ if index == 1 and self._is_empty(value):
+ return [(value, SYNTAX)]
+ return _Table._tokenize(self, value, index)
+
+ def _is_setting(self, value):
+ return value.startswith('[') and value.endswith(']')
+
+ def _is_template(self, value):
+ return normalize(value) == '[template]'
+
+ def _is_for_loop(self, value):
+ return value.startswith(':') and normalize(value, remove=':') == 'for'
+
+ def set_test_template(self, template):
+ self._test_template = self._is_template_set(template)
+
+ def set_default_template(self, template):
+ self._default_template = self._is_template_set(template)
+
+ def _is_template_set(self, template):
+ return normalize(template) not in ('', '\\', 'none', '${empty}')
+
+
+class KeywordTable(TestCaseTable):
+ _tokenizer_class = KeywordCall
+ _setting_class = KeywordSetting
+
+ def _is_template(self, value):
+ return False
+
+
+# Following code copied directly from Robot Framework 2.7.5.
+
+class VariableSplitter:
+
+ def __init__(self, string, identifiers):
+ self.identifier = None
+ self.base = None
+ self.index = None
+ self.start = -1
+ self.end = -1
+ self._identifiers = identifiers
+ self._may_have_internal_variables = False
+ try:
+ self._split(string)
+ except ValueError:
+ pass
+ else:
+ self._finalize()
+
+ def get_replaced_base(self, variables):
+ if self._may_have_internal_variables:
+ return variables.replace_string(self.base)
+ return self.base
+
+ def _finalize(self):
+ self.identifier = self._variable_chars[0]
+ self.base = ''.join(self._variable_chars[2:-1])
+ self.end = self.start + len(self._variable_chars)
+ if self._has_list_variable_index():
+ self.index = ''.join(self._list_variable_index_chars[1:-1])
+ self.end += len(self._list_variable_index_chars)
+
+ def _has_list_variable_index(self):
+ return self._list_variable_index_chars\
+ and self._list_variable_index_chars[-1] == ']'
+
+ def _split(self, string):
+ start_index, max_index = self._find_variable(string)
+ self.start = start_index
+ self._open_curly = 1
+ self._state = self._variable_state
+ self._variable_chars = [string[start_index], '{']
+ self._list_variable_index_chars = []
+ self._string = string
+ start_index += 2
+ for index, char in enumerate(string[start_index:]):
+ index += start_index # Giving start to enumerate only in Py 2.6+
+ try:
+ self._state(char, index)
+ except StopIteration:
+ return
+ if index == max_index and not self._scanning_list_variable_index():
+ return
+
+ def _scanning_list_variable_index(self):
+ return self._state in [self._waiting_list_variable_index_state,
+ self._list_variable_index_state]
+
+ def _find_variable(self, string):
+ max_end_index = string.rfind('}')
+ if max_end_index == -1:
+ return ValueError('No variable end found')
+ if self._is_escaped(string, max_end_index):
+ return self._find_variable(string[:max_end_index])
+ start_index = self._find_start_index(string, 1, max_end_index)
+ if start_index == -1:
+ return ValueError('No variable start found')
+ return start_index, max_end_index
+
+ def _find_start_index(self, string, start, end):
+ index = string.find('{', start, end) - 1
+ if index < 0:
+ return -1
+ if self._start_index_is_ok(string, index):
+ return index
+ return self._find_start_index(string, index+2, end)
+
+ def _start_index_is_ok(self, string, index):
+ return string[index] in self._identifiers\
+ and not self._is_escaped(string, index)
+
+ def _is_escaped(self, string, index):
+ escaped = False
+ while index > 0 and string[index-1] == '\\':
+ index -= 1
+ escaped = not escaped
+ return escaped
+
+ def _variable_state(self, char, index):
+ self._variable_chars.append(char)
+ if char == '}' and not self._is_escaped(self._string, index):
+ self._open_curly -= 1
+ if self._open_curly == 0:
+ if not self._is_list_variable():
+ raise StopIteration
+ self._state = self._waiting_list_variable_index_state
+ elif char in self._identifiers:
+ self._state = self._internal_variable_start_state
+
+ def _is_list_variable(self):
+ return self._variable_chars[0] == '@'
+
+ def _internal_variable_start_state(self, char, index):
+ self._state = self._variable_state
+ if char == '{':
+ self._variable_chars.append(char)
+ self._open_curly += 1
+ self._may_have_internal_variables = True
+ else:
+ self._variable_state(char, index)
+
+ def _waiting_list_variable_index_state(self, char, index):
+ if char != '[':
+ raise StopIteration
+ self._list_variable_index_chars.append(char)
+ self._state = self._list_variable_index_state
+
+ def _list_variable_index_state(self, char, index):
+ self._list_variable_index_chars.append(char)
+ if char == ']':
+ raise StopIteration
diff --git a/pygments/lexers/asm.py b/pygments/lexers/asm.py
index c1d46bcb..a87ca9c2 100644
--- a/pygments/lexers/asm.py
+++ b/pygments/lexers/asm.py
@@ -240,8 +240,8 @@ class LlvmLexer(RegexLexer):
r'|linkonce_odr|weak|weak_odr|appending|dllimport|dllexport'
r'|common|default|hidden|protected|extern_weak|external'
r'|thread_local|zeroinitializer|undef|null|to|tail|target|triple'
- r'|deplibs|datalayout|volatile|nuw|nsw|exact|inbounds|align'
- r'|addrspace|section|alias|module|asm|sideeffect|gc|dbg'
+ r'|datalayout|volatile|nuw|nsw|nnan|ninf|nsz|arcp|fast|exact|inbounds'
+ r'|align|addrspace|section|alias|module|asm|sideeffect|gc|dbg'
r'|ccc|fastcc|coldcc|x86_stdcallcc|x86_fastcallcc|arm_apcscc'
r'|arm_aapcscc|arm_aapcs_vfpcc'
diff --git a/pygments/lexers/compiled.py b/pygments/lexers/compiled.py
index 66b782e5..f8055fc5 100644
--- a/pygments/lexers/compiled.py
+++ b/pygments/lexers/compiled.py
@@ -27,7 +27,8 @@ __all__ = ['CLexer', 'CppLexer', 'DLexer', 'DelphiLexer', 'ECLexer',
'DylanLexer', 'ObjectiveCLexer', 'FortranLexer', 'GLShaderLexer',
'PrologLexer', 'CythonLexer', 'ValaLexer', 'OocLexer', 'GoLexer',
'FelixLexer', 'AdaLexer', 'Modula2Lexer', 'BlitzMaxLexer',
- 'NimrodLexer', 'FantomLexer', 'RustLexer', 'CudaLexer', 'MonkeyLexer']
+ 'NimrodLexer', 'FantomLexer', 'RustLexer', 'CudaLexer', 'MonkeyLexer',
+ 'DylanLidLexer']
class CLexer(RegexLexer):
@@ -1057,40 +1058,176 @@ class DylanLexer(RegexLexer):
name = 'Dylan'
aliases = ['dylan']
- filenames = ['*.dylan', '*.dyl']
+ filenames = ['*.dylan', '*.dyl', '*.intr']
mimetypes = ['text/x-dylan']
- flags = re.DOTALL
+ flags = re.IGNORECASE
+
+ builtins = set([
+ 'subclass', 'abstract', 'block', 'concrete', 'constant', 'class',
+ 'compiler-open', 'compiler-sideways', 'domain', 'dynamic',
+ 'each-subclass', 'exception', 'exclude', 'function', 'generic',
+ 'handler', 'inherited', 'inline', 'inline-only', 'instance',
+ 'interface', 'import', 'keyword', 'library', 'macro', 'method',
+ 'module', 'open', 'primary', 'required', 'sealed', 'sideways',
+ 'singleton', 'slot', 'thread', 'variable', 'virtual'])
+
+ keywords = set([
+ 'above', 'afterwards', 'begin', 'below', 'by', 'case', 'cleanup',
+ 'create', 'define', 'else', 'elseif', 'end', 'export', 'finally',
+ 'for', 'from', 'if', 'in', 'let', 'local', 'otherwise', 'rename',
+ 'select', 'signal', 'then', 'to', 'unless', 'until', 'use', 'when',
+ 'while'])
+
+ operators = set([
+ '~', '+', '-', '*', '|', '^', '=', '==', '~=', '~==', '<', '<=',
+ '>', '>=', '&', '|'])
+
+ functions = set([
+ 'abort', 'abs', 'add', 'add!', 'add-method', 'add-new', 'add-new!',
+ 'all-superclasses', 'always', 'any?', 'applicable-method?', 'apply',
+ 'aref', 'aref-setter', 'as', 'as-lowercase', 'as-lowercase!',
+ 'as-uppercase', 'as-uppercase!', 'ash', 'backward-iteration-protocol',
+ 'break', 'ceiling', 'ceiling/', 'cerror', 'check-type', 'choose',
+ 'choose-by', 'complement', 'compose', 'concatenate', 'concatenate-as',
+ 'condition-format-arguments', 'condition-format-string', 'conjoin',
+ 'copy-sequence', 'curry', 'default-handler', 'dimension', 'dimensions',
+ 'direct-subclasses', 'direct-superclasses', 'disjoin', 'do',
+ 'do-handlers', 'element', 'element-setter', 'empty?', 'error', 'even?',
+ 'every?', 'false-or', 'fill!', 'find-key', 'find-method', 'first',
+ 'first-setter', 'floor', 'floor/', 'forward-iteration-protocol',
+ 'function-arguments', 'function-return-values',
+ 'function-specializers', 'gcd', 'generic-function-mandatory-keywords',
+ 'generic-function-methods', 'head', 'head-setter', 'identity',
+ 'initialize', 'instance?', 'integral?', 'intersection',
+ 'key-sequence', 'key-test', 'last', 'last-setter', 'lcm', 'limited',
+ 'list', 'logand', 'logbit?', 'logior', 'lognot', 'logxor', 'make',
+ 'map', 'map-as', 'map-into', 'max', 'member?', 'merge-hash-codes',
+ 'min', 'modulo', 'negative', 'negative?', 'next-method',
+ 'object-class', 'object-hash', 'odd?', 'one-of', 'pair', 'pop',
+ 'pop-last', 'positive?', 'push', 'push-last', 'range', 'rank',
+ 'rcurry', 'reduce', 'reduce1', 'remainder', 'remove', 'remove!',
+ 'remove-duplicates', 'remove-duplicates!', 'remove-key!',
+ 'remove-method', 'replace-elements!', 'replace-subsequence!',
+ 'restart-query', 'return-allowed?', 'return-description',
+ 'return-query', 'reverse', 'reverse!', 'round', 'round/',
+ 'row-major-index', 'second', 'second-setter', 'shallow-copy',
+ 'signal', 'singleton', 'size', 'size-setter', 'slot-initialized?',
+ 'sort', 'sort!', 'sorted-applicable-methods', 'subsequence-position',
+ 'subtype?', 'table-protocol', 'tail', 'tail-setter', 'third',
+ 'third-setter', 'truncate', 'truncate/', 'type-error-expected-type',
+ 'type-error-value', 'type-for-copy', 'type-union', 'union', 'values',
+ 'vector', 'zero?'])
+
+ valid_name = '\\\\?[a-zA-Z0-9' + re.escape('!&*<>|^$%@_-+~?/=') + ']+'
+
+ def get_tokens_unprocessed(self, text):
+ for index, token, value in RegexLexer.get_tokens_unprocessed(self, text):
+ if token is Name:
+ if value in self.builtins:
+ yield index, Name.Builtin, value
+ continue
+ if value in self.keywords:
+ yield index, Keyword, value
+ continue
+ if value in self.functions:
+ yield index, Name.Builtin, value
+ continue
+ if value in self.operators:
+ yield index, Operator, value
+ continue
+ yield index, token, value
tokens = {
'root': [
- (r'\b(subclass|abstract|block|c(on(crete|stant)|lass)|domain'
- r'|ex(c(eption|lude)|port)|f(unction(al)?)|generic|handler'
- r'|i(n(herited|line|stance|terface)|mport)|library|m(acro|ethod)'
- r'|open|primary|sealed|si(deways|ngleton)|slot'
- r'|v(ariable|irtual))\b', Name.Builtin),
- (r'<\w+>', Keyword.Type),
+ # Whitespace
+ (r'\s+', Text),
+
+ # single line comment
+ (r'//.*?\n', Comment.Single),
+
+ # lid header
+ (r'([A-Za-z0-9-]+)(:)([ \t]*)(.*(?:\n[ \t].+)*)',
+ bygroups(Name.Attribute, Operator, Text, String)),
+
+ ('', Text, 'code') # no header match, switch to code
+ ],
+ 'code': [
+ # Whitespace
+ (r'\s+', Text),
+
+ # single line comment
(r'//.*?\n', Comment.Single),
- (r'/\*[\w\W]*?\*/', Comment.Multiline),
+
+ # multi-line comment
+ (r'/\*', Comment.Multiline, 'comment'),
+
+ # strings and characters
(r'"', String, 'string'),
(r"'(\\.|\\[0-7]{1,3}|\\x[a-fA-F0-9]{1,2}|[^\\\'\n])'", String.Char),
- (r'=>|\b(a(bove|fterwards)|b(e(gin|low)|y)|c(ase|leanup|reate)'
- r'|define|else(if)?|end|f(inally|or|rom)|i[fn]|l(et|ocal)|otherwise'
- r'|rename|s(elect|ignal)|t(hen|o)|u(n(less|til)|se)|wh(en|ile))\b',
- Keyword),
- (r'([ \t])([!\$%&\*\/:<=>\?~_^a-zA-Z0-9.+\-]*:)',
- bygroups(Text, Name.Variable)),
- (r'([ \t]*)(\S+[^:])([ \t]*)(\()([ \t]*)',
- bygroups(Text, Name.Function, Text, Punctuation, Text)),
- (r'-?[0-9.]+', Number),
- (r'[(),;]', Punctuation),
- (r'\$[a-zA-Z0-9-]+', Name.Constant),
- (r'[!$%&*/:<>=?~^.+\[\]{}-]+', Operator),
- (r'\s+', Text),
- (r'#"[a-zA-Z0-9-]+"', Keyword),
+
+ # binary integer
+ (r'#[bB][01]+', Number),
+
+ # octal integer
+ (r'#[oO][0-7]+', Number.Oct),
+
+ # floating point
+ (r'[-+]?(\d*\.\d+(e[-+]?\d+)?|\d+(\.\d*)?e[-+]?\d+)', Number.Float),
+
+ # decimal integer
+ (r'[-+]?\d+', Number.Integer),
+
+ # hex integer
+ (r'#[xX][0-9a-fA-F]+', Number.Hex),
+
+ # Macro parameters
+ (r'(\?' + valid_name + ')(:)(token|name|variable|expression|body|case-body|\*)',
+ bygroups(Name.Tag, Operator, Name.Builtin)),
+ (r'(\?)(:)(token|name|variable|expression|body|case-body|\*)',
+ bygroups(Name.Tag, Operator, Name.Builtin)),
+ (r'\?' + valid_name, Name.Tag),
+
+ # Punctuation
+ (r'(=>|::|#\(|#\[|##|\?|\?\?|\?=|[(){}\[\],\.;])', Punctuation),
+
+ # Most operators are picked up as names and then re-flagged.
+ # This one isn't valid in a name though, so we pick it up now.
+ (r':=', Operator),
+
+ # Pick up #t / #f before we match other stuff with #.
+ (r'#[tf]', Literal),
+
+ # #"foo" style keywords
+ (r'#"', String.Symbol, 'keyword'),
+
+ # #rest, #key, #all-keys, etc.
(r'#[a-zA-Z0-9-]+', Keyword),
- (r'#(\(|\[)', Punctuation),
- (r'[a-zA-Z0-9-_]+', Name.Variable),
+
+ # required-init-keyword: style keywords.
+ (valid_name + ':', Keyword),
+
+ # class names
+ (r'<' + valid_name + '>', Name.Class),
+
+ # define variable forms.
+ (r'\*' + valid_name + '\*', Name.Variable.Global),
+
+ # define constant forms.
+ (r'\$' + valid_name, Name.Constant),
+
+ # everything else. We re-flag some of these in the method above.
+ (valid_name, Name),
+ ],
+ 'comment': [
+ (r'[^*/]', Comment.Multiline),
+ (r'/\*', Comment.Multiline, '#push'),
+ (r'\*/', Comment.Multiline, '#pop'),
+ (r'[*/]', Comment.Multiline)
+ ],
+ 'keyword': [
+ (r'"', String.Symbol, '#pop'),
+ (r'[^\\"]+', String.Symbol), # all other characters
],
'string': [
(r'"', String, '#pop'),
@@ -1098,7 +1235,36 @@ class DylanLexer(RegexLexer):
(r'[^\\"\n]+', String), # all other characters
(r'\\\n', String), # line continuation
(r'\\', String), # stray backslash
- ],
+ ]
+ }
+
+
+class DylanLidLexer(RegexLexer):
+ """
+ For Dylan LID (Library Interchange Definition) files.
+
+ *New in Pygments 1.6.*
+ """
+
+ name = 'DylanLID'
+ aliases = ['dylan-lid', 'lid']
+ filenames = ['*.lid', '*.hdp']
+ mimetypes = ['text/x-dylan-lid']
+
+ flags = re.IGNORECASE
+
+ tokens = {
+ 'root': [
+ # Whitespace
+ (r'\s+', Text),
+
+ # single line comment
+ (r'//.*?\n', Comment.Single),
+
+ # lid header
+ (r'(.*?)(:)([ \t]*)(.*(?:\n[ \t].+)*)',
+ bygroups(Name.Attribute, Operator, Text, String)),
+ ]
}
@@ -2943,13 +3109,13 @@ class RustLexer(RegexLexer):
(r'/[*](.|\n)*?[*]/', Comment.Multiline),
# Keywords
- (r'(alt|as|assert|be|break|check|claim|class|const'
- r'|cont|copy|crust|do|else|enum|export|fail'
- r'|false|fn|for|if|iface|impl|import|let|log'
- r'|loop|mod|mut|native|pure|resource|ret|true'
- r'|type|unsafe|use|white|note|bind|prove|unchecked'
- r'|with|syntax|u8|u16|u32|u64|i8|i16|i32|i64|uint'
- r'|int|f32|f64)\b', Keyword),
+ (r'(as|assert|break|const'
+ r'|copy|do|else|enum|extern|fail'
+ r'|false|fn|for|if|impl|let|log'
+ r'|loop|match|mod|move|mut|once|priv|pub|pure'
+ r'|ref|return|static|struct|trait|true|type|unsafe|use|while'
+ r'|u8|u16|u32|u64|i8|i16|i32|i64|uint'
+ r'|int|float|f32|f64|str)\b', Keyword),
# Character Literal
(r"""'(\\['"\\nrt]|\\x[0-9a-fA-F]{2}|\\[0-7]{1,3}"""
@@ -2978,8 +3144,8 @@ class RustLexer(RegexLexer):
(r'#\[', Comment.Preproc, 'attribute['),
(r'#\(', Comment.Preproc, 'attribute('),
# Macros
- (r'#[A-Za-z_][A-Za-z0-9_]*\[', Comment.Preproc, 'attribute['),
- (r'#[A-Za-z_][A-Za-z0-9_]*\(', Comment.Preproc, 'attribute('),
+ (r'[A-Za-z_][A-Za-z0-9_]*!\[', Comment.Preproc, 'attribute['),
+ (r'[A-Za-z_][A-Za-z0-9_]*!\(', Comment.Preproc, 'attribute('),
],
'number_lit': [
(r'(([ui](8|16|32|64)?)|(f(32|64)?))?', Keyword, '#pop'),
diff --git a/pygments/lexers/jvm.py b/pygments/lexers/jvm.py
index 83eb02f5..d337c3f2 100644
--- a/pygments/lexers/jvm.py
+++ b/pygments/lexers/jvm.py
@@ -675,7 +675,7 @@ class ClojureLexer(RegexLexer):
(r'::?' + valid_name, String.Symbol),
# special operators
- (r'~@|[`\'#^~&]', Operator),
+ (r'~@|[`\'#^~&@]', Operator),
# highlight the special forms
(_multi_escape(special_forms), Keyword),
diff --git a/pygments/lexers/math.py b/pygments/lexers/math.py
index 0e78bb49..5f6bb7c8 100644
--- a/pygments/lexers/math.py
+++ b/pygments/lexers/math.py
@@ -11,6 +11,7 @@
import re
+from pygments.util import shebang_matches
from pygments.lexer import Lexer, RegexLexer, bygroups, include, \
combined, do_insertions
from pygments.token import Comment, String, Punctuation, Keyword, Name, \
@@ -97,8 +98,8 @@ class JuliaLexer(RegexLexer):
(r'[a-zA-Z_][a-zA-Z0-9_]*', Name),
# numbers
- (r'(\d+\.\d*|\d*\.\d+)([eE][+-]?[0-9]+)?', Number.Float),
- (r'\d+[eE][+-]?[0-9]+', Number.Float),
+ (r'(\d+\.\d*|\d*\.\d+)([eEf][+-]?[0-9]+)?', Number.Float),
+ (r'\d+[eEf][+-]?[0-9]+', Number.Float),
(r'0b[01]+', Number.Binary),
(r'0o[0-7]+', Number.Oct),
(r'0x[a-fA-F0-9]+', Number.Hex),
@@ -342,6 +343,10 @@ class MatlabLexer(RegexLexer):
# (not great, but handles common cases...)
(r'(?<=[\w\)\]])\'', Operator),
+ (r'(\d+\.\d*|\d*\.\d+)([eEf][+-]?[0-9]+)?', Number.Float),
+ (r'\d+[eEf][+-]?[0-9]+', Number.Float),
+ (r'\d+', Number.Integer),
+
(r'(?<![\w\)\]])\'', String, 'string'),
('[a-zA-Z_][a-zA-Z0-9_]*', Name),
(r'.', Text),
@@ -788,6 +793,10 @@ class OctaveLexer(RegexLexer):
(r'"[^"]*"', String),
+ (r'(\d+\.\d*|\d*\.\d+)([eEf][+-]?[0-9]+)?', Number.Float),
+ (r'\d+[eEf][+-]?[0-9]+', Number.Float),
+ (r'\d+', Number.Integer),
+
# quote can be transpose, instead of string:
# (not great, but handles common cases...)
(r'(?<=[\w\)\]])\'', Operator),
@@ -859,6 +868,10 @@ class ScilabLexer(RegexLexer):
(r'(?<=[\w\)\]])\'', Operator),
(r'(?<![\w\)\]])\'', String, 'string'),
+ (r'(\d+\.\d*|\d*\.\d+)([eEf][+-]?[0-9]+)?', Number.Float),
+ (r'\d+[eEf][+-]?[0-9]+', Number.Float),
+ (r'\d+', Number.Integer),
+
('[a-zA-Z_][a-zA-Z0-9_]*', Name),
(r'.', Text),
],
diff --git a/pygments/lexers/other.py b/pygments/lexers/other.py
index 8cef26ca..35a8a813 100644
--- a/pygments/lexers/other.py
+++ b/pygments/lexers/other.py
@@ -19,6 +19,7 @@ from pygments.util import get_bool_opt
from pygments.lexers.web import HtmlLexer
from pygments.lexers._openedgebuiltins import OPENEDGEKEYWORDS
+from pygments.lexers._robotframeworklexer import RobotFrameworkLexer
# backwards compatibility
from pygments.lexers.sql import SqlLexer, MySqlLexer, SqliteConsoleLexer
@@ -32,7 +33,8 @@ __all__ = ['BrainfuckLexer', 'BefungeLexer', 'RedcodeLexer', 'MOOCodeLexer',
'AutohotkeyLexer', 'GoodDataCLLexer', 'MaqlLexer', 'ProtoBufLexer',
'HybrisLexer', 'AwkLexer', 'Cfengine3Lexer', 'SnobolLexer',
'ECLLexer', 'UrbiscriptLexer', 'OpenEdgeLexer', 'BroLexer',
- 'MscgenLexer', 'KconfigLexer', 'VGLLexer', 'SourcePawnLexer', 'NSISLexer']
+ 'MscgenLexer', 'KconfigLexer', 'VGLLexer', 'SourcePawnLexer',
+ 'RobotFrameworkLexer', 'PuppetLexer', 'NSISLexer']
class ECLLexer(RegexLexer):
@@ -1255,7 +1257,8 @@ class ModelicaLexer(RegexLexer):
],
'classes': [
(r'(block|class|connector|function|model|package|'
- r'record|type)\b', Name.Class),
+ r'record|type)(\s+)([A-Za-z_]+)',
+ bygroups(Keyword, Text, Name.Class))
],
'string': [
(r'"', String, '#pop'),
@@ -2771,7 +2774,7 @@ class OpenEdgeLexer(RegexLexer):
keywords = (r'(?i)(^|(?<=[^0-9a-z_\-]))(' +
r'|'.join(OPENEDGEKEYWORDS) +
- r')\s*($|(?=[^0-9a-z_\-]))')
+ r')\s*($|(?=[^0-9a-z_\-]))')
tokens = {
'root': [
(r'/\*', Comment.Multiline, 'comment'),
@@ -3156,10 +3159,91 @@ class SourcePawnLexer(RegexLexer):
if value in self.SM_TYPES:
token = Keyword.Type
elif value in self._functions:
- tokens = Name.Builtin
+ token = Name.Builtin
yield index, token, value
+class PuppetLexer(RegexLexer):
+ """
+ For `Puppet <http://puppetlabs.com/>`__ configuration DSL.
+
+ *New in Pygments 1.6.*
+ """
+ name = 'Puppet'
+ aliases = ['puppet']
+ filenames = ['*.pp']
+
+ tokens = {
+ 'root': [
+ include('comments'),
+ include('keywords'),
+ include('names'),
+ include('numbers'),
+ include('operators'),
+ include('strings'),
+
+ (r'[]{}:(),;[]', Punctuation),
+ (r'[^\S\n]+', Text),
+ ],
+
+ 'comments': [
+ (r'\s*#.*$', Comment),
+ (r'/(\\\n)?[*](.|\n)*?[*](\\\n)?/', Comment.Multiline),
+ ],
+
+ 'operators': [
+ (r'(=>|\?|<|>|=|\+|-|/|\*|~|!|\|)', Operator),
+ (r'(in|and|or|not)\b', Operator.Word),
+ ],
+
+ 'names': [
+ ('[a-zA-Z_][a-zA-Z0-9_]*', Name.Attribute),
+ (r'(\$\S+)(\[)(\S+)(\])', bygroups(Name.Variable, Punctuation,
+ String, Punctuation)),
+ (r'\$\S+', Name.Variable),
+ ],
+
+ 'numbers': [
+ # Copypasta from the Python lexer
+ (r'(\d+\.\d*|\d*\.\d+)([eE][+-]?[0-9]+)?j?', Number.Float),
+ (r'\d+[eE][+-]?[0-9]+j?', Number.Float),
+ (r'0[0-7]+j?', Number.Oct),
+ (r'0[xX][a-fA-F0-9]+', Number.Hex),
+ (r'\d+L', Number.Integer.Long),
+ (r'\d+j?', Number.Integer)
+ ],
+
+ 'keywords': [
+ # Left out 'group' and 'require'
+ # Since they're often used as attributes
+ (r'(?i)(absent|alert|alias|audit|augeas|before|case|check|class|'
+ r'computer|configured|contained|create_resources|crit|cron|debug|'
+ r'default|define|defined|directory|else|elsif|emerg|err|exec|'
+ r'extlookup|fail|false|file|filebucket|fqdn_rand|generate|host|if|'
+ r'import|include|info|inherits|inline_template|installed|'
+ r'interface|k5login|latest|link|loglevel|macauthorization|'
+ r'mailalias|maillist|mcx|md5|mount|mounted|nagios_command|'
+ r'nagios_contact|nagios_contactgroup|nagios_host|'
+ r'nagios_hostdependency|nagios_hostescalation|nagios_hostextinfo|'
+ r'nagios_hostgroup|nagios_service|nagios_servicedependency|'
+ r'nagios_serviceescalation|nagios_serviceextinfo|'
+ r'nagios_servicegroup|nagios_timeperiod|node|noop|notice|notify|'
+ r'package|present|purged|realize|regsubst|resources|role|router|'
+ r'running|schedule|scheduled_task|search|selboolean|selmodule|'
+ r'service|sha1|shellquote|split|sprintf|ssh_authorized_key|sshkey|'
+ r'stage|stopped|subscribe|tag|tagged|template|tidy|true|undef|'
+ r'unmounted|user|versioncmp|vlan|warning|yumrepo|zfs|zone|'
+ r'zpool)\b', Keyword),
+ ],
+
+ 'strings': [
+ (r'"([^"])*"', String),
+ (r'\'([^\'])*\'', String),
+ ],
+
+ }
+
+
class NSISLexer(RegexLexer):
"""
For `NSIS <http://nsis.sourceforge.net/>`_ scripts.
@@ -3187,7 +3271,8 @@ class NSISLexer(RegexLexer):
('.', Text),
],
'basic': [
- (r'(\n)(Function)(\s+)([\.\_a-zA-Z][\.\_a-zA-Z0-9]*)\b',bygroups(Text, Keyword, Text, Name.Function)),
+ (r'(\n)(Function)(\s+)([\.\_a-zA-Z][\.\_a-zA-Z0-9]*)\b',
+ bygroups(Text, Keyword, Text, Name.Function)),
(r'\b([_a-zA-Z][_a-zA-Z0-9]*)(::)([a-zA-Z][a-zA-Z0-9]*)\b',
bygroups(Keyword.Namespace, Punctuation, Name.Function)),
(r'\b([_a-zA-Z][_a-zA-Z0-9]*)(:)', bygroups(Name.Label, Punctuation)),
diff --git a/pygments/lexers/text.py b/pygments/lexers/text.py
index ca50665e..7b3624d0 100644
--- a/pygments/lexers/text.py
+++ b/pygments/lexers/text.py
@@ -25,7 +25,7 @@ __all__ = ['IniLexer', 'PropertiesLexer', 'SourcesListLexer', 'BaseMakefileLexer
'RstLexer', 'VimLexer', 'GettextLexer', 'SquidConfLexer',
'DebianControlLexer', 'DarcsPatchLexer', 'YamlLexer',
'LighttpdConfLexer', 'NginxConfLexer', 'CMakeLexer', 'HttpLexer',
- 'PyPyLogLexer', 'RegeditLexer']
+ 'PyPyLogLexer', 'RegeditLexer', 'HxmlLexer']
class IniLexer(RegexLexer):
@@ -1749,8 +1749,8 @@ class PyPyLogLexer(RegexLexer):
],
"jit-log": [
(r"\[\w+\] jit-log-.*?}$", Keyword, "#pop"),
-
(r"^\+\d+: ", Comment),
+ (r"--end of the loop--", Comment),
(r"[ifp]\d+", Name),
(r"ptr\d+", Name),
(r"(\()(\w+(?:\.\w+)?)(\))",
@@ -1760,7 +1760,7 @@ class PyPyLogLexer(RegexLexer):
(r"-?\d+", Number.Integer),
(r"'.*'", String),
(r"(None|descr|ConstClass|ConstPtr|TargetToken)", Name),
- (r"<.*?>", Name.Builtin),
+ (r"<.*?>+", Name.Builtin),
(r"(label|debug_merge_point|jump|finish)", Name.Class),
(r"(int_add_ovf|int_add|int_sub_ovf|int_sub|int_mul_ovf|int_mul|"
r"int_floordiv|int_mod|int_lshift|int_rshift|int_and|int_or|"
@@ -1800,3 +1800,44 @@ class PyPyLogLexer(RegexLexer):
(r"#.*?$", Comment),
],
}
+
+
+class HxmlLexer(RegexLexer):
+ """
+ Lexer for `haXe build <http://haxe.org/doc/compiler>`_ files.
+
+ *New in Pygments 1.6.*
+ """
+ name = 'Hxml'
+ aliases = ['haxeml', 'hxml']
+ filenames = ['*.hxml']
+
+ tokens = {
+ 'root': [
+ # Seperator
+ (r'(--)(next)', bygroups(Punctuation, Generic.Heading)),
+ # Compiler switches with one dash
+ (r'(-)(prompt|debug|v)', bygroups(Punctuation, Keyword.Keyword)),
+ # Compilerswitches with two dashes
+ (r'(--)(neko-source|flash-strict|flash-use-stage|no-opt|no-traces|'
+ r'no-inline|times|no-output)', bygroups(Punctuation, Keyword)),
+ # Targets and other options that take an argument
+ (r'(-)(cpp|js|neko|x|as3|swf9?|swf-lib|php|xml|main|lib|D|resource|'
+ r'cp|cmd)( +)(.+)',
+ bygroups(Punctuation, Keyword, Whitespace, String)),
+ # Options that take only numerical arguments
+ (r'(-)(swf-version)( +)(\d+)',
+ bygroups(Punctuation, Keyword, Number.Integer)),
+ # An Option that defines the size, the fps and the background
+ # color of an flash movie
+ (r'(-)(swf-header)( +)(\d+)(:)(\d+)(:)(\d+)(:)([A-Fa-f0-9]{6})',
+ bygroups(Punctuation, Keyword, Whitespace, Number.Integer,
+ Punctuation, Number.Integer, Punctuation, Number.Integer,
+ Punctuation, Number.Hex)),
+ # options with two dashes that takes arguments
+ (r'(--)(js-namespace|php-front|php-lib|remap|gen-hx-classes)( +)'
+ r'(.+)', bygroups(Punctuation, Keyword, Whitespace, String)),
+ # Single line comment, multiline ones are not allowed.
+ (r'#.*', Comment.Single)
+ ]
+ }
diff --git a/pygments/lexers/web.py b/pygments/lexers/web.py
index 79245d34..c6e58f01 100644
--- a/pygments/lexers/web.py
+++ b/pygments/lexers/web.py
@@ -1762,7 +1762,7 @@ class ScssLexer(RegexLexer):
(r'(@include)( [\w-]+)', bygroups(Keyword, Name.Decorator), 'value'),
(r'@extend', Keyword, 'selector'),
(r'@[a-z0-9_-]+', Keyword, 'selector'),
- (r'(\$[\w-]\w*)([ \t]*:)', bygroups(Name.Variable, Operator), 'value'),
+ (r'(\$[\w-]*\w)([ \t]*:)', bygroups(Name.Variable, Operator), 'value'),
(r'(?=[^;{}][;}])', Name.Attribute, 'attr'),
(r'(?=[^;{}:]+:[^a-z])', Name.Attribute, 'attr'),
(r'', Text, 'selector'),
diff --git a/tests/examplefiles/BOM.js b/tests/examplefiles/BOM.js
new file mode 100644
index 00000000..930599c1
--- /dev/null
+++ b/tests/examplefiles/BOM.js
@@ -0,0 +1 @@
+/* There is a BOM at the beginning of this file. */ \ No newline at end of file
diff --git a/tests/examplefiles/classes.dylan b/tests/examplefiles/classes.dylan
index 6dd55ff2..7bb88faa 100644
--- a/tests/examplefiles/classes.dylan
+++ b/tests/examplefiles/classes.dylan
@@ -1,12 +1,26 @@
+module: sample
+comment: for make sure that does not highlight per word.
+ and it continues on to the next line.
+
define class <car> (<object>)
slot serial-number :: <integer> = unique-serial-number();
- slot model-name :: <string>,
+ constant slot model-name :: <string>,
required-init-keyword: model:;
- slot has-sunroof? :: <boolean>,
+ each-subclass slot has-sunroof? :: <boolean>,
init-keyword: sunroof?:,
init-value: #f;
+ keyword foo:;
+ required keyword bar:;
end class <car>;
+define class <flying-car> (<car>)
+end class <flying-car>;
+
+let flying-car = make(<flying-car>);
+let car? :: <car?> = #f;
+let prefixed-car :: <vehicles/car> = #f;
+let model :: <car-911> = #f;
+
define constant $empty-string = "";
define constant $escaped-backslash = '\\';
define constant $escaped-single-quote = '\'';
@@ -31,10 +45,79 @@ define method foo() => _ :: <boolean>;
#t
end method;
-define method \+()
-end;
+define method \+
+ (offset1 :: <time-offset>, offset2 :: <time-offset>)
+ => (sum :: <time-offset>)
+ let sum = offset1.total-seconds + offset2.total-seconds;
+ make(<time-offset>, total-seconds: sum);
+end method \+;
+
+define method bar ()
+ 1 | 2 & 3
+end
+
+if (bar)
+ 1
+elseif (foo)
+ 2
+else
+ 3
+end if;
+
+select (foo by instance?)
+ <integer> => 1
+ otherwise => 3
+end select;
+
+/* multi
+ line
+ comment
+*/
+
+/* multi line comments
+ /* can be */
+ nested */
define constant $symbol = #"hello";
define variable *vector* = #[3.5, 5]
define constant $list = #(1, 2);
define constant $pair = #(1 . "foo")
+
+let octal-number = #o238;
+let hex-number = #x3890ADEF;
+let binary-number = #b1010;
+let float-exponent = 3.5e10;
+
+block (return)
+ with-lock (lock)
+ return();
+ end;
+exception (e :: <error>)
+ format-out("Oh no");
+cleanup
+ return();
+afterwards
+ format-out("Hello");
+end;
+
+define macro repeat
+ { repeat ?:body end }
+ => { block (?=stop!)
+ local method again() ?body; again() end;
+ again();
+ end }
+end macro repeat;
+
+define macro with-decoded-seconds
+ {
+ with-decoded-seconds
+ (?max:variable, ?min:variable, ?sec:variable = ?time:expression)
+ ?:body
+ end
+ }
+ => {
+ let (?max, ?min, ?sec) = decode-total-seconds(?time);
+ ?body
+ }
+end macro;
+
diff --git a/tests/examplefiles/nanomsg.intr b/tests/examplefiles/nanomsg.intr
new file mode 100644
index 00000000..d21f62cc
--- /dev/null
+++ b/tests/examplefiles/nanomsg.intr
@@ -0,0 +1,95 @@
+module: nanomsg
+synopsis: generated bindings for the nanomsg library
+author: Bruce Mitchener, Jr.
+copyright: See LICENSE file in this distribution.
+
+define simple-C-mapped-subtype <C-buffer-offset> (<C-char*>)
+ export-map <machine-word>, export-function: identity;
+end;
+
+define interface
+ #include {
+ "sp/sp.h",
+ "sp/fanin.h",
+ "sp/inproc.h",
+ "sp/pair.h",
+ "sp/reqrep.h",
+ "sp/survey.h",
+ "sp/fanout.h",
+ "sp/ipc.h",
+ "sp/pubsub.h",
+ "sp/tcp.h"
+ },
+
+ exclude: {
+ "SP_HAUSNUMERO",
+ "SP_PAIR_ID",
+ "SP_PUBSUB_ID",
+ "SP_REQREP_ID",
+ "SP_FANIN_ID",
+ "SP_FANOUT_ID",
+ "SP_SURVEY_ID"
+ },
+
+ equate: {"char *" => <c-string>},
+
+ rename: {
+ "sp_recv" => %sp-recv,
+ "sp_send" => %sp-send,
+ "sp_setsockopt" => %sp-setsockopt
+ };
+
+ function "sp_version",
+ output-argument: 1,
+ output-argument: 2,
+ output-argument: 3;
+
+ function "sp_send",
+ map-argument: { 2 => <C-buffer-offset> };
+
+ function "sp_recv",
+ map-argument: { 2 => <C-buffer-offset> };
+
+end interface;
+
+// Function for adding the base address of the repeated slots of a <buffer>
+// to an offset and returning the result as a <machine-word>. This is
+// necessary for passing <buffer> contents across the FFI.
+
+define function buffer-offset
+ (the-buffer :: <buffer>, data-offset :: <integer>)
+ => (result-offset :: <machine-word>)
+ u%+(data-offset,
+ primitive-wrap-machine-word
+ (primitive-repeated-slot-as-raw
+ (the-buffer, primitive-repeated-slot-offset(the-buffer))))
+end function;
+
+define inline function sp-send (socket :: <integer>, data :: <buffer>, flags :: <integer>) => (res :: <integer>)
+ %sp-send(socket, buffer-offset(data, 0), data.size, flags)
+end;
+
+define inline function sp-recv (socket :: <integer>, data :: <buffer>, flags :: <integer>) => (res :: <integer>)
+ %sp-recv(socket, buffer-offset(data, 0), data.size, flags);
+end;
+
+define inline method sp-setsockopt (socket :: <integer>, level :: <integer>, option :: <integer>, value :: <integer>)
+ with-stack-structure (int :: <C-int*>)
+ pointer-value(int) := value;
+ let setsockopt-result =
+ %sp-setsockopt(socket, level, option, int, size-of(<C-int*>));
+ if (setsockopt-result < 0)
+ // Check error!
+ end;
+ setsockopt-result
+ end;
+end;
+
+define inline method sp-setsockopt (socket :: <integer>, level :: <integer>, option :: <integer>, data :: <byte-string>)
+ let setsockopt-result =
+ %sp-setsockopt(socket, level, option, as(<c-string>, data), data.size);
+ if (setsockopt-result < 0)
+ // Check error!
+ end;
+ setsockopt-result
+end;
diff --git a/tests/examplefiles/robotframework.txt b/tests/examplefiles/robotframework.txt
new file mode 100644
index 00000000..63ba63e6
--- /dev/null
+++ b/tests/examplefiles/robotframework.txt
@@ -0,0 +1,39 @@
+*** Settings ***
+Documentation Simple example demonstrating syntax highlighting.
+Library ExampleLibrary
+Test Setup Keyword argument argument with ${VARIABLE}
+
+*** Variables ***
+${VARIABLE} Variable value
+@{LIST} List variable here
+
+*** Test Cases ***
+Keyword-driven example
+ Initialize System
+ Do Something
+ Result Should Be 42
+ [Teardown] Cleanup System
+
+Data-driven example
+ [Template] Keyword
+ argument1 argument2
+ argument ${VARIABLE}
+ @{LIST}
+
+Gherkin
+ Given system is initialized
+ When something is done
+ Then result should be "42"
+
+| Pipes |
+| | [Documentation] | Also pipe separated format is supported. |
+| | Log | As this example demonstrates. |
+
+*** Keywords ***
+Result Should Be
+ [Arguments] ${expected}
+ ${actual} = Get Value
+ Should be Equal ${actual} ${expected}
+
+Then result should be "${expected}"
+ Result Should Be ${expected}
diff --git a/tests/examplefiles/rust_example.rs b/tests/examplefiles/rust_example.rs
index af791fbc..1c0a70c3 100644
--- a/tests/examplefiles/rust_example.rs
+++ b/tests/examplefiles/rust_example.rs
@@ -1,743 +1,233 @@
-
-#[doc = "json serialization"];
-
-import result::{result, ok, err};
-import io;
-import io::{reader_util, writer_util};
-import map;
-import map::hashmap;
-
-export json;
-export error;
-export to_writer;
-export to_str;
-export from_reader;
-export from_str;
-export eq;
-
-export num;
-export string;
-export boolean;
-export list;
-export dict;
-export null;
-
-#[doc = "Represents a json value"]
-enum json {
- num(float),
- string(str),
- boolean(bool),
- list([json]),
- dict(map::hashmap<str,json>),
- null,
+// Copyright 2012 The Rust Project Developers. See the COPYRIGHT
+// file at the top-level directory of this distribution and at
+// http://rust-lang.org/COPYRIGHT.
+//
+// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
+// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
+// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
+// option. This file may not be copied, modified, or distributed
+// except according to those terms.
+
+// based on:
+// http://shootout.alioth.debian.org/u32/benchmark.php?test=nbody&lang=java
+
+extern mod std;
+
+use core::os;
+
+// Using sqrt from the standard library is way slower than using libc
+// directly even though std just calls libc, I guess it must be
+// because the the indirection through another dynamic linker
+// stub. Kind of shocking. Might be able to make it faster still with
+// an llvm intrinsic.
+#[nolink]
+extern mod libc {
+ #[legacy_exports];
+ fn sqrt(n: float) -> float;
}
-type error = {
- line: uint,
- col: uint,
- msg: str,
-};
-
-#[doc = "Serializes a json value into a io::writer"]
-fn to_writer(wr: io::writer, j: json) {
- alt j {
- num(n) { wr.write_str(float::to_str(n, 6u)); }
- string(s) {
- wr.write_char('"');
- let mut escaped = "";
- str::chars_iter(s) { |c|
- alt c {
- '"' { escaped += "\\\""; }
- '\\' { escaped += "\\\\"; }
- '\x08' { escaped += "\\b"; }
- '\x0c' { escaped += "\\f"; }
- '\n' { escaped += "\\n"; }
- '\r' { escaped += "\\r"; }
- '\t' { escaped += "\\t"; }
- _ { escaped += str::from_char(c); }
- }
- };
- wr.write_str(escaped);
- wr.write_char('"');
- }
- boolean(b) {
- wr.write_str(if b { "true" } else { "false" });
- }
- list(v) {
- wr.write_char('[');
- let mut first = true;
- vec::iter(v) { |item|
- if !first {
- wr.write_str(", ");
- }
- first = false;
- to_writer(wr, item);
- };
- wr.write_char(']');
- }
- dict(d) {
- if d.size() == 0u {
- wr.write_str("{}");
- ret;
- }
-
- wr.write_str("{ ");
- let mut first = true;
- d.items { |key, value|
- if !first {
- wr.write_str(", ");
- }
- first = false;
- to_writer(wr, string(key));
- wr.write_str(": ");
- to_writer(wr, value);
- };
- wr.write_str(" }");
- }
- null {
- wr.write_str("null");
- }
- }
+fn main() {
+ let args = os::args();
+ let args = if os::getenv(~"RUST_BENCH").is_some() {
+ ~[~"", ~"4000000"]
+ } else if args.len() <= 1u {
+ ~[~"", ~"100000"]
+ } else {
+ args
+ };
+ let n = int::from_str(args[1]).get();
+ let mut bodies: ~[Body::props] = NBodySystem::make();
+ io::println(fmt!("%f", NBodySystem::energy(bodies)));
+ let mut i = 0;
+ while i < n {
+ NBodySystem::advance(bodies, 0.01);
+ i += 1;
+ }
+ io::println(fmt!("%f", NBodySystem::energy(bodies)));
}
-#[doc = "Serializes a json value into a string"]
-fn to_str(j: json) -> str {
- io::with_str_writer { |wr| to_writer(wr, j) }
-}
+mod NBodySystem {
+ use Body;
-type parser = {
- rdr: io::reader,
- mut ch: char,
- mut line: uint,
- mut col: uint,
-};
+ pub fn make() -> ~[Body::props] {
+ let mut bodies: ~[Body::props] =
+ ~[Body::sun(),
+ Body::jupiter(),
+ Body::saturn(),
+ Body::uranus(),
+ Body::neptune()];
-impl parser for parser {
- fn eof() -> bool { self.ch == -1 as char }
+ let mut px = 0.0;
+ let mut py = 0.0;
+ let mut pz = 0.0;
- fn bump() {
- self.ch = self.rdr.read_char();
+ let mut i = 0;
+ while i < 5 {
+ px += bodies[i].vx * bodies[i].mass;
+ py += bodies[i].vy * bodies[i].mass;
+ pz += bodies[i].vz * bodies[i].mass;
- if self.ch == '\n' {
- self.line += 1u;
- self.col = 1u;
- } else {
- self.col += 1u;
+ i += 1;
}
- }
- fn next_char() -> char {
- self.bump();
- self.ch
- }
+ // side-effecting
+ Body::offset_momentum(&mut bodies[0], px, py, pz);
- fn error<T>(msg: str) -> result<T, error> {
- err({ line: self.line, col: self.col, msg: msg })
+ return bodies;
}
- fn parse() -> result<json, error> {
- alt self.parse_value() {
- ok(value) {
- // Skip trailing whitespaces.
- self.parse_whitespace();
- // Make sure there is no trailing characters.
- if self.eof() {
- ok(value)
- } else {
- self.error("trailing characters")
+ pub fn advance(bodies: &mut [Body::props], dt: float) {
+ let mut i = 0;
+ while i < 5 {
+ let mut j = i + 1;
+ while j < 5 {
+ advance_one(&mut bodies[i],
+ &mut bodies[j], dt);
+ j += 1;
}
- }
- e { e }
- }
- }
- fn parse_value() -> result<json, error> {
- self.parse_whitespace();
-
- if self.eof() { ret self.error("EOF while parsing value"); }
-
- alt self.ch {
- 'n' { self.parse_ident("ull", null) }
- 't' { self.parse_ident("rue", boolean(true)) }
- 'f' { self.parse_ident("alse", boolean(false)) }
- '0' to '9' | '-' { self.parse_number() }
- '"' {
- alt self.parse_str() {
- ok(s) { ok(string(s)) }
- err(e) { err(e) }
- }
- }
- '[' { self.parse_list() }
- '{' { self.parse_object() }
- _ { self.error("invalid syntax") }
+ i += 1;
}
- }
- fn parse_whitespace() {
- while char::is_whitespace(self.ch) { self.bump(); }
- }
-
- fn parse_ident(ident: str, value: json) -> result<json, error> {
- if str::all(ident, { |c| c == self.next_char() }) {
- self.bump();
- ok(value)
- } else {
- self.error("invalid syntax")
+ i = 0;
+ while i < 5 {
+ move_(&mut bodies[i], dt);
+ i += 1;
}
}
- fn parse_number() -> result<json, error> {
- let mut neg = 1f;
+ pub fn advance_one(bi: &mut Body::props,
+ bj: &mut Body::props,
+ dt: float) unsafe {
+ let dx = bi.x - bj.x;
+ let dy = bi.y - bj.y;
+ let dz = bi.z - bj.z;
- if self.ch == '-' {
- self.bump();
- neg = -1f;
- }
+ let dSquared = dx * dx + dy * dy + dz * dz;
- let mut res = alt self.parse_integer() {
- ok(res) { res }
- err(e) { ret err(e); }
- };
+ let distance = ::libc::sqrt(dSquared);
+ let mag = dt / (dSquared * distance);
- if self.ch == '.' {
- alt self.parse_decimal(res) {
- ok(r) { res = r; }
- err(e) { ret err(e); }
- }
- }
-
- if self.ch == 'e' || self.ch == 'E' {
- alt self.parse_exponent(res) {
- ok(r) { res = r; }
- err(e) { ret err(e); }
- }
- }
+ bi.vx -= dx * bj.mass * mag;
+ bi.vy -= dy * bj.mass * mag;
+ bi.vz -= dz * bj.mass * mag;
- ok(num(neg * res))
+ bj.vx += dx * bi.mass * mag;
+ bj.vy += dy * bi.mass * mag;
+ bj.vz += dz * bi.mass * mag;
}
- fn parse_integer() -> result<float, error> {
- let mut res = 0f;
-
- alt self.ch {
- '0' {
- self.bump();
-
- // There can be only one leading '0'.
- alt self.ch {
- '0' to '9' { ret self.error("invalid number"); }
- _ {}
- }
- }
- '1' to '9' {
- while !self.eof() {
- alt self.ch {
- '0' to '9' {
- res *= 10f;
- res += ((self.ch as int) - ('0' as int)) as float;
-
- self.bump();
- }
- _ { break; }
- }
- }
- }
- _ { ret self.error("invalid number"); }
- }
-
- ok(res)
+ pub fn move_(b: &mut Body::props, dt: float) {
+ b.x += dt * b.vx;
+ b.y += dt * b.vy;
+ b.z += dt * b.vz;
}
- fn parse_decimal(res: float) -> result<float, error> {
- self.bump();
+ pub fn energy(bodies: &[Body::props]) -> float unsafe {
+ let mut dx;
+ let mut dy;
+ let mut dz;
+ let mut distance;
+ let mut e = 0.0;
- // Make sure a digit follows the decimal place.
- alt self.ch {
- '0' to '9' {}
- _ { ret self.error("invalid number"); }
- }
-
- let mut res = res;
- let mut dec = 1f;
- while !self.eof() {
- alt self.ch {
- '0' to '9' {
- dec /= 10f;
- res += (((self.ch as int) - ('0' as int)) as float) * dec;
-
- self.bump();
- }
- _ { break; }
- }
- }
-
- ok(res)
- }
-
- fn parse_exponent(res: float) -> result<float, error> {
- self.bump();
+ let mut i = 0;
+ while i < 5 {
+ e +=
+ 0.5 * bodies[i].mass *
+ (bodies[i].vx * bodies[i].vx + bodies[i].vy * bodies[i].vy
+ + bodies[i].vz * bodies[i].vz);
- let mut res = res;
- let mut exp = 0u;
- let mut neg_exp = false;
+ let mut j = i + 1;
+ while j < 5 {
+ dx = bodies[i].x - bodies[j].x;
+ dy = bodies[i].y - bodies[j].y;
+ dz = bodies[i].z - bodies[j].z;
- alt self.ch {
- '+' { self.bump(); }
- '-' { self.bump(); neg_exp = true; }
- _ {}
- }
-
- // Make sure a digit follows the exponent place.
- alt self.ch {
- '0' to '9' {}
- _ { ret self.error("invalid number"); }
- }
-
- while !self.eof() {
- alt self.ch {
- '0' to '9' {
- exp *= 10u;
- exp += (self.ch as uint) - ('0' as uint);
+ distance = ::libc::sqrt(dx * dx + dy * dy + dz * dz);
+ e -= bodies[i].mass * bodies[j].mass / distance;
- self.bump();
- }
- _ { break; }
+ j += 1;
}
- }
- let exp = float::pow_with_uint(10u, exp);
- if neg_exp {
- res /= exp;
- } else {
- res *= exp;
+ i += 1;
}
+ return e;
- ok(res)
- }
-
- fn parse_str() -> result<str, error> {
- let mut escape = false;
- let mut res = "";
-
- while !self.eof() {
- self.bump();
-
- if (escape) {
- alt self.ch {
- '"' { str::push_char(res, '"'); }
- '\\' { str::push_char(res, '\\'); }
- '/' { str::push_char(res, '/'); }
- 'b' { str::push_char(res, '\x08'); }
- 'f' { str::push_char(res, '\x0c'); }
- 'n' { str::push_char(res, '\n'); }
- 'r' { str::push_char(res, '\r'); }
- 't' { str::push_char(res, '\t'); }
- 'u' {
- // Parse \u1234.
- let mut i = 0u;
- let mut n = 0u;
- while i < 4u {
- alt self.next_char() {
- '0' to '9' {
- n = n * 10u +
- (self.ch as uint) - ('0' as uint);
- }
- _ { ret self.error("invalid \\u escape"); }
- }
- i += 1u;
- }
-
- // Error out if we didn't parse 4 digits.
- if i != 4u {
- ret self.error("invalid \\u escape");
- }
-
- str::push_char(res, n as char);
- }
- _ { ret self.error("invalid escape"); }
- }
- escape = false;
- } else if self.ch == '\\' {
- escape = true;
- } else {
- if self.ch == '"' {
- self.bump();
- ret ok(res);
- }
- str::push_char(res, self.ch);
- }
- }
-
- self.error("EOF while parsing string")
- }
-
- fn parse_list() -> result<json, error> {
- self.bump();
- self.parse_whitespace();
-
- let mut values = [];
-
- if self.ch == ']' {
- self.bump();
- ret ok(list(values));
- }
-
- loop {
- alt self.parse_value() {
- ok(v) { vec::push(values, v); }
- e { ret e; }
- }
-
- self.parse_whitespace();
- if self.eof() {
- ret self.error("EOF while parsing list");
- }
-
- alt self.ch {
- ',' { self.bump(); }
- ']' { self.bump(); ret ok(list(values)); }
- _ { ret self.error("expecting ',' or ']'"); }
- }
- };
- }
-
- fn parse_object() -> result<json, error> {
- self.bump();
- self.parse_whitespace();
-
- let values = map::str_hash();
-
- if self.ch == '}' {
- self.bump();
- ret ok(dict(values));
- }
-
- while !self.eof() {
- self.parse_whitespace();
-
- if self.ch != '"' {
- ret self.error("key must be a string");
- }
-
- let key = alt self.parse_str() {
- ok(key) { key }
- err(e) { ret err(e); }
- };
-
- self.parse_whitespace();
-
- if self.ch != ':' {
- if self.eof() { break; }
- ret self.error("expecting ':'");
- }
- self.bump();
-
- alt self.parse_value() {
- ok(value) { values.insert(key, value); }
- e { ret e; }
- }
- self.parse_whitespace();
-
- alt self.ch {
- ',' { self.bump(); }
- '}' { self.bump(); ret ok(dict(values)); }
- _ {
- if self.eof() { break; }
- ret self.error("expecting ',' or '}'");
- }
- }
- }
-
- ret self.error("EOF while parsing object");
- }
-}
-
-#[doc = "Deserializes a json value from an io::reader"]
-fn from_reader(rdr: io::reader) -> result<json, error> {
- let parser = {
- rdr: rdr,
- mut ch: rdr.read_char(),
- mut line: 1u,
- mut col: 1u,
- };
-
- parser.parse()
-}
-
-#[doc = "Deserializes a json value from a string"]
-fn from_str(s: str) -> result<json, error> {
- io::with_str_reader(s, from_reader)
-}
-
-#[doc = "Test if two json values are equal"]
-fn eq(value0: json, value1: json) -> bool {
- alt (value0, value1) {
- (num(f0), num(f1)) { f0 == f1 }
- (string(s0), string(s1)) { s0 == s1 }
- (boolean(b0), boolean(b1)) { b0 == b1 }
- (list(l0), list(l1)) { vec::all2(l0, l1, eq) }
- (dict(d0), dict(d1)) {
- if d0.size() == d1.size() {
- let mut equal = true;
- d0.items { |k, v0|
- alt d1.find(k) {
- some(v1) {
- if !eq(v0, v1) { equal = false; } }
- none { equal = false; }
- }
- };
- equal
- } else {
- false
- }
- }
- (null, null) { true }
- _ { false }
}
}
-#[cfg(test)]
-mod tests {
- fn mk_dict(items: [(str, json)]) -> json {
- let d = map::str_hash();
-
- vec::iter(items) { |item|
- let (key, value) = item;
- d.insert(key, value);
- };
-
- dict(d)
- }
-
- #[test]
- fn test_write_null() {
- assert to_str(null) == "null";
- }
-
- #[test]
- fn test_write_num() {
- assert to_str(num(3f)) == "3";
- assert to_str(num(3.1f)) == "3.1";
- assert to_str(num(-1.5f)) == "-1.5";
- assert to_str(num(0.5f)) == "0.5";
- }
-
- #[test]
- fn test_write_str() {
- assert to_str(string("")) == "\"\"";
- assert to_str(string("foo")) == "\"foo\"";
- }
-
- #[test]
- fn test_write_bool() {
- assert to_str(boolean(true)) == "true";
- assert to_str(boolean(false)) == "false";
- }
-
- #[test]
- fn test_write_list() {
- assert to_str(list([])) == "[]";
- assert to_str(list([boolean(true)])) == "[true]";
- assert to_str(list([
- boolean(false),
- null,
- list([string("foo\nbar"), num(3.5f)])
- ])) == "[false, null, [\"foo\\nbar\", 3.5]]";
- }
-
- #[test]
- fn test_write_dict() {
- assert to_str(mk_dict([])) == "{}";
- assert to_str(mk_dict([("a", boolean(true))])) == "{ \"a\": true }";
- assert to_str(mk_dict([
- ("a", boolean(true)),
- ("b", list([
- mk_dict([("c", string("\x0c\r"))]),
- mk_dict([("d", string(""))])
- ]))
- ])) ==
- "{ " +
- "\"a\": true, " +
- "\"b\": [" +
- "{ \"c\": \"\\f\\r\" }, " +
- "{ \"d\": \"\" }" +
- "]" +
- " }";
+mod Body {
+ use Body;
+
+ pub const PI: float = 3.141592653589793;
+ pub const SOLAR_MASS: float = 39.478417604357432;
+ // was 4 * PI * PI originally
+ pub const DAYS_PER_YEAR: float = 365.24;
+
+ pub type props =
+ {mut x: float,
+ mut y: float,
+ mut z: float,
+ mut vx: float,
+ mut vy: float,
+ mut vz: float,
+ mass: float};
+
+ pub fn jupiter() -> Body::props {
+ return {mut x: 4.84143144246472090e+00,
+ mut y: -1.16032004402742839e+00,
+ mut z: -1.03622044471123109e-01,
+ mut vx: 1.66007664274403694e-03 * DAYS_PER_YEAR,
+ mut vy: 7.69901118419740425e-03 * DAYS_PER_YEAR,
+ mut vz: -6.90460016972063023e-05 * DAYS_PER_YEAR,
+ mass: 9.54791938424326609e-04 * SOLAR_MASS};
+ }
+
+ pub fn saturn() -> Body::props {
+ return {mut x: 8.34336671824457987e+00,
+ mut y: 4.12479856412430479e+00,
+ mut z: -4.03523417114321381e-01,
+ mut vx: -2.76742510726862411e-03 * DAYS_PER_YEAR,
+ mut vy: 4.99852801234917238e-03 * DAYS_PER_YEAR,
+ mut vz: 2.30417297573763929e-05 * DAYS_PER_YEAR,
+ mass: 2.85885980666130812e-04 * SOLAR_MASS};
+ }
+
+ pub fn uranus() -> Body::props {
+ return {mut x: 1.28943695621391310e+01,
+ mut y: -1.51111514016986312e+01,
+ mut z: -2.23307578892655734e-01,
+ mut vx: 2.96460137564761618e-03 * DAYS_PER_YEAR,
+ mut vy: 2.37847173959480950e-03 * DAYS_PER_YEAR,
+ mut vz: -2.96589568540237556e-05 * DAYS_PER_YEAR,
+ mass: 4.36624404335156298e-05 * SOLAR_MASS};
+ }
+
+ pub fn neptune() -> Body::props {
+ return {mut x: 1.53796971148509165e+01,
+ mut y: -2.59193146099879641e+01,
+ mut z: 1.79258772950371181e-01,
+ mut vx: 2.68067772490389322e-03 * DAYS_PER_YEAR,
+ mut vy: 1.62824170038242295e-03 * DAYS_PER_YEAR,
+ mut vz: -9.51592254519715870e-05 * DAYS_PER_YEAR,
+ mass: 5.15138902046611451e-05 * SOLAR_MASS};
+ }
+
+ pub fn sun() -> Body::props {
+ return {mut x: 0.0,
+ mut y: 0.0,
+ mut z: 0.0,
+ mut vx: 0.0,
+ mut vy: 0.0,
+ mut vz: 0.0,
+ mass: SOLAR_MASS};
+ }
+
+ pub fn offset_momentum(props: &mut Body::props,
+ px: float, py: float, pz: float) {
+ props.vx = -px / SOLAR_MASS;
+ props.vy = -py / SOLAR_MASS;
+ props.vz = -pz / SOLAR_MASS;
}
- #[test]
- fn test_trailing_characters() {
- assert from_str("nulla") ==
- err({line: 1u, col: 5u, msg: "trailing characters"});
- assert from_str("truea") ==
- err({line: 1u, col: 5u, msg: "trailing characters"});
- assert from_str("falsea") ==
- err({line: 1u, col: 6u, msg: "trailing characters"});
- assert from_str("1a") ==
- err({line: 1u, col: 2u, msg: "trailing characters"});
- assert from_str("[]a") ==
- err({line: 1u, col: 3u, msg: "trailing characters"});
- assert from_str("{}a") ==
- err({line: 1u, col: 3u, msg: "trailing characters"});
- }
-
- #[test]
- fn test_read_identifiers() {
- assert from_str("n") ==
- err({line: 1u, col: 2u, msg: "invalid syntax"});
- assert from_str("nul") ==
- err({line: 1u, col: 4u, msg: "invalid syntax"});
-
- assert from_str("t") ==
- err({line: 1u, col: 2u, msg: "invalid syntax"});
- assert from_str("truz") ==
- err({line: 1u, col: 4u, msg: "invalid syntax"});
-
- assert from_str("f") ==
- err({line: 1u, col: 2u, msg: "invalid syntax"});
- assert from_str("faz") ==
- err({line: 1u, col: 3u, msg: "invalid syntax"});
-
- assert from_str("null") == ok(null);
- assert from_str("true") == ok(boolean(true));
- assert from_str("false") == ok(boolean(false));
- assert from_str(" null ") == ok(null);
- assert from_str(" true ") == ok(boolean(true));
- assert from_str(" false ") == ok(boolean(false));
- }
-
- #[test]
- fn test_read_num() {
- assert from_str("+") ==
- err({line: 1u, col: 1u, msg: "invalid syntax"});
- assert from_str(".") ==
- err({line: 1u, col: 1u, msg: "invalid syntax"});
-
- assert from_str("-") ==
- err({line: 1u, col: 2u, msg: "invalid number"});
- assert from_str("00") ==
- err({line: 1u, col: 2u, msg: "invalid number"});
- assert from_str("1.") ==
- err({line: 1u, col: 3u, msg: "invalid number"});
- assert from_str("1e") ==
- err({line: 1u, col: 3u, msg: "invalid number"});
- assert from_str("1e+") ==
- err({line: 1u, col: 4u, msg: "invalid number"});
-
- assert from_str("3") == ok(num(3f));
- assert from_str("3.1") == ok(num(3.1f));
- assert from_str("-1.2") == ok(num(-1.2f));
- assert from_str("0.4") == ok(num(0.4f));
- assert from_str("0.4e5") == ok(num(0.4e5f));
- assert from_str("0.4e+15") == ok(num(0.4e15f));
- assert from_str("0.4e-01") == ok(num(0.4e-01f));
- assert from_str(" 3 ") == ok(num(3f));
- }
-
- #[test]
- fn test_read_str() {
- assert from_str("\"") ==
- err({line: 1u, col: 2u, msg: "EOF while parsing string"});
- assert from_str("\"lol") ==
- err({line: 1u, col: 5u, msg: "EOF while parsing string"});
-
- assert from_str("\"\"") == ok(string(""));
- assert from_str("\"foo\"") == ok(string("foo"));
- assert from_str("\"\\\"\"") == ok(string("\""));
- assert from_str("\"\\b\"") == ok(string("\x08"));
- assert from_str("\"\\n\"") == ok(string("\n"));
- assert from_str("\"\\r\"") == ok(string("\r"));
- assert from_str("\"\\t\"") == ok(string("\t"));
- assert from_str(" \"foo\" ") == ok(string("foo"));
- }
-
- #[test]
- fn test_read_list() {
- assert from_str("[") ==
- err({line: 1u, col: 2u, msg: "EOF while parsing value"});
- assert from_str("[1") ==
- err({line: 1u, col: 3u, msg: "EOF while parsing list"});
- assert from_str("[1,") ==
- err({line: 1u, col: 4u, msg: "EOF while parsing value"});
- assert from_str("[1,]") ==
- err({line: 1u, col: 4u, msg: "invalid syntax"});
- assert from_str("[6 7]") ==
- err({line: 1u, col: 4u, msg: "expecting ',' or ']'"});
-
- assert from_str("[]") == ok(list([]));
- assert from_str("[ ]") == ok(list([]));
- assert from_str("[true]") == ok(list([boolean(true)]));
- assert from_str("[ false ]") == ok(list([boolean(false)]));
- assert from_str("[null]") == ok(list([null]));
- assert from_str("[3, 1]") == ok(list([num(3f), num(1f)]));
- assert from_str("\n[3, 2]\n") == ok(list([num(3f), num(2f)]));
- assert from_str("[2, [4, 1]]") ==
- ok(list([num(2f), list([num(4f), num(1f)])]));
- }
-
- #[test]
- fn test_read_dict() {
- assert from_str("{") ==
- err({line: 1u, col: 2u, msg: "EOF while parsing object"});
- assert from_str("{ ") ==
- err({line: 1u, col: 3u, msg: "EOF while parsing object"});
- assert from_str("{1") ==
- err({line: 1u, col: 2u, msg: "key must be a string"});
- assert from_str("{ \"a\"") ==
- err({line: 1u, col: 6u, msg: "EOF while parsing object"});
- assert from_str("{\"a\"") ==
- err({line: 1u, col: 5u, msg: "EOF while parsing object"});
- assert from_str("{\"a\" ") ==
- err({line: 1u, col: 6u, msg: "EOF while parsing object"});
-
- assert from_str("{\"a\" 1") ==
- err({line: 1u, col: 6u, msg: "expecting ':'"});
- assert from_str("{\"a\":") ==
- err({line: 1u, col: 6u, msg: "EOF while parsing value"});
- assert from_str("{\"a\":1") ==
- err({line: 1u, col: 7u, msg: "EOF while parsing object"});
- assert from_str("{\"a\":1 1") ==
- err({line: 1u, col: 8u, msg: "expecting ',' or '}'"});
- assert from_str("{\"a\":1,") ==
- err({line: 1u, col: 8u, msg: "EOF while parsing object"});
-
- assert eq(result::get(from_str("{}")), mk_dict([]));
- assert eq(result::get(from_str("{\"a\": 3}")),
- mk_dict([("a", num(3.0f))]));
-
- assert eq(result::get(from_str("{ \"a\": null, \"b\" : true }")),
- mk_dict([("a", null), ("b", boolean(true))]));
- assert eq(result::get(from_str("\n{ \"a\": null, \"b\" : true }\n")),
- mk_dict([("a", null), ("b", boolean(true))]));
- assert eq(result::get(from_str("{\"a\" : 1.0 ,\"b\": [ true ]}")),
- mk_dict([
- ("a", num(1.0)),
- ("b", list([boolean(true)]))
- ]));
- assert eq(result::get(from_str(
- "{" +
- "\"a\": 1.0, " +
- "\"b\": [" +
- "true," +
- "\"foo\\nbar\", " +
- "{ \"c\": {\"d\": null} } " +
- "]" +
- "}")),
- mk_dict([
- ("a", num(1.0f)),
- ("b", list([
- boolean(true),
- string("foo\nbar"),
- mk_dict([
- ("c", mk_dict([("d", null)]))
- ])
- ]))
- ]));
- }
-
- #[test]
- fn test_multiline_errors() {
- assert from_str("{\n \"foo\":\n \"bar\"") ==
- err({line: 3u, col: 8u, msg: "EOF while parsing object"});
- }
}
diff --git a/tests/examplefiles/test2.pypylog b/tests/examplefiles/test2.pypylog
new file mode 100644
index 00000000..543e21dd
--- /dev/null
+++ b/tests/examplefiles/test2.pypylog
@@ -0,0 +1,120 @@
+[2f1dd6c3b8b7] {jit-log-opt-loop
+# Loop 0 (<Function object at 0xb720e550> ds1dr4 dsdr3 ds1dr4) : loop with 115 ops
+[p0, p1]
++33: label(p0, p1, descr=TargetToken(-1223434224))
+debug_merge_point(0, 0, '<Function object at 0xb710b120> ds1dr4 dsdr3 ds1dr4')
++33: guard_nonnull_class(p1, 138371488, descr=<Guard2>) [p1, p0]
++54: p3 = getfield_gc_pure(p1, descr=<FieldP pyhaskell.interpreter.haskell.Substitution.inst_rhs 8>)
++57: guard_value(p3, ConstPtr(ptr4), descr=<Guard3>) [p1, p0, p3]
++69: p5 = getfield_gc_pure(p1, descr=<FieldP pyhaskell.interpreter.haskell.Substitution.inst_subst 12>)
++72: p7 = getarrayitem_gc(p5, 0, descr=<ArrayP 4>)
++75: guard_class(p7, 138371552, descr=<Guard4>) [p0, p5, p7]
++88: p9 = getfield_gc(p7, descr=<FieldP pyhaskell.interpreter.haskell.Thunk.inst_application 8>)
++91: guard_nonnull_class(p9, 138373024, descr=<Guard5>) [p0, p5, p7, p9]
++109: p12 = getarrayitem_gc(p5, 1, descr=<ArrayP 4>)
++112: guard_class(p12, 138371552, descr=<Guard6>) [p0, p5, p12, p7]
++125: p14 = getfield_gc(p12, descr=<FieldP pyhaskell.interpreter.haskell.Thunk.inst_application 8>)
++128: guard_nonnull_class(p14, 138373024, descr=<Guard7>) [p0, p5, p12, p14, p7]
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, 'None')
++146: p16 = getfield_gc_pure(p9, descr=<FieldP pyhaskell.interpreter.haskell.Application.inst_function 8>)
++149: guard_value(p16, ConstPtr(ptr17), descr=<Guard8>) [p16, p9, p0, p12, p7]
++161: p18 = getfield_gc_pure(p9, descr=<FieldP pyhaskell.interpreter.haskell.Application3.inst_arg0 12>)
++164: guard_class(p18, 138371648, descr=<Guard9>) [p18, p9, p0, p12, p7]
++177: p20 = getfield_gc_pure(p9, descr=<FieldP pyhaskell.interpreter.haskell.Application3.inst_arg1 16>)
++180: guard_class(p20, 138371648, descr=<Guard10>) [p20, p9, p18, p0, p12, p7]
++193: p22 = getfield_gc_pure(p9, descr=<FieldP pyhaskell.interpreter.haskell.Application3.inst_arg2 20>)
++196: guard_class(p22, 138371936, descr=<Guard11>) [p22, p9, p20, p18, p0, p12, p7]
+debug_merge_point(0, 0, 'None')
++209: p24 = getfield_gc_pure(p22, descr=<FieldP pyhaskell.interpreter.haskell.Application.inst_function 8>)
++215: guard_value(p24, ConstPtr(ptr25), descr=<Guard12>) [p24, p22, p9, None, None, p0, p12, p7]
++227: p27 = getfield_gc_pure(p22, descr=<FieldP pyhaskell.interpreter.haskell.Application1.inst_arg0 12>)
++230: guard_class(p27, 138371648, descr=<Guard13>) [p22, p27, p9, None, None, p0, p12, p7]
+debug_merge_point(0, 0, '_')
+debug_merge_point(0, 0, 'None')
++243: p30 = getfield_gc(ConstPtr(ptr29), descr=<FieldP pyhaskell.interpreter.module.CoreMod.inst_qvars 24>)
++249: i34 = call(ConstClass(ll_dict_lookup_trampoline__v64___simple_call__function_ll), p30, ConstPtr(ptr32), 360200661, descr=<Calli 4 rri EF=4>)
++281: guard_no_exception(, descr=<Guard14>) [p27, p20, p18, i34, p30, None, None, None, p0, p12, p7]
++294: i36 = int_and(i34, -2147483648)
++302: i37 = int_is_true(i36)
+guard_false(i37, descr=<Guard15>) [p27, p20, p18, i34, p30, None, None, None, p0, p12, p7]
++311: p38 = getfield_gc(p30, descr=<FieldP dicttable.entries 12>)
++314: p39 = getinteriorfield_gc(p38, i34, descr=<InteriorFieldDescr <FieldP dictentry.value 4>>)
++318: i40 = instance_ptr_eq(p18, p39)
+guard_true(i40, descr=<Guard16>) [p27, p20, None, None, None, p0, p12, p7]
+debug_merge_point(0, 0, 'None')
++327: i41 = getfield_gc_pure(p20, descr=<FieldS pyhaskell.interpreter.primtype.Int.inst_value 8>)
++330: i42 = getfield_gc_pure(p27, descr=<FieldS pyhaskell.interpreter.primtype.Int.inst_value 8>)
++333: i43 = int_sub(i41, i42)
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, 'None')
++335: i45 = int_eq(0, i43)
+guard_false(i45, descr=<Guard17>) [p0, i43, None, None, None, None, p12, p7]
+p47 = new_with_vtable(138371648)
++393: setfield_gc(p47, i43, descr=<FieldS pyhaskell.interpreter.primtype.Int.inst_value 8>)
+setfield_gc(p7, p47, descr=<FieldP pyhaskell.interpreter.haskell.Thunk.inst_application 8>)
++414: p48 = getfield_gc(p12, descr=<FieldP pyhaskell.interpreter.haskell.Thunk.inst_application 8>)
++420: guard_nonnull_class(p48, 138371648, descr=<Guard18>) [p0, p48, p12, p47, p7]
+debug_merge_point(0, 0, '<PrimFunction object at 0x83f3f6c> 1 <Function object at 0xb710b3b0> 1 <Function object at 0xb710b3c0> <PrimFunction object at 0x83f3f3c> 1 dsdr3 <Function object at 0xb710b210> 1')
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, '_')
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, '<Function object at 0xb710b3d0> dsdr3 dsdr3')
+debug_merge_point(0, 0, '<Function object at 0xb710b120> ds1dr4 dsdr3 ds1dr4')
++438: label(p0, p48, p30, p38, descr=TargetToken(-1223434176))
+debug_merge_point(0, 0, '<Function object at 0xb710b120> ds1dr4 dsdr3 ds1dr4')
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, '_')
+debug_merge_point(0, 0, 'None')
++438: i50 = call(ConstClass(ll_dict_lookup_trampoline__v64___simple_call__function_ll), p30, ConstPtr(ptr32), 360200661, descr=<Calli 4 rri EF=4>)
++464: guard_no_exception(, descr=<Guard19>) [p48, i50, p30, p0]
++477: i51 = int_and(i50, -2147483648)
++485: i52 = int_is_true(i51)
+guard_false(i52, descr=<Guard20>) [p48, i50, p30, p0]
++494: p53 = getinteriorfield_gc(p38, i50, descr=<InteriorFieldDescr <FieldP dictentry.value 4>>)
++501: i55 = instance_ptr_eq(ConstPtr(ptr54), p53)
+guard_true(i55, descr=<Guard21>) [p48, p0]
+debug_merge_point(0, 0, 'None')
++513: i56 = getfield_gc_pure(p48, descr=<FieldS pyhaskell.interpreter.primtype.Int.inst_value 8>)
++516: i58 = int_sub(i56, 1)
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, 'None')
++519: i59 = int_eq(0, i58)
+guard_false(i59, descr=<Guard22>) [i58, p48, p0]
+debug_merge_point(0, 0, '<PrimFunction object at 0x83f3f6c> 1 <Function object at 0xb710b3b0> 1 <Function object at 0xb710b3c0> <PrimFunction object at 0x83f3f3c> 1 dsdr3 <Function object at 0xb710b210> 1')
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, '_')
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, 'None')
+debug_merge_point(0, 0, '<Function object at 0xb710b3d0> dsdr3 dsdr3')
+debug_merge_point(0, 0, '<Function object at 0xb710b120> ds1dr4 dsdr3 ds1dr4')
+p61 = new_with_vtable(138371700)
+p63 = new_with_vtable(138373024)
+p65 = new_with_vtable(138371936)
++606: setfield_gc(p63, ConstPtr(ptr66), descr=<FieldP pyhaskell.interpreter.haskell.Application.inst_function 8>)
+p68 = new_with_vtable(138373024)
++632: setfield_gc(p65, ConstPtr(ptr69), descr=<FieldP pyhaskell.interpreter.haskell.Application.inst_function 8>)
+p71 = new_with_vtable(138371936)
++658: setfield_gc(p68, ConstPtr(ptr17), descr=<FieldP pyhaskell.interpreter.haskell.Application.inst_function 8>)
++665: setfield_gc(p71, ConstPtr(ptr72), descr=<FieldP pyhaskell.interpreter.haskell.Application1.inst_arg0 12>)
++672: setfield_gc(p68, p71, descr=<FieldP pyhaskell.interpreter.haskell.Application3.inst_arg2 20>)
++675: setfield_gc(p68, p48, descr=<FieldP pyhaskell.interpreter.haskell.Application3.inst_arg1 16>)
++678: setfield_gc(p68, ConstPtr(ptr54), descr=<FieldP pyhaskell.interpreter.haskell.Application3.inst_arg0 12>)
+p73 = new_with_vtable(138371648)
++701: setfield_gc(p61, p0, descr=<FieldP pyhaskell.interpreter.haskell.StackElement.inst_next 8>)
++716: setfield_gc(p61, 2, descr=<FieldS pyhaskell.interpreter.haskell.CopyStackElement.inst_index 16>)
++723: setfield_gc(p71, ConstPtr(ptr25), descr=<FieldP pyhaskell.interpreter.haskell.Application.inst_function 8>)
++730: setfield_gc(p65, p68, descr=<FieldP pyhaskell.interpreter.haskell.Application1.inst_arg0 12>)
++733: setfield_gc(p63, p65, descr=<FieldP pyhaskell.interpreter.haskell.Application3.inst_arg2 20>)
++736: setfield_gc(p63, ConstPtr(ptr75), descr=<FieldP pyhaskell.interpreter.haskell.Application3.inst_arg1 16>)
++743: setfield_gc(p63, ConstPtr(ptr54), descr=<FieldP pyhaskell.interpreter.haskell.Application3.inst_arg0 12>)
++750: setfield_gc(p61, p63, descr=<FieldP pyhaskell.interpreter.haskell.CopyStackElement.inst_application 12>)
++753: setfield_gc(p73, i58, descr=<FieldS pyhaskell.interpreter.primtype.Int.inst_value 8>)
++762: jump(p61, p73, p30, p38, descr=TargetToken(-1223434176))
++775: --end of the loop--
+[2f1dd6da3b99] jit-log-opt-loop}
diff --git a/tests/examplefiles/unix-io.lid b/tests/examplefiles/unix-io.lid
new file mode 100644
index 00000000..617fcaa4
--- /dev/null
+++ b/tests/examplefiles/unix-io.lid
@@ -0,0 +1,37 @@
+Library: io
+Synopsis: A portable IO library
+Author: Gail Zacharias
+Files: library
+ streams/defs
+ streams/stream
+ streams/sequence-stream
+ streams/native-buffer
+ streams/buffer
+ streams/typed-stream
+ streams/external-stream
+ streams/buffered-stream
+ streams/convenience
+ streams/wrapper-stream
+ streams/cleanup-streams
+ streams/native-speed
+ streams/async-writes
+ streams/file-stream
+ streams/multi-buffered-streams
+ pprint
+ print
+ print-double-integer-kludge
+ format
+ buffered-format
+ format-condition
+ unix-file-accessor
+ unix-standard-io
+ unix-interface
+ format-out
+C-Source-Files: unix-portability.c
+Major-Version: 2
+Minor-Version: 1
+Target-Type: dll
+Copyright: Original Code is Copyright (c) 1995-2004 Functional Objects, Inc.
+ All rights reserved.
+License: See License.txt in this distribution for details.
+Warranty: Distributed WITHOUT WARRANTY OF ANY KIND
diff --git a/tests/test_basic_api.py b/tests/test_basic_api.py
index bc4eb771..0e9d218d 100644
--- a/tests/test_basic_api.py
+++ b/tests/test_basic_api.py
@@ -93,7 +93,7 @@ def test_lexer_options():
'PythonConsoleLexer', 'RConsoleLexer', 'RubyConsoleLexer',
'SqliteConsoleLexer', 'MatlabSessionLexer', 'ErlangShellLexer',
'BashSessionLexer', 'LiterateHaskellLexer', 'PostgresConsoleLexer',
- 'ElixirConsoleLexer', 'JuliaConsoleLexer'):
+ 'ElixirConsoleLexer', 'JuliaConsoleLexer', 'RobotFrameworkLexer'):
inst = cls(ensurenl=False)
ensure(inst.get_tokens('a\nb'), 'a\nb')
inst = cls(ensurenl=False, stripall=True)
diff --git a/tests/test_examplefiles.py b/tests/test_examplefiles.py
index 41acf4ef..a938ebaa 100644
--- a/tests/test_examplefiles.py
+++ b/tests/test_examplefiles.py
@@ -58,6 +58,8 @@ def check_lexer(lx, absfn, outfn):
text = text.strip(b('\n')) + b('\n')
try:
text = text.decode('utf-8')
+ if text.startswith(u'\ufeff'):
+ text = text[len(u'\ufeff'):]
except UnicodeError:
text = text.decode('latin1')
ntext = []