diff options
author | David Beazley <dave@dabeaz.com> | 2008-05-28 16:13:22 +0000 |
---|---|---|
committer | David Beazley <dave@dabeaz.com> | 2008-05-28 16:13:22 +0000 |
commit | d2151e3557667c0ea0cff5f41a5dc58b275f107f (patch) | |
tree | 70b685901d26af63ede13b37cba8f9c23433926c | |
parent | 241e3c771322a9d3014d2f0d48975f15bcba732a (diff) | |
download | ply-d2151e3557667c0ea0cff5f41a5dc58b275f107f.tar.gz |
Bug fixes
-rw-r--r-- | ANNOUNCE | 13 | ||||
-rw-r--r-- | README | 4 | ||||
-rw-r--r-- | doc/ply.html | 27 | ||||
-rw-r--r-- | example/ansic/clex.py | 4 | ||||
-rw-r--r-- | example/yply/ylex.py | 4 | ||||
-rw-r--r-- | ply/lex.py | 45 | ||||
-rw-r--r-- | ply/yacc.py | 2 | ||||
-rw-r--r-- | test/lex_ignore.exp | 3 | ||||
-rw-r--r-- | test/lex_re1.exp | 3 | ||||
-rw-r--r-- | test/lex_re2.exp | 3 | ||||
-rw-r--r-- | test/lex_re3.exp | 3 | ||||
-rw-r--r-- | test/lex_state1.exp | 3 | ||||
-rw-r--r-- | test/lex_state2.exp | 3 | ||||
-rw-r--r-- | test/lex_state3.exp | 3 | ||||
-rw-r--r-- | test/lex_state4.exp | 3 | ||||
-rw-r--r-- | test/lex_state5.exp | 3 | ||||
-rw-r--r-- | test/lex_state_norule.exp | 3 |
17 files changed, 58 insertions, 71 deletions
@@ -1,11 +1,11 @@ -May 17, 2008 +May 28, 2008 - Announcing : PLY-2.4 (Python Lex-Yacc) + Announcing : PLY-2.5 (Python Lex-Yacc) http://www.dabeaz.com/ply I'm pleased to announce a significant new update to PLY---a 100% Python -implementation of the common parsing tools lex and yacc. PLY-2.4 fixes +implementation of the common parsing tools lex and yacc. PLY-2.5 fixes some bugs in error handling and provides some performance improvements. If you are new to PLY, here are a few highlights: @@ -29,13 +29,6 @@ If you are new to PLY, here are a few highlights: problems. Currently, PLY can build its parsing tables using either SLR or LALR(1) algorithms. -- PLY can be used to build parsers for large programming languages. - Although it is not ultra-fast due to its Python implementation, - PLY can be used to parse grammars consisting of several hundred - rules (as might be found for a language like C). The lexer and LR - parser are also reasonably efficient when parsing normal - sized programs. - More information about PLY can be obtained on the PLY webpage at: http://www.dabeaz.com/ply @@ -1,8 +1,8 @@ -PLY (Python Lex-Yacc) Version 2.4 (May, 2008) +PLY (Python Lex-Yacc) Version 2.5 (May 28, 2008) David M. Beazley (dave@dabeaz.com) -Copyright (C) 2001-2007 David M. Beazley +Copyright (C) 2001-2008 David M. Beazley This library is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public diff --git a/doc/ply.html b/doc/ply.html index 43af3d7..13a2631 100644 --- a/doc/ply.html +++ b/doc/ply.html @@ -12,7 +12,7 @@ dave@dabeaz.com<br> </b> <p> -<b>PLY Version: 2.4</b> +<b>PLY Version: 2.5</b> <p> <!-- INDEX --> @@ -472,7 +472,8 @@ def t_ID(t): </blockquote> It is important to note that storing data in other attribute names is <em>not</em> recommended. The <tt>yacc.py</tt> module only exposes the -contents of the <tt>value</tt> attribute. Thus, accessing other attributes may be unnecessarily awkward. +contents of the <tt>value</tt> attribute. Thus, accessing other attributes may be unnecessarily awkward. If you +need to store multiple values on a token, assign a tuple, dictionary, or instance to <tt>value</tt>. <H3><a name="ply_nn8"></a>3.5 Discarded tokens</H3> @@ -894,14 +895,18 @@ m.test("3 + 4") # Test it </pre> </blockquote> -For reasons that are subtle, you should <em>NOT</em> invoke <tt>lex.lex()</tt> inside the <tt>__init__()</tt> method of your class. If you -do, it may cause bizarre behavior if someone tries to duplicate a lexer object. Keep reading. +When building a lexer from class, you should construct the lexer from +an instance of the class, not the class object itself. Also, for +reasons that are subtle, you should <em>NOT</em> +invoke <tt>lex.lex()</tt> inside the <tt>__init__()</tt> method of +your class. If you do, it may cause bizarre behavior if someone tries +to duplicate a lexer object. <H3><a name="ply_nn18"></a>3.15 Maintaining state</H3> - In your lexer, you may want to maintain a variety of state information. This might include mode settings, symbol tables, and other details. There are a few -different ways to handle this situation. First, you could just keep some global variables: +different ways to handle this situation. One way to do this is to keep a set of global variables in the module +where you created the lexer. For example: <blockquote> <pre> @@ -940,9 +945,9 @@ lexer.num_count = 0 # Set the initial count </blockquote> This latter approach has the advantage of storing information inside -the lexer itself---something that may be useful if multiple instances +the lexer object itself---something that may be useful if multiple instances of the same lexer have been created. However, it may also feel kind -of "hacky" to the purists. Just to put their mind at some ease, all +of "hacky" to the OO purists. Just to put their mind at some ease, all internal attributes of the lexer (with the exception of <tt>lineno</tt>) have names that are prefixed by <tt>lex</tt> (e.g., <tt>lexdata</tt>,<tt>lexpos</tt>, etc.). Thus, it should be perfectly safe to store attributes in the lexer that @@ -977,12 +982,12 @@ lexer = lex.lex(object=m) </pre> </blockquote> -The class approach may be the easiest to manage if your application is going to be creating multiple instances of the same lexer and -you need to manage a lot of state. +The class approach may be the easiest to manage if your application is +going to be creating multiple instances of the same lexer and you need +to manage a lot of state. <H3><a name="ply_nn19"></a>3.16 Lexer cloning</H3> - <p> If necessary, a lexer object can be quickly duplicated by invoking its <tt>clone()</tt> method. For example: diff --git a/example/ansic/clex.py b/example/ansic/clex.py index 7dc98cd..7ec6d32 100644 --- a/example/ansic/clex.py +++ b/example/ansic/clex.py @@ -143,12 +143,12 @@ t_CCONST = r'(L)?\'([^\\\n]|(\\.))*?\'' # Comments def t_comment(t): r'/\*(.|\n)*?\*/' - t.lineno += t.value.count('\n') + t.lexer.lineno += t.value.count('\n') # Preprocessor directive (ignored) def t_preprocessor(t): r'\#(.)*?\n' - t.lineno += 1 + t.lexer.lineno += 1 def t_error(t): print "Illegal character %s" % repr(t.value[0]) diff --git a/example/yply/ylex.py b/example/yply/ylex.py index 67d2354..84f2f7a 100644 --- a/example/yply/ylex.py +++ b/example/yply/ylex.py @@ -42,7 +42,7 @@ def t_SECTION(t): # Comments def t_ccomment(t): r'/\*(.|\n)*?\*/' - t.lineno += t.value.count('\n') + t.lexer.lineno += t.value.count('\n') t_ignore_cppcomment = r'//.*' @@ -95,7 +95,7 @@ def t_code_error(t): raise RuntimeError def t_error(t): - print "%d: Illegal character '%s'" % (t.lineno, t.value[0]) + print "%d: Illegal character '%s'" % (t.lexer.lineno, t.value[0]) print t.value t.lexer.skip(1) @@ -22,7 +22,7 @@ # See the file COPYING for a complete copy of the LGPL. # ----------------------------------------------------------------------------- -__version__ = "2.4" +__version__ = "2.5" __tabversion__ = "2.4" # Version of table file used import re, sys, types, copy, os @@ -89,6 +89,7 @@ class Lexer: self.lexretext = None # Current regular expression strings self.lexstatere = {} # Dictionary mapping lexer states to master regexs self.lexstateretext = {} # Dictionary mapping lexer states to regex strings + self.lexstaterenames = {} # Dictionary mapping lexer states to symbol names self.lexstate = "INITIAL" # Current lexer state self.lexstatestack = [] # Stack of lexer states self.lexstateinfo = None # State information @@ -161,7 +162,7 @@ class Lexer: for key, lre in self.lexstatere.items(): titem = [] for i in range(len(lre)): - titem.append((self.lexstateretext[key][i],_funcs_to_names(lre[i][1],key,initialfuncs))) + titem.append((self.lexstateretext[key][i],_funcs_to_names(lre[i][1],self.lexstaterenames[key][i]))) tabre[key] = titem tf.write("_lexstatere = %s\n" % repr(tabre)) @@ -409,20 +410,11 @@ def _validate_file(filename): # suitable for output to a table file # ----------------------------------------------------------------------------- -def _funcs_to_names(funclist,state,initial): - # If this is the initial state, we clear the state and initial list - if state == 'INITIAL': - state = "" - initial = [] +def _funcs_to_names(funclist,namelist): result = [] - for f in funclist: + for f,name in zip(funclist,namelist): if f and f[0]: - # If a function is defined, make sure it's name corresponds to the correct state - if not initial or f in initial: - statestr = "t_" - else: - statestr = "t_"+state+"_" - result.append((statestr+ f[1],f[1])) + result.append((name, f[1])) else: result.append(f) return result @@ -459,25 +451,27 @@ def _form_master_re(relist,reflags,ldict,toknames): # Build the index to function map for the matching engine lexindexfunc = [ None ] * (max(lexre.groupindex.values())+1) + lexindexnames = lexindexfunc[:] + for f,i in lexre.groupindex.items(): handle = ldict.get(f,None) if type(handle) in (types.FunctionType, types.MethodType): lexindexfunc[i] = (handle,toknames[f]) + lexindexnames[i] = f elif handle is not None: - # If rule was specified as a string, we build an anonymous - # callback function to carry out the action + lexindexnames[i] = f if f.find("ignore_") > 0: lexindexfunc[i] = (None,None) else: lexindexfunc[i] = (None, toknames[f]) - - return [(lexre,lexindexfunc)],[regex] + + return [(lexre,lexindexfunc)],[regex],[lexindexnames] except Exception,e: m = int(len(relist)/2) if m == 0: m = 1 - llist, lre = _form_master_re(relist[:m],reflags,ldict,toknames) - rlist, rre = _form_master_re(relist[m:],reflags,ldict,toknames) - return llist+rlist, lre+rre + llist, lre, lnames = _form_master_re(relist[:m],reflags,ldict,toknames) + rlist, rre, rnames = _form_master_re(relist[m:],reflags,ldict,toknames) + return llist+rlist, lre+rre, lnames+rnames # ----------------------------------------------------------------------------- # def _statetoken(s,names) @@ -794,9 +788,10 @@ def lex(module=None,object=None,debug=0,optimize=0,lextab="lextab",reflags=0,now # Build the master regular expressions for state in regexs.keys(): - lexre, re_text = _form_master_re(regexs[state],reflags,ldict,toknames) + lexre, re_text, re_names = _form_master_re(regexs[state],reflags,ldict,toknames) lexobj.lexstatere[state] = lexre lexobj.lexstateretext[state] = re_text + lexobj.lexstaterenames[state] = re_names if debug: for i in range(len(re_text)): print "lex: state '%s'. regex[%d] = '%s'" % (state, i, re_text[i]) @@ -806,6 +801,7 @@ def lex(module=None,object=None,debug=0,optimize=0,lextab="lextab",reflags=0,now if state != "INITIAL" and type == 'inclusive': lexobj.lexstatere[state].extend(lexobj.lexstatere['INITIAL']) lexobj.lexstateretext[state].extend(lexobj.lexstateretext['INITIAL']) + lexobj.lexstaterenames[state].extend(lexobj.lexstaterenames['INITIAL']) lexobj.lexstateinfo = stateinfo lexobj.lexre = lexobj.lexstatere["INITIAL"] @@ -888,7 +884,10 @@ def runmain(lexer=None,data=None): def TOKEN(r): def set_doc(f): - f.__doc__ = r + if callable(r): + f.__doc__ = r.__doc__ + else: + f.__doc__ = r return f return set_doc diff --git a/ply/yacc.py b/ply/yacc.py index 6609c2f..bf3a30b 100644 --- a/ply/yacc.py +++ b/ply/yacc.py @@ -50,7 +50,7 @@ # own risk! # ---------------------------------------------------------------------------- -__version__ = "2.4" +__version__ = "2.5" __tabversion__ = "2.4" # Table version #----------------------------------------------------------------------------- diff --git a/test/lex_ignore.exp b/test/lex_ignore.exp index fdcf61a..f7bdfdd 100644 --- a/test/lex_ignore.exp +++ b/test/lex_ignore.exp @@ -2,6 +2,5 @@ Traceback (most recent call last): File "./lex_ignore.py", line 29, in <module> lex.lex() - File "../ply/lex.py", line 772, in lex - raise SyntaxError,"lex: Unable to build lexer." + File "../../ply/lex.py", line 783, in lex SyntaxError: lex: Unable to build lexer. diff --git a/test/lex_re1.exp b/test/lex_re1.exp index c49472e..968454d 100644 --- a/test/lex_re1.exp +++ b/test/lex_re1.exp @@ -2,6 +2,5 @@ lex: Invalid regular expression for rule 't_NUMBER'. unbalanced parenthesis Traceback (most recent call last): File "./lex_re1.py", line 25, in <module> lex.lex() - File "../ply/lex.py", line 772, in lex - raise SyntaxError,"lex: Unable to build lexer." + File "../../ply/lex.py", line 783, in lex SyntaxError: lex: Unable to build lexer. diff --git a/test/lex_re2.exp b/test/lex_re2.exp index 76d1178..cceb7a6 100644 --- a/test/lex_re2.exp +++ b/test/lex_re2.exp @@ -2,6 +2,5 @@ lex: Regular expression for rule 't_PLUS' matches empty string. Traceback (most recent call last): File "./lex_re2.py", line 25, in <module> lex.lex() - File "../ply/lex.py", line 772, in lex - raise SyntaxError,"lex: Unable to build lexer." + File "../../ply/lex.py", line 783, in lex SyntaxError: lex: Unable to build lexer. diff --git a/test/lex_re3.exp b/test/lex_re3.exp index 615c02e..38ce8ca 100644 --- a/test/lex_re3.exp +++ b/test/lex_re3.exp @@ -3,6 +3,5 @@ lex: Make sure '#' in rule 't_POUND' is escaped with '\#'. Traceback (most recent call last): File "./lex_re3.py", line 27, in <module> lex.lex() - File "../ply/lex.py", line 772, in lex - raise SyntaxError,"lex: Unable to build lexer." + File "../../ply/lex.py", line 783, in lex SyntaxError: lex: Unable to build lexer. diff --git a/test/lex_state1.exp b/test/lex_state1.exp index 5793b16..b89f8ad 100644 --- a/test/lex_state1.exp +++ b/test/lex_state1.exp @@ -2,6 +2,5 @@ lex: states must be defined as a tuple or list. Traceback (most recent call last): File "./lex_state1.py", line 38, in <module> lex.lex() - File "../ply/lex.py", line 772, in lex - raise SyntaxError,"lex: Unable to build lexer." + File "../../ply/lex.py", line 783, in lex SyntaxError: lex: Unable to build lexer. diff --git a/test/lex_state2.exp b/test/lex_state2.exp index 206b194..d7458f3 100644 --- a/test/lex_state2.exp +++ b/test/lex_state2.exp @@ -3,6 +3,5 @@ lex: invalid state specifier 'example'. Must be a tuple (statename,'exclusive|in Traceback (most recent call last): File "./lex_state2.py", line 38, in <module> lex.lex() - File "../ply/lex.py", line 772, in lex - raise SyntaxError,"lex: Unable to build lexer." + File "../../ply/lex.py", line 783, in lex SyntaxError: lex: Unable to build lexer. diff --git a/test/lex_state3.exp b/test/lex_state3.exp index 8284cf0..054ec8d 100644 --- a/test/lex_state3.exp +++ b/test/lex_state3.exp @@ -3,6 +3,5 @@ lex: No rules defined for state 'example' Traceback (most recent call last): File "./lex_state3.py", line 40, in <module> lex.lex() - File "../ply/lex.py", line 772, in lex - raise SyntaxError,"lex: Unable to build lexer." + File "../../ply/lex.py", line 783, in lex SyntaxError: lex: Unable to build lexer. diff --git a/test/lex_state4.exp b/test/lex_state4.exp index 9537dc5..4c77243 100644 --- a/test/lex_state4.exp +++ b/test/lex_state4.exp @@ -2,6 +2,5 @@ lex: state type for state comment must be 'inclusive' or 'exclusive' Traceback (most recent call last): File "./lex_state4.py", line 39, in <module> lex.lex() - File "../ply/lex.py", line 772, in lex - raise SyntaxError,"lex: Unable to build lexer." + File "../../ply/lex.py", line 783, in lex SyntaxError: lex: Unable to build lexer. diff --git a/test/lex_state5.exp b/test/lex_state5.exp index 5d088ac..301f398 100644 --- a/test/lex_state5.exp +++ b/test/lex_state5.exp @@ -2,6 +2,5 @@ lex: state 'comment' already defined. Traceback (most recent call last): File "./lex_state5.py", line 40, in <module> lex.lex() - File "../ply/lex.py", line 772, in lex - raise SyntaxError,"lex: Unable to build lexer." + File "../../ply/lex.py", line 783, in lex SyntaxError: lex: Unable to build lexer. diff --git a/test/lex_state_norule.exp b/test/lex_state_norule.exp index afb0651..07f03d2 100644 --- a/test/lex_state_norule.exp +++ b/test/lex_state_norule.exp @@ -2,6 +2,5 @@ lex: No rules defined for state 'example' Traceback (most recent call last): File "./lex_state_norule.py", line 40, in <module> lex.lex() - File "../ply/lex.py", line 772, in lex - raise SyntaxError,"lex: Unable to build lexer." + File "../../ply/lex.py", line 783, in lex SyntaxError: lex: Unable to build lexer. |