summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorDavid Beazley <dave@dabeaz.com>2008-05-28 16:13:22 +0000
committerDavid Beazley <dave@dabeaz.com>2008-05-28 16:13:22 +0000
commitd2151e3557667c0ea0cff5f41a5dc58b275f107f (patch)
tree70b685901d26af63ede13b37cba8f9c23433926c
parent241e3c771322a9d3014d2f0d48975f15bcba732a (diff)
downloadply-d2151e3557667c0ea0cff5f41a5dc58b275f107f.tar.gz
Bug fixes
-rw-r--r--ANNOUNCE13
-rw-r--r--README4
-rw-r--r--doc/ply.html27
-rw-r--r--example/ansic/clex.py4
-rw-r--r--example/yply/ylex.py4
-rw-r--r--ply/lex.py45
-rw-r--r--ply/yacc.py2
-rw-r--r--test/lex_ignore.exp3
-rw-r--r--test/lex_re1.exp3
-rw-r--r--test/lex_re2.exp3
-rw-r--r--test/lex_re3.exp3
-rw-r--r--test/lex_state1.exp3
-rw-r--r--test/lex_state2.exp3
-rw-r--r--test/lex_state3.exp3
-rw-r--r--test/lex_state4.exp3
-rw-r--r--test/lex_state5.exp3
-rw-r--r--test/lex_state_norule.exp3
17 files changed, 58 insertions, 71 deletions
diff --git a/ANNOUNCE b/ANNOUNCE
index 828f11e..02b6cde 100644
--- a/ANNOUNCE
+++ b/ANNOUNCE
@@ -1,11 +1,11 @@
-May 17, 2008
+May 28, 2008
- Announcing : PLY-2.4 (Python Lex-Yacc)
+ Announcing : PLY-2.5 (Python Lex-Yacc)
http://www.dabeaz.com/ply
I'm pleased to announce a significant new update to PLY---a 100% Python
-implementation of the common parsing tools lex and yacc. PLY-2.4 fixes
+implementation of the common parsing tools lex and yacc. PLY-2.5 fixes
some bugs in error handling and provides some performance improvements.
If you are new to PLY, here are a few highlights:
@@ -29,13 +29,6 @@ If you are new to PLY, here are a few highlights:
problems. Currently, PLY can build its parsing tables using
either SLR or LALR(1) algorithms.
-- PLY can be used to build parsers for large programming languages.
- Although it is not ultra-fast due to its Python implementation,
- PLY can be used to parse grammars consisting of several hundred
- rules (as might be found for a language like C). The lexer and LR
- parser are also reasonably efficient when parsing normal
- sized programs.
-
More information about PLY can be obtained on the PLY webpage at:
http://www.dabeaz.com/ply
diff --git a/README b/README
index 4538f20..d533919 100644
--- a/README
+++ b/README
@@ -1,8 +1,8 @@
-PLY (Python Lex-Yacc) Version 2.4 (May, 2008)
+PLY (Python Lex-Yacc) Version 2.5 (May 28, 2008)
David M. Beazley (dave@dabeaz.com)
-Copyright (C) 2001-2007 David M. Beazley
+Copyright (C) 2001-2008 David M. Beazley
This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
diff --git a/doc/ply.html b/doc/ply.html
index 43af3d7..13a2631 100644
--- a/doc/ply.html
+++ b/doc/ply.html
@@ -12,7 +12,7 @@ dave@dabeaz.com<br>
</b>
<p>
-<b>PLY Version: 2.4</b>
+<b>PLY Version: 2.5</b>
<p>
<!-- INDEX -->
@@ -472,7 +472,8 @@ def t_ID(t):
</blockquote>
It is important to note that storing data in other attribute names is <em>not</em> recommended. The <tt>yacc.py</tt> module only exposes the
-contents of the <tt>value</tt> attribute. Thus, accessing other attributes may be unnecessarily awkward.
+contents of the <tt>value</tt> attribute. Thus, accessing other attributes may be unnecessarily awkward. If you
+need to store multiple values on a token, assign a tuple, dictionary, or instance to <tt>value</tt>.
<H3><a name="ply_nn8"></a>3.5 Discarded tokens</H3>
@@ -894,14 +895,18 @@ m.test("3 + 4") # Test it
</pre>
</blockquote>
-For reasons that are subtle, you should <em>NOT</em> invoke <tt>lex.lex()</tt> inside the <tt>__init__()</tt> method of your class. If you
-do, it may cause bizarre behavior if someone tries to duplicate a lexer object. Keep reading.
+When building a lexer from class, you should construct the lexer from
+an instance of the class, not the class object itself. Also, for
+reasons that are subtle, you should <em>NOT</em>
+invoke <tt>lex.lex()</tt> inside the <tt>__init__()</tt> method of
+your class. If you do, it may cause bizarre behavior if someone tries
+to duplicate a lexer object.
<H3><a name="ply_nn18"></a>3.15 Maintaining state</H3>
-
In your lexer, you may want to maintain a variety of state information. This might include mode settings, symbol tables, and other details. There are a few
-different ways to handle this situation. First, you could just keep some global variables:
+different ways to handle this situation. One way to do this is to keep a set of global variables in the module
+where you created the lexer. For example:
<blockquote>
<pre>
@@ -940,9 +945,9 @@ lexer.num_count = 0 # Set the initial count
</blockquote>
This latter approach has the advantage of storing information inside
-the lexer itself---something that may be useful if multiple instances
+the lexer object itself---something that may be useful if multiple instances
of the same lexer have been created. However, it may also feel kind
-of "hacky" to the purists. Just to put their mind at some ease, all
+of "hacky" to the OO purists. Just to put their mind at some ease, all
internal attributes of the lexer (with the exception of <tt>lineno</tt>) have names that are prefixed
by <tt>lex</tt> (e.g., <tt>lexdata</tt>,<tt>lexpos</tt>, etc.). Thus,
it should be perfectly safe to store attributes in the lexer that
@@ -977,12 +982,12 @@ lexer = lex.lex(object=m)
</pre>
</blockquote>
-The class approach may be the easiest to manage if your application is going to be creating multiple instances of the same lexer and
-you need to manage a lot of state.
+The class approach may be the easiest to manage if your application is
+going to be creating multiple instances of the same lexer and you need
+to manage a lot of state.
<H3><a name="ply_nn19"></a>3.16 Lexer cloning</H3>
-
<p>
If necessary, a lexer object can be quickly duplicated by invoking its <tt>clone()</tt> method. For example:
diff --git a/example/ansic/clex.py b/example/ansic/clex.py
index 7dc98cd..7ec6d32 100644
--- a/example/ansic/clex.py
+++ b/example/ansic/clex.py
@@ -143,12 +143,12 @@ t_CCONST = r'(L)?\'([^\\\n]|(\\.))*?\''
# Comments
def t_comment(t):
r'/\*(.|\n)*?\*/'
- t.lineno += t.value.count('\n')
+ t.lexer.lineno += t.value.count('\n')
# Preprocessor directive (ignored)
def t_preprocessor(t):
r'\#(.)*?\n'
- t.lineno += 1
+ t.lexer.lineno += 1
def t_error(t):
print "Illegal character %s" % repr(t.value[0])
diff --git a/example/yply/ylex.py b/example/yply/ylex.py
index 67d2354..84f2f7a 100644
--- a/example/yply/ylex.py
+++ b/example/yply/ylex.py
@@ -42,7 +42,7 @@ def t_SECTION(t):
# Comments
def t_ccomment(t):
r'/\*(.|\n)*?\*/'
- t.lineno += t.value.count('\n')
+ t.lexer.lineno += t.value.count('\n')
t_ignore_cppcomment = r'//.*'
@@ -95,7 +95,7 @@ def t_code_error(t):
raise RuntimeError
def t_error(t):
- print "%d: Illegal character '%s'" % (t.lineno, t.value[0])
+ print "%d: Illegal character '%s'" % (t.lexer.lineno, t.value[0])
print t.value
t.lexer.skip(1)
diff --git a/ply/lex.py b/ply/lex.py
index f8eed0f..c5beb8c 100644
--- a/ply/lex.py
+++ b/ply/lex.py
@@ -22,7 +22,7 @@
# See the file COPYING for a complete copy of the LGPL.
# -----------------------------------------------------------------------------
-__version__ = "2.4"
+__version__ = "2.5"
__tabversion__ = "2.4" # Version of table file used
import re, sys, types, copy, os
@@ -89,6 +89,7 @@ class Lexer:
self.lexretext = None # Current regular expression strings
self.lexstatere = {} # Dictionary mapping lexer states to master regexs
self.lexstateretext = {} # Dictionary mapping lexer states to regex strings
+ self.lexstaterenames = {} # Dictionary mapping lexer states to symbol names
self.lexstate = "INITIAL" # Current lexer state
self.lexstatestack = [] # Stack of lexer states
self.lexstateinfo = None # State information
@@ -161,7 +162,7 @@ class Lexer:
for key, lre in self.lexstatere.items():
titem = []
for i in range(len(lre)):
- titem.append((self.lexstateretext[key][i],_funcs_to_names(lre[i][1],key,initialfuncs)))
+ titem.append((self.lexstateretext[key][i],_funcs_to_names(lre[i][1],self.lexstaterenames[key][i])))
tabre[key] = titem
tf.write("_lexstatere = %s\n" % repr(tabre))
@@ -409,20 +410,11 @@ def _validate_file(filename):
# suitable for output to a table file
# -----------------------------------------------------------------------------
-def _funcs_to_names(funclist,state,initial):
- # If this is the initial state, we clear the state and initial list
- if state == 'INITIAL':
- state = ""
- initial = []
+def _funcs_to_names(funclist,namelist):
result = []
- for f in funclist:
+ for f,name in zip(funclist,namelist):
if f and f[0]:
- # If a function is defined, make sure it's name corresponds to the correct state
- if not initial or f in initial:
- statestr = "t_"
- else:
- statestr = "t_"+state+"_"
- result.append((statestr+ f[1],f[1]))
+ result.append((name, f[1]))
else:
result.append(f)
return result
@@ -459,25 +451,27 @@ def _form_master_re(relist,reflags,ldict,toknames):
# Build the index to function map for the matching engine
lexindexfunc = [ None ] * (max(lexre.groupindex.values())+1)
+ lexindexnames = lexindexfunc[:]
+
for f,i in lexre.groupindex.items():
handle = ldict.get(f,None)
if type(handle) in (types.FunctionType, types.MethodType):
lexindexfunc[i] = (handle,toknames[f])
+ lexindexnames[i] = f
elif handle is not None:
- # If rule was specified as a string, we build an anonymous
- # callback function to carry out the action
+ lexindexnames[i] = f
if f.find("ignore_") > 0:
lexindexfunc[i] = (None,None)
else:
lexindexfunc[i] = (None, toknames[f])
-
- return [(lexre,lexindexfunc)],[regex]
+
+ return [(lexre,lexindexfunc)],[regex],[lexindexnames]
except Exception,e:
m = int(len(relist)/2)
if m == 0: m = 1
- llist, lre = _form_master_re(relist[:m],reflags,ldict,toknames)
- rlist, rre = _form_master_re(relist[m:],reflags,ldict,toknames)
- return llist+rlist, lre+rre
+ llist, lre, lnames = _form_master_re(relist[:m],reflags,ldict,toknames)
+ rlist, rre, rnames = _form_master_re(relist[m:],reflags,ldict,toknames)
+ return llist+rlist, lre+rre, lnames+rnames
# -----------------------------------------------------------------------------
# def _statetoken(s,names)
@@ -794,9 +788,10 @@ def lex(module=None,object=None,debug=0,optimize=0,lextab="lextab",reflags=0,now
# Build the master regular expressions
for state in regexs.keys():
- lexre, re_text = _form_master_re(regexs[state],reflags,ldict,toknames)
+ lexre, re_text, re_names = _form_master_re(regexs[state],reflags,ldict,toknames)
lexobj.lexstatere[state] = lexre
lexobj.lexstateretext[state] = re_text
+ lexobj.lexstaterenames[state] = re_names
if debug:
for i in range(len(re_text)):
print "lex: state '%s'. regex[%d] = '%s'" % (state, i, re_text[i])
@@ -806,6 +801,7 @@ def lex(module=None,object=None,debug=0,optimize=0,lextab="lextab",reflags=0,now
if state != "INITIAL" and type == 'inclusive':
lexobj.lexstatere[state].extend(lexobj.lexstatere['INITIAL'])
lexobj.lexstateretext[state].extend(lexobj.lexstateretext['INITIAL'])
+ lexobj.lexstaterenames[state].extend(lexobj.lexstaterenames['INITIAL'])
lexobj.lexstateinfo = stateinfo
lexobj.lexre = lexobj.lexstatere["INITIAL"]
@@ -888,7 +884,10 @@ def runmain(lexer=None,data=None):
def TOKEN(r):
def set_doc(f):
- f.__doc__ = r
+ if callable(r):
+ f.__doc__ = r.__doc__
+ else:
+ f.__doc__ = r
return f
return set_doc
diff --git a/ply/yacc.py b/ply/yacc.py
index 6609c2f..bf3a30b 100644
--- a/ply/yacc.py
+++ b/ply/yacc.py
@@ -50,7 +50,7 @@
# own risk!
# ----------------------------------------------------------------------------
-__version__ = "2.4"
+__version__ = "2.5"
__tabversion__ = "2.4" # Table version
#-----------------------------------------------------------------------------
diff --git a/test/lex_ignore.exp b/test/lex_ignore.exp
index fdcf61a..f7bdfdd 100644
--- a/test/lex_ignore.exp
+++ b/test/lex_ignore.exp
@@ -2,6 +2,5 @@
Traceback (most recent call last):
File "./lex_ignore.py", line 29, in <module>
lex.lex()
- File "../ply/lex.py", line 772, in lex
- raise SyntaxError,"lex: Unable to build lexer."
+ File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
diff --git a/test/lex_re1.exp b/test/lex_re1.exp
index c49472e..968454d 100644
--- a/test/lex_re1.exp
+++ b/test/lex_re1.exp
@@ -2,6 +2,5 @@ lex: Invalid regular expression for rule 't_NUMBER'. unbalanced parenthesis
Traceback (most recent call last):
File "./lex_re1.py", line 25, in <module>
lex.lex()
- File "../ply/lex.py", line 772, in lex
- raise SyntaxError,"lex: Unable to build lexer."
+ File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
diff --git a/test/lex_re2.exp b/test/lex_re2.exp
index 76d1178..cceb7a6 100644
--- a/test/lex_re2.exp
+++ b/test/lex_re2.exp
@@ -2,6 +2,5 @@ lex: Regular expression for rule 't_PLUS' matches empty string.
Traceback (most recent call last):
File "./lex_re2.py", line 25, in <module>
lex.lex()
- File "../ply/lex.py", line 772, in lex
- raise SyntaxError,"lex: Unable to build lexer."
+ File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
diff --git a/test/lex_re3.exp b/test/lex_re3.exp
index 615c02e..38ce8ca 100644
--- a/test/lex_re3.exp
+++ b/test/lex_re3.exp
@@ -3,6 +3,5 @@ lex: Make sure '#' in rule 't_POUND' is escaped with '\#'.
Traceback (most recent call last):
File "./lex_re3.py", line 27, in <module>
lex.lex()
- File "../ply/lex.py", line 772, in lex
- raise SyntaxError,"lex: Unable to build lexer."
+ File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
diff --git a/test/lex_state1.exp b/test/lex_state1.exp
index 5793b16..b89f8ad 100644
--- a/test/lex_state1.exp
+++ b/test/lex_state1.exp
@@ -2,6 +2,5 @@ lex: states must be defined as a tuple or list.
Traceback (most recent call last):
File "./lex_state1.py", line 38, in <module>
lex.lex()
- File "../ply/lex.py", line 772, in lex
- raise SyntaxError,"lex: Unable to build lexer."
+ File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
diff --git a/test/lex_state2.exp b/test/lex_state2.exp
index 206b194..d7458f3 100644
--- a/test/lex_state2.exp
+++ b/test/lex_state2.exp
@@ -3,6 +3,5 @@ lex: invalid state specifier 'example'. Must be a tuple (statename,'exclusive|in
Traceback (most recent call last):
File "./lex_state2.py", line 38, in <module>
lex.lex()
- File "../ply/lex.py", line 772, in lex
- raise SyntaxError,"lex: Unable to build lexer."
+ File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
diff --git a/test/lex_state3.exp b/test/lex_state3.exp
index 8284cf0..054ec8d 100644
--- a/test/lex_state3.exp
+++ b/test/lex_state3.exp
@@ -3,6 +3,5 @@ lex: No rules defined for state 'example'
Traceback (most recent call last):
File "./lex_state3.py", line 40, in <module>
lex.lex()
- File "../ply/lex.py", line 772, in lex
- raise SyntaxError,"lex: Unable to build lexer."
+ File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
diff --git a/test/lex_state4.exp b/test/lex_state4.exp
index 9537dc5..4c77243 100644
--- a/test/lex_state4.exp
+++ b/test/lex_state4.exp
@@ -2,6 +2,5 @@ lex: state type for state comment must be 'inclusive' or 'exclusive'
Traceback (most recent call last):
File "./lex_state4.py", line 39, in <module>
lex.lex()
- File "../ply/lex.py", line 772, in lex
- raise SyntaxError,"lex: Unable to build lexer."
+ File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
diff --git a/test/lex_state5.exp b/test/lex_state5.exp
index 5d088ac..301f398 100644
--- a/test/lex_state5.exp
+++ b/test/lex_state5.exp
@@ -2,6 +2,5 @@ lex: state 'comment' already defined.
Traceback (most recent call last):
File "./lex_state5.py", line 40, in <module>
lex.lex()
- File "../ply/lex.py", line 772, in lex
- raise SyntaxError,"lex: Unable to build lexer."
+ File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.
diff --git a/test/lex_state_norule.exp b/test/lex_state_norule.exp
index afb0651..07f03d2 100644
--- a/test/lex_state_norule.exp
+++ b/test/lex_state_norule.exp
@@ -2,6 +2,5 @@ lex: No rules defined for state 'example'
Traceback (most recent call last):
File "./lex_state_norule.py", line 40, in <module>
lex.lex()
- File "../ply/lex.py", line 772, in lex
- raise SyntaxError,"lex: Unable to build lexer."
+ File "../../ply/lex.py", line 783, in lex
SyntaxError: lex: Unable to build lexer.