Fixes to debugging output and docs

author: David Beazley <dave@dabeaz.com> 2009-02-03 18:22:47 +0000
committer: David Beazley <dave@dabeaz.com> 2009-02-03 18:22:47 +0000
commit: 2509247839cfff7bb5cb762591ac718619e4c619 (patch)
tree: 47eb6062ebb7910671590eceae917d1b08e79711 /doc
parent: 0516f5fcd0b0931a018128c5f4a070b14c76a040 (diff)
download: ply-2509247839cfff7bb5cb762591ac718619e4c619.tar.gz
1 files changed, 129 insertions, 24 deletions
diff --git a/doc/ply.html b/doc/ply.html
index f9fe036..fca0966 100644
--- a/doc/ply.html
+++ b/doc/ply.html
@@ -72,10 +72,23 @@ dave@dabeaz.com<br>
 <!-- INDEX -->
 
 
+<h2>Preface and Requirements</h2>
 
+<p>
+This document provides an overview of lexing and parsing with PLY.
+Given the intrinsic complexity of parsing, I would strongly advise 
+that you read (or at least skim) this entire document before jumping
+into a big development project with PLY.  
+</p>
 
-
-
+<p>
+PLY-3.0 is compatible with both Python 2 and Python 3.  Be aware that
+Python 3 support is new and has not been extensively tested (although
+all of the examples and unit tests pass under Python 3.0).   If you are
+using Python 2, you should try to use Python 2.4 or newer.  Although PLY
+works with versions as far back as Python 2.2, some of its optional features
+require more modern library modules.
+</p>
 
 <H2><a name="ply_nn1"></a>1. Introduction</H2>
 
@@ -392,11 +405,7 @@ converts the string into a Python integer.
 <pre>
 def t_NUMBER(t):
     r'\d+'
-    try:
-         t.value = int(t.value)
-    except ValueError:
-         print "Number %s is too large!" % t.value
-	 t.value = 0
+    t.value = int(t.value)
     return t
 </pre>
 </blockquote>
@@ -427,8 +436,8 @@ expressions in order of decreasing length, this problem is solved for rules defi
 the order can be explicitly controlled since rules appearing first are checked first.
 
 <p>
-To handle reserved words, it is usually easier to just match an identifier and do a special name lookup in a function
-like this:
+To handle reserved words, you should write a single rule to match an
+identifier and do a special name lookup in a function like this:
 
 <blockquote>
 <pre>
@@ -741,12 +750,16 @@ lexer = lex.lex(debug=1)
 </pre>
 </blockquote>
 
-This will result in a large amount of debugging information to be printed including all of the added rules and the master
-regular expressions.
+<p>
+This will produce various sorts of debugging information including all of the added rules,
+the master regular expressions used by the lexer, and tokens generating during lexing.
+</p>
 
+<p>
 In addition, <tt>lex.py</tt> comes with a simple main function which
 will either tokenize input read from standard input or from a file specified
 on the command line. To use it, simply put this in your lexer:
+</p>
 
 <blockquote>
 <pre>
@@ -755,6 +768,9 @@ if __name__ == '__main__':
 </pre>
 </blockquote>
 
+Please refer to the "Debugging" section near the end for some more advanced details 
+of debugging.
+
 <H3><a name="ply_nn17"></a>3.14 Alternative specification of lexers</H3>
 
 
@@ -2990,16 +3006,7 @@ each time it runs (which may take awhile depending on how large your grammar is)
 
 <blockquote>
 <pre>
-yacc.parse(debug=n)      # Pick n > 1 for increased amounts of debugging
-</pre>
-</blockquote>
-
-<p>
-<li>To redirect the debugging output to a filename of your choosing, use:
-
-<blockquote>
-<pre>
-yacc.parse(debug=n, debugfile="debugging.out")   # Pick n > 1 for increasing amount of debugging
+yacc.parse(debug=1)     
 </pre>
 </blockquote>
 
@@ -3117,9 +3124,107 @@ the tables without the need for doc strings.
 <p>
 Beware: running PLY in optimized mode disables a lot of error
 checking.  You should only do this when your project has stabilized
-and you don't need to do any debugging.
-  
-<H2><a name="ply_nn39"></a>8. Where to go from here?</H2>
+and you don't need to do any debugging.   One of the purposes of
+optimized mode is to substantially decrease the startup time of
+your compiler (by assuming that everything is already properly
+specified and works).
+
+<H2>8. Advanced Debugging</H2>
+
+<p>
+Debugging a compiler is typically not an easy task. PLY provides some
+advanced diagonistic capabilities through the use of Python's
+<tt>logging</tt> module.   The next two sections describe this:
+
+<h3>8.1 Debugging the lex() and yacc() commands</h3>
+
+<p>
+Both the <tt>lex()</tt> and <tt>yacc()</tt> commands have a debugging
+mode that can be enabled using the <tt>debug</tt> flag.  For example:
+
+<blockquote>
+<pre>
+lex.lex(debug=True)
+yacc.yacc(debug=True)
+</pre>
+</blockquote>
+
+Normally, the output produced by debugging is routed to either
+standard error or, in the case of <tt>yacc()</tt>, to a file
+<tt>parser.out</tt>.  This output can be more carefully controlled
+by supplying a logging object.  Here is an example that adds
+information about where different debugging messages are coming from:
+
+<blockquote>
+<pre>
+# Set up a logging object
+import logging
+logging.basicConfig(
+    level = logging.DEBUG,
+    filename = "parselog.txt",
+    filemode = "w",
+    format = "%(filename)10s:%(lineno)4d:%(message)s"
+)
+log = logging.getLogger()
+
+lex.lex(debug=True,debuglog=log)
+yacc.yacc(debug=True,debuglog=log)
+</pre>
+</blockquote>
+
+If you supply a custom logger, the amount of debugging
+information produced can be controlled by setting the logging level.
+Typically, debugging messages are either issued at the <tt>DEBUG</tt>,
+<tt>INFO</tt>, or <tt>WARNING</tt> levels.
+
+<p>
+PLY's error messages and warnings are also produced using the logging
+interface.  This can be controlled by passing a logging object
+using the <tt>errorlog</tt> parameter.
+
+<blockquote>
+<pre>
+lex.lex(errorlog=log)
+yacc.yacc(errorlog=log)
+</pre>
+</blockquote>
+
+If you want to completely silence warnings, you can either pass in a
+logging object with an appropriate filter level or use the <tt>NullLogger</tt>
+object defined in either <tt>lex</tt> or <tt>yacc</tt>.  For example:
+
+<blockquote>
+<pre>
+yacc.yacc(errorlog=yacc.NullLogger())
+</pre>
+</blockquote>
+
+<h3>8.2 Run-time Debugging</h3>
+
+<p>
+To enable run-time debugging of a parser, use the <tt>debug</tt> option to parse. This
+option can either be an integer (which simply turns debugging on or off) or an instance
+of a logger object. For example:
+
+<blockquote>
+<pre>
+log = logging.getLogger()
+parser.parse(input,debug=log)
+</pre>
+</blockquote>
+
+If a logging object is passed, you can use its filtering level to control how much
+output gets generated.   The <tt>INFO</tt> level is used to produce information
+about rule reductions.  The <tt>DEBUG</tt> level will show information about the
+parsing stack, token shifts, and other details.  The <tt>ERROR</tt> level shows information
+related to parsing errors.
+
+<p>
+For very complicated problems, you should pass in a logging object that
+redirects to a file where you can more easily inspect the output after
+execution.
+
+<H2><a name="ply_nn39"></a>9. Where to go from here?</H2>
 
 
 The <tt>examples</tt> directory of the PLY distribution contains several simple examples.   Please consult a
author	David Beazley <dave@dabeaz.com>	2009-02-03 18:22:47 +0000
committer	David Beazley <dave@dabeaz.com>	2009-02-03 18:22:47 +0000
commit	2509247839cfff7bb5cb762591ac718619e4c619 (patch)
tree	47eb6062ebb7910671590eceae917d1b08e79711 /doc
parent	0516f5fcd0b0931a018128c5f4a070b14c76a040 (diff)
download	ply-2509247839cfff7bb5cb762591ac718619e4c619.tar.gz