From 2509247839cfff7bb5cb762591ac718619e4c619 Mon Sep 17 00:00:00 2001 From: David Beazley Date: Tue, 3 Feb 2009 18:22:47 +0000 Subject: Fixes to debugging output and docs --- doc/ply.html | 153 +++++++++++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 129 insertions(+), 24 deletions(-) (limited to 'doc') diff --git a/doc/ply.html b/doc/ply.html index f9fe036..fca0966 100644 --- a/doc/ply.html +++ b/doc/ply.html @@ -72,10 +72,23 @@ dave@dabeaz.com
+

Preface and Requirements

+

+This document provides an overview of lexing and parsing with PLY. +Given the intrinsic complexity of parsing, I would strongly advise +that you read (or at least skim) this entire document before jumping +into a big development project with PLY. +

- - +

+PLY-3.0 is compatible with both Python 2 and Python 3. Be aware that +Python 3 support is new and has not been extensively tested (although +all of the examples and unit tests pass under Python 3.0). If you are +using Python 2, you should try to use Python 2.4 or newer. Although PLY +works with versions as far back as Python 2.2, some of its optional features +require more modern library modules. +

1. Introduction

@@ -392,11 +405,7 @@ converts the string into a Python integer.
 def t_NUMBER(t):
     r'\d+'
-    try:
-         t.value = int(t.value)
-    except ValueError:
-         print "Number %s is too large!" % t.value
-	 t.value = 0
+    t.value = int(t.value)
     return t
 
@@ -427,8 +436,8 @@ expressions in order of decreasing length, this problem is solved for rules defi the order can be explicitly controlled since rules appearing first are checked first.

-To handle reserved words, it is usually easier to just match an identifier and do a special name lookup in a function -like this: +To handle reserved words, you should write a single rule to match an +identifier and do a special name lookup in a function like this:

@@ -741,12 +750,16 @@ lexer = lex.lex(debug=1)
 
-This will result in a large amount of debugging information to be printed including all of the added rules and the master -regular expressions. +

+This will produce various sorts of debugging information including all of the added rules, +the master regular expressions used by the lexer, and tokens generating during lexing. +

+

In addition, lex.py comes with a simple main function which will either tokenize input read from standard input or from a file specified on the command line. To use it, simply put this in your lexer: +

@@ -755,6 +768,9 @@ if __name__ == '__main__':
 
+Please refer to the "Debugging" section near the end for some more advanced details +of debugging. +

3.14 Alternative specification of lexers

@@ -2990,16 +3006,7 @@ each time it runs (which may take awhile depending on how large your grammar is)
-yacc.parse(debug=n)      # Pick n > 1 for increased amounts of debugging
-
-
- -

-

  • To redirect the debugging output to a filename of your choosing, use: - -
    -
    -yacc.parse(debug=n, debugfile="debugging.out")   # Pick n > 1 for increasing amount of debugging
    +yacc.parse(debug=1)     
     
    @@ -3117,9 +3124,107 @@ the tables without the need for doc strings.

    Beware: running PLY in optimized mode disables a lot of error checking. You should only do this when your project has stabilized -and you don't need to do any debugging. - -

    8. Where to go from here?

    +and you don't need to do any debugging. One of the purposes of +optimized mode is to substantially decrease the startup time of +your compiler (by assuming that everything is already properly +specified and works). + +

    8. Advanced Debugging

    + +

    +Debugging a compiler is typically not an easy task. PLY provides some +advanced diagonistic capabilities through the use of Python's +logging module. The next two sections describe this: + +

    8.1 Debugging the lex() and yacc() commands

    + +

    +Both the lex() and yacc() commands have a debugging +mode that can be enabled using the debug flag. For example: + +

    +
    +lex.lex(debug=True)
    +yacc.yacc(debug=True)
    +
    +
    + +Normally, the output produced by debugging is routed to either +standard error or, in the case of yacc(), to a file +parser.out. This output can be more carefully controlled +by supplying a logging object. Here is an example that adds +information about where different debugging messages are coming from: + +
    +
    +# Set up a logging object
    +import logging
    +logging.basicConfig(
    +    level = logging.DEBUG,
    +    filename = "parselog.txt",
    +    filemode = "w",
    +    format = "%(filename)10s:%(lineno)4d:%(message)s"
    +)
    +log = logging.getLogger()
    +
    +lex.lex(debug=True,debuglog=log)
    +yacc.yacc(debug=True,debuglog=log)
    +
    +
    + +If you supply a custom logger, the amount of debugging +information produced can be controlled by setting the logging level. +Typically, debugging messages are either issued at the DEBUG, +INFO, or WARNING levels. + +

    +PLY's error messages and warnings are also produced using the logging +interface. This can be controlled by passing a logging object +using the errorlog parameter. + +

    +
    +lex.lex(errorlog=log)
    +yacc.yacc(errorlog=log)
    +
    +
    + +If you want to completely silence warnings, you can either pass in a +logging object with an appropriate filter level or use the NullLogger +object defined in either lex or yacc. For example: + +
    +
    +yacc.yacc(errorlog=yacc.NullLogger())
    +
    +
    + +

    8.2 Run-time Debugging

    + +

    +To enable run-time debugging of a parser, use the debug option to parse. This +option can either be an integer (which simply turns debugging on or off) or an instance +of a logger object. For example: + +

    +
    +log = logging.getLogger()
    +parser.parse(input,debug=log)
    +
    +
    + +If a logging object is passed, you can use its filtering level to control how much +output gets generated. The INFO level is used to produce information +about rule reductions. The DEBUG level will show information about the +parsing stack, token shifts, and other details. The ERROR level shows information +related to parsing errors. + +

    +For very complicated problems, you should pass in a logging object that +redirects to a file where you can more easily inspect the output after +execution. + +

    9. Where to go from here?

    The examples directory of the PLY distribution contains several simple examples. Please consult a -- cgit v1.2.1