Author: | Eli Bendersky |
---|
Contents
pycparser is a parser for the C language, written in pure Python. It is a module designed to be easily integrated into applications that need to parse C source code.
Anything that needs C code to be parsed. The following are some uses for pycparser, taken from real user reports:
pycparser is unique in the sense that it's written in pure Python - a very high level language that's easy to experiment with and tweak. To people familiar with Lex and Yacc, pycparser's code will be simple to understand.
pycparser aims to support the full C99 language (according to the standard ISO/IEC 9899). This is a new feature in the version 2.x series - earlier versions only supported C89. For more information on the change, read this wiki page.
pycparser doesn't support any GCC extensions.
pycparser very closely follows the C grammar provided in the end of the C99 standard document
Drop me an email to eliben@gmail.com for any questions regarding pycparser. For reporting problems with pycparser or submitting feature requests, the best way is to open an issue on the pycparser page at Google Code.
Installing pycparser is very simple. Once you download it from its website and unzip the package, you just have to execute the standard python setup.py install. The setup script will then place the pycparser module into site-packages in your Python's installation library.
It's recommended to run _build_tables.py in the pycparser code directory after installation to make sure the parsing tables of PLY are pre-generated. This can make your code run faster.
In order to be compilable, C code must be preprocessed by the C preprocessor - cpp. cpp handles preprocessing directives like #include and #define, removes comments, and does other minor tasks that prepare the C code for compilation.
For all but the most trivial snippets of C code, pycparser, like a C compiler, must receive preprocessed C code in order to function correctly. If you import the top-level parse_file function from the pycparser package, it will interact with cpp for you, as long as it's in your PATH, or you provide a path to it.
On the vast majority of Linux systems, cpp is installed and is in the PATH. If you're on Windows and don't have cpp somewhere, you can use the one provided in the utils directory in pycparser's distribution. This cpp executable was compiled from the LCC distribution, and is provided under LCC's license terms.
C code almost always includes various header files from the standard C library, like stdio.h. While, with some effort, pycparser can be made to parse the standard headers from any C compiler, it's much simpler to use the provided "fake" standard includes in utils/fake_libc_include. These are standard C header files that contain only the bare necessities to allow valid parsing of the files that use them. As a bonus, since they're minimal, it can significantly improve the performance of parsing C files.
See the using_cpp_libc.py example for more details.
Take a look at the examples directory of the distribution for a few examples of using pycparser. These should be enough to get you started.
The public interface of pycparser is well documented with comments in pycparser/c_parser.py. For a detailed overview of the various AST nodes created by the parser, see pycparser/_c_ast.cfg.
There's also a FAQ available here. In any case, you can always drop me an email for help.
There are a few points to keep in mind when modifying pycparser:
Once you unzip the pycparser package, you'll see the following files and directories:
Some people have contributed to pycparser by opening issues on bugs they've found and/or submitting patches. The list of contributors is at this pycparser Wiki page.