diff options
author | nigel <nigel@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2007-02-24 21:40:45 +0000 |
---|---|---|
committer | nigel <nigel@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2007-02-24 21:40:45 +0000 |
commit | 97cb05691b9cabed35f1a853c74d48c692aaabcf (patch) | |
tree | cb7c68a44f0b79c6d90d9a18a7ec640c8435a5e7 /README | |
parent | 455fcc7e13a175722acfd2cca6ab99caa9606a22 (diff) | |
download | pcre-97cb05691b9cabed35f1a853c74d48c692aaabcf.tar.gz |
Load pcre-6.0 into code/trunk.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@77 2f5784b3-3f2a-0410-8824-cb99058d5e15
Diffstat (limited to 'README')
-rw-r--r-- | README | 186 |
1 files changed, 124 insertions, 62 deletions
@@ -7,14 +7,22 @@ The latest release of PCRE is always available from Please read the NEWS file if you are upgrading from a previous release. -PCRE has its own native API, but a set of "wrapper" functions that are based on -the POSIX API are also supplied in the library libpcreposix. Note that this -just provides a POSIX calling interface to PCRE: the regular expressions -themselves still follow Perl syntax and semantics. The header file -for the POSIX-style functions is called pcreposix.h. The official POSIX name is -regex.h, but I didn't want to risk possible problems with existing files of -that name by distributing it that way. To use it with an existing program that -uses the POSIX API, it will have to be renamed or pointed at by a link. + +The PCRE APIs +------------- + +PCRE is written in C, and it has its own API. The distribution now includes a +set of C++ wrapper functions, courtesy of Google Inc. (see the pcrecpp man page +for details). + +Also included are a set of C wrapper functions that are based on the POSIX +API. These end up in the library called libpcreposix. Note that this just +provides a POSIX calling interface to PCRE: the regular expressions themselves +still follow Perl syntax and semantics. The header file for the POSIX-style +functions is called pcreposix.h. The official POSIX name is regex.h, but I +didn't want to risk possible problems with existing files of that name by +distributing it that way. To use it with an existing program that uses the +POSIX API, it will have to be renamed or pointed at by a link. If you are using the POSIX interface to PCRE and there is already a POSIX regex library installed on your system, you must take care when linking programs to @@ -112,7 +120,7 @@ library. You can read more about them in the pcrebuild man page. on the "configure" command. -. PCRE has a counter which can be set to limit the amount of resources it uses. +. PCRE has a counter that can be set to limit the amount of resources it uses. If the limit is exceeded during a match, the match fails. The default is ten million. You can change the default by setting, for example, @@ -130,31 +138,56 @@ library. You can read more about them in the pcrebuild man page. is a representation of the compiled pattern, and this changes with the link size. -. You can build PCRE so that its match() function does not call itself - recursively. Instead, it uses blocks of data from the heap via special - functions pcre_stack_malloc() and pcre_stack_free() to save data that would - otherwise be saved on the stack. To build PCRE like this, use +. You can build PCRE so that its internal match() function that is called from + pcre_exec() does not call itself recursively. Instead, it uses blocks of data + from the heap via special functions pcre_stack_malloc() and pcre_stack_free() + to save data that would otherwise be saved on the stack. To build PCRE like + this, use --disable-stack-for-recursion on the "configure" command. PCRE runs more slowly in this mode, but it may be - necessary in environments with limited stack sizes. + necessary in environments with limited stack sizes. This applies only to the + pcre_exec() function; it does not apply to pcre_dfa_exec(), which does not + use deeply nested recursion. -The "configure" script builds seven files: +The "configure" script builds eight files for the basic C library: -. pcre.h is build by copying pcre.in and making substitutions -. Makefile is built by copying Makefile.in and making substitutions. -. config.h is built by copying config.in and making substitutions. -. pcre-config is built by copying pcre-config.in and making substitutions. -. libpcre.pc is data for the pkg-config command, built from libpcre.pc.in +. pcre.h is the header file for C programs that call PCRE +. Makefile is the makefile that builds the library +. config.h contains build-time configuration options for the library +. pcre-config is a script that shows the settings of "configure" options +. libpcre.pc is data for the pkg-config command . libtool is a script that builds shared and/or static libraries -. RunTest is a script for running tests +. RunTest is a script for running tests on the library +. RunGrepTest is a script for running tests on the pcregrep command + +In addition, if a C++ compiler is found, the following are also built: -Once "configure" has run, you can run "make". It builds two libraries called +. pcrecpp.h is the header file for programs that call PCRE via the C++ wrapper +. pcre_stringpiece.h is the header for the C++ "stringpiece" functions + +The "configure" script also creates config.status, which is an executable +script that can be run to recreate the configuration, and config.log, which +contains compiler output from tests that "configure" runs. + +Once "configure" has run, you can run "make". It builds two libraries, called libpcre and libpcreposix, a test program called pcretest, and the pcregrep -command. You can use "make install" to copy these, the public header files -pcre.h and pcreposix.h, and the man pages to appropriate live directories on -your system, in the normal way. +command. If a C++ compiler was found on your system, it also builds the C++ +wrapper library, which is called libpcrecpp, and some test programs called +pcrecpp_unittest, pcre_scanner_unittest, and pcre_stringpiece_unittest. + +The command "make test" runs all the appropriate tests. Details of the PCRE +tests are given in a separate section of this document, below. + +You can use "make install" to copy the libraries, the public header files +pcre.h, pcreposix.h, pcrecpp.h, and pcre_stringpiece.h (the last two only if +the C++ wrapper was built), and the man pages to appropriate live directories +on your system, in the normal way. + +If you want to remove PCRE from your system, you can run "make uninstall". +This removes all the files that "make install" installed. However, it does not +remove any directories, because these are often shared with other programs. Retrieving configuration information on Unix-like systems @@ -187,9 +220,9 @@ pkgconfig. Shared libraries on Unix-like systems ------------------------------------- -The default distribution builds PCRE as two shared libraries and two static -libraries, as long as the operating system supports shared libraries. Shared -library support relies on the "libtool" script which is built as part of the +The default distribution builds PCRE as shared libraries and static libraries, +as long as the operating system supports shared libraries. Shared library +support relies on the "libtool" script which is built as part of the "configure" process. The libtool script is used to compile and link both shared and static @@ -218,7 +251,8 @@ order to cross-compile PCRE for some other host. However, during the building process, the dftables.c source file is compiled *and run* on the local host, in order to generate the default character tables (the chartables.c file). It therefore needs to be compiled with the local compiler, not the cross compiler. -You can do this by specifying CC_FOR_BUILD (and if necessary CFLAGS_FOR_BUILD) +You can do this by specifying CC_FOR_BUILD (and if necessary CFLAGS_FOR_BUILD; +there are also CXX_FOR_BUILD and CXXFLAGS_FOR_BUILD for the C++ wrapper) when calling the "configure" command. If they are not specified, they default to the values of CC and CFLAGS. @@ -240,13 +274,19 @@ Testing PCRE ------------ To test PCRE on a Unix system, run the RunTest script that is created by the -configuring process. (This can also be run by "make runtest", "make check", or -"make test".) For other systems, see the instructions in NON-UNIX-USE. - -The script runs the pcretest test program (which is documented in its own man -page) on each of the testinput files (in the testdata directory) in turn, -and compares the output with the contents of the corresponding testoutput file. -A file called testtry is used to hold the main output from pcretest +configuring process. There is also a script called RunGrepTest that tests the +options of the pcregrep command. If the C++ wrapper library is build, three +test programs called pcrecpp_unittest, pcre_scanner_unittest, and +pcre_stringpiece_unittest are provided. + +Both the scripts and all the program tests are run if you obey "make runtest", +"make check", or "make test". For other systems, see the instructions in +NON-UNIX-USE. + +The RunTest script runs the pcretest test program (which is documented in its +own man page) on each of the testinput files (in the testdata directory) in +turn, and compares the output with the contents of the corresponding testoutput +file. A file called testtry is used to hold the main output from pcretest (testsavedregex is also used as a working file). To run pcretest on just one of the test files, give its number as an argument to RunTest, for example: @@ -294,9 +334,14 @@ commented in the script, can be be used.) The fifth test checks error handling with UTF-8 encoding, and internal UTF-8 features of PCRE that are not relevant to Perl. -The sixth and final test checks the support for Unicode character properties. -It it not run automatically unless PCRE is built with Unicode property support. -To to this you must set --enable-unicode-properties when running "configure". +The sixth and test checks the support for Unicode character properties. It it +not run automatically unless PCRE is built with Unicode property support. To to +this you must set --enable-unicode-properties when running "configure". + +The seventh, eighth, and ninth tests check the pcre_dfa_exec() alternative +matching function, in non-UTF-8 mode, UTF-8 mode, and UTF-8 mode with Unicode +property support, respectively. The eighth and ninth tests are not run +automatically unless PCRE is build with the relevant support. Character tables @@ -348,14 +393,27 @@ The distribution should contain the following files: dftables.c auxiliary program for building chartables.c - get.c ) - maketables.c ) - study.c ) source of the functions - pcre.c ) in the library pcreposix.c ) - printint.c ) - - ucp.c ) + pcre_compile.c ) + pcre_config.c ) + pcre_dfa_exec.c ) + pcre_exec.c ) + pcre_fullinfo.c ) + pcre_get.c ) sources for the functions in the library, + pcre_globals.c ) and some internal functions that they use + pcre_info.c ) + pcre_maketables.c ) + pcre_ord2utf8.c ) + pcre_printint.c ) + pcre_study.c ) + pcre_tables.c ) + pcre_try_flipped.c ) + pcre_ucp_findchar.c ) + pcre_valid_utf8.c ) + pcre_version.c ) + pcre_xclass.c ) + + ucp_findchar.c ) ucp.h ) source for the code that is used for ucpinternal.h ) Unicode property handling ucptable.c ) @@ -364,9 +422,17 @@ The distribution should contain the following files: pcre.in "source" for the header for the external API; pcre.h is built from this by "configure" pcreposix.h header for the external POSIX wrapper API - internal.h header for internal use + pcre_internal.h header for internal use config.in template for config.h, which is built by configure + pcrecpp.h.in "source" for the header file for the C++ wrapper + pcrecpp.cc ) + pcre_scanner.cc ) source for the C++ wrapper library + + pcre_stringpiece.h.in "source" for pcre_stringpiece.h, the header for the + C++ stringpiece functions + pcre_stringpiece.cc source for the C++ stringpiece functions + (B) Auxiliary files: AUTHORS information about the author of PCRE @@ -379,6 +445,7 @@ The distribution should contain the following files: NON-UNIX-USE notes on building PCRE on non-Unix systems README this file RunTest.in template for a Unix shell script for running tests + RunGrepTest.in template for a Unix shell script for pcregrep tests config.guess ) files used by libtool, config.sub ) used only when building a shared library configure a configuring shell script (built by autoconf) @@ -399,22 +466,15 @@ The distribution should contain the following files: perltest Perl test program pcregrep.c source of a grep utility that uses PCRE pcre-config.in source of script which retains PCRE information - testdata/testinput1 test data, compatible with Perl - testdata/testinput2 test data for error messages and non-Perl things - testdata/testinput3 test data for locale-specific tests - testdata/testinput4 test data for UTF-8 tests compatible with Perl - testdata/testinput5 test data for other UTF-8 tests - testdata/testinput6 test data for Unicode property support tests - testdata/testoutput1 test results corresponding to testinput1 - testdata/testoutput2 test results corresponding to testinput2 - testdata/testoutput3 test results corresponding to testinput3 - testdata/testoutput4 test results corresponding to testinput4 - testdata/testoutput5 test results corresponding to testinput5 - testdata/testoutput6 test results corresponding to testinput6 + pcrecpp_unittest.c ) + pcre_scanner_unittest.c ) test programs for the C++ wrapper + pcre_stringpiece_unittest.c ) + testdata/testinput* test data for main library tests + testdata/testoutput* expected test results + testdata/grep* input and output for pcregrep tests (C) Auxiliary files for Win32 DLL - dll.mk libpcre.def libpcreposix.def pcre.def @@ -423,5 +483,7 @@ The distribution should contain the following files: makevp.bat -Philip Hazel <ph10@cam.ac.uk> -September 2004 +Philip Hazel +Email local part: ph10 +Email domain: cam.ac.uk +June 2005 |