summaryrefslogtreecommitdiff
path: root/README
diff options
context:
space:
mode:
authornigel <nigel@2f5784b3-3f2a-0410-8824-cb99058d5e15>2007-02-24 21:40:45 +0000
committernigel <nigel@2f5784b3-3f2a-0410-8824-cb99058d5e15>2007-02-24 21:40:45 +0000
commit97cb05691b9cabed35f1a853c74d48c692aaabcf (patch)
treecb7c68a44f0b79c6d90d9a18a7ec640c8435a5e7 /README
parent455fcc7e13a175722acfd2cca6ab99caa9606a22 (diff)
downloadpcre-97cb05691b9cabed35f1a853c74d48c692aaabcf.tar.gz
Load pcre-6.0 into code/trunk.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@77 2f5784b3-3f2a-0410-8824-cb99058d5e15
Diffstat (limited to 'README')
-rw-r--r--README186
1 files changed, 124 insertions, 62 deletions
diff --git a/README b/README
index fc5397e..f8d63bb 100644
--- a/README
+++ b/README
@@ -7,14 +7,22 @@ The latest release of PCRE is always available from
Please read the NEWS file if you are upgrading from a previous release.
-PCRE has its own native API, but a set of "wrapper" functions that are based on
-the POSIX API are also supplied in the library libpcreposix. Note that this
-just provides a POSIX calling interface to PCRE: the regular expressions
-themselves still follow Perl syntax and semantics. The header file
-for the POSIX-style functions is called pcreposix.h. The official POSIX name is
-regex.h, but I didn't want to risk possible problems with existing files of
-that name by distributing it that way. To use it with an existing program that
-uses the POSIX API, it will have to be renamed or pointed at by a link.
+
+The PCRE APIs
+-------------
+
+PCRE is written in C, and it has its own API. The distribution now includes a
+set of C++ wrapper functions, courtesy of Google Inc. (see the pcrecpp man page
+for details).
+
+Also included are a set of C wrapper functions that are based on the POSIX
+API. These end up in the library called libpcreposix. Note that this just
+provides a POSIX calling interface to PCRE: the regular expressions themselves
+still follow Perl syntax and semantics. The header file for the POSIX-style
+functions is called pcreposix.h. The official POSIX name is regex.h, but I
+didn't want to risk possible problems with existing files of that name by
+distributing it that way. To use it with an existing program that uses the
+POSIX API, it will have to be renamed or pointed at by a link.
If you are using the POSIX interface to PCRE and there is already a POSIX regex
library installed on your system, you must take care when linking programs to
@@ -112,7 +120,7 @@ library. You can read more about them in the pcrebuild man page.
on the "configure" command.
-. PCRE has a counter which can be set to limit the amount of resources it uses.
+. PCRE has a counter that can be set to limit the amount of resources it uses.
If the limit is exceeded during a match, the match fails. The default is ten
million. You can change the default by setting, for example,
@@ -130,31 +138,56 @@ library. You can read more about them in the pcrebuild man page.
is a representation of the compiled pattern, and this changes with the link
size.
-. You can build PCRE so that its match() function does not call itself
- recursively. Instead, it uses blocks of data from the heap via special
- functions pcre_stack_malloc() and pcre_stack_free() to save data that would
- otherwise be saved on the stack. To build PCRE like this, use
+. You can build PCRE so that its internal match() function that is called from
+ pcre_exec() does not call itself recursively. Instead, it uses blocks of data
+ from the heap via special functions pcre_stack_malloc() and pcre_stack_free()
+ to save data that would otherwise be saved on the stack. To build PCRE like
+ this, use
--disable-stack-for-recursion
on the "configure" command. PCRE runs more slowly in this mode, but it may be
- necessary in environments with limited stack sizes.
+ necessary in environments with limited stack sizes. This applies only to the
+ pcre_exec() function; it does not apply to pcre_dfa_exec(), which does not
+ use deeply nested recursion.
-The "configure" script builds seven files:
+The "configure" script builds eight files for the basic C library:
-. pcre.h is build by copying pcre.in and making substitutions
-. Makefile is built by copying Makefile.in and making substitutions.
-. config.h is built by copying config.in and making substitutions.
-. pcre-config is built by copying pcre-config.in and making substitutions.
-. libpcre.pc is data for the pkg-config command, built from libpcre.pc.in
+. pcre.h is the header file for C programs that call PCRE
+. Makefile is the makefile that builds the library
+. config.h contains build-time configuration options for the library
+. pcre-config is a script that shows the settings of "configure" options
+. libpcre.pc is data for the pkg-config command
. libtool is a script that builds shared and/or static libraries
-. RunTest is a script for running tests
+. RunTest is a script for running tests on the library
+. RunGrepTest is a script for running tests on the pcregrep command
+
+In addition, if a C++ compiler is found, the following are also built:
-Once "configure" has run, you can run "make". It builds two libraries called
+. pcrecpp.h is the header file for programs that call PCRE via the C++ wrapper
+. pcre_stringpiece.h is the header for the C++ "stringpiece" functions
+
+The "configure" script also creates config.status, which is an executable
+script that can be run to recreate the configuration, and config.log, which
+contains compiler output from tests that "configure" runs.
+
+Once "configure" has run, you can run "make". It builds two libraries, called
libpcre and libpcreposix, a test program called pcretest, and the pcregrep
-command. You can use "make install" to copy these, the public header files
-pcre.h and pcreposix.h, and the man pages to appropriate live directories on
-your system, in the normal way.
+command. If a C++ compiler was found on your system, it also builds the C++
+wrapper library, which is called libpcrecpp, and some test programs called
+pcrecpp_unittest, pcre_scanner_unittest, and pcre_stringpiece_unittest.
+
+The command "make test" runs all the appropriate tests. Details of the PCRE
+tests are given in a separate section of this document, below.
+
+You can use "make install" to copy the libraries, the public header files
+pcre.h, pcreposix.h, pcrecpp.h, and pcre_stringpiece.h (the last two only if
+the C++ wrapper was built), and the man pages to appropriate live directories
+on your system, in the normal way.
+
+If you want to remove PCRE from your system, you can run "make uninstall".
+This removes all the files that "make install" installed. However, it does not
+remove any directories, because these are often shared with other programs.
Retrieving configuration information on Unix-like systems
@@ -187,9 +220,9 @@ pkgconfig.
Shared libraries on Unix-like systems
-------------------------------------
-The default distribution builds PCRE as two shared libraries and two static
-libraries, as long as the operating system supports shared libraries. Shared
-library support relies on the "libtool" script which is built as part of the
+The default distribution builds PCRE as shared libraries and static libraries,
+as long as the operating system supports shared libraries. Shared library
+support relies on the "libtool" script which is built as part of the
"configure" process.
The libtool script is used to compile and link both shared and static
@@ -218,7 +251,8 @@ order to cross-compile PCRE for some other host. However, during the building
process, the dftables.c source file is compiled *and run* on the local host, in
order to generate the default character tables (the chartables.c file). It
therefore needs to be compiled with the local compiler, not the cross compiler.
-You can do this by specifying CC_FOR_BUILD (and if necessary CFLAGS_FOR_BUILD)
+You can do this by specifying CC_FOR_BUILD (and if necessary CFLAGS_FOR_BUILD;
+there are also CXX_FOR_BUILD and CXXFLAGS_FOR_BUILD for the C++ wrapper)
when calling the "configure" command. If they are not specified, they default
to the values of CC and CFLAGS.
@@ -240,13 +274,19 @@ Testing PCRE
------------
To test PCRE on a Unix system, run the RunTest script that is created by the
-configuring process. (This can also be run by "make runtest", "make check", or
-"make test".) For other systems, see the instructions in NON-UNIX-USE.
-
-The script runs the pcretest test program (which is documented in its own man
-page) on each of the testinput files (in the testdata directory) in turn,
-and compares the output with the contents of the corresponding testoutput file.
-A file called testtry is used to hold the main output from pcretest
+configuring process. There is also a script called RunGrepTest that tests the
+options of the pcregrep command. If the C++ wrapper library is build, three
+test programs called pcrecpp_unittest, pcre_scanner_unittest, and
+pcre_stringpiece_unittest are provided.
+
+Both the scripts and all the program tests are run if you obey "make runtest",
+"make check", or "make test". For other systems, see the instructions in
+NON-UNIX-USE.
+
+The RunTest script runs the pcretest test program (which is documented in its
+own man page) on each of the testinput files (in the testdata directory) in
+turn, and compares the output with the contents of the corresponding testoutput
+file. A file called testtry is used to hold the main output from pcretest
(testsavedregex is also used as a working file). To run pcretest on just one of
the test files, give its number as an argument to RunTest, for example:
@@ -294,9 +334,14 @@ commented in the script, can be be used.)
The fifth test checks error handling with UTF-8 encoding, and internal UTF-8
features of PCRE that are not relevant to Perl.
-The sixth and final test checks the support for Unicode character properties.
-It it not run automatically unless PCRE is built with Unicode property support.
-To to this you must set --enable-unicode-properties when running "configure".
+The sixth and test checks the support for Unicode character properties. It it
+not run automatically unless PCRE is built with Unicode property support. To to
+this you must set --enable-unicode-properties when running "configure".
+
+The seventh, eighth, and ninth tests check the pcre_dfa_exec() alternative
+matching function, in non-UTF-8 mode, UTF-8 mode, and UTF-8 mode with Unicode
+property support, respectively. The eighth and ninth tests are not run
+automatically unless PCRE is build with the relevant support.
Character tables
@@ -348,14 +393,27 @@ The distribution should contain the following files:
dftables.c auxiliary program for building chartables.c
- get.c )
- maketables.c )
- study.c ) source of the functions
- pcre.c ) in the library
pcreposix.c )
- printint.c )
-
- ucp.c )
+ pcre_compile.c )
+ pcre_config.c )
+ pcre_dfa_exec.c )
+ pcre_exec.c )
+ pcre_fullinfo.c )
+ pcre_get.c ) sources for the functions in the library,
+ pcre_globals.c ) and some internal functions that they use
+ pcre_info.c )
+ pcre_maketables.c )
+ pcre_ord2utf8.c )
+ pcre_printint.c )
+ pcre_study.c )
+ pcre_tables.c )
+ pcre_try_flipped.c )
+ pcre_ucp_findchar.c )
+ pcre_valid_utf8.c )
+ pcre_version.c )
+ pcre_xclass.c )
+
+ ucp_findchar.c )
ucp.h ) source for the code that is used for
ucpinternal.h ) Unicode property handling
ucptable.c )
@@ -364,9 +422,17 @@ The distribution should contain the following files:
pcre.in "source" for the header for the external API; pcre.h
is built from this by "configure"
pcreposix.h header for the external POSIX wrapper API
- internal.h header for internal use
+ pcre_internal.h header for internal use
config.in template for config.h, which is built by configure
+ pcrecpp.h.in "source" for the header file for the C++ wrapper
+ pcrecpp.cc )
+ pcre_scanner.cc ) source for the C++ wrapper library
+
+ pcre_stringpiece.h.in "source" for pcre_stringpiece.h, the header for the
+ C++ stringpiece functions
+ pcre_stringpiece.cc source for the C++ stringpiece functions
+
(B) Auxiliary files:
AUTHORS information about the author of PCRE
@@ -379,6 +445,7 @@ The distribution should contain the following files:
NON-UNIX-USE notes on building PCRE on non-Unix systems
README this file
RunTest.in template for a Unix shell script for running tests
+ RunGrepTest.in template for a Unix shell script for pcregrep tests
config.guess ) files used by libtool,
config.sub ) used only when building a shared library
configure a configuring shell script (built by autoconf)
@@ -399,22 +466,15 @@ The distribution should contain the following files:
perltest Perl test program
pcregrep.c source of a grep utility that uses PCRE
pcre-config.in source of script which retains PCRE information
- testdata/testinput1 test data, compatible with Perl
- testdata/testinput2 test data for error messages and non-Perl things
- testdata/testinput3 test data for locale-specific tests
- testdata/testinput4 test data for UTF-8 tests compatible with Perl
- testdata/testinput5 test data for other UTF-8 tests
- testdata/testinput6 test data for Unicode property support tests
- testdata/testoutput1 test results corresponding to testinput1
- testdata/testoutput2 test results corresponding to testinput2
- testdata/testoutput3 test results corresponding to testinput3
- testdata/testoutput4 test results corresponding to testinput4
- testdata/testoutput5 test results corresponding to testinput5
- testdata/testoutput6 test results corresponding to testinput6
+ pcrecpp_unittest.c )
+ pcre_scanner_unittest.c ) test programs for the C++ wrapper
+ pcre_stringpiece_unittest.c )
+ testdata/testinput* test data for main library tests
+ testdata/testoutput* expected test results
+ testdata/grep* input and output for pcregrep tests
(C) Auxiliary files for Win32 DLL
- dll.mk
libpcre.def
libpcreposix.def
pcre.def
@@ -423,5 +483,7 @@ The distribution should contain the following files:
makevp.bat
-Philip Hazel <ph10@cam.ac.uk>
-September 2004
+Philip Hazel
+Email local part: ph10
+Email domain: cam.ac.uk
+June 2005