summaryrefslogtreecommitdiff
path: root/srclib/pcre/README
diff options
context:
space:
mode:
Diffstat (limited to 'srclib/pcre/README')
-rw-r--r--srclib/pcre/README174
1 files changed, 121 insertions, 53 deletions
diff --git a/srclib/pcre/README b/srclib/pcre/README
index 90aaf4d6a6..7557374791 100644
--- a/srclib/pcre/README
+++ b/srclib/pcre/README
@@ -7,30 +7,73 @@ The latest release of PCRE is always available from
Please read the NEWS file if you are upgrading from a previous release.
+PCRE has its own native API, but a set of "wrapper" functions that are based on
+the POSIX API are also supplied in the library libpcreposix. Note that this
+just provides a POSIX calling interface to PCRE: the regular expressions
+themselves still follow Perl syntax and semantics. The header file
+for the POSIX-style functions is called pcreposix.h. The official POSIX name is
+regex.h, but I didn't want to risk possible problems with existing files of
+that name by distributing it that way. To use it with an existing program that
+uses the POSIX API, it will have to be renamed or pointed at by a link.
+
+
+Contributions by users of PCRE
+------------------------------
+
+You can find contributions from PCRE users in the directory
+
+ ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/Contrib
+
+where there is also a README file giving brief descriptions of what they are.
+Several of them provide support for compiling PCRE on various flavours of
+Windows systems (I myself do not use Windows). Some are complete in themselves;
+others are pointers to URLs containing relevant files.
+
Building PCRE on a Unix system
------------------------------
-To build PCRE on a Unix system, run the "configure" command in the PCRE
-distribution directory. This is a standard GNU "autoconf" configuration script,
-for which generic instructions are supplied in INSTALL. On many systems just
-running "./configure" is sufficient, but the usual methods of changing standard
-defaults are available. For example
+To build PCRE on a Unix system, first run the "configure" command from the PCRE
+distribution directory, with your current directory set to the directory where
+you want the files to be created. This command is a standard GNU "autoconf"
+configuration script, for which generic instructions are supplied in INSTALL.
+
+Most commonly, people build PCRE within its own distribution directory, and in
+this case, on many systems, just running "./configure" is sufficient, but the
+usual methods of changing standard defaults are available. For example,
CFLAGS='-O2 -Wall' ./configure --prefix=/opt/local
specifies that the C compiler should be run with the flags '-O2 -Wall' instead
of the default, and that "make install" should install PCRE under /opt/local
-instead of the default /usr/local. The "configure" script builds thre files:
+instead of the default /usr/local.
+
+If you want to build in a different directory, just run "configure" with that
+directory as current. For example, suppose you have unpacked the PCRE source
+into /source/pcre/pcre-xxx, but you want to build it in /build/pcre/pcre-xxx:
+
+cd /build/pcre/pcre-xxx
+/source/pcre/pcre-xxx/configure
+
+If you want to make use of the experimential, incomplete support for UTF-8
+character strings in PCRE, you must add --enable-utf8 to the "configure"
+command. Without it, the code for handling UTF-8 is not included in the
+library. (Even when included, it still has to be enabled by an option at run
+time.)
+The "configure" script builds five files:
+
+. libtool is a script that builds shared and/or static libraries
. Makefile is built by copying Makefile.in and making substitutions.
. config.h is built by copying config.in and making substitutions.
. pcre-config is built by copying pcre-config.in and making substitutions.
+. RunTest is a script for running tests
Once "configure" has run, you can run "make". It builds two libraries called
-libpcre and libpcreposix, a test program called pcretest, and the pgrep
-command. You can use "make install" to copy these, and the public header file
-pcre.h, to appropriate live directories on your system, in the normal way.
+libpcre and libpcreposix, a test program called pcretest, and the pcregrep
+command. You can use "make install" to copy these, the public header files
+pcre.h and pcreposix.h, and the man pages to appropriate live directories on
+your system, in the normal way.
Running "make install" also installs the command pcre-config, which can be used
to recall information about the PCRE configuration and installation. For
@@ -46,26 +89,38 @@ outputs information about where the library is installed. This command can be
included in makefiles for programs that use PCRE, saving the programmer from
having to remember too many details.
+There is one esoteric feature that is controlled by "configure". It concerns
+the character value used for "newline", and is something that you probably do
+not want to change on a Unix system. The default is to use whatever value your
+compiler gives to '\n'. By using --enable-newline-is-cr or
+--enable-newline-is-lf you can force the value to be CR (13) or LF (10) if you
+really want to.
+
Shared libraries on Unix systems
--------------------------------
-The default distribution builds PCRE as two shared libraries. This support is
-new and experimental and may not work on all systems. It relies on the
-"libtool" scripts - these are distributed with PCRE. It should build a
-"libtool" script and use this to compile and link shared libraries, which are
-placed in a subdirectory called .libs. The programs pcretest and pgrep are
-built to use these uninstalled libraries by means of wrapper scripts. When you
-use "make install" to install shared libraries, pgrep and pcretest are
-automatically re-built to use the newly installed libraries. However, only
-pgrep is installed, as pcretest is really just a test program.
-
-To build PCRE using static libraries you must use --disable-shared when
+The default distribution builds PCRE as two shared libraries and two static
+libraries, as long as the operating system supports shared libraries. Shared
+library support relies on the "libtool" script which is built as part of the
+"configure" process.
+
+The libtool script is used to compile and link both shared and static
+libraries. They are placed in a subdirectory called .libs when they are newly
+built. The programs pcretest and pcregrep are built to use these uninstalled
+libraries (by means of wrapper scripts in the case of shared libraries). When
+you use "make install" to install shared libraries, pcregrep and pcretest are
+automatically re-built to use the newly installed shared libraries before being
+installed themselves. However, the versions left in the source directory still
+use the uninstalled libraries.
+
+To build PCRE using static libraries only you must use --disable-shared when
configuring it. For example
./configure --prefix=/usr/gnu --disable-shared
-Then run "make" in the usual way.
+Then run "make" in the usual way. Similarly, you can use --disable-static to
+build only shared libraries.
Building on non-Unix systems
@@ -81,28 +136,40 @@ Standard C functions.
Testing PCRE
------------
-To test PCRE on a Unix system, run the RunTest script in the pcre directory.
-(This can also be run by "make runtest" or "make check".) For other systems,
-see the instruction in NON-UNIX-USE.
+To test PCRE on a Unix system, run the RunTest script that is created by the
+configuring process. (This can also be run by "make runtest", "make check", or
+"make test".) For other systems, see the instruction in NON-UNIX-USE.
-The script runs the pcretest test program (which is documented in
-doc/pcretest.txt) on each of the testinput files (in the testdata directory) in
-turn, and compares the output with the contents of the corresponding testoutput
-file. A file called testtry is used to hold the output from pcretest. To run
-pcretest on just one of the test files, give its number as an argument to
-RunTest, for example:
+The script runs the pcretest test program (which is documented in the doc
+directory) on each of the testinput files (in the testdata directory) in turn,
+and compares the output with the contents of the corresponding testoutput file.
+A file called testtry is used to hold the output from pcretest. To run pcretest
+on just one of the test files, give its number as an argument to RunTest, for
+example:
RunTest 3
The first and third test files can also be fed directly into the perltest
script to check that Perl gives the same results. The third file requires the
additional features of release 5.005, which is why it is kept separate from the
-main test input, which needs only Perl 5.004. In the long run, when 5.005 is
-widespread, these two test files may get amalgamated.
-
-The second set of tests check pcre_info(), pcre_study(), pcre_copy_substring(),
-pcre_get_substring(), pcre_get_substring_list(), error detection and run-time
-flags that are specific to PCRE, as well as the POSIX wrapper API.
+main test input, which needs only Perl 5.004. In the long run, when 5.005 (or
+higher) is widespread, these two test files may get amalgamated.
+
+The second set of tests check pcre_fullinfo(), pcre_info(), pcre_study(),
+pcre_copy_substring(), pcre_get_substring(), pcre_get_substring_list(), error
+detection, and run-time flags that are specific to PCRE, as well as the POSIX
+wrapper API. It also uses the debugging flag to check some of the internals of
+pcre_compile().
+
+If you build PCRE with a locale setting that is not the standard C locale, the
+character tables may be different (see next paragraph). In some cases, this may
+cause failures in the second set of tests. For example, in a locale where the
+isprint() function yields TRUE for characters in the range 128-255, the use of
+[:isascii:] inside a character class defines a different set of characters, and
+this shows up in this test as a difference in the compiled code, which is being
+listed for checking. Where the comparison test output contains [\x00-\x7f] the
+test will contain [\x00-\xff], and similarly in some other cases. This is not a
+bug in PCRE.
The fourth set of tests checks pcre_maketables(), the facility for building a
set of character tables for a specific locale and using them instead of the
@@ -117,14 +184,10 @@ output to say why. If running this test produces instances of the error
in the comparison output, it means that locale is not available on your system,
despite being listed by "locale". This does not mean that PCRE is broken.
-PCRE has its own native API, but a set of "wrapper" functions that are based on
-the POSIX API are also supplied in the library libpcreposix.a. Note that this
-just provides a POSIX calling interface to PCRE: the regular expressions
-themselves still follow Perl syntax and semantics. The header file
-for the POSIX-style functions is called pcreposix.h. The official POSIX name is
-regex.h, but I didn't want to risk possible problems with existing files of
-that name by distributing it that way. To use it with an existing program that
-uses the POSIX API, it will have to be renamed or pointed at by a link.
+The fifth test checks the experimental, incomplete UTF-8 support. It is not run
+automatically unless PCRE is built with UTF-8 support. This file can be fed
+directly to the perltest8 script, which requires Perl 5.6 or higher. The sixth
+file tests internal UTF-8 features of PCRE that are not relevant to Perl.
Character tables
@@ -197,7 +260,7 @@ The distribution should contain the following files:
NEWS important changes in this release
NON-UNIX-USE notes on building PCRE on non-Unix systems
README this file
- RunTest a Unix shell script for running tests
+ RunTest.in template for a Unix shell script for running tests
config.guess ) files used by libtool,
config.sub ) used only when building a shared library
configure a configuring shell script (built by autoconf)
@@ -211,24 +274,29 @@ The distribution should contain the following files:
doc/pcreposix.txt plain text version
doc/pcretest.txt documentation of test program
doc/perltest.txt documentation of Perl test program
- doc/pgrep.1 man page source for the pgrep utility
- doc/pgrep.html HTML version
- doc/pgrep.txt plain text version
+ doc/pcregrep.1 man page source for the pcregrep utility
+ doc/pcregrep.html HTML version
+ doc/pcregrep.txt plain text version
install-sh a shell script for installing files
- ltconfig ) files used to build "libtool",
- ltmain.sh ) used only when building a shared library
- pcretest.c test program
+ ltmain.sh file used to build a libtool script
+ pcretest.c comprehensive test program
+ pcredemo.c simple demonstration of coding calls to PCRE
perltest Perl test program
- pgrep.c source of a grep utility that uses PCRE
+ perltest8 Perl test program for UTF-8 tests
+ pcregrep.c source of a grep utility that uses PCRE
pcre-config.in source of script which retains PCRE information
testdata/testinput1 test data, compatible with Perl 5.004 and 5.005
testdata/testinput2 test data for error messages and non-Perl things
testdata/testinput3 test data, compatible with Perl 5.005
testdata/testinput4 test data for locale-specific tests
+ testdata/testinput5 test data for UTF-8 tests compatible with Perl 5.6
+ testdata/testinput6 test data for other UTF-8 tests
testdata/testoutput1 test results corresponding to testinput1
testdata/testoutput2 test results corresponding to testinput2
testdata/testoutput3 test results corresponding to testinput3
testdata/testoutput4 test results corresponding to testinput4
+ testdata/testoutput5 test results corresponding to testinput5
+ testdata/testoutput6 test results corresponding to testinput6
(C) Auxiliary files for Win32 DLL
@@ -236,4 +304,4 @@ The distribution should contain the following files:
pcre.def
Philip Hazel <ph10@cam.ac.uk>
-February 2000
+August 2001