summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2007-09-21 08:37:48 +0000
committerph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2007-09-21 08:37:48 +0000
commit9c74abda0f1247b4d108930b1a396161cf7a9cb6 (patch)
treecd5a2eddfb24b99c2b9a4287b84bb5d60abd6293
parentb63ec439c9129c143040c7a2f8d8e0bbd2822430 (diff)
downloadpcre-9c74abda0f1247b4d108930b1a396161cf7a9cb6.tar.gz
Final doc updates and file tidies for 7.4.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@261 2f5784b3-3f2a-0410-8824-cb99058d5e15
-rw-r--r--ChangeLog3
-rw-r--r--NEWS4
-rw-r--r--NON-UNIX-USE2
-rw-r--r--doc/html/pcrebuild.html18
-rw-r--r--doc/html/pcresyntax.html8
-rw-r--r--doc/pcre.txt179
-rw-r--r--doc/pcresyntax.34
7 files changed, 119 insertions, 99 deletions
diff --git a/ChangeLog b/ChangeLog
index 1edbf26..cff7367 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -72,6 +72,9 @@ Version 7.4 21-Sep-07
and instead check for _strtoi64 explicitly, and avoid the use of snprintf()
entirely. This removes changes made in 7 above.
+17. The CMake files have been updated, and there is now more information about
+ building with CMake in the NON-UNIX-USE document.
+
Version 7.3 28-Aug-07
---------------------
diff --git a/NEWS b/NEWS
index ec470cc..8ef40b0 100644
--- a/NEWS
+++ b/NEWS
@@ -8,7 +8,9 @@ Release 7.4 21-Sep-07
The only change of specification is the addition of options to control whether
\R matches any Unicode line ending (the default) or just CR, LF, and CRLF.
Otherwise, the changes are bug fixes and a refactoring to reduce the number of
-relocations needed in a shared library.
+relocations needed in a shared library. There have also been some documentation
+updates, in particular, some more information about using CMake to build PCRE
+has been added to the NON-UNIX-USE file.
Release 7.3 28-Aug-07
diff --git a/NON-UNIX-USE b/NON-UNIX-USE
index 1a3ca85..fe6cd02 100644
--- a/NON-UNIX-USE
+++ b/NON-UNIX-USE
@@ -169,7 +169,7 @@ fail because of this. Normally, running out of stack causes a crash, but there
have been cases where the test program has just died silently. See your linker
documentation for how to increase stack size if you experience problems. The
Linux default of 8Mb is a reasonable choice for the stack, though even that can
-be too small for some pattern/subject combinations.
+be too small for some pattern/subject combinations.
PCRE has a compile configuration option to disable the use of stack for
recursion so that heap is used instead. However, pattern matching is
diff --git a/doc/html/pcrebuild.html b/doc/html/pcrebuild.html
index 42656ed..4cb9f60 100644
--- a/doc/html/pcrebuild.html
+++ b/doc/html/pcrebuild.html
@@ -33,11 +33,17 @@ man page, in case the conversion went wrong.
<br><a name="SEC1" href="#TOC1">PCRE BUILD-TIME OPTIONS</a><br>
<P>
This document describes the optional features of PCRE that can be selected when
-the library is compiled. They are all selected, or deselected, by providing
-options to the <b>configure</b> script that is run before the <b>make</b>
-command. The complete list of options for <b>configure</b> (which includes the
-standard ones such as the selection of the installation directory) can be
-obtained by running
+the library is compiled. It assumes use of the <b>configure</b> script, where
+the optional features are selected or deselected by providing options to
+<b>configure</b> before running the <b>make</b> command. However, the same
+options can be selected in both Unix-like and non-Unix-like environments using
+the GUI facility of <b>CMakeSetup</b> if you are using <b>CMake</b> instead of
+<b>configure</b> to build PCRE.
+</P>
+<P>
+The complete list of options for <b>configure</b> (which includes the standard
+ones such as the selection of the installation directory) can be obtained by
+running
<pre>
./configure --help
</pre>
@@ -279,7 +285,7 @@ Cambridge CB2 3QH, England.
</P>
<br><a name="SEC16" href="#TOC1">REVISION</a><br>
<P>
-Last updated: 11 September 2007
+Last updated: 21 September 2007
<br>
Copyright &copy; 1997-2007 University of Cambridge.
<br>
diff --git a/doc/html/pcresyntax.html b/doc/html/pcresyntax.html
index a667f2e..cbb1394 100644
--- a/doc/html/pcresyntax.html
+++ b/doc/html/pcresyntax.html
@@ -398,7 +398,8 @@ pattern is not anchored.
</P>
<br><a name="SEC21" href="#TOC1">NEWLINE CONVENTIONS</a><br>
<P>
-These are recognized only at the very start of a pattern.
+These are recognized only at the very start of the pattern or after a
+(*BSR_...) option.
<pre>
(*CR)
(*LF)
@@ -409,7 +410,8 @@ These are recognized only at the very start of a pattern.
</P>
<br><a name="SEC22" href="#TOC1">WHAT \R MATCHES</a><br>
<P>
-These are recognized only at the very start of a pattern.
+These are recognized only at the very start of the pattern or after a
+(*...) option that sets the newline convention.
<pre>
(*BSR_ANYCRLF)
(*BSR_UNICODE)
@@ -438,7 +440,7 @@ Cambridge CB2 3QH, England.
</P>
<br><a name="SEC26" href="#TOC1">REVISION</a><br>
<P>
-Last updated: 11 September 2007
+Last updated: 21 September 2007
<br>
Copyright &copy; 1997-2007 University of Cambridge.
<br>
diff --git a/doc/pcre.txt b/doc/pcre.txt
index 38adb47..fe89473 100644
--- a/doc/pcre.txt
+++ b/doc/pcre.txt
@@ -271,19 +271,24 @@ NAME
PCRE BUILD-TIME OPTIONS
This document describes the optional features of PCRE that can be
- selected when the library is compiled. They are all selected, or dese-
- lected, by providing options to the configure script that is run before
- the make command. The complete list of options for configure (which
- includes the standard ones such as the selection of the installation
- directory) can be obtained by running
+ selected when the library is compiled. It assumes use of the configure
+ script, where the optional features are selected or deselected by pro-
+ viding options to configure before running the make command. However,
+ the same options can be selected in both Unix-like and non-Unix-like
+ environments using the GUI facility of CMakeSetup if you are using
+ CMake instead of configure to build PCRE.
+
+ The complete list of options for configure (which includes the standard
+ ones such as the selection of the installation directory) can be
+ obtained by running
./configure --help
- The following sections include descriptions of options whose names
+ The following sections include descriptions of options whose names
begin with --enable or --disable. These settings specify changes to the
- defaults for the configure command. Because of the way that configure
- works, --enable and --disable always come in pairs, so the complemen-
- tary option always exists as well, but as it specifies the default, it
+ defaults for the configure command. Because of the way that configure
+ works, --enable and --disable always come in pairs, so the complemen-
+ tary option always exists as well, but as it specifies the default, it
is not described.
@@ -304,40 +309,40 @@ UTF-8 SUPPORT
--enable-utf8
- to the configure command. Of itself, this does not make PCRE treat
- strings as UTF-8. As well as compiling PCRE with this option, you also
- have have to set the PCRE_UTF8 option when you call the pcre_compile()
+ to the configure command. Of itself, this does not make PCRE treat
+ strings as UTF-8. As well as compiling PCRE with this option, you also
+ have have to set the PCRE_UTF8 option when you call the pcre_compile()
function.
UNICODE CHARACTER PROPERTY SUPPORT
- UTF-8 support allows PCRE to process character values greater than 255
- in the strings that it handles. On its own, however, it does not pro-
+ UTF-8 support allows PCRE to process character values greater than 255
+ in the strings that it handles. On its own, however, it does not pro-
vide any facilities for accessing the properties of such characters. If
- you want to be able to use the pattern escapes \P, \p, and \X, which
+ you want to be able to use the pattern escapes \P, \p, and \X, which
refer to Unicode character properties, you must add
--enable-unicode-properties
- to the configure command. This implies UTF-8 support, even if you have
+ to the configure command. This implies UTF-8 support, even if you have
not explicitly requested it.
- Including Unicode property support adds around 30K of tables to the
- PCRE library. Only the general category properties such as Lu and Nd
+ Including Unicode property support adds around 30K of tables to the
+ PCRE library. Only the general category properties such as Lu and Nd
are supported. Details are given in the pcrepattern documentation.
CODE VALUE OF NEWLINE
- By default, PCRE interprets character 10 (linefeed, LF) as indicating
- the end of a line. This is the normal newline character on Unix-like
+ By default, PCRE interprets character 10 (linefeed, LF) as indicating
+ the end of a line. This is the normal newline character on Unix-like
systems. You can compile PCRE to use character 13 (carriage return, CR)
instead, by adding
--enable-newline-is-cr
- to the configure command. There is also a --enable-newline-is-lf
+ to the configure command. There is also a --enable-newline-is-lf
option, which explicitly specifies linefeed as the newline character.
Alternatively, you can specify that line endings are to be indicated by
@@ -349,35 +354,35 @@ CODE VALUE OF NEWLINE
--enable-newline-is-anycrlf
- which causes PCRE to recognize any of the three sequences CR, LF, or
+ which causes PCRE to recognize any of the three sequences CR, LF, or
CRLF as indicating a line ending. Finally, a fifth option, specified by
--enable-newline-is-any
causes PCRE to recognize any Unicode newline sequence.
- Whatever line ending convention is selected when PCRE is built can be
- overridden when the library functions are called. At build time it is
+ Whatever line ending convention is selected when PCRE is built can be
+ overridden when the library functions are called. At build time it is
conventional to use the standard for your operating system.
WHAT \R MATCHES
- By default, the sequence \R in a pattern matches any Unicode newline
- sequence, whatever has been selected as the line ending sequence. If
+ By default, the sequence \R in a pattern matches any Unicode newline
+ sequence, whatever has been selected as the line ending sequence. If
you specify
--enable-bsr-anycrlf
- the default is changed so that \R matches only CR, LF, or CRLF. What-
- ever is selected when PCRE is built can be overridden when the library
+ the default is changed so that \R matches only CR, LF, or CRLF. What-
+ ever is selected when PCRE is built can be overridden when the library
functions are called.
BUILDING SHARED AND STATIC LIBRARIES
- The PCRE building process uses libtool to build both shared and static
- Unix libraries by default. You can suppress one of these by adding one
+ The PCRE building process uses libtool to build both shared and static
+ Unix libraries by default. You can suppress one of these by adding one
of
--disable-shared
@@ -389,9 +394,9 @@ BUILDING SHARED AND STATIC LIBRARIES
POSIX MALLOC USAGE
When PCRE is called through the POSIX interface (see the pcreposix doc-
- umentation), additional working storage is required for holding the
- pointers to capturing substrings, because PCRE requires three integers
- per substring, whereas the POSIX interface provides only two. If the
+ umentation), additional working storage is required for holding the
+ pointers to capturing substrings, because PCRE requires three integers
+ per substring, whereas the POSIX interface provides only two. If the
number of expected substrings is small, the wrapper function uses space
on the stack, because this is faster than using malloc() for each call.
The default threshold above which the stack is no longer used is 10; it
@@ -404,111 +409,111 @@ POSIX MALLOC USAGE
HANDLING VERY LARGE PATTERNS
- Within a compiled pattern, offset values are used to point from one
- part to another (for example, from an opening parenthesis to an alter-
- nation metacharacter). By default, two-byte values are used for these
- offsets, leading to a maximum size for a compiled pattern of around
- 64K. This is sufficient to handle all but the most gigantic patterns.
- Nevertheless, some people do want to process enormous patterns, so it
- is possible to compile PCRE to use three-byte or four-byte offsets by
+ Within a compiled pattern, offset values are used to point from one
+ part to another (for example, from an opening parenthesis to an alter-
+ nation metacharacter). By default, two-byte values are used for these
+ offsets, leading to a maximum size for a compiled pattern of around
+ 64K. This is sufficient to handle all but the most gigantic patterns.
+ Nevertheless, some people do want to process enormous patterns, so it
+ is possible to compile PCRE to use three-byte or four-byte offsets by
adding a setting such as
--with-link-size=3
- to the configure command. The value given must be 2, 3, or 4. Using
- longer offsets slows down the operation of PCRE because it has to load
+ to the configure command. The value given must be 2, 3, or 4. Using
+ longer offsets slows down the operation of PCRE because it has to load
additional bytes when handling them.
AVOIDING EXCESSIVE STACK USAGE
When matching with the pcre_exec() function, PCRE implements backtrack-
- ing by making recursive calls to an internal function called match().
- In environments where the size of the stack is limited, this can se-
- verely limit PCRE's operation. (The Unix environment does not usually
+ ing by making recursive calls to an internal function called match().
+ In environments where the size of the stack is limited, this can se-
+ verely limit PCRE's operation. (The Unix environment does not usually
suffer from this problem, but it may sometimes be necessary to increase
- the maximum stack size. There is a discussion in the pcrestack docu-
- mentation.) An alternative approach to recursion that uses memory from
- the heap to remember data, instead of using recursive function calls,
- has been implemented to work round the problem of limited stack size.
+ the maximum stack size. There is a discussion in the pcrestack docu-
+ mentation.) An alternative approach to recursion that uses memory from
+ the heap to remember data, instead of using recursive function calls,
+ has been implemented to work round the problem of limited stack size.
If you want to build a version of PCRE that works this way, add
--disable-stack-for-recursion
- to the configure command. With this configuration, PCRE will use the
- pcre_stack_malloc and pcre_stack_free variables to call memory manage-
- ment functions. By default these point to malloc() and free(), but you
+ to the configure command. With this configuration, PCRE will use the
+ pcre_stack_malloc and pcre_stack_free variables to call memory manage-
+ ment functions. By default these point to malloc() and free(), but you
can replace the pointers so that your own functions are used.
- Separate functions are provided rather than using pcre_malloc and
- pcre_free because the usage is very predictable: the block sizes
- requested are always the same, and the blocks are always freed in
- reverse order. A calling program might be able to implement optimized
- functions that perform better than malloc() and free(). PCRE runs
+ Separate functions are provided rather than using pcre_malloc and
+ pcre_free because the usage is very predictable: the block sizes
+ requested are always the same, and the blocks are always freed in
+ reverse order. A calling program might be able to implement optimized
+ functions that perform better than malloc() and free(). PCRE runs
noticeably more slowly when built in this way. This option affects only
- the pcre_exec() function; it is not relevant for the the
+ the pcre_exec() function; it is not relevant for the the
pcre_dfa_exec() function.
LIMITING PCRE RESOURCE USAGE
- Internally, PCRE has a function called match(), which it calls repeat-
- edly (sometimes recursively) when matching a pattern with the
- pcre_exec() function. By controlling the maximum number of times this
- function may be called during a single matching operation, a limit can
- be placed on the resources used by a single call to pcre_exec(). The
- limit can be changed at run time, as described in the pcreapi documen-
- tation. The default is 10 million, but this can be changed by adding a
+ Internally, PCRE has a function called match(), which it calls repeat-
+ edly (sometimes recursively) when matching a pattern with the
+ pcre_exec() function. By controlling the maximum number of times this
+ function may be called during a single matching operation, a limit can
+ be placed on the resources used by a single call to pcre_exec(). The
+ limit can be changed at run time, as described in the pcreapi documen-
+ tation. The default is 10 million, but this can be changed by adding a
setting such as
--with-match-limit=500000
- to the configure command. This setting has no effect on the
+ to the configure command. This setting has no effect on the
pcre_dfa_exec() matching function.
- In some environments it is desirable to limit the depth of recursive
+ In some environments it is desirable to limit the depth of recursive
calls of match() more strictly than the total number of calls, in order
- to restrict the maximum amount of stack (or heap, if --disable-stack-
+ to restrict the maximum amount of stack (or heap, if --disable-stack-
for-recursion is specified) that is used. A second limit controls this;
- it defaults to the value that is set for --with-match-limit, which
- imposes no additional constraints. However, you can set a lower limit
+ it defaults to the value that is set for --with-match-limit, which
+ imposes no additional constraints. However, you can set a lower limit
by adding, for example,
--with-match-limit-recursion=10000
- to the configure command. This value can also be overridden at run
+ to the configure command. This value can also be overridden at run
time.
CREATING CHARACTER TABLES AT BUILD TIME
- PCRE uses fixed tables for processing characters whose code values are
- less than 256. By default, PCRE is built with a set of tables that are
- distributed in the file pcre_chartables.c.dist. These tables are for
+ PCRE uses fixed tables for processing characters whose code values are
+ less than 256. By default, PCRE is built with a set of tables that are
+ distributed in the file pcre_chartables.c.dist. These tables are for
ASCII codes only. If you add
--enable-rebuild-chartables
- to the configure command, the distributed tables are no longer used.
- Instead, a program called dftables is compiled and run. This outputs
+ to the configure command, the distributed tables are no longer used.
+ Instead, a program called dftables is compiled and run. This outputs
the source for new set of tables, created in the default locale of your
C runtime system. (This method of replacing the tables does not work if
- you are cross compiling, because dftables is run on the local host. If
- you need to create alternative tables when cross compiling, you will
+ you are cross compiling, because dftables is run on the local host. If
+ you need to create alternative tables when cross compiling, you will
have to do so "by hand".)
USING EBCDIC CODE
- PCRE assumes by default that it will run in an environment where the
- character code is ASCII (or Unicode, which is a superset of ASCII).
- This is the case for most computer operating systems. PCRE can, how-
+ PCRE assumes by default that it will run in an environment where the
+ character code is ASCII (or Unicode, which is a superset of ASCII).
+ This is the case for most computer operating systems. PCRE can, how-
ever, be compiled to run in an EBCDIC environment by adding
--enable-ebcdic
to the configure command. This setting implies --enable-rebuild-charta-
- bles. You should only use it if you know that you are in an EBCDIC
+ bles. You should only use it if you know that you are in an EBCDIC
environment (for example, an IBM mainframe operating system).
@@ -526,7 +531,7 @@ AUTHOR
REVISION
- Last updated: 11 September 2007
+ Last updated: 21 September 2007
Copyright (c) 1997-2007 University of Cambridge.
------------------------------------------------------------------------------
@@ -5161,7 +5166,8 @@ BACKTRACKING CONTROL
NEWLINE CONVENTIONS
- These are recognized only at the very start of a pattern.
+ These are recognized only at the very start of the pattern or after a
+ (*BSR_...) option.
(*CR)
(*LF)
@@ -5172,7 +5178,8 @@ NEWLINE CONVENTIONS
WHAT \R MATCHES
- These are recognized only at the very start of a pattern.
+ These are recognized only at the very start of the pattern or after a
+ (*...) option that sets the newline convention.
(*BSR_ANYCRLF)
(*BSR_UNICODE)
@@ -5198,7 +5205,7 @@ AUTHOR
REVISION
- Last updated: 11 September 2007
+ Last updated: 21 September 2007
Copyright (c) 1997-2007 University of Cambridge.
------------------------------------------------------------------------------
diff --git a/doc/pcresyntax.3 b/doc/pcresyntax.3
index 53e51ac..2817820 100644
--- a/doc/pcresyntax.3
+++ b/doc/pcresyntax.3
@@ -370,7 +370,7 @@ pattern is not anchored.
.SH "NEWLINE CONVENTIONS"
.rs
.sp
-These are recognized only at the very start of the pattern or after a
+These are recognized only at the very start of the pattern or after a
(*BSR_...) option.
.sp
(*CR)
@@ -383,7 +383,7 @@ These are recognized only at the very start of the pattern or after a
.SH "WHAT \eR MATCHES"
.rs
.sp
-These are recognized only at the very start of the pattern or after a
+These are recognized only at the very start of the pattern or after a
(*...) option that sets the newline convention.
.sp
(*BSR_ANYCRLF)