summaryrefslogtreecommitdiff
path: root/doc/html/pcre2posix.html
diff options
context:
space:
mode:
authorph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>2017-06-05 18:25:47 +0000
committerph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>2017-06-05 18:25:47 +0000
commit3bb618c91795bc56f3062d7e09f4950b84a064d9 (patch)
tree6b00b279d05ab6ecd19d83a5783f9034b6cf12a6 /doc/html/pcre2posix.html
parent5b20a763d32c58e8b2184a5393f8efe0144c28b9 (diff)
downloadpcre2-3bb618c91795bc56f3062d7e09f4950b84a064d9.tar.gz
Implement REG_PEND (GNU extension) for the POSIX wrapper.
git-svn-id: svn://vcs.exim.org/pcre2/code/trunk@820 6239d852-aaf2-0410-a92c-79f79f948069
Diffstat (limited to 'doc/html/pcre2posix.html')
-rw-r--r--doc/html/pcre2posix.html61
1 files changed, 42 insertions, 19 deletions
diff --git a/doc/html/pcre2posix.html b/doc/html/pcre2posix.html
index 1d5fe63..a6d75e1 100644
--- a/doc/html/pcre2posix.html
+++ b/doc/html/pcre2posix.html
@@ -69,7 +69,7 @@ replacement library. Other POSIX options are not even defined.
<P>
There are also some options that are not defined by POSIX. These have been
added at the request of users who want to make use of certain PCRE2-specific
-features via the POSIX calling interface.
+features via the POSIX calling interface or to add BSD or GNU functionality.
</P>
<P>
When PCRE2 is called via these functions, it is only the API that is POSIX-like
@@ -91,10 +91,11 @@ identifying error codes.
<br><a name="SEC3" href="#TOC1">COMPILING A PATTERN</a><br>
<P>
The function <b>regcomp()</b> is called to compile a pattern into an
-internal form. The pattern is a C string terminated by a binary zero, and
-is passed in the argument <i>pattern</i>. The <i>preg</i> argument is a pointer
-to a <b>regex_t</b> structure that is used as a base for storing information
-about the compiled regular expression.
+internal form. By default, the pattern is a C string terminated by a binary
+zero (but see REG_PEND below). The <i>preg</i> argument is a pointer to a
+<b>regex_t</b> structure that is used as a base for storing information about
+the compiled regular expression. (It is also used for input when REG_PEND is
+set.)
</P>
<P>
The argument <i>cflags</i> is either zero, or contains one or more of the bits
@@ -125,6 +126,16 @@ captured strings are returned. Versions of the PCRE library prior to 10.22 used
to set the PCRE2_NO_AUTO_CAPTURE compile option, but this no longer happens
because it disables the use of back references.
<pre>
+ REG_PEND
+</pre>
+If this option is set, the <b>reg_endp</b> field in the <i>preg</i> structure
+(which has the type const char *) must be set to point to the character beyond
+the end of the pattern before calling <b>regcomp()</b>. The pattern itself may
+now contain binary zeroes, which are treated as data characters. Without
+REG_PEND, a binary zero terminates the pattern and the <b>re_endp</b> field is
+ignored. This is a GNU extension to the POSIX standard and should be used with
+caution in software intended to be portable to other systems.
+<pre>
REG_UCP
</pre>
The PCRE2_UCP option is set when the regular expression is passed for
@@ -156,9 +167,10 @@ class such as [^a] (they are).
</P>
<P>
The yield of <b>regcomp()</b> is zero on success, and non-zero otherwise. The
-<i>preg</i> structure is filled in on success, and one member of the structure
-is public: <i>re_nsub</i> contains the number of capturing subpatterns in
-the regular expression. Various error codes are defined in the header file.
+<i>preg</i> structure is filled in on success, and one other member of the
+structure (as well as <i>re_endp</i>) is public: <i>re_nsub</i> contains the
+number of capturing subpatterns in the regular expression. Various error codes
+are defined in the header file.
</P>
<P>
NOTE: If the yield of <b>regcomp()</b> is non-zero, you must not attempt to
@@ -228,15 +240,26 @@ function.
<pre>
REG_STARTEND
</pre>
-The string is considered to start at <i>string</i> + <i>pmatch[0].rm_so</i> and
-to have a terminating NUL located at <i>string</i> + <i>pmatch[0].rm_eo</i>
-(there need not actually be a NUL at that location), regardless of the value of
-<i>nmatch</i>. This is a BSD extension, compatible with but not specified by
-IEEE Standard 1003.2 (POSIX.2), and should be used with caution in software
-intended to be portable to other systems. Note that a non-zero <i>rm_so</i> does
-not imply REG_NOTBOL; REG_STARTEND affects only the location of the string, not
-how it is matched. Setting REG_STARTEND and passing <i>pmatch</i> as NULL are
-mutually exclusive; the error REG_INVARG is returned.
+When this option is set, the subject string is starts at <i>string</i> +
+<i>pmatch[0].rm_so</i> and ends at <i>string</i> + <i>pmatch[0].rm_eo</i>, which
+should point to the first character beyond the string. There may be binary
+zeroes within the subject string, and indeed, using REG_STARTEND is the only
+way to pass a subject string that contains a binary zero.
+</P>
+<P>
+Whatever the value of <i>pmatch[0].rm_so</i>, the offsets of the matched string
+and any captured substrings are still given relative to the start of
+<i>string</i> itself. (Before PCRE2 release 10.30 these were given relative to
+<i>string</i> + <i>pmatch[0].rm_so</i>, but this differs from other
+implementations.)
+</P>
+<P>
+This is a BSD extension, compatible with but not specified by IEEE Standard
+1003.2 (POSIX.2), and should be used with caution in software intended to be
+portable to other systems. Note that a non-zero <i>rm_so</i> does not imply
+REG_NOTBOL; REG_STARTEND affects only the location and length of the string,
+not how it is matched. Setting REG_STARTEND and passing <i>pmatch</i> as NULL
+are mutually exclusive; the error REG_INVARG is returned.
</P>
<P>
If the pattern was compiled with the REG_NOSUB flag, no data about any matched
@@ -291,9 +314,9 @@ Cambridge, England.
</P>
<br><a name="SEC9" href="#TOC1">REVISION</a><br>
<P>
-Last updated: 31 January 2016
+Last updated: 05 June 2017
<br>
-Copyright &copy; 1997-2016 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
<br>
<p>
Return to the <a href="index.html">PCRE2 index page</a>.