summaryrefslogtreecommitdiff
path: root/srclib/pcre/doc/pcrecompat.3
diff options
context:
space:
mode:
Diffstat (limited to 'srclib/pcre/doc/pcrecompat.3')
-rw-r--r--srclib/pcre/doc/pcrecompat.3121
1 files changed, 0 insertions, 121 deletions
diff --git a/srclib/pcre/doc/pcrecompat.3 b/srclib/pcre/doc/pcrecompat.3
deleted file mode 100644
index 6a853e072a..0000000000
--- a/srclib/pcre/doc/pcrecompat.3
+++ /dev/null
@@ -1,121 +0,0 @@
-.TH PCRE 3
-.SH NAME
-PCRE - Perl-compatible regular expressions
-.SH "DIFFERENCES BETWEEN PCRE AND PERL"
-.rs
-.sp
-This document describes the differences in the ways that PCRE and Perl handle
-regular expressions. The differences described here are with respect to Perl
-5.8.
-.P
-1. PCRE does not have full UTF-8 support. Details of what it does have are
-given in the
-.\" HTML <a href="pcre.html#utf8support">
-.\" </a>
-section on UTF-8 support
-.\"
-in the main
-.\" HREF
-\fBpcre\fP
-.\"
-page.
-.P
-2. PCRE does not allow repeat quantifiers on lookahead assertions. Perl permits
-them, but they do not mean what you might think. For example, (?!a){3} does
-not assert that the next three characters are not "a". It just asserts that the
-next character is not "a" three times.
-.P
-3. Capturing subpatterns that occur inside negative lookahead assertions are
-counted, but their entries in the offsets vector are never set. Perl sets its
-numerical variables from any such patterns that are matched before the
-assertion fails to match something (thereby succeeding), but only if the
-negative lookahead assertion contains just one branch.
-.P
-4. Though binary zero characters are supported in the subject string, they are
-not allowed in a pattern string because it is passed as a normal C string,
-terminated by zero. The escape sequence \e0 can be used in the pattern to
-represent a binary zero.
-.P
-5. The following Perl escape sequences are not supported: \el, \eu, \eL,
-\eU, and \eN. In fact these are implemented by Perl's general string-handling
-and are not part of its pattern matching engine. If any of these are
-encountered by PCRE, an error is generated.
-.P
-6. The Perl escape sequences \ep, \eP, and \eX are supported only if PCRE is
-built with Unicode character property support. The properties that can be
-tested with \ep and \eP are limited to the general category properties such as
-Lu and Nd.
-.P
-7. PCRE does support the \eQ...\eE escape for quoting substrings. Characters in
-between are treated as literals. This is slightly different from Perl in that $
-and @ are also handled as literals inside the quotes. In Perl, they cause
-variable interpolation (but of course PCRE does not have variables). Note the
-following examples:
-.sp
- Pattern PCRE matches Perl matches
-.sp
-.\" JOIN
- \eQabc$xyz\eE abc$xyz abc followed by the
- contents of $xyz
- \eQabc\e$xyz\eE abc\e$xyz abc\e$xyz
- \eQabc\eE\e$\eQxyz\eE abc$xyz abc$xyz
-.sp
-The \eQ...\eE sequence is recognized both inside and outside character classes.
-.P
-8. Fairly obviously, PCRE does not support the (?{code}) and (?p{code})
-constructions. However, there is support for recursive patterns using the
-non-Perl items (?R), (?number), and (?P>name). Also, the PCRE "callout" feature
-allows an external function to be called during pattern matching. See the
-.\" HREF
-\fBpcrecallout\fP
-.\"
-documentation for details.
-.P
-9. There are some differences that are concerned with the settings of captured
-strings when part of a pattern is repeated. For example, matching "aba" against
-the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE it is set to "b".
-.P
-10. PCRE provides some extensions to the Perl regular expression facilities:
-.sp
-(a) Although lookbehind assertions must match fixed length strings, each
-alternative branch of a lookbehind assertion can match a different length of
-string. Perl requires them all to have the same length.
-.sp
-(b) If PCRE_DOLLAR_ENDONLY is set and PCRE_MULTILINE is not set, the $
-meta-character matches only at the very end of the string.
-.sp
-(c) If PCRE_EXTRA is set, a backslash followed by a letter with no special
-meaning is faulted.
-.sp
-(d) If PCRE_UNGREEDY is set, the greediness of the repetition quantifiers is
-inverted, that is, by default they are not greedy, but if followed by a
-question mark they are.
-.sp
-(e) PCRE_ANCHORED can be used at matching time to force a pattern to be tried
-only at the first matching position in the subject string.
-.sp
-(f) The PCRE_NOTBOL, PCRE_NOTEOL, PCRE_NOTEMPTY, and PCRE_NO_AUTO_CAPTURE
-options for \fBpcre_exec()\fP have no Perl equivalents.
-.sp
-(g) The (?R), (?number), and (?P>name) constructs allows for recursive pattern
-matching (Perl can do this using the (?p{code}) construct, which PCRE cannot
-support.)
-.sp
-(h) PCRE supports named capturing substrings, using the Python syntax.
-.sp
-(i) PCRE supports the possessive quantifier "++" syntax, taken from Sun's Java
-package.
-.sp
-(j) The (R) condition, for testing recursion, is a PCRE extension.
-.sp
-(k) The callout facility is PCRE-specific.
-.sp
-(l) The partial matching facility is PCRE-specific.
-.sp
-(m) Patterns compiled by PCRE can be saved and re-used at a later time, even on
-different hosts that have the other endianness.
-.P
-.in 0
-Last updated: 09 September 2004
-.br
-Copyright (c) 1997-2004 University of Cambridge.