summaryrefslogtreecommitdiff
path: root/doc/html/pcre2compat.html
diff options
context:
space:
mode:
authorph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>2017-03-31 16:49:33 +0000
committerph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>2017-03-31 16:49:33 +0000
commit0b5cd88c0756b6b9101feff0d580dba70afadc18 (patch)
tree74c0b1ea9c09df7052d4bdf263c52fc2bf39aeb9 /doc/html/pcre2compat.html
parent88703fe34a96e29afa7ce63472da4f6ba83a84b2 (diff)
downloadpcre2-0b5cd88c0756b6b9101feff0d580dba70afadc18.tar.gz
Documentation update
git-svn-id: svn://vcs.exim.org/pcre2/code/trunk@722 6239d852-aaf2-0410-a92c-79f79f948069
Diffstat (limited to 'doc/html/pcre2compat.html')
-rw-r--r--doc/html/pcre2compat.html64
1 files changed, 29 insertions, 35 deletions
diff --git a/doc/html/pcre2compat.html b/doc/html/pcre2compat.html
index 993dfd1..b55ab82 100644
--- a/doc/html/pcre2compat.html
+++ b/doc/html/pcre2compat.html
@@ -18,7 +18,8 @@ DIFFERENCES BETWEEN PCRE2 AND PERL
<P>
This document describes the differences in the ways that PCRE2 and Perl handle
regular expressions. The differences described here are with respect to Perl
-versions 5.10 and above.
+versions 5.24, but as both Perl and PCRE2 are continually changing, the
+information may sometimes be out of date.
</P>
<P>
1. PCRE2 has only a subset of Perl's Unicode support. Details of what it does
@@ -27,17 +28,18 @@ have are given in the
page.
</P>
<P>
-2. PCRE2 allows repeat quantifiers only on parenthesized assertions, but they
-do not mean what you might think. For example, (?!a){3} does not assert that
-the next three characters are not "a". It just asserts that the next character
-is not "a" three times (in principle: PCRE2 optimizes this to run the assertion
-just once). Perl allows repeat quantifiers on other assertions such as \b, but
-these do not seem to have any use.
+2. Like Perl, PCRE2 allows repeat quantifiers on parenthesized assertions, but
+they do not mean what you might think. For example, (?!a){3} does not assert
+that the next three characters are not "a". It just asserts that the next
+character is not "a" three times (in principle: PCRE2 optimizes this to run the
+assertion just once). Perl allows some repeat quantifiers on other assertions,
+for example, \b* (but not \b{3}), but these do not seem to have any use.
</P>
<P>
-3. Capturing subpatterns that occur inside negative lookahead assertions are
-counted, but their entries in the offsets vector are never set. Perl sometimes
-(but not always) sets its numerical variables from inside negative assertions.
+3. Capturing subpatterns that occur inside negative lookaround assertions are
+counted, but their entries in the offsets vector are set only if the assertion
+is a condition. Perl has changed its behaviour in this regard from time to
+time.
</P>
<P>
4. The following Perl escape sequences are not supported: \l, \u, \L,
@@ -50,13 +52,13 @@ generated by default. However, if the PCRE2_ALT_BSUX option is set,
</P>
<P>
5. The Perl escape sequences \p, \P, and \X are supported only if PCRE2 is
-built with Unicode support. The properties that can be tested with \p and \P
-are limited to the general category properties such as Lu and Nd, script names
-such as Greek or Han, and the derived properties Any and L&. PCRE2 does support
-the Cs (surrogate) property, which Perl does not; the Perl documentation says
-"Because Perl hides the need for the user to understand the internal
-representation of Unicode characters, there is no need to implement the
-somewhat messy concept of surrogates."
+built with Unicode support (the default). The properties that can be tested
+with \p and \P are limited to the general category properties such as Lu and
+Nd, script names such as Greek or Han, and the derived properties Any and L&.
+PCRE2 does support the Cs (surrogate) property, which Perl does not; the Perl
+documentation says "Because Perl hides the need for the user to understand the
+internal representation of Unicode characters, there is no need to implement
+the somewhat messy concept of surrogates."
</P>
<P>
6. PCRE2 does support the \Q...\E escape for quoting substrings. Characters
@@ -75,23 +77,15 @@ The \Q...\E sequence is recognized both inside and outside character classes.
</P>
<P>
7. Fairly obviously, PCRE2 does not support the (?{code}) and (??{code})
-constructions. However, there is support for recursive patterns. This is not
-available in Perl 5.8, but it is in Perl 5.10. Also, the PCRE2 "callout"
-feature allows an external function to be called during pattern matching. See
-the
+constructions. However, there is support PCRE2's "callout" feature, which
+allows an external function to be called during pattern matching. See the
<a href="pcre2callout.html"><b>pcre2callout</b></a>
documentation for details.
</P>
<P>
-8. Subroutine calls (whether recursive or not) are treated as atomic groups.
-Atomic recursion is like Python, but unlike Perl. Captured values that are set
-outside a subroutine call can be referenced from inside in PCRE2, but not in
-Perl. There is a discussion that explains these differences in more detail in
-the
-<a href="pcre2pattern.html#recursiondifference">section on recursion differences from Perl</a>
-in the
-<a href="pcre2pattern.html"><b>pcre2pattern</b></a>
-page.
+8. Subroutine calls (whether recursive or not) were treated as atomic groups up
+to PCRE2 release 10.23, but from release 10.30 this changed, and backtracking
+into subroutine calls is now supported, as in Perl.
</P>
<P>
9. If any of the backtracking control verbs are used in a subpattern that is
@@ -147,14 +141,14 @@ certainly user mistakes.
16. In PCRE2, the upper/lower case character properties Lu and Ll are not
affected when case-independent matching is specified. For example, \p{Lu}
always matches an upper case letter. I think Perl has changed in this respect;
-in the release at the time of writing (5.16), \p{Lu} and \p{Ll} match all
+in the release at the time of writing (5.24), \p{Lu} and \p{Ll} match all
letters, regardless of case, when case independence is specified.
</P>
<P>
17. PCRE2 provides some extensions to the Perl regular expression facilities.
Perl 5.10 includes new features that are not in earlier versions of Perl, some
-of which (such as named parentheses) have been in PCRE2 for some time. This
-list is with respect to Perl 5.10:
+of which (such as named parentheses) were in PCRE2 for some time before. This
+list is with respect to Perl 5.24:
<br>
<br>
(a) Although lookbehind assertions in PCRE2 must match fixed length strings,
@@ -220,9 +214,9 @@ Cambridge, England.
REVISION
</b><br>
<P>
-Last updated: 18 October 2016
+Last updated: 29 March 2017
<br>
-Copyright &copy; 1997-2016 University of Cambridge.
+Copyright &copy; 1997-2017 University of Cambridge.
<br>
<p>
Return to the <a href="index.html">PCRE2 index page</a>.