summaryrefslogtreecommitdiff
path: root/doc/pcre2compat.3
diff options
context:
space:
mode:
authorph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>2019-02-06 18:11:36 +0000
committerph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069>2019-02-06 18:11:36 +0000
commit03c006cfda40d5218d2248674ddc3824f8169897 (patch)
tree8bfb007e8adba8eb8e1256afba09001b52509905 /doc/pcre2compat.3
parent2aee0809b4ec6f9c2fdbb33a0c200b17a9fd333c (diff)
downloadpcre2-03c006cfda40d5218d2248674ddc3824f8169897.tar.gz
Allow non-ASCII in group names when UTF is set; revise group naming terminology
in documentation to use "capture group", as Perl does. git-svn-id: svn://vcs.exim.org/pcre2/code/trunk@1066 6239d852-aaf2-0410-a92c-79f79f948069
Diffstat (limited to 'doc/pcre2compat.3')
-rw-r--r--doc/pcre2compat.352
1 files changed, 25 insertions, 27 deletions
diff --git a/doc/pcre2compat.3 b/doc/pcre2compat.3
index 6e448f6..a2fbf48 100644
--- a/doc/pcre2compat.3
+++ b/doc/pcre2compat.3
@@ -1,4 +1,4 @@
-.TH PCRE2COMPAT 3 "28 July 2018" "PCRE2 10.32"
+.TH PCRE2COMPAT 3 "03 February 2019" "PCRE2 10.33"
.SH NAME
PCRE2 - Perl-compatible regular expressions (revised API)
.SH "DIFFERENCES BETWEEN PCRE2 AND PERL"
@@ -23,10 +23,9 @@ character is not "a" three times (in principle; PCRE2 optimizes this to run the
assertion just once). Perl allows some repeat quantifiers on other assertions,
for example, \eb* (but not \eb{3}), but these do not seem to have any use.
.P
-3. Capturing subpatterns that occur inside negative lookaround assertions are
-counted, but their entries in the offsets vector are set only when a negative
-assertion is a condition that has a matching branch (that is, the condition is
-false).
+3. Capture groups that occur inside negative lookaround assertions are counted,
+but their entries in the offsets vector are set only when a negative assertion
+is a condition that has a matching branch (that is, the condition is false).
.P
4. The following Perl escape sequences are not supported: \eF, \el, \eL, \eu,
\eU, and \eN when followed by a character name. \eN on its own, matching a
@@ -79,13 +78,13 @@ documentation for details.
to PCRE2 release 10.23, but from release 10.30 this changed, and backtracking
into subroutine calls is now supported, as in Perl.
.P
-9. If any of the backtracking control verbs are used in a subpattern that is
-called as a subroutine (whether or not recursively), their effect is confined
-to that subpattern; it does not extend to the surrounding pattern. This is not
-always the case in Perl. In particular, if (*THEN) is present in a group that
-is called as a subroutine, its action is limited to that group, even if the
-group does not contain any | characters. Note that such subpatterns are
-processed as anchored at the point where they are tested.
+9. If any of the backtracking control verbs are used in a group that is called
+as a subroutine (whether or not recursively), their effect is confined to that
+group; it does not extend to the surrounding pattern. This is not always the
+case in Perl. In particular, if (*THEN) is present in a group that is called as
+a subroutine, its action is limited to that group, even if the group does not
+contain any | characters. Note that such groups are processed as anchored
+at the point where they are tested.
.P
10. If a pattern contains more than one backtracking control verb, the first
one that is backtracked onto acts. For example, in the pattern
@@ -101,21 +100,20 @@ strings when part of a pattern is repeated. For example, matching "aba" against
the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE2 it is set to
"b".
.P
-13. PCRE2's handling of duplicate subpattern numbers and duplicate subpattern
-names is not as general as Perl's. This is a consequence of the fact the PCRE2
-works internally just with numbers, using an external table to translate
-between numbers and names. In particular, a pattern such as (?|(?<a>A)|(?<b>B),
-where the two capturing parentheses have the same number but different names,
-is not supported, and causes an error at compile time. If it were allowed, it
-would not be possible to distinguish which parentheses matched, because both
-names map to capturing subpattern number 1. To avoid this confusing situation,
-an error is given at compile time.
+13. PCRE2's handling of duplicate capture group numbers and names is not as
+general as Perl's. This is a consequence of the fact the PCRE2 works internally
+just with numbers, using an external table to translate between numbers and
+names. In particular, a pattern such as (?|(?<a>A)|(?<b>B), where the two
+capture groups have the same number but different names, is not supported, and
+causes an error at compile time. If it were allowed, it would not be possible
+to distinguish which group matched, because both names map to capture group
+number 1. To avoid this confusing situation, an error is given at compile time.
.P
14. Perl used to recognize comments in some places that PCRE2 does not, for
-example, between the ( and ? at the start of a subpattern. If the /x modifier
-is set, Perl allowed white space between ( and ? though the latest Perls give
-an error (for a while it was just deprecated). There may still be some cases
-where Perl behaves differently.
+example, between the ( and ? at the start of a group. If the /x modifier is
+set, Perl allowed white space between ( and ? though the latest Perls give an
+error (for a while it was just deprecated). There may still be some cases where
+Perl behaves differently.
.P
15. Perl, when in warning mode, gives warnings for character classes such as
[A-\ed] or [a-[:digit:]]. It then treats the hyphens as literals. PCRE2 has no
@@ -200,6 +198,6 @@ Cambridge, England.
.rs
.sp
.nf
-Last updated: 28 July 2018
-Copyright (c) 1997-2018 University of Cambridge.
+Last updated: 03 February 2019
+Copyright (c) 1997-2019 University of Cambridge.
.fi