diff options
Diffstat (limited to 'pcre/doc/pcrepattern.3')
-rw-r--r-- | pcre/doc/pcrepattern.3 | 37 |
1 files changed, 20 insertions, 17 deletions
diff --git a/pcre/doc/pcrepattern.3 b/pcre/doc/pcrepattern.3 index 3b8c6393d21..97df217fdb2 100644 --- a/pcre/doc/pcrepattern.3 +++ b/pcre/doc/pcrepattern.3 @@ -1,4 +1,4 @@ -.TH PCREPATTERN 3 "14 June 2015" "PCRE 8.38" +.TH PCREPATTERN 3 "23 October 2016" "PCRE 8.40" .SH NAME PCRE - Perl-compatible regular expressions .SH "PCRE REGULAR EXPRESSION DETAILS" @@ -336,22 +336,22 @@ When PCRE is compiled in EBCDIC mode, \ea, \ee, \ef, \en, \er, and \et generate the appropriate EBCDIC code values. The \ec escape is processed as specified for Perl in the \fBperlebcdic\fP document. The only characters that are allowed after \ec are A-Z, a-z, or one of @, [, \e, ], ^, _, or ?. Any -other character provokes a compile-time error. The sequence \e@ encodes -character code 0; the letters (in either case) encode characters 1-26 (hex 01 -to hex 1A); [, \e, ], ^, and _ encode characters 27-31 (hex 1B to hex 1F), and -\e? becomes either 255 (hex FF) or 95 (hex 5F). +other character provokes a compile-time error. The sequence \ec@ encodes +character code 0; after \ec the letters (in either case) encode characters 1-26 +(hex 01 to hex 1A); [, \e, ], ^, and _ encode characters 27-31 (hex 1B to hex +1F), and \ec? becomes either 255 (hex FF) or 95 (hex 5F). .P -Thus, apart from \e?, these escapes generate the same character code values as +Thus, apart from \ec?, these escapes generate the same character code values as they do in an ASCII environment, though the meanings of the values mostly -differ. For example, \eG always generates code value 7, which is BEL in ASCII +differ. For example, \ecG always generates code value 7, which is BEL in ASCII but DEL in EBCDIC. .P -The sequence \e? generates DEL (127, hex 7F) in an ASCII environment, but +The sequence \ec? generates DEL (127, hex 7F) in an ASCII environment, but because 127 is not a control character in EBCDIC, Perl makes it generate the APC character. Unfortunately, there are several variants of EBCDIC. In most of them the APC character has the value 255 (hex FF), but in the one Perl calls POSIX-BC its value is 95 (hex 5F). If certain other characters have POSIX-BC -values, PCRE makes \e? generate 95; otherwise it generates 255. +values, PCRE makes \ec? generate 95; otherwise it generates 255. .P After \e0 up to two further octal digits are read. If there are fewer than two digits, just those that are present are used. Thus the sequence \e0\ex\e015 @@ -1511,12 +1511,8 @@ J, U and X respectively. .P When one of these option changes occurs at top level (that is, not inside subpattern parentheses), the change applies to the remainder of the pattern -that follows. If the change is placed right at the start of a pattern, PCRE -extracts it into the global options (and it will therefore show up in data -extracted by the \fBpcre_fullinfo()\fP function). -.P -An option change within a subpattern (see below for a description of -subpatterns) affects only that part of the subpattern that follows it, so +that follows. An option change within a subpattern (see below for a description +of subpatterns) affects only that part of the subpattern that follows it, so .sp (a(?i)b)c .sp @@ -2171,6 +2167,13 @@ numbering the capturing subpatterns in the whole pattern. However, substring capturing is carried out only for positive assertions. (Perl sometimes, but not always, does do capturing in negative assertions.) .P +WARNING: If a positive assertion containing one or more capturing subpatterns +succeeds, but failure to match later in the pattern causes backtracking over +this assertion, the captures within the assertion are reset only if no higher +numbered captures are already set. This is, unfortunately, a fundamental +limitation of the current implementation, and as PCRE1 is now in +maintenance-only status, it is unlikely ever to change. +.P For compatibility with Perl, assertion subpatterns may be repeated; though it makes no sense to assert the same thing several times, the side effect of capturing parentheses may occasionally be useful. In practice, there only three @@ -3296,6 +3299,6 @@ Cambridge CB2 3QH, England. .rs .sp .nf -Last updated: 14 June 2015 -Copyright (c) 1997-2015 University of Cambridge. +Last updated: 23 October 2016 +Copyright (c) 1997-2016 University of Cambridge. .fi |