diff options
author | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2010-10-31 14:06:43 +0000 |
---|---|---|
committer | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2010-10-31 14:06:43 +0000 |
commit | 6b21094f294885a4e1d4d255ceeecea503e56e17 (patch) | |
tree | c40748d20f0d00854db8a04a5537525e3996f2e6 | |
parent | 70ccf43822dbf6bc1c6fa18746192fb9c6ebdfed (diff) | |
download | pcre-6b21094f294885a4e1d4d255ceeecea503e56e17.tar.gz |
Clarify documentation about comments in patterns.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@562 2f5784b3-3f2a-0410-8824-cb99058d5e15
-rw-r--r-- | doc/pcrecompat.3 | 9 | ||||
-rw-r--r-- | doc/pcrepattern.3 | 31 |
2 files changed, 23 insertions, 17 deletions
diff --git a/doc/pcrecompat.3 b/doc/pcrecompat.3 index 6c1adda..3ebaac0 100644 --- a/doc/pcrecompat.3 +++ b/doc/pcrecompat.3 @@ -6,7 +6,7 @@ PCRE - Perl-compatible regular expressions .sp This document describes the differences in the ways that PCRE and Perl handle regular expressions. The differences described here are with respect to Perl -5.10/5.11. +versions 5.10 and above. .P 1. PCRE has only a subset of Perl's UTF-8 and Unicode support. Details of what it does have are given in the @@ -103,7 +103,10 @@ would not be possible to distinguish which parentheses matched, because both names map to capturing subpattern number 1. To avoid this confusing situation, an error is given at compile time. .P -12. PCRE provides some extensions to the Perl regular expression facilities. +12. Perl recognizes comments in some places that PCRE doesn't, for example, +between the ( and ? at the start of a subpattern. +.P +13. PCRE provides some extensions to the Perl regular expression facilities. Perl 5.10 includes new features that are not in earlier versions of Perl, some of which (such as named parentheses) have been in PCRE for some time. This list is with respect to Perl 5.10: @@ -160,6 +163,6 @@ Cambridge CB2 3QH, England. .rs .sp .nf -Last updated: 12 May 2010 +Last updated: 31 October 2010 Copyright (c) 1997-2010 University of Cambridge. .fi diff --git a/doc/pcrepattern.3 b/doc/pcrepattern.3 index ecbcc9f..fe332f0 100644 --- a/doc/pcrepattern.3 +++ b/doc/pcrepattern.3 @@ -2111,23 +2111,26 @@ dd-aaa-dd or dd-dd-dd, where aaa are letters and dd are digits. .SH COMMENTS .rs .sp +There are two ways of including comments in patterns that are processed by +PCRE. In both cases, the start of the comment must not be in a character class, +nor in the middle of any other sequence of related characters such as (?: or a +subpattern name or number. The characters that make up a comment play no part +in the pattern matching. +.P The sequence (?# marks the start of a comment that continues up to the next -closing parenthesis. Nested parentheses are not permitted. The characters -that make up a comment play no part in the pattern matching at all. -.P -If the PCRE_EXTENDED option is set, an unescaped # character outside a -character class introduces a comment that continues to immediately after the -next newline character or character sequence in the pattern. Which characters -are interpreted as newlines is controlled by the options passed to -\fBpcre_compile()\fP or by a special sequence at the start of the pattern, as -described in the section entitled +closing parenthesis. Nested parentheses are not permitted. If the PCRE_EXTENDED +option is set, an unescaped # character also introduces a comment, which in +this case continues to immediately after the next newline character or +character sequence in the pattern. Which characters are interpreted as newlines +is controlled by the options passed to \fBpcre_compile()\fP or by a special +sequence at the start of the pattern, as described in the section entitled .\" HTML <a href="#recursion"> .\" </a> "Newline conventions" .\" -above. Note that end of a comment is a literal newline sequence in the pattern; -escape sequences that happen to represent a newline do not terminate a comment. -For example, consider this pattern when PCRE_EXTENDED is set, and the default +above. Note that end of this type of comment is a literal newline sequence in +the pattern; escape sequences that happen to represent a newline do not count. +For example, consider this pattern when PCRE_EXTENDED is set, and the default newline convention is in force: .sp abc #comment \en still comment @@ -2135,7 +2138,7 @@ newline convention is in force: On encountering the # character, \fBpcre_compile()\fP skips along, looking for a newline in the pattern. The sequence \en is still literal at this stage, so it does not terminate the comment. Only an actual character with the code value -0x0a does so. +0x0a (the default newline) does so. . . .\" HTML <a name="recursion"></a> @@ -2711,6 +2714,6 @@ Cambridge CB2 3QH, England. .rs .sp .nf -Last updated: 26 October 2010 +Last updated: 31 October 2010 Copyright (c) 1997-2010 University of Cambridge. .fi |