summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2010-10-31 14:06:43 +0000
committerph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2010-10-31 14:06:43 +0000
commit6b21094f294885a4e1d4d255ceeecea503e56e17 (patch)
treec40748d20f0d00854db8a04a5537525e3996f2e6
parent70ccf43822dbf6bc1c6fa18746192fb9c6ebdfed (diff)
downloadpcre-6b21094f294885a4e1d4d255ceeecea503e56e17.tar.gz
Clarify documentation about comments in patterns.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@562 2f5784b3-3f2a-0410-8824-cb99058d5e15
-rw-r--r--doc/pcrecompat.39
-rw-r--r--doc/pcrepattern.331
2 files changed, 23 insertions, 17 deletions
diff --git a/doc/pcrecompat.3 b/doc/pcrecompat.3
index 6c1adda..3ebaac0 100644
--- a/doc/pcrecompat.3
+++ b/doc/pcrecompat.3
@@ -6,7 +6,7 @@ PCRE - Perl-compatible regular expressions
.sp
This document describes the differences in the ways that PCRE and Perl handle
regular expressions. The differences described here are with respect to Perl
-5.10/5.11.
+versions 5.10 and above.
.P
1. PCRE has only a subset of Perl's UTF-8 and Unicode support. Details of what
it does have are given in the
@@ -103,7 +103,10 @@ would not be possible to distinguish which parentheses matched, because both
names map to capturing subpattern number 1. To avoid this confusing situation,
an error is given at compile time.
.P
-12. PCRE provides some extensions to the Perl regular expression facilities.
+12. Perl recognizes comments in some places that PCRE doesn't, for example,
+between the ( and ? at the start of a subpattern.
+.P
+13. PCRE provides some extensions to the Perl regular expression facilities.
Perl 5.10 includes new features that are not in earlier versions of Perl, some
of which (such as named parentheses) have been in PCRE for some time. This list
is with respect to Perl 5.10:
@@ -160,6 +163,6 @@ Cambridge CB2 3QH, England.
.rs
.sp
.nf
-Last updated: 12 May 2010
+Last updated: 31 October 2010
Copyright (c) 1997-2010 University of Cambridge.
.fi
diff --git a/doc/pcrepattern.3 b/doc/pcrepattern.3
index ecbcc9f..fe332f0 100644
--- a/doc/pcrepattern.3
+++ b/doc/pcrepattern.3
@@ -2111,23 +2111,26 @@ dd-aaa-dd or dd-dd-dd, where aaa are letters and dd are digits.
.SH COMMENTS
.rs
.sp
+There are two ways of including comments in patterns that are processed by
+PCRE. In both cases, the start of the comment must not be in a character class,
+nor in the middle of any other sequence of related characters such as (?: or a
+subpattern name or number. The characters that make up a comment play no part
+in the pattern matching.
+.P
The sequence (?# marks the start of a comment that continues up to the next
-closing parenthesis. Nested parentheses are not permitted. The characters
-that make up a comment play no part in the pattern matching at all.
-.P
-If the PCRE_EXTENDED option is set, an unescaped # character outside a
-character class introduces a comment that continues to immediately after the
-next newline character or character sequence in the pattern. Which characters
-are interpreted as newlines is controlled by the options passed to
-\fBpcre_compile()\fP or by a special sequence at the start of the pattern, as
-described in the section entitled
+closing parenthesis. Nested parentheses are not permitted. If the PCRE_EXTENDED
+option is set, an unescaped # character also introduces a comment, which in
+this case continues to immediately after the next newline character or
+character sequence in the pattern. Which characters are interpreted as newlines
+is controlled by the options passed to \fBpcre_compile()\fP or by a special
+sequence at the start of the pattern, as described in the section entitled
.\" HTML <a href="#recursion">
.\" </a>
"Newline conventions"
.\"
-above. Note that end of a comment is a literal newline sequence in the pattern;
-escape sequences that happen to represent a newline do not terminate a comment.
-For example, consider this pattern when PCRE_EXTENDED is set, and the default
+above. Note that end of this type of comment is a literal newline sequence in
+the pattern; escape sequences that happen to represent a newline do not count.
+For example, consider this pattern when PCRE_EXTENDED is set, and the default
newline convention is in force:
.sp
abc #comment \en still comment
@@ -2135,7 +2138,7 @@ newline convention is in force:
On encountering the # character, \fBpcre_compile()\fP skips along, looking for
a newline in the pattern. The sequence \en is still literal at this stage, so
it does not terminate the comment. Only an actual character with the code value
-0x0a does so.
+0x0a (the default newline) does so.
.
.
.\" HTML <a name="recursion"></a>
@@ -2711,6 +2714,6 @@ Cambridge CB2 3QH, England.
.rs
.sp
.nf
-Last updated: 26 October 2010
+Last updated: 31 October 2010
Copyright (c) 1997-2010 University of Cambridge.
.fi