diff options
author | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2007-06-19 13:39:46 +0000 |
---|---|---|
committer | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2007-06-19 13:39:46 +0000 |
commit | 09f910f7ee8eb276aa67b38764b967506d04d3e5 (patch) | |
tree | e3b6252b42782e3883415317482d1a68fa8851b2 | |
parent | 5369bcdf22a5d865be5c920e0642012eb09d2cfe (diff) | |
download | pcre-09f910f7ee8eb276aa67b38764b967506d04d3e5.tar.gz |
Documentation final tidies for 7.2 release.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@185 2f5784b3-3f2a-0410-8824-cb99058d5e15
-rw-r--r-- | ChangeLog | 4 | ||||
-rw-r--r-- | doc/html/pcrepattern.html | 11 | ||||
-rw-r--r-- | doc/pcre.txt | 12 | ||||
-rw-r--r-- | doc/pcrepattern.3 | 2 |
4 files changed, 18 insertions, 11 deletions
@@ -68,9 +68,9 @@ Version 7.2 19-June-07 pcrecpp::RE("a*").FullMatch("aaa") matches, while pcrecpp::RE("a*?").FullMatch("aaa") does not, and pcrecpp::RE("a*?\\z").FullMatch("aaa") does again. - + 12. If \p or \P was used in non-UTF-8 mode on a character greater than 127 - it matched the wrong number of bytes. + it matched the wrong number of bytes. Version 7.1 24-Apr-07 diff --git a/doc/html/pcrepattern.html b/doc/html/pcrepattern.html index 8d603a1..a5ce66d 100644 --- a/doc/html/pcrepattern.html +++ b/doc/html/pcrepattern.html @@ -384,8 +384,10 @@ Unicode character properties </b><br> <P> When PCRE is built with Unicode character property support, three additional -escape sequences to match character properties are available when UTF-8 mode -is selected. They are: +escape sequences that match characters with specific properties are available. +When not in UTF-8 mode, these sequences are of course limited to testing +characters whose codepoints are less than 256, but they do work in this mode. +The extra escape sequences are: <pre> \p{<i>xx</i>} a character with the <i>xx</i> property \P{<i>xx</i>} a character without the <i>xx</i> property @@ -566,7 +568,8 @@ or more characters with the "mark" property, and treats the sequence as an atomic group <a href="#atomicgroup">(see below).</a> Characters with the "mark" property are typically accents that affect the -preceding character. +preceding character. None of them have codepoints less than 256, so in +non-UTF-8 mode \X matches any one character. </P> <P> Matching characters by Unicode property is not fast, because PCRE has to search @@ -1987,7 +1990,7 @@ Cambridge CB2 3QH, England. </P> <br><a name="SEC25" href="#TOC1">REVISION</a><br> <P> -Last updated: 13 June 2007 +Last updated: 19 June 2007 <br> Copyright © 1997-2007 University of Cambridge. <br> diff --git a/doc/pcre.txt b/doc/pcre.txt index e55cf01..823f15c 100644 --- a/doc/pcre.txt +++ b/doc/pcre.txt @@ -3047,8 +3047,10 @@ BACKSLASH Unicode character properties When PCRE is built with Unicode character property support, three addi- - tional escape sequences to match character properties are available - when UTF-8 mode is selected. They are: + tional escape sequences that match characters with specific properties + are available. When not in UTF-8 mode, these sequences are of course + limited to testing characters whose codepoints are less than 256, but + they do work in this mode. The extra escape sequences are: \p{xx} a character with the xx property \P{xx} a character without the xx property @@ -3162,7 +3164,9 @@ BACKSLASH That is, it matches a character without the "mark" property, followed by zero or more characters with the "mark" property, and treats the sequence as an atomic group (see below). Characters with the "mark" - property are typically accents that affect the preceding character. + property are typically accents that affect the preceding character. + None of them have codepoints less than 256, so in non-UTF-8 mode \X + matches any one character. Matching characters by Unicode property is not fast, because PCRE has to search a structure that contains data for over fifteen thousand @@ -4539,7 +4543,7 @@ AUTHOR REVISION - Last updated: 13 June 2007 + Last updated: 19 June 2007 Copyright (c) 1997-2007 University of Cambridge. ------------------------------------------------------------------------------ diff --git a/doc/pcrepattern.3 b/doc/pcrepattern.3 index 0b79c42..4b7a909 100644 --- a/doc/pcrepattern.3 +++ b/doc/pcrepattern.3 @@ -555,7 +555,7 @@ atomic group (see below). .\" Characters with the "mark" property are typically accents that affect the -preceding character. None of them have codepoints less than 256, so in +preceding character. None of them have codepoints less than 256, so in non-UTF-8 mode \eX matches any one character. .P Matching characters by Unicode property is not fast, because PCRE has to search |