diff options
author | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2014-01-03 15:15:00 +0000 |
---|---|---|
committer | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2014-01-03 15:15:00 +0000 |
commit | 07655468f70a257382c954ee9a12810f2418310f (patch) | |
tree | 8436579dfd665e39f105ff1a8eaf95b3cf40d074 /doc | |
parent | 7a1b87172d72044111aaf64400b531323899a766 (diff) | |
download | pcre-07655468f70a257382c954ee9a12810f2418310f.tar.gz |
Reword pcretest messages and clarify "first char" meaning.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1433 2f5784b3-3f2a-0410-8824-cb99058d5e15
Diffstat (limited to 'doc')
-rw-r--r-- | doc/pcreapi.3 | 84 | ||||
-rw-r--r-- | doc/pcretest.1 | 11 |
2 files changed, 48 insertions, 47 deletions
diff --git a/doc/pcreapi.3 b/doc/pcreapi.3 index ebbd20f..0404939 100644 --- a/doc/pcreapi.3 +++ b/doc/pcreapi.3 @@ -1,4 +1,4 @@ -.TH PCREAPI 3 "12 November 2013" "PCRE 8.34" +.TH PCREAPI 3 "03 January 2014" "PCRE 8.35" .SH NAME PCRE - Perl-compatible regular expressions .sp @@ -1248,12 +1248,15 @@ information call is provided for internal use by the \fBpcre_study()\fP function. External callers can cause PCRE to use its internal tables by passing a NULL table pointer. .sp - PCRE_INFO_FIRSTBYTE + PCRE_INFO_FIRSTBYTE (deprecated) .sp Return information about the first data unit of any matched string, for a -non-anchored pattern. (The name of this option refers to the 8-bit library, -where data units are bytes.) The fourth argument should point to an \fBint\fP -variable. +non-anchored pattern. The name of this option refers to the 8-bit library, +where data units are bytes. The fourth argument should point to an \fBint\fP +variable. Negative values are used for special cases. However, this means that +when the 32-bit library is in non-UTF-32 mode, the full 32-bit range of +characters cannot be returned. For this reason, this value is deprecated; use +PCRE_INFO_FIRSTCHARACTERFLAGS and PCRE_INFO_FIRSTCHARACTER instead. .P If there is a fixed first value, for example, the letter "c" from a pattern such as (cat|cow|coyote), its value is returned. In the 8-bit library, the @@ -1271,11 +1274,38 @@ starts with "^", or -1 is returned, indicating that the pattern matches only at the start of a subject string or after any newline within the string. Otherwise -2 is returned. For anchored patterns, -2 is returned. +.sp + PCRE_INFO_FIRSTCHARACTER +.sp +Return the value of the first data unit (non-UTF character) of any matched +string in the situation where PCRE_INFO_FIRSTCHARACTERFLAGS returns 1; +otherwise return 0. The fourth argument should point to an \fBuint_t\fP +variable. .P -Since for the 32-bit library using the non-UTF-32 mode, this function is unable -to return the full 32-bit range of the character, this value is deprecated; -instead the PCRE_INFO_FIRSTCHARACTERFLAGS and PCRE_INFO_FIRSTCHARACTER values -should be used. +In the 8-bit library, the value is always less than 256. In the 16-bit library +the value can be up to 0xffff. In the 32-bit library in UTF-32 mode the value +can be up to 0x10ffff, and up to 0xffffffff when not using UTF-32 mode. +.sp + PCRE_INFO_FIRSTCHARACTERFLAGS +.sp +Return information about the first data unit of any matched string, for a +non-anchored pattern. The fourth argument should point to an \fBint\fP +variable. +.P +If there is a fixed first value, for example, the letter "c" from a pattern +such as (cat|cow|coyote), 1 is returned, and the character value can be +retrieved using PCRE_INFO_FIRSTCHARACTER. If there is no fixed first value, and +if either +.sp +(a) the pattern was compiled with the PCRE_MULTILINE option, and every branch +starts with "^", or +.sp +(b) every branch of the pattern starts with ".*" and PCRE_DOTALL is not set +(if it were set, the pattern would be anchored), +.sp +2 is returned, indicating that the pattern matches only at the start of a +subject string or after any newline within the string. Otherwise 0 is +returned. For anchored patterns, 0 is returned. .sp PCRE_INFO_FIRSTTABLE .sp @@ -1499,38 +1529,6 @@ is made available via this option so that it can be saved and restored (see the .\" documentation for details). .sp - PCRE_INFO_FIRSTCHARACTERFLAGS -.sp -Return information about the first data unit of any matched string, for a -non-anchored pattern. The fourth argument should point to an \fBint\fP -variable. -.P -If there is a fixed first value, for example, the letter "c" from a pattern -such as (cat|cow|coyote), 1 is returned, and the character value can be -retrieved using PCRE_INFO_FIRSTCHARACTER. -.P -If there is no fixed first value, and if either -.sp -(a) the pattern was compiled with the PCRE_MULTILINE option, and every branch -starts with "^", or -.sp -(b) every branch of the pattern starts with ".*" and PCRE_DOTALL is not set -(if it were set, the pattern would be anchored), -.sp -2 is returned, indicating that the pattern matches only at the start of a -subject string or after any newline within the string. Otherwise 0 is -returned. For anchored patterns, 0 is returned. -.sp - PCRE_INFO_FIRSTCHARACTER -.sp -Return the fixed first character value in the situation where -PCRE_INFO_FIRSTCHARACTERFLAGS returns 1; otherwise return 0. The fourth -argument should point to an \fBuint_t\fP variable. -.P -In the 8-bit library, the value is always less than 256. In the 16-bit library -the value can be up to 0xffff. In the 32-bit library in UTF-32 mode the value -can be up to 0x10ffff, and up to 0xffffffff when not using UTF-32 mode. -.sp PCRE_INFO_REQUIREDCHARFLAGS .sp Returns 1 if there is a rightmost literal data unit that must exist in any @@ -2900,6 +2898,6 @@ Cambridge CB2 3QH, England. .rs .sp .nf -Last updated: 12 November 2013 -Copyright (c) 1997-2013 University of Cambridge. +Last updated: 03 January 2014 +Copyright (c) 1997-2014 University of Cambridge. .fi diff --git a/doc/pcretest.1 b/doc/pcretest.1 index f17c6f2..5a8ec58 100644 --- a/doc/pcretest.1 +++ b/doc/pcretest.1 @@ -1,4 +1,4 @@ -.TH PCRETEST 1 "12 November 2013" "PCRE 8.34" +.TH PCRETEST 1 "03 January 2014" "PCRE 8.35" .SH NAME pcretest - a program for testing Perl-compatible regular expressions. .SH SYNOPSIS @@ -483,7 +483,10 @@ below. The \fB/I\fP modifier requests that \fBpcretest\fP output information about the compiled pattern (whether it is anchored, has a fixed first character, and so on). It does this by calling \fBpcre[16|32]_fullinfo()\fP after compiling a -pattern. If the pattern is studied, the results of that are also output. +pattern. If the pattern is studied, the results of that are also output. In +this output, the word "char" means a non-UTF character, that is, the value of a +single data item (8-bit, 16-bit, or 32-bit, depending on the library that is +being tested). .P The \fB/K\fP modifier requests \fBpcretest\fP to show names from backtracking control verbs that are returned from calls to \fBpcre[16|32]_exec()\fP. It causes @@ -1135,6 +1138,6 @@ Cambridge CB2 3QH, England. .rs .sp .nf -Last updated: 12 November 2013 -Copyright (c) 1997-2013 University of Cambridge. +Last updated: 03 January 2014 +Copyright (c) 1997-2014 University of Cambridge. .fi |