diff options
author | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2014-01-03 15:15:00 +0000 |
---|---|---|
committer | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2014-01-03 15:15:00 +0000 |
commit | 07655468f70a257382c954ee9a12810f2418310f (patch) | |
tree | 8436579dfd665e39f105ff1a8eaf95b3cf40d074 | |
parent | 7a1b87172d72044111aaf64400b531323899a766 (diff) | |
download | pcre-07655468f70a257382c954ee9a12810f2418310f.tar.gz |
Reword pcretest messages and clarify "first char" meaning.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1433 2f5784b3-3f2a-0410-8824-cb99058d5e15
-rw-r--r-- | ChangeLog | 6 | ||||
-rw-r--r-- | doc/pcreapi.3 | 84 | ||||
-rw-r--r-- | doc/pcretest.1 | 11 | ||||
-rw-r--r-- | pcretest.c | 4 | ||||
-rw-r--r-- | testdata/testoutput12 | 8 | ||||
-rw-r--r-- | testdata/testoutput13 | 2 | ||||
-rw-r--r-- | testdata/testoutput14 | 12 | ||||
-rw-r--r-- | testdata/testoutput15 | 36 | ||||
-rw-r--r-- | testdata/testoutput16 | 8 | ||||
-rw-r--r-- | testdata/testoutput17 | 16 | ||||
-rw-r--r-- | testdata/testoutput18-16 | 38 | ||||
-rw-r--r-- | testdata/testoutput18-32 | 38 | ||||
-rw-r--r-- | testdata/testoutput19 | 2 | ||||
-rw-r--r-- | testdata/testoutput2 | 168 | ||||
-rw-r--r-- | testdata/testoutput21-16 | 4 | ||||
-rw-r--r-- | testdata/testoutput21-32 | 4 | ||||
-rw-r--r-- | testdata/testoutput22-16 | 4 | ||||
-rw-r--r-- | testdata/testoutput22-32 | 4 | ||||
-rw-r--r-- | testdata/testoutput23 | 4 | ||||
-rw-r--r-- | testdata/testoutput25 | 4 | ||||
-rw-r--r-- | testdata/testoutput3 | 4 | ||||
-rw-r--r-- | testdata/testoutput5 | 8 | ||||
-rw-r--r-- | testdata/testoutput8 | 2 |
23 files changed, 239 insertions, 232 deletions
@@ -40,6 +40,12 @@ Version 8.35-RC1 xx-xxxx-201x 8. Add missing (new) files sljitNativeTILEGX.c and sljitNativeTILEGX-encoder.c to the export list in Makefile.am (they were accidentally omitted from the 8.34 tarball). + +9. The informational output from pcretest used the phrase "starting byte set" + which is inappropriate for the 16-bit and 32-bit libraries. As the output + for "first char" and "need char" really means "non-UTF-char", I've changed + "byte" to "char", and slightly reworded the output. The documentation about + these values has also been (I hope) clarified. Version 8.34 15-December-2013 diff --git a/doc/pcreapi.3 b/doc/pcreapi.3 index ebbd20f..0404939 100644 --- a/doc/pcreapi.3 +++ b/doc/pcreapi.3 @@ -1,4 +1,4 @@ -.TH PCREAPI 3 "12 November 2013" "PCRE 8.34" +.TH PCREAPI 3 "03 January 2014" "PCRE 8.35" .SH NAME PCRE - Perl-compatible regular expressions .sp @@ -1248,12 +1248,15 @@ information call is provided for internal use by the \fBpcre_study()\fP function. External callers can cause PCRE to use its internal tables by passing a NULL table pointer. .sp - PCRE_INFO_FIRSTBYTE + PCRE_INFO_FIRSTBYTE (deprecated) .sp Return information about the first data unit of any matched string, for a -non-anchored pattern. (The name of this option refers to the 8-bit library, -where data units are bytes.) The fourth argument should point to an \fBint\fP -variable. +non-anchored pattern. The name of this option refers to the 8-bit library, +where data units are bytes. The fourth argument should point to an \fBint\fP +variable. Negative values are used for special cases. However, this means that +when the 32-bit library is in non-UTF-32 mode, the full 32-bit range of +characters cannot be returned. For this reason, this value is deprecated; use +PCRE_INFO_FIRSTCHARACTERFLAGS and PCRE_INFO_FIRSTCHARACTER instead. .P If there is a fixed first value, for example, the letter "c" from a pattern such as (cat|cow|coyote), its value is returned. In the 8-bit library, the @@ -1271,11 +1274,38 @@ starts with "^", or -1 is returned, indicating that the pattern matches only at the start of a subject string or after any newline within the string. Otherwise -2 is returned. For anchored patterns, -2 is returned. +.sp + PCRE_INFO_FIRSTCHARACTER +.sp +Return the value of the first data unit (non-UTF character) of any matched +string in the situation where PCRE_INFO_FIRSTCHARACTERFLAGS returns 1; +otherwise return 0. The fourth argument should point to an \fBuint_t\fP +variable. .P -Since for the 32-bit library using the non-UTF-32 mode, this function is unable -to return the full 32-bit range of the character, this value is deprecated; -instead the PCRE_INFO_FIRSTCHARACTERFLAGS and PCRE_INFO_FIRSTCHARACTER values -should be used. +In the 8-bit library, the value is always less than 256. In the 16-bit library +the value can be up to 0xffff. In the 32-bit library in UTF-32 mode the value +can be up to 0x10ffff, and up to 0xffffffff when not using UTF-32 mode. +.sp + PCRE_INFO_FIRSTCHARACTERFLAGS +.sp +Return information about the first data unit of any matched string, for a +non-anchored pattern. The fourth argument should point to an \fBint\fP +variable. +.P +If there is a fixed first value, for example, the letter "c" from a pattern +such as (cat|cow|coyote), 1 is returned, and the character value can be +retrieved using PCRE_INFO_FIRSTCHARACTER. If there is no fixed first value, and +if either +.sp +(a) the pattern was compiled with the PCRE_MULTILINE option, and every branch +starts with "^", or +.sp +(b) every branch of the pattern starts with ".*" and PCRE_DOTALL is not set +(if it were set, the pattern would be anchored), +.sp +2 is returned, indicating that the pattern matches only at the start of a +subject string or after any newline within the string. Otherwise 0 is +returned. For anchored patterns, 0 is returned. .sp PCRE_INFO_FIRSTTABLE .sp @@ -1499,38 +1529,6 @@ is made available via this option so that it can be saved and restored (see the .\" documentation for details). .sp - PCRE_INFO_FIRSTCHARACTERFLAGS -.sp -Return information about the first data unit of any matched string, for a -non-anchored pattern. The fourth argument should point to an \fBint\fP -variable. -.P -If there is a fixed first value, for example, the letter "c" from a pattern -such as (cat|cow|coyote), 1 is returned, and the character value can be -retrieved using PCRE_INFO_FIRSTCHARACTER. -.P -If there is no fixed first value, and if either -.sp -(a) the pattern was compiled with the PCRE_MULTILINE option, and every branch -starts with "^", or -.sp -(b) every branch of the pattern starts with ".*" and PCRE_DOTALL is not set -(if it were set, the pattern would be anchored), -.sp -2 is returned, indicating that the pattern matches only at the start of a -subject string or after any newline within the string. Otherwise 0 is -returned. For anchored patterns, 0 is returned. -.sp - PCRE_INFO_FIRSTCHARACTER -.sp -Return the fixed first character value in the situation where -PCRE_INFO_FIRSTCHARACTERFLAGS returns 1; otherwise return 0. The fourth -argument should point to an \fBuint_t\fP variable. -.P -In the 8-bit library, the value is always less than 256. In the 16-bit library -the value can be up to 0xffff. In the 32-bit library in UTF-32 mode the value -can be up to 0x10ffff, and up to 0xffffffff when not using UTF-32 mode. -.sp PCRE_INFO_REQUIREDCHARFLAGS .sp Returns 1 if there is a rightmost literal data unit that must exist in any @@ -2900,6 +2898,6 @@ Cambridge CB2 3QH, England. .rs .sp .nf -Last updated: 12 November 2013 -Copyright (c) 1997-2013 University of Cambridge. +Last updated: 03 January 2014 +Copyright (c) 1997-2014 University of Cambridge. .fi diff --git a/doc/pcretest.1 b/doc/pcretest.1 index f17c6f2..5a8ec58 100644 --- a/doc/pcretest.1 +++ b/doc/pcretest.1 @@ -1,4 +1,4 @@ -.TH PCRETEST 1 "12 November 2013" "PCRE 8.34" +.TH PCRETEST 1 "03 January 2014" "PCRE 8.35" .SH NAME pcretest - a program for testing Perl-compatible regular expressions. .SH SYNOPSIS @@ -483,7 +483,10 @@ below. The \fB/I\fP modifier requests that \fBpcretest\fP output information about the compiled pattern (whether it is anchored, has a fixed first character, and so on). It does this by calling \fBpcre[16|32]_fullinfo()\fP after compiling a -pattern. If the pattern is studied, the results of that are also output. +pattern. If the pattern is studied, the results of that are also output. In +this output, the word "char" means a non-UTF character, that is, the value of a +single data item (8-bit, 16-bit, or 32-bit, depending on the library that is +being tested). .P The \fB/K\fP modifier requests \fBpcretest\fP to show names from backtracking control verbs that are returned from calls to \fBpcre[16|32]_exec()\fP. It causes @@ -1135,6 +1138,6 @@ Cambridge CB2 3QH, England. .rs .sp .nf -Last updated: 12 November 2013 -Copyright (c) 1997-2013 University of Cambridge. +Last updated: 03 January 2014 +Copyright (c) 1997-2014 University of Cambridge. .fi @@ -4282,12 +4282,12 @@ while (!done) if (new_info(re, extra, PCRE_INFO_FIRSTTABLE, &start_bits) == 0) { if (start_bits == NULL) - fprintf(outfile, "No set of starting bytes\n"); + fprintf(outfile, "No starting char list\n"); else { int i; int c = 24; - fprintf(outfile, "Starting byte set: "); + fprintf(outfile, "Starting chars: "); for (i = 0; i < 256; i++) { if ((start_bits[i/8] & (1<<(i&7))) != 0) diff --git a/testdata/testoutput12 b/testdata/testoutput12 index a76e2ae..67ad2c8 100644 --- a/testdata/testoutput12 +++ b/testdata/testoutput12 @@ -8,7 +8,7 @@ No options First char = 'a' Need char = 'c' Subject length lower bound = 3 -No set of starting bytes +No starting char list JIT study was successful /(?(?C1)(?=a)a)/S+I @@ -27,7 +27,7 @@ No options No first char No need char Subject length lower bound = -1 -No set of starting bytes +No starting char list JIT study was not successful /abc/S+I>testsavedregex @@ -36,7 +36,7 @@ No options First char = 'a' Need char = 'c' Subject length lower bound = 3 -No set of starting bytes +No starting char list JIT study was successful Compiled pattern written to testsavedregex Study data written to testsavedregex @@ -165,7 +165,7 @@ No options First char = 'a' Need char = 'd' Subject length lower bound = 4 -No set of starting bytes +No starting char list JIT study was successful /(*NO_START_OPT)a(*:m)b/KS++ diff --git a/testdata/testoutput13 b/testdata/testoutput13 index 9f73c50..d6fb8a5 100644 --- a/testdata/testoutput13 +++ b/testdata/testoutput13 @@ -8,7 +8,7 @@ No options First char = 'a' Need char = 'c' Subject length lower bound = 3 -No set of starting bytes +No starting char list JIT support is not available in this version of PCRE /a*/SI diff --git a/testdata/testoutput14 b/testdata/testoutput14 index 52680a8..ae85681 100644 --- a/testdata/testoutput14 +++ b/testdata/testoutput14 @@ -361,7 +361,7 @@ Options: extended No first char No need char Subject length lower bound = 3 -Starting byte set: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8 +Starting chars: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8 9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f @@ -388,7 +388,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x09 \x20 \xa0 +Starting chars: \x09 \x20 \xa0 /\H/SI Capturing subpattern count = 0 @@ -396,7 +396,7 @@ No options No first char No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list /\v/SI Capturing subpattern count = 0 @@ -404,7 +404,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d \x85 +Starting chars: \x0a \x0b \x0c \x0d \x85 /\V/SI Capturing subpattern count = 0 @@ -412,7 +412,7 @@ No options No first char No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list /\R/SI Capturing subpattern count = 0 @@ -420,7 +420,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d \x85 +Starting chars: \x0a \x0b \x0c \x0d \x85 /[\h]/BZ ------------------------------------------------------------------ diff --git a/testdata/testoutput15 b/testdata/testoutput15 index 5792be7..5af369d 100644 --- a/testdata/testoutput15 +++ b/testdata/testoutput15 @@ -481,7 +481,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a +Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y @@ -519,7 +519,7 @@ Options: utf First char = \x{c4} Need char = \x{80} Subject length lower bound = 3 -No set of starting bytes +No starting char list \x{100}\x{100}\x{100}\x{100\x{100} 0: \x{100}\x{100}\x{100} @@ -539,7 +539,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: x \xc4 +Starting chars: x \xc4 /(\x{100}*a|x)/8SDZ ------------------------------------------------------------------ @@ -558,7 +558,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: a x \xc4 +Starting chars: a x \xc4 /(\x{100}{0,2}a|x)/8SDZ ------------------------------------------------------------------ @@ -577,7 +577,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: a x \xc4 +Starting chars: a x \xc4 /(\x{100}{1,2}a|x)/8SDZ ------------------------------------------------------------------ @@ -597,7 +597,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: x \xc4 +Starting chars: x \xc4 /\x{100}/8DZ ------------------------------------------------------------------ @@ -799,7 +799,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: \x09 \x20 \xc2 \xe1 \xe2 \xe3 +Starting chars: \x09 \x20 \xc2 \xe1 \xe2 \xe3 ABC\x{09} 0: \x{09} ABC\x{20} @@ -825,7 +825,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 +Starting chars: \x0a \x0b \x0c \x0d \xc2 \xe2 ABC\x{0a} 0: \x{0a} ABC\x{0b} @@ -845,7 +845,7 @@ Options: utf No first char Need char = 'A' Subject length lower bound = 1 -Starting byte set: \x09 \x20 A \xc2 \xe1 \xe2 \xe3 +Starting chars: \x09 \x20 A \xc2 \xe1 \xe2 \xe3 CDBABC 0: A @@ -855,7 +855,7 @@ Options: utf No first char Need char = 'A' Subject length lower bound = 2 -Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 +Starting chars: \x0a \x0b \x0c \x0d \xc2 \xe2 /\s?xxx\s/8SI Capturing subpattern count = 0 @@ -863,7 +863,7 @@ Options: utf No first char Need char = 'x' Subject length lower bound = 4 -Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 x +Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 x /\sxxx\s/I8ST1 Capturing subpattern count = 0 @@ -871,7 +871,7 @@ Options: utf No first char Need char = 'x' Subject length lower bound = 5 -Starting byte set: \x09 \x0a \x0c \x0d \x20 \xc2 +Starting chars: \x09 \x0a \x0c \x0d \x20 \xc2 AB\x{85}xxx\x{a0}XYZ 0: \x{85}xxx\x{a0} AB\x{a0}xxx\x{85}XYZ @@ -883,7 +883,7 @@ Options: utf No first char Need char = ' ' Subject length lower bound = 3 -Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e +Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e @@ -917,7 +917,7 @@ Options: caseless utf No first char No need char Subject length lower bound = 1 -Starting byte set: \xe1 +Starting chars: \xe1 /\x{1234}+?/iS8I Capturing subpattern count = 0 @@ -925,7 +925,7 @@ Options: caseless utf No first char No need char Subject length lower bound = 1 -Starting byte set: \xe1 +Starting chars: \xe1 /\x{1234}++/iS8I Capturing subpattern count = 0 @@ -933,7 +933,7 @@ Options: caseless utf No first char No need char Subject length lower bound = 1 -Starting byte set: \xe1 +Starting chars: \xe1 /\x{1234}{2}/iS8I Capturing subpattern count = 0 @@ -941,7 +941,7 @@ Options: caseless utf No first char No need char Subject length lower bound = 2 -Starting byte set: \xe1 +Starting chars: \xe1 /[^\x{c4}]/8DZ ------------------------------------------------------------------ @@ -974,7 +974,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d \xc2 \xe2 +Starting chars: \x0a \x0b \x0c \x0d \xc2 \xe2 /\777/8DZ ------------------------------------------------------------------ diff --git a/testdata/testoutput16 b/testdata/testoutput16 index 1d5f31d..63e9eb0 100644 --- a/testdata/testoutput16 +++ b/testdata/testoutput16 @@ -64,7 +64,7 @@ Options: caseless utf No first char No need char Subject length lower bound = 17 -Starting byte set: \xd0 \xd1 +Starting chars: \xd0 \xd1 \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f} 0: \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f} \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f} @@ -92,7 +92,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x09 \x20 \xa0 +Starting chars: \x09 \x20 \xa0 /\v/SI Capturing subpattern count = 0 @@ -100,7 +100,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d \x85 +Starting chars: \x0a \x0b \x0c \x0d \x85 /\R/SI Capturing subpattern count = 0 @@ -108,7 +108,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d \x85 +Starting chars: \x0a \x0b \x0c \x0d \x85 /[[:blank:]]/WBZ ------------------------------------------------------------------ diff --git a/testdata/testoutput17 b/testdata/testoutput17 index c406c66..1a3b492 100644 --- a/testdata/testoutput17 +++ b/testdata/testoutput17 @@ -228,7 +228,7 @@ Options: extended No first char No need char Subject length lower bound = 3 -Starting byte set: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8 +Starting chars: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8 9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xff @@ -274,7 +274,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x09 \x20 \xa0 \xff +Starting chars: \x09 \x20 \xa0 \xff \x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000} 0: \x{1680}\x{2000}\x{202f}\x{3000} \x{3001}\x{2fff}\x{200a}\xa0\x{2000} @@ -292,7 +292,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x09 \x20 \xa0 \xff +Starting chars: \x09 \x20 \xa0 \xff \x{1681}\x{200b}\x{1680}\x{2000}\x{202f}\x{3000} 0: \x{1680}\x{2000}\x{202f}\x{3000} \x{3001}\x{2fff}\x{200a}\xa0\x{2000} @@ -304,7 +304,7 @@ No options No first char No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list \x{1680}\x{180e}\x{167f}\x{1681}\x{180d}\x{180f} 0: \x{167f}\x{1681}\x{180d}\x{180f} \x{2000}\x{200a}\x{1fff}\x{200b} @@ -330,7 +330,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff +Starting chars: \x0a \x0b \x0c \x0d \x85 \xff \x{2027}\x{2030}\x{2028}\x{2029} 0: \x{2028}\x{2029} \x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d @@ -348,7 +348,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff +Starting chars: \x0a \x0b \x0c \x0d \x85 \xff \x{2027}\x{2030}\x{2028}\x{2029} 0: \x{2028}\x{2029} \x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d @@ -360,7 +360,7 @@ No options No first char No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list \x{2028}\x{2029}\x{2027}\x{2030} 0: \x{2027}\x{2030} \x85\x0a\x0b\x0c\x0d\x09\x0e\x84\x86 @@ -378,7 +378,7 @@ Options: bsr_unicode No first char No need char Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff +Starting chars: \x0a \x0b \x0c \x0d \x85 \xff \x{2027}\x{2030}\x{2028}\x{2029} 0: \x{2028}\x{2029} \x09\x0e\x84\x86\x85\x0a\x0b\x0c\x0d diff --git a/testdata/testoutput18-16 b/testdata/testoutput18-16 index 1ca9ee7..0538d7c 100644 --- a/testdata/testoutput18-16 +++ b/testdata/testoutput18-16 @@ -339,7 +339,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a +Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y @@ -378,7 +378,7 @@ Options: utf First char = \x{100} Need char = \x{100} Subject length lower bound = 3 -No set of starting bytes +No starting char list \x{100}\x{100}\x{100}\x{100\x{100} 0: \x{100}\x{100}\x{100} @@ -398,7 +398,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: x \xff +Starting chars: x \xff /(\x{100}*a|x)/8SDZ ------------------------------------------------------------------ @@ -417,7 +417,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: a x \xff +Starting chars: a x \xff /(\x{100}{0,2}a|x)/8SDZ ------------------------------------------------------------------ @@ -436,7 +436,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: a x \xff +Starting chars: a x \xff /(\x{100}{1,2}a|x)/8SDZ ------------------------------------------------------------------ @@ -456,7 +456,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: x \xff +Starting chars: x \xff /\x{100}/8DZ ------------------------------------------------------------------ @@ -666,7 +666,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: \x09 \x20 \xa0 \xff +Starting chars: \x09 \x20 \xa0 \xff ABC\x{09} 0: \x{09} ABC\x{20} @@ -692,7 +692,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff +Starting chars: \x0a \x0b \x0c \x0d \x85 \xff ABC\x{0a} 0: \x{0a} ABC\x{0b} @@ -712,7 +712,7 @@ Options: utf No first char Need char = 'A' Subject length lower bound = 1 -Starting byte set: \x09 \x20 A \xa0 \xff +Starting chars: \x09 \x20 A \xa0 \xff CDBABC 0: A \x{2000}ABC @@ -724,7 +724,7 @@ Options: utf No first char Need char = 'A' Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d A \x85 \xff +Starting chars: \x0a \x0b \x0c \x0d A \x85 \xff CDBABC 0: A \x{2028}A @@ -736,7 +736,7 @@ Options: utf No first char Need char = 'A' Subject length lower bound = 2 -Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff +Starting chars: \x0a \x0b \x0c \x0d \x85 \xff /\s?xxx\s/8SI Capturing subpattern count = 0 @@ -744,7 +744,7 @@ Options: utf No first char Need char = 'x' Subject length lower bound = 4 -Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 x +Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 x /\sxxx\s/I8ST1 Capturing subpattern count = 0 @@ -752,7 +752,7 @@ Options: utf No first char Need char = 'x' Subject length lower bound = 5 -Starting byte set: \x09 \x0a \x0c \x0d \x20 \x85 \xa0 +Starting chars: \x09 \x0a \x0c \x0d \x20 \x85 \xa0 AB\x{85}xxx\x{a0}XYZ 0: \x{85}xxx\x{a0} AB\x{a0}xxx\x{85}XYZ @@ -764,7 +764,7 @@ Options: utf No first char Need char = ' ' Subject length lower bound = 3 -Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e +Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e @@ -803,7 +803,7 @@ Options: caseless utf First char = \x{1234} No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list /\x{1234}+?/iS8I Capturing subpattern count = 0 @@ -811,7 +811,7 @@ Options: caseless utf First char = \x{1234} No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list /\x{1234}++/iS8I Capturing subpattern count = 0 @@ -819,7 +819,7 @@ Options: caseless utf First char = \x{1234} No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list /\x{1234}{2}/iS8I Capturing subpattern count = 0 @@ -827,7 +827,7 @@ Options: caseless utf First char = \x{1234} Need char = \x{1234} Subject length lower bound = 2 -No set of starting bytes +No starting char list /[^\x{c4}]/8DZ ------------------------------------------------------------------ @@ -860,7 +860,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff +Starting chars: \x0a \x0b \x0c \x0d \x85 \xff /-- Check bad offset --/ diff --git a/testdata/testoutput18-32 b/testdata/testoutput18-32 index 89be3a4..f2d1b0c 100644 --- a/testdata/testoutput18-32 +++ b/testdata/testoutput18-32 @@ -337,7 +337,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a +Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y @@ -376,7 +376,7 @@ Options: utf First char = \x{100} Need char = \x{100} Subject length lower bound = 3 -No set of starting bytes +No starting char list \x{100}\x{100}\x{100}\x{100\x{100} 0: \x{100}\x{100}\x{100} @@ -396,7 +396,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: x \xff +Starting chars: x \xff /(\x{100}*a|x)/8SDZ ------------------------------------------------------------------ @@ -415,7 +415,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: a x \xff +Starting chars: a x \xff /(\x{100}{0,2}a|x)/8SDZ ------------------------------------------------------------------ @@ -434,7 +434,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: a x \xff +Starting chars: a x \xff /(\x{100}{1,2}a|x)/8SDZ ------------------------------------------------------------------ @@ -454,7 +454,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: x \xff +Starting chars: x \xff /\x{100}/8DZ ------------------------------------------------------------------ @@ -663,7 +663,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: \x09 \x20 \xa0 \xff +Starting chars: \x09 \x20 \xa0 \xff ABC\x{09} 0: \x{09} ABC\x{20} @@ -689,7 +689,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff +Starting chars: \x0a \x0b \x0c \x0d \x85 \xff ABC\x{0a} 0: \x{0a} ABC\x{0b} @@ -709,7 +709,7 @@ Options: utf No first char Need char = 'A' Subject length lower bound = 1 -Starting byte set: \x09 \x20 A \xa0 \xff +Starting chars: \x09 \x20 A \xa0 \xff CDBABC 0: A \x{2000}ABC @@ -721,7 +721,7 @@ Options: utf No first char Need char = 'A' Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d A \x85 \xff +Starting chars: \x0a \x0b \x0c \x0d A \x85 \xff CDBABC 0: A \x{2028}A @@ -733,7 +733,7 @@ Options: utf No first char Need char = 'A' Subject length lower bound = 2 -Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff +Starting chars: \x0a \x0b \x0c \x0d \x85 \xff /\s?xxx\s/8SI Capturing subpattern count = 0 @@ -741,7 +741,7 @@ Options: utf No first char Need char = 'x' Subject length lower bound = 4 -Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 x +Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 x /\sxxx\s/I8ST1 Capturing subpattern count = 0 @@ -749,7 +749,7 @@ Options: utf No first char Need char = 'x' Subject length lower bound = 5 -Starting byte set: \x09 \x0a \x0c \x0d \x20 \x85 \xa0 +Starting chars: \x09 \x0a \x0c \x0d \x20 \x85 \xa0 AB\x{85}xxx\x{a0}XYZ 0: \x{85}xxx\x{a0} AB\x{a0}xxx\x{85}XYZ @@ -761,7 +761,7 @@ Options: utf No first char Need char = ' ' Subject length lower bound = 3 -Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e +Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0b \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e @@ -800,7 +800,7 @@ Options: caseless utf First char = \x{1234} No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list /\x{1234}+?/iS8I Capturing subpattern count = 0 @@ -808,7 +808,7 @@ Options: caseless utf First char = \x{1234} No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list /\x{1234}++/iS8I Capturing subpattern count = 0 @@ -816,7 +816,7 @@ Options: caseless utf First char = \x{1234} No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list /\x{1234}{2}/iS8I Capturing subpattern count = 0 @@ -824,7 +824,7 @@ Options: caseless utf First char = \x{1234} Need char = \x{1234} Subject length lower bound = 2 -No set of starting bytes +No starting char list /[^\x{c4}]/8DZ ------------------------------------------------------------------ @@ -857,7 +857,7 @@ Options: utf No first char No need char Subject length lower bound = 1 -Starting byte set: \x0a \x0b \x0c \x0d \x85 \xff +Starting chars: \x0a \x0b \x0c \x0d \x85 \xff /-- Check bad offset --/ diff --git a/testdata/testoutput19 b/testdata/testoutput19 index ccc198c..21fe677 100644 --- a/testdata/testoutput19 +++ b/testdata/testoutput19 @@ -55,7 +55,7 @@ Options: caseless utf First char = \x{401} (caseless) Need char = \x{42f} (caseless) Subject length lower bound = 17 -No set of starting bytes +No starting char list \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f} 0: \x{401}\x{420}\x{421}\x{422}\x{423}\x{424}\x{425}\x{426}\x{427}\x{428}\x{429}\x{42a}\x{42b}\x{42c}\x{42d}\x{42e}\x{42f} \x{451}\x{440}\x{441}\x{442}\x{443}\x{444}\x{445}\x{446}\x{447}\x{448}\x{449}\x{44a}\x{44b}\x{44c}\x{44d}\x{44e}\x{44f} diff --git a/testdata/testoutput2 b/testdata/testoutput2 index 70e7ceb..b82488c 100644 --- a/testdata/testoutput2 +++ b/testdata/testoutput2 @@ -178,7 +178,7 @@ No options No first char No need char Subject length lower bound = 3 -Starting byte set: c d e +Starting chars: c d e this sentence eventually mentions a cat 0: cat this sentences rambles on and on for a while and then reaches elephant @@ -190,7 +190,7 @@ Options: caseless No first char No need char Subject length lower bound = 3 -Starting byte set: C D E c d e +Starting chars: C D E c d e this sentence eventually mentions a CAT cat 0: CAT this sentences rambles on and on for a while to elephant ElePhant @@ -202,7 +202,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: a b c d +Starting chars: a b c d /(a|[^\dZ])/IS Capturing subpattern count = 1 @@ -210,7 +210,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a +Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0a \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y [ \ ] ^ _ ` a b c d @@ -231,7 +231,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 a b +Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 a b /(ab\2)/ Failed: reference to non-existent subpattern at offset 6 @@ -512,7 +512,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: a b c d +Starting chars: a b c d /(?i)[abcd]/IS Capturing subpattern count = 0 @@ -520,7 +520,7 @@ Options: caseless No first char No need char Subject length lower bound = 1 -Starting byte set: A B C D a b c d +Starting chars: A B C D a b c d /(?m)[xy]|(b|c)/IS Capturing subpattern count = 1 @@ -528,7 +528,7 @@ Options: multiline No first char No need char Subject length lower bound = 1 -Starting byte set: b c x y +Starting chars: b c x y /(^a|^b)/Im Capturing subpattern count = 1 @@ -591,7 +591,7 @@ No options First char = 'b' (caseless) No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list /(a*b|(?i:c*(?-i)d))/IS Capturing subpattern count = 1 @@ -599,7 +599,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: C a b c d +Starting chars: C a b c d /a$/I Capturing subpattern count = 0 @@ -666,7 +666,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: a b +Starting chars: a b /(?<!foo)(alpha|omega)/IS Capturing subpattern count = 1 @@ -675,7 +675,7 @@ No options No first char Need char = 'a' Subject length lower bound = 5 -Starting byte set: a o +Starting chars: a o /(?!alphabet)[ab]/IS Capturing subpattern count = 0 @@ -683,7 +683,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: a b +Starting chars: a b /(?<=foo\n)^bar/Im Capturing subpattern count = 0 @@ -1642,7 +1642,7 @@ Options: anchored No first char Need char = 'd' Subject length lower bound = 4 -No set of starting bytes +No starting char list /\( # ( at start (?: # Non-capturing bracket @@ -1875,7 +1875,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z +Starting chars: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z /^[[:ascii:]]/DZ @@ -1937,7 +1937,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 +Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 /^[[:cntrl:]]/DZ ------------------------------------------------------------------ @@ -3434,7 +3434,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: a b +Starting chars: a b /[^a]/I Capturing subpattern count = 0 @@ -3454,7 +3454,7 @@ No options No first char Need char = '6' Subject length lower bound = 4 -Starting byte set: 0 1 2 3 4 5 6 7 8 9 +Starting chars: 0 1 2 3 4 5 6 7 8 9 /a^b/I Capturing subpattern count = 0 @@ -3488,7 +3488,7 @@ Options: caseless No first char No need char Subject length lower bound = 1 -Starting byte set: A B a b +Starting chars: A B a b /[ab](?i)cd/IS Capturing subpattern count = 0 @@ -3496,7 +3496,7 @@ No options No first char Need char = 'd' (caseless) Subject length lower bound = 3 -Starting byte set: a b +Starting chars: a b /abc(?C)def/I Capturing subpattern count = 0 @@ -3537,7 +3537,7 @@ No options No first char Need char = 'f' Subject length lower bound = 7 -Starting byte set: 0 1 2 3 4 5 6 7 8 9 +Starting chars: 0 1 2 3 4 5 6 7 8 9 1234abcdef --->1234abcdef 1 ^ \d @@ -3856,7 +3856,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: a b +Starting chars: a b /(?R)/I Failed: recursive call could loop indefinitely at offset 3 @@ -4637,7 +4637,7 @@ Options: caseless No first char Need char = 'g' (caseless) Subject length lower bound = 8 -No set of starting bytes +No starting char list Baby Bjorn Active Carrier - With free SHIPPING!! 0: Baby Bjorn Active Carrier - With free SHIPPING!! 1: Baby Bjorn Active Carrier - With free SHIPPING!! @@ -4656,7 +4656,7 @@ No options No first char Need char = 'b' Subject length lower bound = 1 -No set of starting bytes +No starting char list /(a|b)*.?c/ISDZ ------------------------------------------------------------------ @@ -4677,7 +4677,7 @@ No options No first char Need char = 'c' Subject length lower bound = 1 -No set of starting bytes +No starting char list /abc(?C255)de(?C)f/DZ ------------------------------------------------------------------ @@ -4750,7 +4750,7 @@ Options: No first char Need char = 'b' Subject length lower bound = 1 -Starting byte set: a b +Starting chars: a b ab --->ab +0 ^ a* @@ -4893,7 +4893,7 @@ Options: No first char Need char = 'x' Subject length lower bound = 4 -Starting byte set: a d +Starting chars: a d abcx --->abcx +0 ^ (abc|def) @@ -5127,7 +5127,7 @@ Options: No first char No need char Subject length lower bound = 2 -Starting byte set: a b x +Starting chars: a b x Note: that { does NOT introduce a quantifier --->Note: that { does NOT introduce a quantifier +0 ^ ([ab]{,4}c|xy) @@ -5607,7 +5607,7 @@ No options First char = 'a' Need char = 'c' Subject length lower bound = 3 -No set of starting bytes +No starting char list Compiled pattern written to testsavedregex Study data written to testsavedregex <testsavedregex @@ -5642,7 +5642,7 @@ No options First char = 'a' Need char = 'c' Subject length lower bound = 3 -No set of starting bytes +No starting char list Compiled pattern written to testsavedregex Study data written to testsavedregex <testsavedregex @@ -5677,7 +5677,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: a b +Starting chars: a b Compiled pattern written to testsavedregex Study data written to testsavedregex <testsavedregex @@ -5716,7 +5716,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: a b +Starting chars: a b Compiled pattern written to testsavedregex Study data written to testsavedregex <testsavedregex @@ -6431,7 +6431,7 @@ No options No first char Need char = ',' Subject length lower bound = 1 -Starting byte set: \x09 \x0a \x0b \x0c \x0d \x20 , +Starting chars: \x09 \x0a \x0b \x0c \x0d \x20 , \x0b,\x0b 0: \x0b,\x0b \x0c,\x0d @@ -6738,7 +6738,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: C a b c d +Starting chars: C a b c d /()[ab]xyz/IS Capturing subpattern count = 1 @@ -6746,7 +6746,7 @@ No options No first char Need char = 'z' Subject length lower bound = 4 -Starting byte set: a b +Starting chars: a b /(|)[ab]xyz/IS Capturing subpattern count = 1 @@ -6754,7 +6754,7 @@ No options No first char Need char = 'z' Subject length lower bound = 4 -Starting byte set: a b +Starting chars: a b /(|c)[ab]xyz/IS Capturing subpattern count = 1 @@ -6762,7 +6762,7 @@ No options No first char Need char = 'z' Subject length lower bound = 4 -Starting byte set: a b c +Starting chars: a b c /(|c?)[ab]xyz/IS Capturing subpattern count = 1 @@ -6770,7 +6770,7 @@ No options No first char Need char = 'z' Subject length lower bound = 4 -Starting byte set: a b c +Starting chars: a b c /(d?|c?)[ab]xyz/IS Capturing subpattern count = 1 @@ -6778,7 +6778,7 @@ No options No first char Need char = 'z' Subject length lower bound = 4 -Starting byte set: a b c d +Starting chars: a b c d /(d?|c)[ab]xyz/IS Capturing subpattern count = 1 @@ -6786,7 +6786,7 @@ No options No first char Need char = 'z' Subject length lower bound = 4 -Starting byte set: a b c d +Starting chars: a b c d /^a*b\d/DZ ------------------------------------------------------------------ @@ -6879,7 +6879,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: a b c d +Starting chars: a b c d /(a+|b*)[cd]/IS Capturing subpattern count = 1 @@ -6887,7 +6887,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: a b c d +Starting chars: a b c d /(a*|b+)[cd]/IS Capturing subpattern count = 1 @@ -6895,7 +6895,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: a b c d +Starting chars: a b c d /(a+|b+)[cd]/IS Capturing subpattern count = 1 @@ -6903,7 +6903,7 @@ No options No first char No need char Subject length lower bound = 2 -Starting byte set: a b +Starting chars: a b /(((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( (((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( @@ -9307,7 +9307,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: x y z +Starting chars: x y z /(?(?=.*b)b|^)/CI Capturing subpattern count = 0 @@ -10096,7 +10096,7 @@ No options No first char No need char Subject length lower bound = 2 -Starting byte set: a b +Starting chars: a b /(a|bc)\1{2,3}/SI Capturing subpattern count = 1 @@ -10105,7 +10105,7 @@ No options No first char No need char Subject length lower bound = 3 -Starting byte set: a b +Starting chars: a b /(a|bc)(?1)/SI Capturing subpattern count = 1 @@ -10113,7 +10113,7 @@ No options No first char No need char Subject length lower bound = 2 -Starting byte set: a b +Starting chars: a b /(a|b\1)(a|b\1)/SI Capturing subpattern count = 2 @@ -10122,7 +10122,7 @@ No options No first char No need char Subject length lower bound = 2 -Starting byte set: a b +Starting chars: a b /(a|b\1){2}/SI Capturing subpattern count = 1 @@ -10131,7 +10131,7 @@ No options No first char No need char Subject length lower bound = 2 -Starting byte set: a b +Starting chars: a b /(a|bbbb\1)(a|bbbb\1)/SI Capturing subpattern count = 2 @@ -10140,7 +10140,7 @@ No options No first char No need char Subject length lower bound = 2 -Starting byte set: a b +Starting chars: a b /(a|bbbb\1){2}/SI Capturing subpattern count = 1 @@ -10149,7 +10149,7 @@ No options No first char No need char Subject length lower bound = 2 -Starting byte set: a b +Starting chars: a b /^From +([^ ]+) +[a-zA-Z][a-zA-Z][a-zA-Z] +[a-zA-Z][a-zA-Z][a-zA-Z] +[0-9]?[0-9] +[0-9][0-9]:[0-9][0-9]/SI Capturing subpattern count = 1 @@ -10157,7 +10157,7 @@ Options: anchored No first char Need char = ':' Subject length lower bound = 22 -No set of starting bytes +No starting char list /<tr([\w\W\s\d][^<>]{0,})><TD([\w\W\s\d][^<>]{0,})>([\d]{0,}\.)(.*)((<BR>([\w\W\s\d][^<>]{0,})|[\s]{0,}))<\/a><\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><TD([\w\W\s\d][^<>]{0,})>([\w\W\s\d][^<>]{0,})<\/TD><\/TR>/isIS Capturing subpattern count = 11 @@ -10165,7 +10165,7 @@ Options: caseless dotall First char = '<' Need char = '>' Subject length lower bound = 47 -No set of starting bytes +No starting char list "(?>.*/)foo"SI Capturing subpattern count = 0 @@ -10173,7 +10173,7 @@ No options No first char Need char = 'o' Subject length lower bound = 4 -No set of starting bytes +No starting char list /(?(?=[^a-z]+[a-z]) \d{2}-[a-z]{3}-\d{2} | \d{2}-\d{2}-\d{2} ) /xSI Capturing subpattern count = 0 @@ -10181,7 +10181,7 @@ Options: extended No first char Need char = '-' Subject length lower bound = 8 -No set of starting bytes +No starting char list /(?:(?:(?:(?:(?:(?:(?:(?:(?:(a|b|c))))))))))/iSI Capturing subpattern count = 1 @@ -10189,7 +10189,7 @@ Options: caseless No first char No need char Subject length lower bound = 1 -Starting byte set: A B C a b c +Starting chars: A B C a b c /(?:c|d)(?:)(?:aaaaaaaa(?:)(?:bbbbbbbb)(?:bbbbbbbb(?:))(?:bbbbbbbb(?:)(?:bbbbbbbb)))/SI Capturing subpattern count = 0 @@ -10197,7 +10197,7 @@ No options No first char Need char = 'b' Subject length lower bound = 41 -Starting byte set: c d +Starting chars: c d /<a[\s]+href[\s]*=[\s]* # find <a href= ([\"\'])? # find single or double quote @@ -10210,7 +10210,7 @@ Options: caseless extended dotall First char = '<' Need char = '=' Subject length lower bound = 9 -No set of starting bytes +No starting char list /^(?!:) # colon disallowed at start (?: # start of item @@ -10226,7 +10226,7 @@ Options: anchored caseless extended No first char Need char = ':' Subject length lower bound = 2 -No set of starting bytes +No starting char list /(?|(?<a>A)|(?<a>B))/I Capturing subpattern count = 1 @@ -10450,7 +10450,7 @@ Options: No first char Need char = 'a' Subject length lower bound = 1 -No set of starting bytes +No starting char list cat 0: a 1: @@ -10464,7 +10464,7 @@ No options No first char Need char = 'a' Subject length lower bound = 3 -No set of starting bytes +No starting char list cat No match @@ -10476,7 +10476,7 @@ No options First char = 'i' No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list i 0: i @@ -10486,7 +10486,7 @@ No options No first char Need char = 'i' Subject length lower bound = 1 -Starting byte set: i +Starting chars: i ia 0: ia 1: @@ -11080,7 +11080,7 @@ No options First char = 'a' Need char = '4' Subject length lower bound = 5 -No set of starting bytes +No starting char list /([abc])++1234/SI Capturing subpattern count = 1 @@ -11088,7 +11088,7 @@ No options No first char Need char = '4' Subject length lower bound = 5 -Starting byte set: a b c +Starting chars: a b c /(?<=(abc)+)X/ Failed: lookbehind assertion is not fixed length at offset 10 @@ -11369,7 +11369,7 @@ No options No first char No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list /(a(?2)|b)(b(?1)|a)(?:(?1)|(?2))/SI Capturing subpattern count = 2 @@ -11377,7 +11377,7 @@ No options No first char No need char Subject length lower bound = 3 -Starting byte set: a b +Starting chars: a b /(a(?2)|b)(b(?1)|a)(?1)(?2)/SI Capturing subpattern count = 2 @@ -11385,7 +11385,7 @@ No options No first char No need char Subject length lower bound = 4 -Starting byte set: a b +Starting chars: a b /(abc)(?1)/SI Capturing subpattern count = 1 @@ -11393,7 +11393,7 @@ No options First char = 'a' Need char = 'c' Subject length lower bound = 6 -No set of starting bytes +No starting char list /^(?>a)++/ aa\M @@ -11711,7 +11711,7 @@ No options First char = 't' Need char = 't' Subject length lower bound = 18 -No set of starting bytes +No starting char list /\btype\b\W*?\btext\b\W*?\bjavascript\b|\burl\b\W*?\bshell:|<input\b.*?\btype\b\W*?\bimage\b|\bonkeyup\b\W*?\=/IS Capturing subpattern count = 0 @@ -11720,7 +11720,7 @@ No options No first char No need char Subject length lower bound = 8 -Starting byte set: < o t u +Starting chars: < o t u /a(*SKIP)c|b(*ACCEPT)|/+S!I Capturing subpattern count = 0 @@ -11729,7 +11729,7 @@ No options No first char No need char Subject length lower bound = -1 -No set of starting bytes +No starting char list a 0: 0+ @@ -11740,7 +11740,7 @@ No options No first char No need char Subject length lower bound = -1 -Starting byte set: a b x +Starting chars: a b x ax 0: x @@ -12436,7 +12436,7 @@ No options No first char No need char Subject length lower bound = -1 -No set of starting bytes +No starting char list /(?:(a)+(?C1)bb|aa(?C2)b)/ aab\C+ @@ -12722,7 +12722,7 @@ No options No first char Need char = 'z' Subject length lower bound = 2 -Starting byte set: a z +Starting chars: a z aaaaaaaaaaaaaz Error -21 (recursion limit exceeded) aaaaaaaaaaaaaz\Q1000 @@ -12735,7 +12735,7 @@ No options No first char Need char = 'z' Subject length lower bound = 2 -Starting byte set: a z +Starting chars: a z aaaaaaaaaaaaaz Error -21 (recursion limit exceeded) @@ -12746,7 +12746,7 @@ No options No first char Need char = 'z' Subject length lower bound = 2 -Starting byte set: a z +Starting chars: a z aaaaaaaaaaaaaz No match aaaaaaaaaaaaaz\Q10 @@ -12790,7 +12790,7 @@ Options: dupnames First char = 'a' Need char = 'z' Subject length lower bound = 5 -No set of starting bytes +No starting char list /a*[bcd]/BZ ------------------------------------------------------------------ @@ -13902,7 +13902,7 @@ No options No first char Need char = 'd' Subject length lower bound = 1 -Starting byte set: a b c d +Starting chars: a b c d /[a-c]+d/DZS ------------------------------------------------------------------ @@ -13917,7 +13917,7 @@ No options No first char Need char = 'd' Subject length lower bound = 2 -Starting byte set: a b c +Starting chars: a b c /[a-c]?d/DZS ------------------------------------------------------------------ @@ -13932,7 +13932,7 @@ No options No first char Need char = 'd' Subject length lower bound = 1 -Starting byte set: a b c d +Starting chars: a b c d /[a-c]{4,6}d/DZS ------------------------------------------------------------------ @@ -13947,7 +13947,7 @@ No options No first char Need char = 'd' Subject length lower bound = 5 -Starting byte set: a b c +Starting chars: a b c /[a-c]{0,6}d/DZS ------------------------------------------------------------------ @@ -13962,7 +13962,7 @@ No options No first char Need char = 'd' Subject length lower bound = 1 -Starting byte set: a b c d +Starting chars: a b c d /-- End of special auto-possessive tests --/ diff --git a/testdata/testoutput21-16 b/testdata/testoutput21-16 index 0e21350..da194d9 100644 --- a/testdata/testoutput21-16 +++ b/testdata/testoutput21-16 @@ -50,7 +50,7 @@ Options: anchored extended No first char No need char Subject length lower bound = 6 -No set of starting bytes +No starting char list <!testsaved16BE-1 Compiled pattern loaded from testsaved16BE-1 @@ -83,7 +83,7 @@ Options: anchored extended No first char No need char Subject length lower bound = 6 -No set of starting bytes +No starting char list <!testsaved32LE-1 Compiled pattern loaded from testsaved32LE-1 diff --git a/testdata/testoutput21-32 b/testdata/testoutput21-32 index 183487a..d087bb6 100644 --- a/testdata/testoutput21-32 +++ b/testdata/testoutput21-32 @@ -62,7 +62,7 @@ Options: anchored extended No first char No need char Subject length lower bound = 6 -No set of starting bytes +No starting char list <!testsaved32BE-1 Compiled pattern loaded from testsaved32BE-1 @@ -95,6 +95,6 @@ Options: anchored extended No first char No need char Subject length lower bound = 6 -No set of starting bytes +No starting char list /-- End of testinput21 --/ diff --git a/testdata/testoutput22-16 b/testdata/testoutput22-16 index f896b13..32a71cd 100644 --- a/testdata/testoutput22-16 +++ b/testdata/testoutput22-16 @@ -37,7 +37,7 @@ Options: extended utf No first char No need char Subject length lower bound = 2 -No set of starting bytes +No starting char list <!testsaved16BE-2 Compiled pattern loaded from testsaved16BE-2 @@ -64,7 +64,7 @@ Options: extended utf No first char No need char Subject length lower bound = 2 -No set of starting bytes +No starting char list <!testsaved32LE-2 Compiled pattern loaded from testsaved32LE-2 diff --git a/testdata/testoutput22-32 b/testdata/testoutput22-32 index 783926b..13e441d 100644 --- a/testdata/testoutput22-32 +++ b/testdata/testoutput22-32 @@ -49,7 +49,7 @@ Options: extended utf No first char No need char Subject length lower bound = 2 -No set of starting bytes +No starting char list <!testsaved32BE-2 Compiled pattern loaded from testsaved32BE-2 @@ -76,6 +76,6 @@ Options: extended utf No first char No need char Subject length lower bound = 2 -No set of starting bytes +No starting char list /-- End of testinput22 --/ diff --git a/testdata/testoutput23 b/testdata/testoutput23 index eedbb60..6dabf03 100644 --- a/testdata/testoutput23 +++ b/testdata/testoutput23 @@ -27,7 +27,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b +Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ @@ -54,7 +54,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e +Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c diff --git a/testdata/testoutput25 b/testdata/testoutput25 index 0561e7c..4c62c8d 100644 --- a/testdata/testoutput25 +++ b/testdata/testoutput25 @@ -74,7 +74,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b +Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x0a \x0b \x0c \x0d \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e \x1f ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ @@ -101,7 +101,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e +Starting chars: \x00 \x01 \x02 \x03 \x04 \x05 \x06 \x07 \x08 \x09 \x0e \x0f \x10 \x11 \x12 \x13 \x14 \x15 \x16 \x17 \x18 \x19 \x1a \x1b \x1c \x1d \x1e \x1f \x20 ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c diff --git a/testdata/testoutput3 b/testdata/testoutput3 index 12ffc99..6241623 100644 --- a/testdata/testoutput3 +++ b/testdata/testoutput3 @@ -90,7 +90,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P +Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z /\w/ISLfr_FR @@ -99,7 +99,7 @@ No options No first char No need char Subject length lower bound = 1 -Starting byte set: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P +Starting chars: 0 1 2 3 4 5 6 7 8 9 A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ a b c d e f g h i j k l m n o p q r s t u v w x y z ª µ º À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ø ù ú û ü ý þ ÿ diff --git a/testdata/testoutput5 b/testdata/testoutput5 index 4d67d1c..5c098e6 100644 --- a/testdata/testoutput5 +++ b/testdata/testoutput5 @@ -1536,7 +1536,7 @@ Options: caseless utf No first char No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list /[^\x{1234}]+?/iS8I Capturing subpattern count = 0 @@ -1544,7 +1544,7 @@ Options: caseless utf No first char No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list /[^\x{1234}]++/iS8I Capturing subpattern count = 0 @@ -1552,7 +1552,7 @@ Options: caseless utf No first char No need char Subject length lower bound = 1 -No set of starting bytes +No starting char list /[^\x{1234}]{2}/iS8I Capturing subpattern count = 0 @@ -1560,7 +1560,7 @@ Options: caseless utf No first char No need char Subject length lower bound = 2 -No set of starting bytes +No starting char list //<bsr_anycrlf><bsr_unicode> Failed: inconsistent NEWLINE options at offset 0 diff --git a/testdata/testoutput8 b/testdata/testoutput8 index bb68d3e..3861ea4 100644 --- a/testdata/testoutput8 +++ b/testdata/testoutput8 @@ -7232,7 +7232,7 @@ No options No first char No need char Subject length lower bound = 3 -Starting byte set: a d x +Starting chars: a d x terhjk;abcdaadsfe 0: abc the quick xyz brown fox |