diff options
Diffstat (limited to 'doc/html/pcresyntax.html')
-rw-r--r-- | doc/html/pcresyntax.html | 101 |
1 files changed, 56 insertions, 45 deletions
diff --git a/doc/html/pcresyntax.html b/doc/html/pcresyntax.html index 1a2749f..ad4399d 100644 --- a/doc/html/pcresyntax.html +++ b/doc/html/pcresyntax.html @@ -17,28 +17,29 @@ man page, in case the conversion went wrong. <li><a name="TOC2" href="#SEC2">QUOTING</a> <li><a name="TOC3" href="#SEC3">CHARACTERS</a> <li><a name="TOC4" href="#SEC4">CHARACTER TYPES</a> -<li><a name="TOC5" href="#SEC5">GENERAL CATEGORY PROPERTY CODES FOR \p and \P</a> -<li><a name="TOC6" href="#SEC6">SCRIPT NAMES FOR \p AND \P</a> -<li><a name="TOC7" href="#SEC7">CHARACTER CLASSES</a> -<li><a name="TOC8" href="#SEC8">QUANTIFIERS</a> -<li><a name="TOC9" href="#SEC9">ANCHORS AND SIMPLE ASSERTIONS</a> -<li><a name="TOC10" href="#SEC10">MATCH POINT RESET</a> -<li><a name="TOC11" href="#SEC11">ALTERNATION</a> -<li><a name="TOC12" href="#SEC12">CAPTURING</a> -<li><a name="TOC13" href="#SEC13">ATOMIC GROUPS</a> -<li><a name="TOC14" href="#SEC14">COMMENT</a> -<li><a name="TOC15" href="#SEC15">OPTION SETTING</a> -<li><a name="TOC16" href="#SEC16">LOOKAHEAD AND LOOKBEHIND ASSERTIONS</a> -<li><a name="TOC17" href="#SEC17">BACKREFERENCES</a> -<li><a name="TOC18" href="#SEC18">SUBROUTINE REFERENCES (POSSIBLY RECURSIVE)</a> -<li><a name="TOC19" href="#SEC19">CONDITIONAL PATTERNS</a> -<li><a name="TOC20" href="#SEC20">BACKTRACKING CONTROL</a> -<li><a name="TOC21" href="#SEC21">NEWLINE CONVENTIONS</a> -<li><a name="TOC22" href="#SEC22">WHAT \R MATCHES</a> -<li><a name="TOC23" href="#SEC23">CALLOUTS</a> -<li><a name="TOC24" href="#SEC24">SEE ALSO</a> -<li><a name="TOC25" href="#SEC25">AUTHOR</a> -<li><a name="TOC26" href="#SEC26">REVISION</a> +<li><a name="TOC5" href="#SEC5">GENERAL CATEGORY PROPERTIES FOR \p and \P</a> +<li><a name="TOC6" href="#SEC6">PCRE SPECIAL CATEGORY PROPERTIES FOR \p and \P</a> +<li><a name="TOC7" href="#SEC7">SCRIPT NAMES FOR \p AND \P</a> +<li><a name="TOC8" href="#SEC8">CHARACTER CLASSES</a> +<li><a name="TOC9" href="#SEC9">QUANTIFIERS</a> +<li><a name="TOC10" href="#SEC10">ANCHORS AND SIMPLE ASSERTIONS</a> +<li><a name="TOC11" href="#SEC11">MATCH POINT RESET</a> +<li><a name="TOC12" href="#SEC12">ALTERNATION</a> +<li><a name="TOC13" href="#SEC13">CAPTURING</a> +<li><a name="TOC14" href="#SEC14">ATOMIC GROUPS</a> +<li><a name="TOC15" href="#SEC15">COMMENT</a> +<li><a name="TOC16" href="#SEC16">OPTION SETTING</a> +<li><a name="TOC17" href="#SEC17">LOOKAHEAD AND LOOKBEHIND ASSERTIONS</a> +<li><a name="TOC18" href="#SEC18">BACKREFERENCES</a> +<li><a name="TOC19" href="#SEC19">SUBROUTINE REFERENCES (POSSIBLY RECURSIVE)</a> +<li><a name="TOC20" href="#SEC20">CONDITIONAL PATTERNS</a> +<li><a name="TOC21" href="#SEC21">BACKTRACKING CONTROL</a> +<li><a name="TOC22" href="#SEC22">NEWLINE CONVENTIONS</a> +<li><a name="TOC23" href="#SEC23">WHAT \R MATCHES</a> +<li><a name="TOC24" href="#SEC24">CALLOUTS</a> +<li><a name="TOC25" href="#SEC25">SEE ALSO</a> +<li><a name="TOC26" href="#SEC26">AUTHOR</a> +<li><a name="TOC27" href="#SEC27">REVISION</a> </ul> <br><a name="SEC1" href="#TOC1">PCRE REGULAR EXPRESSION SYNTAX SUMMARY</a><br> <P> @@ -80,6 +81,7 @@ syntax. \D a character that is not a decimal digit \h a horizontal whitespace character \H a character that is not a horizontal whitespace character + \N a character that is not a newline \p{<i>xx</i>} a character with the <i>xx</i> property \P{<i>xx</i>} a character without the <i>xx</i> property \R a newline sequence @@ -93,7 +95,7 @@ syntax. </pre> In PCRE, \d, \D, \s, \S, \w, and \W recognize only ASCII characters. </P> -<br><a name="SEC5" href="#TOC1">GENERAL CATEGORY PROPERTY CODES FOR \p and \P</a><br> +<br><a name="SEC5" href="#TOC1">GENERAL CATEGORY PROPERTIES FOR \p and \P</a><br> <P> <pre> C Other @@ -142,7 +144,16 @@ In PCRE, \d, \D, \s, \S, \w, and \W recognize only ASCII characters. Zs Space separator </PRE> </P> -<br><a name="SEC6" href="#TOC1">SCRIPT NAMES FOR \p AND \P</a><br> +<br><a name="SEC6" href="#TOC1">PCRE SPECIAL CATEGORY PROPERTIES FOR \p and \P</a><br> +<P> +<pre> + Xan Alphanumeric: union of properties L and N + Xps POSIX space: property Z or tab, NL, VT, FF, CR + Xsp Perl space: property Z or tab, NL, FF, CR + Xwd Perl word: property Xan or underscore +</PRE> +</P> +<br><a name="SEC7" href="#TOC1">SCRIPT NAMES FOR \p AND \P</a><br> <P> Arabic, Armenian, @@ -237,7 +248,7 @@ Ugaritic, Vai, Yi. </P> -<br><a name="SEC7" href="#TOC1">CHARACTER CLASSES</a><br> +<br><a name="SEC8" href="#TOC1">CHARACTER CLASSES</a><br> <P> <pre> [...] positive character class @@ -264,7 +275,7 @@ Yi. In PCRE, POSIX character set names recognize only ASCII characters. You can use \Q...\E inside a character class. </P> -<br><a name="SEC8" href="#TOC1">QUANTIFIERS</a><br> +<br><a name="SEC9" href="#TOC1">QUANTIFIERS</a><br> <P> <pre> ? 0 or 1, greedy @@ -285,7 +296,7 @@ In PCRE, POSIX character set names recognize only ASCII characters. You can use {n,}? n or more, lazy </PRE> </P> -<br><a name="SEC9" href="#TOC1">ANCHORS AND SIMPLE ASSERTIONS</a><br> +<br><a name="SEC10" href="#TOC1">ANCHORS AND SIMPLE ASSERTIONS</a><br> <P> <pre> \b word boundary (only ASCII letters recognized) @@ -302,19 +313,19 @@ In PCRE, POSIX character set names recognize only ASCII characters. You can use \G first matching position in subject </PRE> </P> -<br><a name="SEC10" href="#TOC1">MATCH POINT RESET</a><br> +<br><a name="SEC11" href="#TOC1">MATCH POINT RESET</a><br> <P> <pre> \K reset start of match </PRE> </P> -<br><a name="SEC11" href="#TOC1">ALTERNATION</a><br> +<br><a name="SEC12" href="#TOC1">ALTERNATION</a><br> <P> <pre> expr|expr|expr... </PRE> </P> -<br><a name="SEC12" href="#TOC1">CAPTURING</a><br> +<br><a name="SEC13" href="#TOC1">CAPTURING</a><br> <P> <pre> (...) capturing group @@ -326,19 +337,19 @@ In PCRE, POSIX character set names recognize only ASCII characters. You can use capturing groups in each alternative </PRE> </P> -<br><a name="SEC13" href="#TOC1">ATOMIC GROUPS</a><br> +<br><a name="SEC14" href="#TOC1">ATOMIC GROUPS</a><br> <P> <pre> (?>...) atomic, non-capturing group </PRE> </P> -<br><a name="SEC14" href="#TOC1">COMMENT</a><br> +<br><a name="SEC15" href="#TOC1">COMMENT</a><br> <P> <pre> (?#....) comment (not nestable) </PRE> </P> -<br><a name="SEC15" href="#TOC1">OPTION SETTING</a><br> +<br><a name="SEC16" href="#TOC1">OPTION SETTING</a><br> <P> <pre> (?i) caseless @@ -355,7 +366,7 @@ newline-setting options with similar syntax: (*UTF8) set UTF-8 mode </PRE> </P> -<br><a name="SEC16" href="#TOC1">LOOKAHEAD AND LOOKBEHIND ASSERTIONS</a><br> +<br><a name="SEC17" href="#TOC1">LOOKAHEAD AND LOOKBEHIND ASSERTIONS</a><br> <P> <pre> (?=...) positive look ahead @@ -365,7 +376,7 @@ newline-setting options with similar syntax: </pre> Each top-level branch of a look behind must be of a fixed length. </P> -<br><a name="SEC17" href="#TOC1">BACKREFERENCES</a><br> +<br><a name="SEC18" href="#TOC1">BACKREFERENCES</a><br> <P> <pre> \n reference by number (can be ambiguous) @@ -379,7 +390,7 @@ Each top-level branch of a look behind must be of a fixed length. (?P=name) reference by name (Python) </PRE> </P> -<br><a name="SEC18" href="#TOC1">SUBROUTINE REFERENCES (POSSIBLY RECURSIVE)</a><br> +<br><a name="SEC19" href="#TOC1">SUBROUTINE REFERENCES (POSSIBLY RECURSIVE)</a><br> <P> <pre> (?R) recurse whole pattern @@ -398,7 +409,7 @@ Each top-level branch of a look behind must be of a fixed length. \g'-n' call subpattern by relative number (PCRE extension) </PRE> </P> -<br><a name="SEC19" href="#TOC1">CONDITIONAL PATTERNS</a><br> +<br><a name="SEC20" href="#TOC1">CONDITIONAL PATTERNS</a><br> <P> <pre> (?(condition)yes-pattern) @@ -417,7 +428,7 @@ Each top-level branch of a look behind must be of a fixed length. (?(assert)... assertion condition </PRE> </P> -<br><a name="SEC20" href="#TOC1">BACKTRACKING CONTROL</a><br> +<br><a name="SEC21" href="#TOC1">BACKTRACKING CONTROL</a><br> <P> The following act immediately they are reached: <pre> @@ -435,7 +446,7 @@ pattern is not anchored. (*THEN) local failure, backtrack to next alternation </PRE> </P> -<br><a name="SEC21" href="#TOC1">NEWLINE CONVENTIONS</a><br> +<br><a name="SEC22" href="#TOC1">NEWLINE CONVENTIONS</a><br> <P> These are recognized only at the very start of the pattern or after a (*BSR_...) or (*UTF8) option. @@ -447,7 +458,7 @@ These are recognized only at the very start of the pattern or after a (*ANY) any Unicode newline sequence </PRE> </P> -<br><a name="SEC22" href="#TOC1">WHAT \R MATCHES</a><br> +<br><a name="SEC23" href="#TOC1">WHAT \R MATCHES</a><br> <P> These are recognized only at the very start of the pattern or after a (*...) option that sets the newline convention or UTF-8 mode. @@ -456,19 +467,19 @@ These are recognized only at the very start of the pattern or after a (*BSR_UNICODE) any Unicode newline sequence </PRE> </P> -<br><a name="SEC23" href="#TOC1">CALLOUTS</a><br> +<br><a name="SEC24" href="#TOC1">CALLOUTS</a><br> <P> <pre> (?C) callout (?Cn) callout with data n </PRE> </P> -<br><a name="SEC24" href="#TOC1">SEE ALSO</a><br> +<br><a name="SEC25" href="#TOC1">SEE ALSO</a><br> <P> <b>pcrepattern</b>(3), <b>pcreapi</b>(3), <b>pcrecallout</b>(3), <b>pcrematching</b>(3), <b>pcre</b>(3). </P> -<br><a name="SEC25" href="#TOC1">AUTHOR</a><br> +<br><a name="SEC26" href="#TOC1">AUTHOR</a><br> <P> Philip Hazel <br> @@ -477,9 +488,9 @@ University Computing Service Cambridge CB2 3QH, England. <br> </P> -<br><a name="SEC26" href="#TOC1">REVISION</a><br> +<br><a name="SEC27" href="#TOC1">REVISION</a><br> <P> -Last updated: 01 March 2010 +Last updated: 05 May 2010 <br> Copyright © 1997-2010 University of Cambridge. <br> |