diff options
author | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2008-04-28 15:10:02 +0000 |
---|---|---|
committer | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2008-04-28 15:10:02 +0000 |
commit | 5866158e01cc19c2a8fff7fffa61de5376a938d0 (patch) | |
tree | 7759638de83997a18a99299a741082b8e5b32477 /doc/html/pcrepattern.html | |
parent | ccea1b4ed51d39d72efa77127d0ebbc10c1ea7fe (diff) | |
download | pcre-5866158e01cc19c2a8fff7fffa61de5376a938d0.tar.gz |
Tidies for the 7.7-RC1 distribution.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@345 2f5784b3-3f2a-0410-8824-cb99058d5e15
Diffstat (limited to 'doc/html/pcrepattern.html')
-rw-r--r-- | doc/html/pcrepattern.html | 75 |
1 files changed, 58 insertions, 17 deletions
diff --git a/doc/html/pcrepattern.html b/doc/html/pcrepattern.html index 237816f..9cc055c 100644 --- a/doc/html/pcrepattern.html +++ b/doc/html/pcrepattern.html @@ -35,18 +35,25 @@ man page, in case the conversion went wrong. <li><a name="TOC20" href="#SEC20">COMMENTS</a> <li><a name="TOC21" href="#SEC21">RECURSIVE PATTERNS</a> <li><a name="TOC22" href="#SEC22">SUBPATTERNS AS SUBROUTINES</a> -<li><a name="TOC23" href="#SEC23">CALLOUTS</a> -<li><a name="TOC24" href="#SEC24">BACKTRACKING CONTROL</a> -<li><a name="TOC25" href="#SEC25">SEE ALSO</a> -<li><a name="TOC26" href="#SEC26">AUTHOR</a> -<li><a name="TOC27" href="#SEC27">REVISION</a> +<li><a name="TOC23" href="#SEC23">ONIGURUMA SUBROUTINE SYNTAX</a> +<li><a name="TOC24" href="#SEC24">CALLOUTS</a> +<li><a name="TOC25" href="#SEC25">BACKTRACKING CONTROL</a> +<li><a name="TOC26" href="#SEC26">SEE ALSO</a> +<li><a name="TOC27" href="#SEC27">AUTHOR</a> +<li><a name="TOC28" href="#SEC28">REVISION</a> </ul> <br><a name="SEC1" href="#TOC1">PCRE REGULAR EXPRESSION DETAILS</a><br> <P> The syntax and semantics of the regular expressions that are supported by PCRE are described in detail below. There is a quick-reference syntax summary in the <a href="pcresyntax.html"><b>pcresyntax</b></a> -page. Perl's regular expressions are described in its own documentation, and +page. PCRE tries to match Perl syntax and semantics as closely as it can. PCRE +also supports some alternative regular expression syntax (which does not +conflict with the Perl syntax) in order to provide some compatibility with +regular expressions in Python, .NET, and Oniguruma. +</P> +<P> +Perl's regular expressions are described in its own documentation, and regular expressions in general are covered in a number of books, some of which have copious examples. Jeffrey Friedl's "Mastering Regular Expressions", published by O'Reilly, covers regular expressions in great detail. This @@ -312,6 +319,17 @@ following the discussion of <a href="#subpattern">parenthesized subpatterns.</a> </P> <br><b> +Absolute and relative subroutine calls +</b><br> +<P> +For compatibility with Oniguruma, the non-Perl syntax \g followed by a name or +a number enclosed either in angle brackets or single quotes, is an alternative +syntax for referencing a subpattern as a "subroutine". Details are discussed +<a href="#onigurumasubroutines">later.</a> +Note that \g{...} (Perl syntax) and \g<...> (Oniguruma syntax) are <i>not</i> +synonymous. The former is a back reference; the latter is a subroutine call. +</P> +<br><b> Generic character types </b><br> <P> @@ -1231,7 +1249,11 @@ which may be several bytes long (and they may be of different lengths). </P> <P> The quantifier {0} is permitted, causing the expression to behave as if the -previous item and the quantifier were not present. +previous item and the quantifier were not present. This may be useful for +subpatterns that are referenced as +<a href="#subpatternsassubroutines">subroutines</a> +from elsewhere in the pattern. Items other than subpatterns that have a {0} +quantifier are omitted from the compiled pattern. </P> <P> For convenience, the three most common quantifiers have single-character @@ -2031,8 +2053,26 @@ changed for different calls. For example, consider this pattern: </pre> It matches "abcabc". It does not match "abcABC" because the change of processing option does not affect the called subpattern. +<a name="onigurumasubroutines"></a></P> +<br><a name="SEC23" href="#TOC1">ONIGURUMA SUBROUTINE SYNTAX</a><br> +<P> +For compatibility with Oniguruma, the non-Perl syntax \g followed by a name or +a number enclosed either in angle brackets or single quotes, is an alternative +syntax for referencing a subpattern as a subroutine, possibly recursively. Here +are two of the examples used above, rewritten using this syntax: +<pre> + (?<pn> \( ( (?>[^()]+) | \g<pn> )* \) ) + (sens|respons)e and \g'1'ibility +</pre> +PCRE supports an extension to Oniguruma: if a number is preceded by a +plus or a minus sign it is taken as a relative reference. For example: +<pre> + (abc)(?i:\g<-1>) +</pre> +Note that \g{...} (Perl syntax) and \g<...> (Oniguruma syntax) are <i>not</i> +synonymous. The former is a back reference; the latter is a subroutine call. </P> -<br><a name="SEC23" href="#TOC1">CALLOUTS</a><br> +<br><a name="SEC24" href="#TOC1">CALLOUTS</a><br> <P> Perl has a feature whereby using the sequence (?{...}) causes arbitrary Perl code to be obeyed in the middle of matching a regular expression. This makes it @@ -2067,7 +2107,7 @@ description of the interface to the callout function is given in the <a href="pcrecallout.html"><b>pcrecallout</b></a> documentation. </P> -<br><a name="SEC24" href="#TOC1">BACKTRACKING CONTROL</a><br> +<br><a name="SEC25" href="#TOC1">BACKTRACKING CONTROL</a><br> <P> Perl 5.10 introduced a number of "Special Backtracking Control Verbs", which are described in the Perl documentation as "experimental and subject to change @@ -2076,9 +2116,10 @@ production code should be noted to avoid problems during upgrades." The same remarks apply to the PCRE features described in this section. </P> <P> -Since these verbs are specifically related to backtracking, they can be used -only when the pattern is to be matched using <b>pcre_exec()</b>, which uses a -backtracking algorithm. They cause an error if encountered by +Since these verbs are specifically related to backtracking, most of them can be +used only when the pattern is to be matched using <b>pcre_exec()</b>, which uses +a backtracking algorithm. With the exception of (*FAIL), which behaves like a +failing negative assertion, they cause an error if encountered by <b>pcre_dfa_exec()</b>. </P> <P> @@ -2182,11 +2223,11 @@ the end of the group if FOO succeeds); on failure the matcher skips to the second alternative and tries COND2, without backtracking into COND1. If (*THEN) is used outside of any alternation, it acts exactly like (*PRUNE). </P> -<br><a name="SEC25" href="#TOC1">SEE ALSO</a><br> +<br><a name="SEC26" href="#TOC1">SEE ALSO</a><br> <P> <b>pcreapi</b>(3), <b>pcrecallout</b>(3), <b>pcrematching</b>(3), <b>pcre</b>(3). </P> -<br><a name="SEC26" href="#TOC1">AUTHOR</a><br> +<br><a name="SEC27" href="#TOC1">AUTHOR</a><br> <P> Philip Hazel <br> @@ -2195,11 +2236,11 @@ University Computing Service Cambridge CB2 3QH, England. <br> </P> -<br><a name="SEC27" href="#TOC1">REVISION</a><br> +<br><a name="SEC28" href="#TOC1">REVISION</a><br> <P> -Last updated: 17 September 2007 +Last updated: 19 April 2008 <br> -Copyright © 1997-2007 University of Cambridge. +Copyright © 1997-2008 University of Cambridge. <br> <p> Return to the <a href="index.html">PCRE index page</a>. |