diff options
author | Karl Williamson <khw@cpan.org> | 2019-10-27 08:23:01 -0600 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2019-10-31 18:49:15 -0600 |
commit | b98b45946c976c326f3bfd53f1e3519ed9ca2be4 (patch) | |
tree | 31e50ff380e5d695de5953e67a65435e181542cc /pod/perlretut.pod | |
parent | 49e19b770ea6cfa141e896d9b45613db3dd05324 (diff) | |
download | perl-b98b45946c976c326f3bfd53f1e3519ed9ca2be4.tar.gz |
Accept experimental alpha_assertions feature
Diffstat (limited to 'pod/perlretut.pod')
-rw-r--r-- | pod/perlretut.pod | 53 |
1 files changed, 25 insertions, 28 deletions
diff --git a/pod/perlretut.pod b/pod/perlretut.pod index c722d47e04..78050a1556 100644 --- a/pod/perlretut.pod +++ b/pod/perlretut.pod @@ -2321,11 +2321,15 @@ characters on either side differ in their "word-ness". The lookahead and lookbehind assertions are generalizations of the anchor concept. Lookahead and lookbehind are zero-width assertions that let us specify which characters we want to test for. The -lookahead assertion is denoted by C<(?=regexp)> and the lookbehind -assertion is denoted by C<< (?<=fixed-regexp) >>. Some examples are +lookahead assertion is denoted by C<(?=regexp)> or (starting in 5.32, +experimentally in 5.28) C<(*pla:regexp)> or +C<(*positive_lookahead:regexp)>; and the lookbehind assertion is denoted +by C<< (?<=fixed-regexp) >> or (starting in 5.32, experimentally in +5.28) C<(*plb:fixed-regexp)> or C<(*positive_lookbehind:fixed-regexp)>. +Some examples are $x = "I catch the housecat 'Tom-cat' with catnip"; - $x =~ /cat(?=\s)/; # matches 'cat' in 'housecat' + $x =~ /cat(*pla:\s)/; # matches 'cat' in 'housecat' @catwords = ($x =~ /(?<=\s)cat\w+/g); # matches, # $catwords[0] = 'catch' # $catwords[1] = 'catnip' @@ -2333,15 +2337,19 @@ assertion is denoted by C<< (?<=fixed-regexp) >>. Some examples are $x =~ /(?<=\s)cat(?=\s)/; # doesn't match; no isolated 'cat' in # middle of $x -Note that the parentheses in C<(?=regexp)> and C<< (?<=regexp) >> are +Note that the parentheses in these are non-capturing, since these are zero-width assertions. Thus in the second regexp, the substrings captured are those of the whole regexp -itself. Lookahead C<(?=regexp)> can match arbitrary regexps, but -lookbehind C<< (?<=fixed-regexp) >> only works for regexps of fixed -width, I<i.e.>, a fixed number of characters long. Thus -C<< (?<=(ab|bc)) >> is fine, but C<< (?<=(ab)*) >> is not. The -negated versions of the lookahead and lookbehind assertions are +itself. Lookahead can match arbitrary regexps, but +lookbehind prior to 5.30 C<< (?<=fixed-regexp) >> only works for regexps +of fixed width, I<i.e.>, a fixed number of characters long. Thus +C<< (?<=(ab|bc)) >> is fine, but C<< (?<=(ab)*) >> prior to 5.30 is not. + +The negated versions of the lookahead and lookbehind assertions are denoted by C<(?!regexp)> and C<< (?<!fixed-regexp) >> respectively. +Or, starting in 5.32 (experimentally in 5.28), C<(*nla:regexp)>, +C<(*negative_lookahead:regexp)>, C<(*nlb:regexp)>, or +C<(*negative_lookbehind:regexp)>. They evaluate true if the regexps do I<not> match: $x = "foobar"; @@ -2361,27 +2369,16 @@ by looking ahead and behind: | (?<=-) (?=\S) # a '-' followed by any non-space /x, $str; # @toks = qw(one two - - - 6 - 8) -Starting in Perl 5.28, experimentally, alphabetic equivalents to these -assertions are added, so you can use whichever is most memorable for -your tastes. - - (?=...) (*pla:...) or (*positive_lookahead:...) - (?!...) (*nla:...) or (*negative_lookahead:...) - (?<=...) (*plb:...) or (*positive_lookbehind:...) - (?<!...) (*nlb:...) or (*negative_lookbehind:...) - (?>...) (*atomic:...) - -Using any of these will raise (unless turned off) a warning in the -C<experimental::alpha_assertions> category. - =head2 Using independent subexpressions to prevent backtracking -I<Independent subexpressions> are regular expressions, in the -context of a larger regular expression, that function independently of -the larger regular expression. That is, they consume as much or as -little of the string as they wish without regard for the ability of -the larger regexp to match. Independent subexpressions are represented -by C<< (?>regexp) >>. We can illustrate their behavior by first +I<Independent subexpressions> (or atomic subexpressions) are regular +expressions, in the context of a larger regular expression, that +function independently of the larger regular expression. That is, they +consume as much or as little of the string as they wish without regard +for the ability of the larger regexp to match. Independent +subexpressions are represented by +C<< (?>regexp) >> or (starting in 5.32, experimentally in 5.28) +C<(*atomic:regexp)>. We can illustrate their behavior by first considering an ordinary regexp: $x = "ab"; |