summaryrefslogtreecommitdiff
path: root/pod/perldebguts.pod
diff options
context:
space:
mode:
authorDavid Mitchell <davem@iabyn.com>2012-03-18 15:53:40 +0000
committerDavid Mitchell <davem@iabyn.com>2012-06-13 13:32:50 +0100
commitd24ca0c5f11250dcd2552c84a048bda5786ba8d1 (patch)
treefcb4bd939264b649391c45a71cca9af257d5c6dc /pod/perldebguts.pod
parente485beb85b12595f4a784d37e5f42d36644128ba (diff)
downloadperl-d24ca0c5f11250dcd2552c84a048bda5786ba8d1.tar.gz
Fix up runtime regex codeblocks.
The previous commits in this branch have brought literal code blocks into the New World Order; now do the same for runtime blocks, i.e. those needing "use re 'eval'". The main user-visible changes from this commit are that: * the code is now fully parsed, rather than needing balanced {}'s; i.e. this now works: my $code = q[ (?{ $a = '{' }) ]; use re 'eval'; /$code/ * warnings and errors are now reported as coming from "(eval NNN)" rather than "(re_eval NNN)" (although see the next commit for some fixups to that). Indeed, the string "re_eval" has been expunged from the source and documentation. The big internal difference is that the sv_compile_2op() and sv_compile_2op_is_broken() functions are no longer used, and will be removed shorty. It works by the regex compiler detecting the presence of run-time code blocks, and feeding the whole pattern string back into the parser (where the run-time blocks are now seen as compile-time), then extracting out any compiled code blocks and adding them to the mix. For example, in the following: $c = '(?{"runtime"})d'; use re 'eval'; /a(?{"literal"})\b'$c/ At the point the regex compiler is called, the perl parser will already have compiled the literal code block and presented it to the regex engine. The engine examines the pattern string, sees two '(?{', but only one accounted for by the parser, and so constructs a short string to be evalled: based on the pattern, but with literal code-blocks blanked out, and \ and ' escaped. In the above example, the pattern string is a(?{"literal"})\b'(?{"runtime"})d and we call eval_sv() with an SV containing the text qr'a \\b\'(?{"runtime"})d' The returned qr will contain the new code-block (and associated CV and pad) which can be extracted and added to the list of compiled code blocks of the original pattern. Note that with this scheme, the requirement for "use re 'eval'" is easily determined, and no longer requires all the pp_regcreset / PL_reginterp_cnt machinery, which will be removed shortly. Two subtleties of this scheme are that normally, \\ isn't collapsed into \ for literal regexes (unlike literal strings), and hints aren't inherited when using eval_sv(). We get round both of these by adding and setting a new flag, PL_reg_state.re_reparsing, which indicates that we are refeeding a pattern into the perl parser.
Diffstat (limited to 'pod/perldebguts.pod')
-rw-r--r--pod/perldebguts.pod7
1 files changed, 3 insertions, 4 deletions
diff --git a/pod/perldebguts.pod b/pod/perldebguts.pod
index 8ae6e7baa9..fdddf4a0f6 100644
--- a/pod/perldebguts.pod
+++ b/pod/perldebguts.pod
@@ -38,7 +38,6 @@ Each array C<@{"_<$filename"}> holds the lines of $filename for a
file compiled by Perl. The same is also true for C<eval>ed strings
that contain subroutines, or which are currently being executed.
The $filename for C<eval>ed strings looks like C<(eval 34)>.
-Code assertions in regexes look like C<(re_eval 19)>.
Values in this array are magical in numeric context: they compare
equal to zero only if the line is not breakable.
@@ -53,14 +52,14 @@ C<"$break_condition\0$action">.
The same holds for evaluated strings that contain subroutines, or
which are currently being executed. The $filename for C<eval>ed strings
-looks like C<(eval 34)> or C<(re_eval 19)>.
+looks like C<(eval 34)>.
=item *
Each scalar C<${"_<$filename"}> contains C<"_<$filename">. This is
also the case for evaluated strings that contain subroutines, or
which are currently being executed. The $filename for C<eval>ed
-strings looks like C<(eval 34)> or C<(re_eval 19)>.
+strings looks like C<(eval 34)>.
=item *
@@ -81,7 +80,7 @@ also exists.
A hash C<%DB::sub> is maintained, whose keys are subroutine names
and whose values have the form C<filename:startline-endline>.
C<filename> has the form C<(eval 34)> for subroutines defined inside
-C<eval>s, or C<(re_eval 19)> for those within regex code assertions.
+C<eval>s.
=item *