summaryrefslogtreecommitdiff
path: root/embed.fnc
diff options
context:
space:
mode:
authorDavid Mitchell <davem@iabyn.com>2012-03-18 15:53:40 +0000
committerDavid Mitchell <davem@iabyn.com>2012-06-13 13:32:50 +0100
commitd24ca0c5f11250dcd2552c84a048bda5786ba8d1 (patch)
treefcb4bd939264b649391c45a71cca9af257d5c6dc /embed.fnc
parente485beb85b12595f4a784d37e5f42d36644128ba (diff)
downloadperl-d24ca0c5f11250dcd2552c84a048bda5786ba8d1.tar.gz
Fix up runtime regex codeblocks.
The previous commits in this branch have brought literal code blocks into the New World Order; now do the same for runtime blocks, i.e. those needing "use re 'eval'". The main user-visible changes from this commit are that: * the code is now fully parsed, rather than needing balanced {}'s; i.e. this now works: my $code = q[ (?{ $a = '{' }) ]; use re 'eval'; /$code/ * warnings and errors are now reported as coming from "(eval NNN)" rather than "(re_eval NNN)" (although see the next commit for some fixups to that). Indeed, the string "re_eval" has been expunged from the source and documentation. The big internal difference is that the sv_compile_2op() and sv_compile_2op_is_broken() functions are no longer used, and will be removed shorty. It works by the regex compiler detecting the presence of run-time code blocks, and feeding the whole pattern string back into the parser (where the run-time blocks are now seen as compile-time), then extracting out any compiled code blocks and adding them to the mix. For example, in the following: $c = '(?{"runtime"})d'; use re 'eval'; /a(?{"literal"})\b'$c/ At the point the regex compiler is called, the perl parser will already have compiled the literal code block and presented it to the regex engine. The engine examines the pattern string, sees two '(?{', but only one accounted for by the parser, and so constructs a short string to be evalled: based on the pattern, but with literal code-blocks blanked out, and \ and ' escaped. In the above example, the pattern string is a(?{"literal"})\b'(?{"runtime"})d and we call eval_sv() with an SV containing the text qr'a \\b\'(?{"runtime"})d' The returned qr will contain the new code-block (and associated CV and pad) which can be extracted and added to the list of compiled code blocks of the original pattern. Note that with this scheme, the requirement for "use re 'eval'" is easily determined, and no longer requires all the pp_regcreset / PL_reginterp_cnt machinery, which will be removed shortly. Two subtleties of this scheme are that normally, \\ isn't collapsed into \ for literal regexes (unlike literal strings), and hints aren't inherited when using eval_sv(). We get round both of these by adding and setting a new flag, PL_reg_state.re_reparsing, which indicates that we are refeeding a pattern into the perl parser.
Diffstat (limited to 'embed.fnc')
-rw-r--r--embed.fnc3
1 files changed, 2 insertions, 1 deletions
diff --git a/embed.fnc b/embed.fnc
index e05af38344..faf1f85a44 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -2121,7 +2121,8 @@ s |char* |scan_ident |NN char *s|NN const char *send|NN char *dest \
|STRLEN destlen|I32 ck_uni
sR |char* |scan_inputsymbol|NN char *start
sR |char* |scan_pat |NN char *start|I32 type
-sR |char* |scan_str |NN char *start|int keep_quoted|int keep_delims
+sR |char* |scan_str |NN char *start|int keep_quoted \
+ |int keep_delims|int re_reparse
sR |char* |scan_subst |NN char *start
sR |char* |scan_trans |NN char *start
s |char* |scan_word |NN char *s|NN char *dest|STRLEN destlen \