summaryrefslogtreecommitdiff
path: root/pod/perlreapi.pod
diff options
context:
space:
mode:
authorDavid Mitchell <davem@iabyn.com>2013-05-18 15:05:57 +0100
committerDavid Mitchell <davem@iabyn.com>2013-06-02 22:28:50 +0100
commit52a21eb36148cc4f249f436a989e2cfe5c6bab1f (patch)
treee85cc834c1b706d4f5245c6aef489970a0e6621a /pod/perlreapi.pod
parentf9176b44e50593d8f3446da63d3989558f6d4c20 (diff)
downloadperl-52a21eb36148cc4f249f436a989e2cfe5c6bab1f.tar.gz
add strbeg argument to Perl_re_intuit_start()
(note that this is a change both to the perl API and the regex engine plugin API). Currently, Perl_re_intuit_start() is passed an SV, plus pointers to: where in the string to start matching (strpos); and to the end of the string (strend). Unlike Perl_regexec_flags(), it doesn't also have a strbeg arg. Because of this this, it guesses strbeg: based on the passed SV (if its svPOK()); or just set to strpos otherwise. This latter can happen if for example the SV is overloaded. Note also that this latter guess is wrong, and could in theory make /\b.../ fail. But just to confuse matters, although Perl_re_intuit_start() itself uses its guesstimate strbeg var, some of the functions it calls use the global value of PL_bostr instead. To make this work, the *callers* of Perl_re_intuit_start() currently set PL_bostr first. This is why \b doesn't actually break. The fix to this unholy mess is to simply add a strbeg arg to Perl_re_intuit_start(). It's also the first step to eliminating PL_bostr altogether.
Diffstat (limited to 'pod/perlreapi.pod')
-rw-r--r--pod/perlreapi.pod27
1 files changed, 24 insertions, 3 deletions
diff --git a/pod/perlreapi.pod b/pod/perlreapi.pod
index 3d0962ac8d..c4e30cb4a8 100644
--- a/pod/perlreapi.pod
+++ b/pod/perlreapi.pod
@@ -21,6 +21,7 @@ following format:
void* data, U32 flags);
char* (*intuit) (pTHX_
REGEXP * const rx, SV *sv,
+ const char * const strbeg,
char *strpos, char *strend, U32 flags,
struct re_scream_pos_data_s *data);
SV* (*checkstr) (pTHX_ REGEXP * const rx);
@@ -286,9 +287,14 @@ Optimisation flags; subject to change.
=head2 intuit
- char* intuit(pTHX_ REGEXP * const rx,
- SV *sv, char *strpos, char *strend,
- const U32 flags, struct re_scream_pos_data_s *data);
+ char* intuit(pTHX_
+ REGEXP * const rx,
+ SV *sv,
+ const char * const strbeg,
+ char *strpos,
+ char *strend,
+ const U32 flags,
+ struct re_scream_pos_data_s *data);
Find the start position where a regex match should be attempted,
or possibly if the regex engine should not be run because the
@@ -296,6 +302,21 @@ pattern can't match. This is called, as appropriate, by the core,
depending on the values of the C<extflags> member of the C<regexp>
structure.
+Arguments:
+
+ rx: the regex to match against
+ sv: the SV being matched: only used for utf8 flag; the string
+ itself is accessed via the pointers below. Note that on
+ something like an overloaded SV, SvPOK(sv) may be false
+ and the string pointers may point to something unrelated to
+ the SV itself.
+ strbeg: real beginning of string
+ strpos: the point in the string at which to begin matching
+ strend: pointer to the byte following the last char of the string
+ flags currently unused; set to 0
+ data: currently unused; set to NULL
+
+
=head2 checkstr
SV* checkstr(pTHX_ REGEXP * const rx);