summaryrefslogtreecommitdiff
path: root/pp_ctl.c
diff options
context:
space:
mode:
authorJarkko Hietaniemi <jhi@iki.fi>2001-01-15 05:02:24 +0000
committerJarkko Hietaniemi <jhi@iki.fi>2001-01-15 05:02:24 +0000
commit9aa983d27b0af31badfcbbb76567f6e557076b41 (patch)
treea3290ebe9e4a9773e967a8beb2895428f7e717c7 /pp_ctl.c
parent78d8f6e05211de1a60b4bb9b795b8ff72f179ebe (diff)
downloadperl-9aa983d27b0af31badfcbbb76567f6e557076b41.tar.gz
More UTF-8 patches from Inaba Hiroto.
- The substr lval was still not okay. - Now pp_stringify and sv_setsv copies source's UTF8 flag even if IN_BYTE. pp_stringify is called from fold_constants at optimization phase and "\x{100}" was made SvUTF8_off under use bytes (the bytes pragma is for "byte semantics" and not for "do not produce UTF8 data") - New `qu' operator to generate UTF8 string explicitly. Though I agree with the policy "0x00-0xff always produce bytes", sometimes want to such a string to be coded in UTF8. I can use pack"U0a*" but it requires more typing and has runtime overhead. - Fix pp_regcomp bug uncovered by "0x00-0xff always produce bytes" change, the bug appears if a pm has PMdf_UTF8 flag but interpolated string is not UTF8_on and has char 0x80-0xff. TODO: document and test qu. p4raw-id: //depot/perl@8439
Diffstat (limited to 'pp_ctl.c')
-rw-r--r--pp_ctl.c7
1 files changed, 6 insertions, 1 deletions
diff --git a/pp_ctl.c b/pp_ctl.c
index 07545dc28a..5490221d0b 100644
--- a/pp_ctl.c
+++ b/pp_ctl.c
@@ -116,9 +116,14 @@ PP(pp_regcomp)
pm->op_pmflags = pm->op_pmpermflags; /* reset case sensitivity */
if (DO_UTF8(tmpstr))
pm->op_pmdynflags |= PMdf_DYN_UTF8;
- else
+ else {
pm->op_pmdynflags &= ~PMdf_DYN_UTF8;
+ if (pm->op_pmdynflags & PMdf_UTF8)
+ t = bytes_to_utf8(t, &len);
+ }
pm->op_pmregexp = CALLREGCOMP(aTHX_ t, t + len, pm);
+ if (!DO_UTF8(tmpstr) && (pm->op_pmdynflags & PMdf_UTF8))
+ Safefree(t);
PL_reginterp_cnt = 0; /* XXXX Be extra paranoid - needed
inside tie/overload accessors. */
}