Make shared hash keys to be \0-terminated:

one possible resolution for "UTF-8, weird \w behaviour after HASH-KEY-ification" http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2002-01/msg01327.html The hash keys were shared (the SvLEN(sv) = 0 was the giveaway). The hash keys weren't \0-terminated. This meant that the EOL ($) in regmatch() got the nextchr beyond the last character. Since the keys were UTF-8, the nextchr was \1, not the usual string-terminating \0. Wham, no match. I think another possible resolution could be to stop the nextchr computation in regmatch() from peeking beyond the last character of the string: nextchr = locinput < PL_regeol ? UCHARAT(locinput) : 0; p4raw-id: //depot/perl@14908
author: Jarkko Hietaniemi <jhi@iki.fi> 2002-02-28 05:43:45 +0000
committer: Jarkko Hietaniemi <jhi@iki.fi> 2002-02-28 05:43:45 +0000
commit: e05949c7fbf3ae0363947bc70c1c662248b91b93 (patch)
tree: a4673734f9526bd3b579333443c956499954741f /hv.c
parent: 4379a6f8153cde10f045a82b5434d852f701ae7a (diff)
download: perl-e05949c7fbf3ae0363947bc70c1c662248b91b93.tar.gz
1 files changed, 2 insertions, 1 deletions
diff --git a/hv.c b/hv.c
index e4cc6c999d..7efa0869db 100644
--- a/hv.c
+++ b/hv.c
@@ -85,9 +85,10 @@ S_save_hek(pTHX_ const char *str, I32 len, U32 hash)
       is_utf8 = TRUE;
     }
 
-    New(54, k, HEK_BASESIZE + len + 1, char);
+    New(54, k, HEK_BASESIZE + len + 2, char);
     hek = (HEK*)k;
     Copy(str, HEK_KEY(hek), len, char);
+    HEK_KEY(hek)[len] = 0;
     HEK_LEN(hek) = len;
     HEK_HASH(hek) = hash;
     HEK_UTF8(hek) = (char)is_utf8;
author	Jarkko Hietaniemi <jhi@iki.fi>	2002-02-28 05:43:45 +0000
committer	Jarkko Hietaniemi <jhi@iki.fi>	2002-02-28 05:43:45 +0000
commit	e05949c7fbf3ae0363947bc70c1c662248b91b93 (patch)
tree	a4673734f9526bd3b579333443c956499954741f /hv.c
parent	4379a6f8153cde10f045a82b5434d852f701ae7a (diff)
download	perl-e05949c7fbf3ae0363947bc70c1c662248b91b93.tar.gz