diff options
author | Hugo van der Sanden <hv@crypt.org> | 2003-01-21 01:37:03 +0000 |
---|---|---|
committer | hv <hv@crypt.org> | 2003-01-21 01:37:03 +0000 |
commit | 7e8c5daceba7cb185532328a3b67d4ca7ba4811b (patch) | |
tree | 6fbcb357ec74beb075b5c99dbc3f3ac52e68114f /embed.h | |
parent | 388cc4de5f48b02cc9fe9b962f02cf603af02178 (diff) | |
download | perl-7e8c5daceba7cb185532328a3b67d4ca7ba4811b.tar.gz |
integrate (by hand) #18353 and #18359 from maint-5.8:
Introduce a cache for UTF-8 data: length and byte<->char offset
mapping are stored in a new type of magic. Speeds up length(),
substr(), index(), rindex(), pos(), and some parts of s///.
The speedup varies a lot (on the usual suspects: what is the
access pattern of the data, compiler, CPU), but should be at
least one order of magnitude, and getting to the same magnitude
as byte string speeds, and in some cases (length on unchanged data)
even reaching the byte string speed. On the other hand, in some
cases (index) the byte speed is still faster by a factor of five
or so, but the bottleneck there does not seem to be any more
the byte<->char offset mapping (instead, the fbm_instr() speed).
There is one cache slot for the length, and only two for the
byte<->char offset mapping (the first one for the start->offset,
and the second for the offset->offset+length, when talking
in substr() terms).
Code this hairy is bound to have hairy trolls hiding under it.
[...]
A small tweak on top of #18353: don't display mg_len bytes of
mg_ptr for PERL_MAGIC_utf8 because that's not what's there.
p4raw-id: //depot/perl@18530
Diffstat (limited to 'embed.h')
-rw-r--r-- | embed.h | 6 |
1 files changed, 6 insertions, 0 deletions
@@ -362,6 +362,7 @@ #define magic_settaint Perl_magic_settaint #define magic_setuvar Perl_magic_setuvar #define magic_setvec Perl_magic_setvec +#define magic_setutf8 Perl_magic_setutf8 #define magic_set_all_env Perl_magic_set_all_env #define magic_sizepack Perl_magic_sizepack #define magic_wipepack Perl_magic_wipepack @@ -1093,6 +1094,8 @@ # if defined(USE_ITHREADS) #define gv_share S_gv_share # endif +#define utf8_mg_pos S_utf8_mg_pos +#define utf8_mg_pos_init S_utf8_mg_pos_init #if defined(PERL_COPY_ON_WRITE) #define sv_release_COW S_sv_release_COW #endif @@ -1924,6 +1927,7 @@ #define magic_settaint(a,b) Perl_magic_settaint(aTHX_ a,b) #define magic_setuvar(a,b) Perl_magic_setuvar(aTHX_ a,b) #define magic_setvec(a,b) Perl_magic_setvec(aTHX_ a,b) +#define magic_setutf8(a,b) Perl_magic_setutf8(aTHX_ a,b) #define magic_set_all_env(a,b) Perl_magic_set_all_env(aTHX_ a,b) #define magic_sizepack(a,b) Perl_magic_sizepack(aTHX_ a,b) #define magic_wipepack(a,b) Perl_magic_wipepack(aTHX_ a,b) @@ -2643,6 +2647,8 @@ # if defined(USE_ITHREADS) #define gv_share(a,b) S_gv_share(aTHX_ a,b) # endif +#define utf8_mg_pos(a,b,c,d,e,f,g,h,i) S_utf8_mg_pos(aTHX_ a,b,c,d,e,f,g,h,i) +#define utf8_mg_pos_init(a,b,c,d,e,f,g) S_utf8_mg_pos_init(aTHX_ a,b,c,d,e,f,g) #if defined(PERL_COPY_ON_WRITE) #define sv_release_COW(a,b,c,d,e,f) S_sv_release_COW(aTHX_ a,b,c,d,e,f) #endif |