diff options
author | Nicholas Clark <nick@ccl4.org> | 2010-11-11 16:08:43 +0000 |
---|---|---|
committer | Nicholas Clark <nick@ccl4.org> | 2010-11-11 16:08:43 +0000 |
commit | fed3ba5d6b9222e6e73844680734b059e616c86b (patch) | |
tree | c8a449308b28520170011d015883c39c887fb9e8 /embed.h | |
parent | 08a6f934b8306af074a22b05f6de14f564a9da18 (diff) | |
download | perl-fed3ba5d6b9222e6e73844680734b059e616c86b.tar.gz |
Add Perl_bytes_cmp_utf8() to compare character sequences in different encodings
Convert sv_eq_flags() and sv_cmp_flags() to use it.
Previously, to compare two strings of characters, where was was in UTF-8, and
one was not, you had to either:
1: Upgrade the second to UTF-8
2: Compare the resulting octet sequence
3: Free the temporary UTF-8 string
or:
1: Attempt to downgrade the first to bytes. If it can't be, they aren't equal
2: Else compare the resulting octet sequence
3: Free the temporary byte string
Which for the general case involves a malloc()/free() and at least two O(n)
scans per comparison.
Whereas this approach has no allocation, a single O(n) scan, which terminates
as early as the best case for the second approach.
Diffstat (limited to 'embed.h')
-rw-r--r-- | embed.h | 1 |
1 files changed, 1 insertions, 0 deletions
@@ -47,6 +47,7 @@ #define av_undef(a) Perl_av_undef(aTHX_ a) #define av_unshift(a,b) Perl_av_unshift(aTHX_ a,b) #define block_gimme() Perl_block_gimme(aTHX) +#define bytes_cmp_utf8(a,b,c,d) Perl_bytes_cmp_utf8(aTHX_ a,b,c,d) #define bytes_from_utf8(a,b,c) Perl_bytes_from_utf8(aTHX_ a,b,c) #define bytes_to_utf8(a,b) Perl_bytes_to_utf8(aTHX_ a,b) #define call_argv(a,b,c) Perl_call_argv(aTHX_ a,b,c) |