summaryrefslogtreecommitdiff
path: root/intrpvar.h
diff options
context:
space:
mode:
authorKarl Williamson <khw@cpan.org>2016-11-28 09:09:23 -0700
committerKarl Williamson <khw@cpan.org>2016-11-28 17:15:24 -0700
commitafc4976faee3dbcd0f85100736d54a8694d26645 (patch)
tree03a53207363b338503970f81a52e6eb1cd3feecb /intrpvar.h
parent1e4c96768cc9fe7008eef89b69243de628c78837 (diff)
downloadperl-afc4976faee3dbcd0f85100736d54a8694d26645.tar.gz
PATCH: [perl #129953] lib/locale.t failures on FREEBSD
I thought this bug was in FREEBSD, but when I went to gather the info needed to report it to the vendor, it turned out to be a mistake I had made. The problem is basically doubly encoding into UTF-8. In order to save CPU time, in a UTF-8 locale, I had stored a string as UTF-8 encoded. This string is to be inserted into a larger string. What I neglected to consider in this situation is that not all strings in such locales need be in UTF-8. The UTF-8 encoded insert could get added to a non-UTF-8 string, and the result later was switched to UTF-8, so the inserted string's bytes were individually converted to UTF-8, effectively a second time. This is a problem only if the inserted string is different when encoded in UTF-8 than not, and for this particular usage, on most platforms it was UTF-8 invariant, so did not show up, except on those platforms where it was variant. The solution is to store the replacement as a code point, and encode it as UTF-8 only if necessary, once. This actually simplifies the code.
Diffstat (limited to 'intrpvar.h')
-rw-r--r--intrpvar.h3
1 files changed, 2 insertions, 1 deletions
diff --git a/intrpvar.h b/intrpvar.h
index 4243fc87aa..db6251cecc 100644
--- a/intrpvar.h
+++ b/intrpvar.h
@@ -565,7 +565,8 @@ PERLVAR(I, collation_name, char *) /* Name of current collation */
PERLVAR(I, collxfrm_base, Size_t) /* Basic overhead in *xfrm() */
PERLVARI(I, collxfrm_mult,Size_t, 2) /* Expansion factor in *xfrm() */
PERLVARI(I, collation_ix, U32, 0) /* Collation generation index */
-PERLVARA(I, strxfrm_min_char, 3, char)
+PERLVARI(I, strxfrm_min_char, U8, 0) /* Code point that sorts earliest in
+ locale */
PERLVARI(I, strxfrm_is_behaved, bool, TRUE)
/* Assume until proven otherwise that it works */
PERLVARI(I, strxfrm_max_cp, U8, 0) /* Highest collating cp in locale */