diff options
author | Karl Williamson <public@khwilliamson.com> | 2013-07-02 10:49:04 -0600 |
---|---|---|
committer | Karl Williamson <public@khwilliamson.com> | 2013-07-05 22:30:00 -0600 |
commit | fa9b773e4c910c139e3ea8836f829e5ddce890f1 (patch) | |
tree | 4583b6c1b65633e02b6bc809cd9940a572680347 /perl.h | |
parent | 7d74bb61140f55c3b1a63a3ca309682bfca6a465 (diff) | |
download | perl-fa9b773e4c910c139e3ea8836f829e5ddce890f1.tar.gz |
locale.c: Further checks for utf8ness of a locale
In reality, the return value of setlocale() is documented to be opaque,
so using it to determine if a locale is UTF-8 or not may not work. It
is a char*, which we treat as a name. We can safely assume that if the
name contains UTF-8 (or slight variations thereof), that it is a UTF-8
locale. But if the name doesn't contain that, it still could be one.
In fact there are currently many locales on our dromedary machine that
fall into this category. Similarly, something containing 8859 is not
going to be UTF-8.
This commit adds another test for cases where there is no nl_langinfo(),
and the locale name isn't helpful. It looks at the currency symbol,
which typically will be in the locale's script. If that is illegal
UTF-8, we know for sure that the locale isn't UTF-8 (or is corrupted).
If it is legal UTF-8 (but not ASCII) we can be pretty sure that the
locale is UTF-8. If it is ASCII, we still don't know one way or the
other, so we err on it not being UTF-8.
Originally, I was going to use the locale's error message strings,
returned from strerror(), the source for $!, to check for this.
These are supposed to be in terms of the LC_MESSAGES locale. Chances
are vanishingly small that the locale is not UTF-8 if all the messages
pass a utf8ness test, provided that the messages aren't just ASCII.
However, on dromedary, the messages for many of the exotic locales
haven't been translated, and are still in English, which doesn't help at
all. I figure that this is quite likely to be the case generally, and
the currency symbol is much more likely to have been translated.
I left the code in though, commented out for possible future use.
Note that this test will run only on systems that don't have
nl_langinfo(). The test can also be turned off by setting a C compiler
flag -DNO_LOCALE_MONETARY, (and -DNO_LOCALE_MESSAGES for the
commented-out part), corresponding to the way the other categories can
be turned off (none of which is documented).
Diffstat (limited to 'perl.h')
-rw-r--r-- | perl.h | 6 |
1 files changed, 6 insertions, 0 deletions
@@ -701,6 +701,12 @@ struct op *Perl_op asm(stringify(OP_IN_REGISTER)); # if !defined(NO_LOCALE_NUMERIC) && defined(LC_NUMERIC) # define USE_LOCALE_NUMERIC # endif +# if !defined(NO_LOCALE_MESSAGES) && defined(LC_MESSAGES) +# define USE_LOCALE_MESSAGES +# endif +# if !defined(NO_LOCALE_MONETARY) && defined(LC_MONETARY) +# define USE_LOCALE_MONETARY +# endif #endif /* !NO_LOCALE && HAS_SETLOCALE */ #include <setjmp.h> |