diff options
author | Karl Williamson <khw@cpan.org> | 2018-03-12 12:24:04 -0600 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2018-03-12 12:47:18 -0600 |
commit | 9487427ba26d65e7adf5954069fc2fde3bdedf41 (patch) | |
tree | 61eb06214fe38b3ff158e2c52e3b3464444a4947 | |
parent | 02916c24e049d202108ea97c1e420790acb6090a (diff) | |
download | perl-9487427ba26d65e7adf5954069fc2fde3bdedf41.tar.gz |
Fix comments/pod for LC_NUMERIC not always C
In recent Perl versions, the underlying locale for LC_NUMERIC has been
kept in C because XS code is expecting a dot radix character. But if
the LC_NUMERIC locale has a dot, that is unnecessary. (There is also
the thousands grouping separator which for safety we verify is empty.)
Thus 5.27 doesn't always keep the underlying locale in C; it does so
only if necessary.
This commit updates various comments and pods to reflect this change.
-rw-r--r-- | locale.c | 30 | ||||
-rw-r--r-- | perl.h | 32 |
2 files changed, 35 insertions, 27 deletions
@@ -2081,9 +2081,13 @@ S_win32_setlocale(pTHX_ int category, const char* locale) This is an (almost) drop-in replacement for the system L<C<setlocale(3)>>, taking the same parameters, and returning the same information, except that it -returns the correct underlying C<LC_NUMERIC> locale, instead of C<C> always, as -perl keeps that locale category as C<C>, changing it briefly during the -operations where the underlying one is required. +returns the correct underlying C<LC_NUMERIC> locale. Regular C<setlocale> will +instead return C<C> if the underlying locale has a non-dot decimal point +character, or a non-empty thousands separator for displaying floating point +numbers. This is because perl keeps that locale category such that it has a +dot and empty separator, changing the locale briefly during the operations +where the underlying one is required. C<Perl_setlocale> knows about this, and +compensates; regular C<setlocale> doesn't. Another reason it isn't completely a drop-in replacement is that it is declared to return S<C<const char *>>, whereas the system setlocale omits the @@ -2123,8 +2127,9 @@ Perl_setlocale(const int category, const char * locale) /* A NULL locale means only query what the current one is. We have the * LC_NUMERIC name saved, because we are normally switched into the C - * locale for it. For an LC_ALL query, switch back to get the correct - * results. All other categories don't require special handling */ + * (or equivalent) locale for it. For an LC_ALL query, switch back to get + * the correct results. All other categories don't require special + * handling */ if (locale == NULL) { if (category == LC_NUMERIC) { @@ -2291,13 +2296,14 @@ rather than getting segfaults at runtime. It delivers the correct results for the C<RADIXCHAR> and C<THOUSEP> items, without you having to write extra code. The reason for the extra code would be because these are from the C<LC_NUMERIC> locale category, which is normally -kept set to the C locale by Perl, no matter what the underlying locale is -supposed to be, and so to get the expected results, you have to temporarily -toggle into the underlying locale, and later toggle back. (You could use plain -C<nl_langinfo> and C<L</STORE_LC_NUMERIC_FORCE_TO_UNDERLYING>> for this but -then you wouldn't get the other advantages of C<Perl_langinfo()>; not keeping -C<LC_NUMERIC> in the C locale would break a lot of CPAN, which is expecting the -radix (decimal point) character to be a dot.) +kept set by Perl so that the radix is a dot, and the separator is the empty +string, no matter what the underlying locale is supposed to be, and so to get +the expected results, you have to temporarily toggle into the underlying +locale, and later toggle back. (You could use plain C<nl_langinfo> and +C<L</STORE_LC_NUMERIC_FORCE_TO_UNDERLYING>> for this but then you wouldn't get +the other advantages of C<Perl_langinfo()>; not keeping C<LC_NUMERIC> in the C +(or equivalent) locale would break a lot of CPAN, which is expecting the radix +(decimal point) character to be a dot.) =item * @@ -5807,7 +5807,10 @@ typedef struct am_table_short AMTS; #ifdef USE_LOCALE_NUMERIC /* These macros are for toggling between the underlying locale (UNDERLYING or - * LOCAL) and the C locale (STANDARD). + * LOCAL) and the C locale (STANDARD). (Actually we don't have to use the C + * locale if the underlying locale is indistinguishable from it in the numeric + * operations used by Perl, namely the decimal point, and even the thousands + * separator.) =head1 Locale-related functions and macros @@ -5851,10 +5854,11 @@ close by, and guaranteed to be called. =for apidoc Am|void|STORE_LC_NUMERIC_SET_TO_NEEDED -This is used to help wrap XS or C code that that is C<LC_NUMERIC> locale-aware. -This locale category is generally kept set to the C locale by Perl for -backwards compatibility, and because most XS code that reads floating point -values can cope only with the decimal radix character being a dot. +This is used to help wrap XS or C code that is C<LC_NUMERIC> locale-aware. +This locale category is generally kept set to a locale where the decimal radix +character is a dot, and the separator between groups of digits is empty. This +is because most XS code that reads floating point numbers is expecting them to +have this syntax. This macro makes sure the current C<LC_NUMERIC> state is set properly, to be aware of locale if the call to the XS or C code from the Perl program is @@ -5906,16 +5910,14 @@ expression, but with an empty argument list, like this: */ -/* The numeric locale is generally kept in the C locale instead of the - * underlying locale. The current status is known by looking at two words. - * One is non-zero if the current numeric locale is the standard C/POSIX one. - * The other is non-zero if the current locale is the underlying locale. Both - * can be non-zero if, as often happens, the underlying locale is C. - * - * Its slightly more complicated than this, as the PL_numeric_standard variable - * is set if the current numeric locale is indistinguishable from the C locale. - * This happens when the radix character is a dot, and the thousands separator - * is the empty string. +/* If the underlying numeric locale has a non-dot decimal point or has a + * non-empty floating point thousands separator, the current locale is instead + * generally kept in the C locale instead of that underlying locale. The + * current status is known by looking at two words. One is non-zero if the + * current numeric locale is the standard C/POSIX one or is indistinguishable + * from C. The other is non-zero if the current locale is the underlying + * locale. Both can be non-zero if, as often happens, the underlying locale is + * C or indistinguishable from it. * * khw believes the reason for the variables instead of the bits in a single * word is to avoid having to have masking instructions. */ |