summaryrefslogtreecommitdiff
path: root/locale.c
Commit message (Collapse)AuthorAgeFilesLines
* Add sync_locale()Karl Williamson2014-08-141-0/+35
| | | | | | This trivial function is to be used by XS code when it changes the program's locale. It hides the details from that code of what needs to be done, which could change in the future.
* locale.c: Clarify commentKarl Williamson2014-08-131-1/+1
|
* locale.c: Use PERL_UNUSED_RESULTKarl Williamson2014-08-131-3/+1
| | | | | The previous way to suppress messages wasn't working for all gcc versions. Spotted by Jarkko Hietaniemi.
* Use grok_atou instead of atoi.Jarkko Hietaniemi2014-07-221-1/+2
| | | | | Remaining atoi() uses include at least: ext/DynaLoader/dl_aix.xs, os2/os2.c, vms/vms.c
* locale.c: Improve some commentsKarl Williamson2014-07-121-3/+6
|
* locale.c: Fix some unused code for potential future useKarl Williamson2014-07-121-33/+46
| | | | | | | | | | This code extends the heuristics used to determine if a locale is UTF-8 or not on older platforms. It has been #ifdef'd out because it only added a little value on dromedary. Now the previous commit has added new heuristics, and tests on dromedary show that this adds nothing to that. But I'm leaving it in the source in case it might ever prove useful. In order to test it, I compiled it and found some problems with the earlier version that this now fixes.
* locale.c: Add new heuristic for finding if locale is UTF-8Karl Williamson2014-07-121-1/+93
| | | | | | | | On older platforms that don't conform to POSIX 2001 nor C99, heuristics are employed to try to determine if a locale is UTF-8 or not. This commit improves those heuristics by looking at names of the months and days of the week to see if they are UTF-8 or not. This is done if looking at the currency symbol failed to help.
* locale.c: White-space onlyKarl Williamson2014-07-121-12/+12
| | | | | Indent and outdent blocks of code to conform to newly formed or removed braces
* locale.c: Refactor UTF8ness of currency symbol codeKarl Williamson2014-07-121-26/+24
| | | | | | | | | On older platforms that aren't C99 nor POSIX 2001, locale.c uses the currency symbol to try to see if a locale is UTF-8 or not. This commit refactors it somewhat to make it cleaner, and which fixes several problems. The least issue was that it sometimes did a setlocale() unnecessarily. Others are that in some circumstances it called localeconv() and/or looked at the result while within the wrong locale.
* locale.c: Use ptr's value before freeing it, not afterKarl Williamson2014-07-121-1/+1
| | | | This only affected runs with the -DL parameter to perl set.
* locale.c: Use safer code practiceKarl Williamson2014-07-121-5/+6
| | | | | | | The interior-most function can return NULL. Currently savepv() which is the next outer function handles this correctly, as does the next outer function, but it is dangerous to rely on that behavior. So we test for NULL before calling functions on a NULL ptr.
* locale.c: Skip compiling fallback code on modern platformsKarl Williamson2014-07-121-1/+5
| | | | | | | | | In the function that determines if a POSIX locale is UTF-8 or not, if either nl_langinfo or MB_CUR_MAX are defined, it can reliably determine the answer. If they are not defined, it uses heuristics to figure things out as best it can. This code doesn't add value for those platforms where one of the two symbols is defined, so can just be ifdef'd out
* locale.c: name should be last resort when deciding if locale is utf8Karl Williamson2014-07-121-73/+78
| | | | | Looking at if the currency symbol is UTF-8 should come ahead of looking at the locale name.
* locale.c: Prepare for rearrangement of code blocksKarl Williamson2014-07-121-7/+6
| | | | | | | | This section of code just returned generally,. This commit changes it so that it drops off the end if it can't determine if the current locale is UTF-8 or not, so that additional tests can be added later. The function defaults to not UTF-8 if this drops off the end, so there should be no functionality change
* locale.c: Fix misplaced parenthesisKarl Williamson2014-07-101-2/+2
| | | | | | Commit a39edc4c877304d4075679b1d8de1904671a9c37 got a parenthesis misplaced so it wasn't really looking at the next character, like it was supposed to be doing
* locale.c: White-space onlyKarl Williamson2014-07-091-8/+9
| | | | Outdent because the previous commit removed the enclosing block.
* locale.c: Remove conditionals.Karl Williamson2014-07-091-11/+10
| | | | | | | | | These two functions are supposed to normally be called through macro interfaces which check whether they actually should be called or not. That means the conditionals removed by this commit are redundant from the normal interface. By removing them, we allow the exceptional case where the code should be executed unconditionally, to happen, by just calling the functions directly, not using the macro interface.
* locale.c: Keep better track of C/non-C localeKarl Williamson2014-07-091-2/+2
| | | | | | | | | | | | Perl uses three interpreter-level (but private) variables to keep track of numeric locales. PL_numeric name is the current underlying locale. PL_standard is a boolean to indicate if we are switched to the C (or POSIX) locale, and PL_local is a boolean to indicate if we are switched to the underlying one. The reason there are two booleans is if the underlying locale is C, both can be true at the same time. But the code that is being changed by this commit didn't realize this, and could unnecessarily set the booleans to FALSE. This could cause unnecessary switching of locales.
* locale.c: Make a common idiom into a macroKarl Williamson2014-07-091-10/+18
|
* Avoid unused warnings from locale-less systems.Jarkko Hietaniemi2014-07-031-0/+6
| | | | | | From Brian Fraser: "Technically, any Perl compiled with -Accflags="-UUSE_LOCALE", or -Ui_locale -Ud_setlocale... realistically, for Android".
* Remove or downgrade unnecessary dVAR.Jarkko Hietaniemi2014-06-251-11/+0
| | | | | | | | You need to configure with g++ *and* -Accflags=-DPERL_GLOBAL_STRUCT or -Accflags=-DPERL_GLOBAL_STRUCT_PRIVATE to see any difference. (g++ does not do the "post-annotation" form of "unused".) The version code has some of these issues, reported upstream.
* Put back an #if-0-ed chunk 7053d92 removed.Jarkko Hietaniemi2014-06-131-0/+75
| | | | | | | The chunk is not MAD-related but instead locale stuff. I have no idea why that chunk got removed (I used a combination of unifdef(1) and editor). It's #if-0-ed, so no change of behavior either way, but let's keep the code for now, since it seems to have "historical significance".
* Remove MAD.Jarkko Hietaniemi2014-06-131-75/+0
| | | | | | MAD = Misc Attribute Decoration; unmaintained attempt at preserving the Perl parse tree more faithfully so that automatic conversion to Perl 6 would have been easier.
* locale.c: Fix uncomplemented 'if' testKarl Williamson2014-06-071-1/+1
| | | | | Somehow the ! in this if () got dropped, and there were no tests to catch it. Now both are remedied.
* fix locale.c under -DPERL_GLOBAL_STRUCTDavid Mitchell2014-06-071-0/+1
|
* Use C locale for "$!" ouside 'use locale' scopeKarl Williamson2014-06-051-0/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The stringification of $! has long been an outlier in Perl locale handling. The theory has been that these operating system messages are likely to be of use to the final user, and should be in their language. Things like No space left on device Can't fork are not something the program is likely to handle, but could be meaningfully helpful to the end-user. There are problems with this though. One is that many perl messages are in English, with the $! appended to them, so that the resultant message is of mixed language, and may need to be translated anyway. Things like No space left on device probably won't need the remaining portion of the message to give someone a clear indication as to what's wrong. But there are many other messages where both the OS error and the Perl error would be needed togther to understand the problem. An on-line translation tool can be used to do this. Another problem is that it can lead to garbage coming out on the user's terminal when the program is not expecting UTF-8, but the underlying locale is UTF-8. This is what happens in Bug #112208, and another that was merged with it. It's a lot harder to translate mojibake via an online tool than English. This commit solves that by using the C locale for messages, except within the scope of 'use locale'. It is extremely likely that the messages in the C locale will be English, but if not they will be ASCII, and there will be no garbage printed. A program that says "use locale" is indicating that it has the intelligence necessary to deal with locales.
* Add parameters to "use locale"Karl Williamson2014-06-051-0/+21
| | | | | | | This commit allows one to specify to enable locale-awareness for only a specified subset of the locale categories. Thus you could make a section of code LC_MESSAGES aware, with no locale-awareness for the other categories.
* Allow dynamic lock of LC_NUMERICKarl Williamson2014-06-051-3/+6
| | | | | | | | | | | When processing version strings, the radix character must be a dot even if we otherwise would be using some other character. vutil.c upg_version() changes to the dot, but calls sv_catpvf() which may try to change the character to something else. This commit introduces a way to lock the character to a dot around the call to sv_catpvf() vutil.c is cpan-upstream, but already blead and cpan have diverged, so this just updates the SHA of the new version
* Fix up LC_NUMERIC wrap macrosKarl Williamson2014-06-051-2/+2
| | | | | | | | | | | | | | | | | | | | | | | perl.h has some macros used to manipulate the locale exposed for the category LC_NUMERIC. These are currently undocumented, but will need to be documented as the development of 5.21 progresses. This fixes these up in several ways: The tests for if we are in the correct state are made into macros. This is in preparation for the next commit, which will make one of them more complicated, and so that complication will only have to be in one place. The variable declared by them is renamed to be preceded by an underscore. It is dangerous practice to have a name used in a macro, as it could conflict with a name used by outside code. This alleviates it somewhat by making it even less likely to conflict. This will have to be revisited when some of these macros are made part of the public API. The tests to see if things need to change are reversed. Previously we said we need to change to standard, for example, if the variable for 'local' is set. But both can be true at the same time if the underlying locale is C. In this case, we only need to change if we aren't in standard. Whether that is also local is irrelevant.
* Keep LC_NUMERIC in C locale, except for brief periodsKarl Williamson2014-06-051-0/+6
| | | | | | | This is for XS modules, so they don't have to worry about the radix being a non-dot. When the locale needs to be in the underlying one, the operation should be wrapped using macros for the purpose. That API may change as we gain experience in 5.21, so I'm not including it now.
* Set utf8 flag properly in localeconvKarl Williamson2014-06-051-5/+5
| | | | | | | | | Rare, but not unheard of, is for the strings returned by localeconv to be in UTF-8. This commit looks for and sets the UTF-8 flag if they are. so encoded. A private function had to changed from static for this. It is renamed to begin with an underscore to emphasize its private nature.
* Use system default locale only if there is one.Jarkko Hietaniemi2014-05-291-4/+13
| | | | | | | | | | | (Currently, only Win32 has one.) [perl #121865] Fix for Coverity perl5 CID 28949: Logically dead code (DEADCODE) dead_error_line: Execution cannot reach this statement name = system_default_locale;
* Leaked string in failure path.Jarkko Hietaniemi2014-05-281-0/+1
| | | | | | | | | Fix for Coverity perl5 CID 29058: Resource leak (RESOURCE_LEAK) leaked_storage: Variable codeset going out of scope leaks the storage it points to. The savepv-ed codeset was not freed in failure path. (The save_input_locale is freed just few lines later.)
* g++ cleanups.Jarkko Hietaniemi2014-05-281-1/+1
| | | | | | regcomp.c:11083: warning: suggest a space before ';' or explicit braces around empty body in 'for' statement locale.c:1113: warning: comparison between signed and unsigned integer expressions
* Fix for Coverity perl5 CID 45366: Use after free (USE_AFTER_FREE) ↵Jarkko Hietaniemi2014-04-291-2/+2
| | | | | | pass_freed_arg: Passing freed pointer save_input_locale as an argument to PerlIO_printf. Printfing save-pvs after freeing them.
* locale.c: Change 'and' to '&&'Karl Williamson2014-02-191-6/+6
| | | | To actually compile on Windows
* locale.c: Another POSIX emulation fix on WindowsKarl Williamson2014-02-191-11/+8
| | | | | | Right after I pushed the previous commit, I realized that the system default locale on Windows should also have lower priority (besides LANG) than the LC_foo environment variables. This should do that.
* .locale.c: Better emulate POSIX locale setting on WindowsKarl Williamson2014-02-191-2/+58
| | | | | | | | | | | | | | Commit b385bb4ddcb252e69a1044d702646741e2e489fb introduced my_setlocale() compiled only under Windows which emulates the POSIX rules for setting the locale. It differs from Windows only if the locale passed in is "". Unfortunately it was buggy if the category being set was LC_ALL, and there is a LANG environment variable. LANG has lower precedence than the other environment variables, like LC_NUMERIC, but my_setlocale() was giving it higher priority when set through LC_ALL. This should solve the problems being seen since 7cd8b56846670e577e1f62479eab8f38fb11da14
* locale.c: Fix initialization compile error for HPKarl Williamson2014-02-171-4/+4
| | | | | One of the HP compilers would not compile the compile-time array initialization; so do it at runtime.
* locale.c: Remove vars unused on some platformsKarl Williamson2014-02-171-3/+0
| | | | A Darwin compiler noted these are unused.
* locale.c: Handle case where LC_ALL isn't "all"Karl Williamson2014-02-161-1/+12
| | | | | | | | | | Setting the LC_ALL locale category on NetBSD does not necessarily change all the categories to the requested locale. Sometimes the LC_COLLATE category is set to POSIX. I presume that is because collation has not been defined for the given locale, so it uses a basic locale instead. The code in locale.c that does locale initialization for the Perl program at start-up, depended on LC_ALL setting all categories to the same locale.
* Make sure LC_MONETARY is initializedKarl Williamson2014-02-151-0/+13
| | | | | | | | This is only an issue for those few platforms without LC_ALL, as that is initialized, and includes LC_MONETARY. This commit extends the proper initialization to those other platforms. Perl doesn't use LC_MONETARY itself, but it should be properly initialized for modules that do.
* Initialize LC_MESSAGES at start-upKarl Williamson2014-02-151-1/+13
| | | | | | | | | The code did not explicitly iinitialize LC_MESSAGES at startup, unlike most of the other standard categories; I don't know why. This is only an issue for those few platforms without LC_ALL, as that is initialized, and includes LC_MESSAGES. This commit extends the proper initialization to those other platforms.
* locale.c: White-space, useless brace removal onlyKarl Williamson2014-02-151-84/+82
| | | | | | | | This takes one piece of code that is needlessly enclosed in braces and removes the braces, outdenting and reflowing the comments. Otherwise, it changes to correct indentation for the addition and removal of braces by the previous commit.
* Improve fallback during locale initializationKarl Williamson2014-02-151-46/+162
| | | | | | | | | | | | | If Perl encounters a problem during startup trying to initialize the locales from the environment it has immediately reverted to the "C" locale. This commit generalizes that so it tries each of the applicable environment variables in order of priority until it works, or it gives up and uses the "C" locale. For example, if LC_ALL is set to something that is invalid, but LANG is valid, LANG will be used. This was motivated by trying to get the Windows system default locale used in preference to "C" if all else fails.
* locale.c: Add, move some comments, and a declarationKarl Williamson2014-02-151-7/+16
| | | | | | | | | | | This adds some more comments at the beginning of a function concerning its API, and moves them to before any declarations. It also moves the declaration for 'done' to the block of other declarations, and adds a PERL_UNUSED_VAR call if the code that uses it is #ifdef'd out. Previously it was too easy to not notice the declaration separate from the others, and to insert code between the two, which would not compile under C89, but only on Ultrix machines.
* Emulate POSIX locale setting on WindowsKarl Williamson2014-02-151-8/+87
| | | | | | | | | | | | | Locale initialization and setting on Windows haven't been as described in perllocale for setting locales to "". This is because that tells Windows to use the system default locale, as set through the Control Panel, but on POSIX systems, it means to look at various environment variables. This commit creates a wrapper for setlocale, used only on Windows, that looks for the appropriate environment variables when called with a "" input locale. If none are found, it continues to use the system default locale.
* regexec.c, locale.c: Silence some compiler warningsKarl Williamson2014-02-121-0/+2
| | | | | | | | | | | | For regexec.c, one compiler amongst our smokers believes there is a path where this array can be used uninitialized; it's easiest to just initialize it, even though I think the compiler is wrong, unless it is optimizing incorrectly, in which case, it would be still be best to initialize it. For locale.c, this is just the well-known gcc bug that they refuse to fix concerning a (void) cast when the function has been declared to require not ignoring the resul
* Add -DL option to trace setlocale callsKarl Williamson2014-02-031-0/+64
| | | | This will help field debugging of locale issues.
* locale.c: Fix failure to find UTF-8 localesKarl Williamson2014-01-291-31/+34
| | | | | | | | | | | | | | Commit 119ee68b changed the method to determine if a locale is a UTF-8 one to a method that was usable on more platforms, by using the C99 libc function mbtowc(). I didn't realize that there needs to be a special call to this function preceeding the main call to make sure it is in the initial state. This commit fixes that. In looking at the results from several different platforms, I decided it is best to use nl_langinfo() in preference to mbtowc() when available, and only use mbtowc() if nl_langinfo doesn't exist on the platform or fails to return a real result, which happens for some locales on Darwin. This commit does that as well.