summaryrefslogtreecommitdiff
path: root/locale.c
Commit message (Collapse)AuthorAgeFilesLines
* locale.c: Silence porting messagesKarl Williamson2015-09-091-25/+20
| | | | | | | | | | | This changes from using the standard C, generally unsafe, library functions to using Perl safer alternatives. This code, only used in debugging, really doesn't need that safety, but I had forgotten that Perl makes it easy to add it, and it silences the warnings about using the C functions from t/porting/libperl.t. Why this warning didn't happen in smoking, I don't know. Spotted by Dave Mitchell.
* Add more -DL debugging infoKarl Williamson2015-09-081-1/+25
| | | | | This adds more stuff that gets dumped when debugging locale handling. And it adds even more when the v modifier appears.
* Add code for debugging locale initializationKarl Williamson2015-09-081-28/+171
| | | | | | | | | | | This initialization is done before the processing of command line arguments, so that it has to be handled specially. This commit changes the initialization code to output debugging information if the environment variable PERL_DEBUG_LOCALE_INIT is set. I don't see the need to document this outside the source, as anyone who is using it would be reading the source anyway; it's of highly specialized use.
* locale.c: Add clarifying commentsKarl Williamson2015-09-081-2/+42
|
* Safefree(NULL) reductionDaniel Dragan2015-08-031-3/+0
| | | | | | | | | | | | | | | | locale.c: - the pointers are always null at this point, see http://www.nntp.perl.org/group/perl.perl5.porters/2015/07/msg229533.html pp.c: - reduce scope of temp_buffer and svrecode, into an inner branch - in some permutations, either temp_buffer is never set to non-null, or svrecode, in permutations where it is known that the var hasn't been set yet, skip the freeing calls at the end, this doesn't eliminate all permutations with NULL being passed to Safefree and SvREFCNT_dec, but only some of them regcomp.c - dont create a save stack entry to call Safefree(NULL), see ticket for this patch for some profiling stats
* locale.c: White-space, comment onlyKarl Williamson2015-06-011-7/+9
| | | | Add a comment, indent some nested #if's
* Replace common Emacs file-local variables with dir-localsDagfinn Ilmari Mannsåker2015-03-221-6/+0
| | | | | | | | | | | | | | | | An empty cpan/.dir-locals.el stops Emacs using the core defaults for code imported from CPAN. Committer's work: To keep t/porting/cmp_version.t and t/porting/utils.t happy, $VERSION needed to be incremented in many files, including throughout dist/PathTools. perldelta entry for module updates. Add two Emacs control files to MANIFEST; re-sort MANIFEST. For: RT #124119.
* [perl #123814] replace grok_atou with grok_atoUVHugo van der Sanden2015-03-091-1/+4
| | | | | | | | | | | | Some questions and loose ends: XXX gv.c:S_gv_magicalize - why are we using SSize_t for paren? XXX mg.c:Perl_magic_set - need appopriate error handling for $) XXX regcomp.c:S_reg - need to check if we do the right thing if parno was not grokked Perl_get_debug_opts should probably return something unsigned; not sure if that's something we can change.
* locale.c: Move statements properly within #ifKarl Williamson2015-03-071-4/+4
| | | | | | The variables in these statments were undefined when compiled with ccflag -DNO_LOCALE, because the declarations are skipped then. Just move them a few lines up so are within the same #if.
* locale.c: savepv() of getenv()Karl Williamson2015-02-061-8/+19
| | | | | | | See https://rt.perl.org/Public/Bug/Display.html?id=123748. This also changes a '0' into a FALSE when initializing a boolean, which I consider clearer.
* locale.c: Fix commentKarl Williamson2015-02-061-1/+1
|
* Corrections to spelling and grammatical errors.Lajos Veres2015-01-281-1/+1
| | | | Extracted from patch submitted by Lajos Veres in RT #123693.
* locale.c: Add comment; move #ifKarl Williamson2015-01-131-3/+6
| | | | | | A better comment is added. The #if is moved so that the rare compilation that doesn't use LC_CTYPE, no unused variable warning would be generated.
* Move unlikely executed macro to functionKarl Williamson2015-01-131-0/+23
| | | | | | | | | | The bulk of this macro is extremely rarely executed, so it makes sense to optimize for space, as it is called from a fair number of places, and move as much as possible to a single function. For whatever it's worth, on my system with my typical compilation options, including -O0, the savings was 19640 bytes in regexec.o, 4528 in utf8.o, at a cost of 1488 in locale.o.
* locale.c: Fix memory leak.Karl Williamson2015-01-131-0/+1
| | | | | | | I spotted this in code review. I didn't add a test for it, because to expose the much more serious bug fixed by the previous commit, I had to temporarily change the C code to force these extremely unlikely-to-be-taken branches to execute.
* Fix breakage of 780fcc9Karl Williamson2014-12-291-2/+7
| | | | | I got confused in writing this: the global needs to be cleared always, and set to NULL.
* Don't raise 'poorly supported' locale warning unnecessarilyKarl Williamson2014-12-291-11/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 8c6180a91de91a1194f427fc639694f43a903a78 added a warning message for when Perl determines that the program's underlying locale just switched into is poorly supported. At the time it was thought that this would be an extremely rare occurrence. However, a bug in HP-UX - B.11.00/64 causes this message to be raised for the "C" locale. A workaround was done that silenced those. However, before it got fixed, this message would occur gobs of times executing the test suite. It was raised even if the script is not locale-aware, so that the underlying locale was completely irrelevant. There is a good prospect that someone using an older Asian locale as their default would get this message inappropriately, even if they don't use locales, or switch to a supported one before using them. This commit causes the message to be raised only if it actually is relevant. When not in the scope of 'use locale', the message is stored, not raised. Upon the first locale-dependent operation within a bad locale, the saved message is raised, and the storage cleared. I was able to do this without adding extra branching to the main-line non-locale execution code. This was done by adding regnodes which get jumped to by switch statements, and refactoring some existing C tests so they exclude non-locale right off the bat. These changes would have been necessary for another locale warning that I previously agreed to implement, and which is coming a few commits from now. I do not know of any way to add tests in the test suite for this. It is in fact rare for modern locales to have these issues. The way I tested this was to temporarily change the C code so that all locales are viewed as defective, and manually note that the warnings came out where expected, and only where expected. I chose not to try to output this warning on any POSIX functions called. I believe that all that are affected are deprecated or scheduled to be deprecated anyway. And POSIX is closer to the hardware of the machine. For convenience, I also don't output the message for some zero-length pattern matches. If something is going to be matched, the message will likely very soon be raised anyway.
* Stop errorneous warnings for C localeKarl Williamson2014-12-111-1/+10
| | | | | | HP-UX - B.11.00/64 has a problem with the C locale that's only noticeable from newly added warnings flooding the logs. This adds a test to suppress them.
* Change core to use is_invariant_string()Karl Williamson2014-11-261-5/+5
| | | | | is_ascii_string's name has misled me in the past; the new name is clearer.
* locale.c: Account for setlocale using static storageKarl Williamson2014-11-191-2/+9
| | | | | | | | | Some systems setlocale()s use static storage for the locale name returned by it, so that a subsequent setlocale overwrites it. Therefore, you must make a copy of the name if you want it to work after the next setlocale. Thanks to Craig Berry for finding and diagnosing this problem.
* Reinstate "Raise warnings for poorly supported locales"Karl Williamson2014-11-141-0/+82
| | | | | This reverts commit 1244bd171b8d1fd4b6179e537f7b95c38bd8f099, thus reinstating commit 3d3a881c1b0eb9c855d257a2eea1f72666e30fbc.
* Revert "Raise warnings for poorly supported locales"Karl Williamson2014-11-041-82/+0
| | | | | | | This reverts commit 3d3a881c1b0eb9c855d257a2eea1f72666e30fbc. Win32 with a 1252 code page was failing blead. Revert until I have time to look at it.
* Raise warnings for poorly supported localesKarl Williamson2014-11-041-0/+82
| | | | | | | | | Perl only supports single-byte locales (except for UTF-8 ones), and has poor support for 7-bit locales that aren't supersets of ASCII (these should be exceedingly rare these days). This commit raises warnings in the new locale warning category when such a locale is entered.
* fix type incompatibilities between format strings/argsLukas Mai2014-10-261-1/+1
| | | | | | | | Building a debugging perl triggered warnings such as warning: format ‘%d’ expects argument of type ‘int’, but argument 3 has type ‘U32’ warning: field width specifier ‘*’ expects argument of type ‘int’, but argument 5 has type ‘long unsigned int’ warning: format ‘%x’ expects argument of type ‘unsigned int’, but argument 3 has type ‘wchar_t’
* EBCDIC doesn't have real UTF-8 locales.Karl Williamson2014-10-211-0/+3
| | | | | At least on the system that we have tested on. There are locales that say they are UTF-8, but they're not; they're EBCDIC 1047.
* Add and use macros for case-insensitive comparisonKarl Williamson2014-08-221-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | This adds to handy.h isALPHA_FOLD_EQ(c1,c2) which efficiently tests if c1 and c2 are the same character, case-insensitively. For example isALPHA_FOLD_EQ(c, 's') returns true if and only if <c> is 's' or 'S'. isALPHA_FOLD_NE() is also added by this commit. At least one of c1 and c2 must be known to be in [A-Za-z] or this macro doesn't work properly. (There is an assert for this in the macro in DEBUGGING builds). That is why the name includes "ALPHA", so you won't forget when using it. This functionality has been in regcomp.c for a while, under a different name. I had thought that the only reason to make it more generally available was potential speed gain, but recent gcc versions optimize to the same code, so I thought there wasn't any point to doing so. But I now think that using this makes things easier to read (and certainly shorter to type in). Once you grok what this macro does, it simplifies what you have to keep in your mind when reading logical expressions with multiple operands. That something can be either upper or lower case can be a distraction to understanding the larger point of the expression.
* Add sync_locale()Karl Williamson2014-08-141-0/+35
| | | | | | This trivial function is to be used by XS code when it changes the program's locale. It hides the details from that code of what needs to be done, which could change in the future.
* locale.c: Clarify commentKarl Williamson2014-08-131-1/+1
|
* locale.c: Use PERL_UNUSED_RESULTKarl Williamson2014-08-131-3/+1
| | | | | The previous way to suppress messages wasn't working for all gcc versions. Spotted by Jarkko Hietaniemi.
* Use grok_atou instead of atoi.Jarkko Hietaniemi2014-07-221-1/+2
| | | | | Remaining atoi() uses include at least: ext/DynaLoader/dl_aix.xs, os2/os2.c, vms/vms.c
* locale.c: Improve some commentsKarl Williamson2014-07-121-3/+6
|
* locale.c: Fix some unused code for potential future useKarl Williamson2014-07-121-33/+46
| | | | | | | | | | This code extends the heuristics used to determine if a locale is UTF-8 or not on older platforms. It has been #ifdef'd out because it only added a little value on dromedary. Now the previous commit has added new heuristics, and tests on dromedary show that this adds nothing to that. But I'm leaving it in the source in case it might ever prove useful. In order to test it, I compiled it and found some problems with the earlier version that this now fixes.
* locale.c: Add new heuristic for finding if locale is UTF-8Karl Williamson2014-07-121-1/+93
| | | | | | | | On older platforms that don't conform to POSIX 2001 nor C99, heuristics are employed to try to determine if a locale is UTF-8 or not. This commit improves those heuristics by looking at names of the months and days of the week to see if they are UTF-8 or not. This is done if looking at the currency symbol failed to help.
* locale.c: White-space onlyKarl Williamson2014-07-121-12/+12
| | | | | Indent and outdent blocks of code to conform to newly formed or removed braces
* locale.c: Refactor UTF8ness of currency symbol codeKarl Williamson2014-07-121-26/+24
| | | | | | | | | On older platforms that aren't C99 nor POSIX 2001, locale.c uses the currency symbol to try to see if a locale is UTF-8 or not. This commit refactors it somewhat to make it cleaner, and which fixes several problems. The least issue was that it sometimes did a setlocale() unnecessarily. Others are that in some circumstances it called localeconv() and/or looked at the result while within the wrong locale.
* locale.c: Use ptr's value before freeing it, not afterKarl Williamson2014-07-121-1/+1
| | | | This only affected runs with the -DL parameter to perl set.
* locale.c: Use safer code practiceKarl Williamson2014-07-121-5/+6
| | | | | | | The interior-most function can return NULL. Currently savepv() which is the next outer function handles this correctly, as does the next outer function, but it is dangerous to rely on that behavior. So we test for NULL before calling functions on a NULL ptr.
* locale.c: Skip compiling fallback code on modern platformsKarl Williamson2014-07-121-1/+5
| | | | | | | | | In the function that determines if a POSIX locale is UTF-8 or not, if either nl_langinfo or MB_CUR_MAX are defined, it can reliably determine the answer. If they are not defined, it uses heuristics to figure things out as best it can. This code doesn't add value for those platforms where one of the two symbols is defined, so can just be ifdef'd out
* locale.c: name should be last resort when deciding if locale is utf8Karl Williamson2014-07-121-73/+78
| | | | | Looking at if the currency symbol is UTF-8 should come ahead of looking at the locale name.
* locale.c: Prepare for rearrangement of code blocksKarl Williamson2014-07-121-7/+6
| | | | | | | | This section of code just returned generally,. This commit changes it so that it drops off the end if it can't determine if the current locale is UTF-8 or not, so that additional tests can be added later. The function defaults to not UTF-8 if this drops off the end, so there should be no functionality change
* locale.c: Fix misplaced parenthesisKarl Williamson2014-07-101-2/+2
| | | | | | Commit a39edc4c877304d4075679b1d8de1904671a9c37 got a parenthesis misplaced so it wasn't really looking at the next character, like it was supposed to be doing
* locale.c: White-space onlyKarl Williamson2014-07-091-8/+9
| | | | Outdent because the previous commit removed the enclosing block.
* locale.c: Remove conditionals.Karl Williamson2014-07-091-11/+10
| | | | | | | | | These two functions are supposed to normally be called through macro interfaces which check whether they actually should be called or not. That means the conditionals removed by this commit are redundant from the normal interface. By removing them, we allow the exceptional case where the code should be executed unconditionally, to happen, by just calling the functions directly, not using the macro interface.
* locale.c: Keep better track of C/non-C localeKarl Williamson2014-07-091-2/+2
| | | | | | | | | | | | Perl uses three interpreter-level (but private) variables to keep track of numeric locales. PL_numeric name is the current underlying locale. PL_standard is a boolean to indicate if we are switched to the C (or POSIX) locale, and PL_local is a boolean to indicate if we are switched to the underlying one. The reason there are two booleans is if the underlying locale is C, both can be true at the same time. But the code that is being changed by this commit didn't realize this, and could unnecessarily set the booleans to FALSE. This could cause unnecessary switching of locales.
* locale.c: Make a common idiom into a macroKarl Williamson2014-07-091-10/+18
|
* Avoid unused warnings from locale-less systems.Jarkko Hietaniemi2014-07-031-0/+6
| | | | | | From Brian Fraser: "Technically, any Perl compiled with -Accflags="-UUSE_LOCALE", or -Ui_locale -Ud_setlocale... realistically, for Android".
* Remove or downgrade unnecessary dVAR.Jarkko Hietaniemi2014-06-251-11/+0
| | | | | | | | You need to configure with g++ *and* -Accflags=-DPERL_GLOBAL_STRUCT or -Accflags=-DPERL_GLOBAL_STRUCT_PRIVATE to see any difference. (g++ does not do the "post-annotation" form of "unused".) The version code has some of these issues, reported upstream.
* Put back an #if-0-ed chunk 7053d92 removed.Jarkko Hietaniemi2014-06-131-0/+75
| | | | | | | The chunk is not MAD-related but instead locale stuff. I have no idea why that chunk got removed (I used a combination of unifdef(1) and editor). It's #if-0-ed, so no change of behavior either way, but let's keep the code for now, since it seems to have "historical significance".
* Remove MAD.Jarkko Hietaniemi2014-06-131-75/+0
| | | | | | MAD = Misc Attribute Decoration; unmaintained attempt at preserving the Perl parse tree more faithfully so that automatic conversion to Perl 6 would have been easier.
* locale.c: Fix uncomplemented 'if' testKarl Williamson2014-06-071-1/+1
| | | | | Somehow the ! in this if () got dropped, and there were no tests to catch it. Now both are remedied.