summaryrefslogtreecommitdiff
path: root/locale.c
Commit message (Collapse)AuthorAgeFilesLines
* Check for NULL locale in S_emulate_setlocaleHugo van der Sanden2021-09-121-2/+3
| | | | | gcc-11.2.0 correctly warns that it is called with NULL from Perl_switch_to_global_locale().
* gh17824: zero curlocales[]Hugo van der Sanden2021-05-311-0/+3
| | | | | Static analysis tools such as Coverity and clang report that we can otherwise end up reading uninitialized data, and inspection agrees.
* locale.c: Use memzero, instead of memset(0)Karl Williamson2021-04-151-1/+1
| | | | Clearer to use the more direct operation
* locale.c: Clarifying commentsKarl Williamson2021-04-151-27/+44
|
* locale.c: Use %z modifier instead of castKarl Williamson2021-03-201-2/+2
| | | | It's better to use a %z modifier than to cast the operand.
* locale.c: Add a branch predictionKarl Williamson2021-03-121-1/+1
|
* locale.c: Silence compiler warningKarl Williamson2021-02-111-0/+5
| | | | | | Later gcc compilers warn on these intentional fall throughs in a switch(). Adding FALLTHROUGH would make things a lot less clear, so turn off the warnings around the whole switch.
* locale.c: Remove dead codeKarl Williamson2021-02-091-131/+0
| | | | | | This code was left in for a potential guide for the future. Now that I'm getting round to implementing it, I'm going in a different direction.
* style: Detabify indentation of the C code maintained by the core.Michael G. Schwern2021-01-171-44/+44
| | | | | | | | | | | This just detabifies to get rid of the mixed tab/space indentation. Applying consistent indentation and dealing with other tabs are another issue. Done with `expand -i`. * vutil.* left alone, it's part of version. * Left regen managed files alone for now.
* locale.c: Work around a z/OS limitation/featureKarl Williamson2020-12-121-1/+13
| | | | | | | | | | | | | | | Without per-thread locales, a multi-thread application is inherently unsafe. IBM solves that by allowing you to set up the locale any way you want, but after you've created a thread, all future locale changes are ignored, and return failure. But Perl itself changes the locale in a couple of cases. Recent changes have surfaced this issue in one case, leading to a panic. And this commit works around it, so that messages will be displayed in the locale in effect before the threads were created. The remaining case requires further investigation. Nothing in our suite is failing.
* locale.c: Remove some unnecessary mutex locksKarl Williamson2020-12-081-5/+0
| | | | | | | | | These aren't necessary as the called function has its own lock until done copying into the local structure. And these were breaking blead on Windows, as they are no longer defined. The smoke I ran included more commits beyond the breaking one, so I didn't catch it.
* locale.c: Unlock mutex before croakingKarl Williamson2020-12-081-0/+2
| | | | | These cases aren't supposed to happen, but unlock the mutex first; we could get into deadlock in trying to output the death message.
* Name individual locale locksKarl Williamson2020-12-081-17/+15
| | | | | | | | | These locks for different functions all use the same underlying mutex; but that may not always be the case. By creating separate names used only when we think they will be necessary, the compiler will complain if the conditions in the code that actually use them are the same. Doing this showed a misspelling in an #ifdef, fixed in 9289d4dc7a3d24b20c6e25045e687321ee3e8faf
* Change name of mutex macro.Karl Williamson2020-12-081-7/+7
| | | | | This macro is for localeconv(); the new name is clearer as to the meaning, and this preps for further changes.
* locale.c: Add debugging info to panic messageKarl Williamson2020-12-081-2/+2
|
* duplocale() is part of Posix 2008 localesKarl Williamson2020-12-081-4/+2
| | | | | Thus if we know we have the Posix versions, we have duplocale(), and hence don't need to test separately for it.
* locale.c: Fix typo in #ifdefKarl Williamson2020-12-041-1/+1
| | | | | This misspelling led to the code assuming that the platform didn't have a feature that, if used, would result in faster execution.
* locale.c: Move comment to better placeKarl Williamson2020-11-261-4/+5
|
* autodoc.pl: Specify scn for single-purpose filesKarl Williamson2020-11-061-3/+0
| | | | | | | | Many of the files in perl are for one thing only, and hence their embedded documentation will be for that one thing. By creating a hash here of them, those files don't have to worry about what section that documentation goes under, and so it can be completely changed without affecting them.
* Fix typosSamanta Navarro2020-10-031-1/+1
| | | | | | | | | For: https://github.com/Perl/perl5/pull/18201 Committer: Samanta Navarro is now a Perl author. To keep 'make test_porting' happy: Increment $VERSION in several files. Regenerate uconfig.h via './perl -Ilib regen/uconfig_h.pl'.
* Reorganize perlapiKarl Williamson2020-09-041-1/+1
| | | | | This uses a new organization of sections that I came up with. I asked for comments on p5p, but there were none.
* Remove use of dVAR in coreDagfinn Ilmari Mannsåker2020-07-201-5/+0
| | | | | It only does anything under PERL_GLOBAL_STRUCT, which is gone. Keep the dNOOP defintion for CPAN back-compat
* Remove PERL_GLOBAL_STRUCTDagfinn Ilmari Mannsåker2020-07-201-6/+2
| | | | | | | | This was originally added for MinGW, which no longer needs it, and only still used by Symbian, which is now removed. This also leaves perlapi.[ch] empty, but we keep the header for CPAN backwards compatibility.
* Add z/OS locale categoriesKarl Williamson2020-07-171-1/+45
| | | | | | | z/OS has two locale categories, LC_SYNTAX and LC_TOD, not found outside IBM products. This makes Perl know about them, so that a program can refer to them, but like other similar categories found on other OS's, nothing more is done with them.
* switch_category_locale_to_template: Fix use-after-free under -DLvDagfinn Ilmari Mannsåker2020-03-161-1/+1
| | | | Coverity CID 288709
* Add thread safety to some environment accessesKarl Williamson2020-03-111-26/+4
| | | | | | | | | | | | | | | | | | The previous commit added a mutex specifically for protecting against simultaneous accesses of the environment. This commit changes the normal getenv, putenv, and clearenv functions to use it, to avoid races. This makes the code simpler in places where we've gotten burned and added stuff to avoid races. Other places where we haven't known we were getting burned could have existed until now. Now that comes automatically, and we can remove the special cases we earlier stumbled over. getenv() returns a pointer to static memory, which can be overwritten at any moment from another thread, or even another getenv from the same thread. This commit changes the accesses to be under control of a mutex, and in the case of getenv, a mortalized copy is created so that there is no possible race.
* Fixup POSIX::mbtowc, wctombKarl Williamson2020-02-191-2/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit enhances these functions so that on threaded perls, they use mbrtowc and wcrtomb when available, making them thread safe. The substitution isn't completely transparent, as no effort is made to hide any differences in errno setting upon error. And there may be slight differences in edge case behavior on some platforms. This commit also changes the behaviors so that they take a scalar parameter instead of a char *, and this might be 'undef' or not be forceable into a valid PV. If not a PV, the functions initialize the shift state. Previously the shift state was always reinitialized with every call, which meant these could not work on locales with shift states. In addition, there were several issues in mbtowc and wctomb that this commit fixes. mbtowc and wctomb, when used, are now run with a semaphore. This avoids races if called at the same time in another thread. The returned wide character from mbtowc() could well have been garbage. The final parameter to mbtowc is now optional, as passing an SV allows us to determine the length without the need for an extra parameter. It is now used only to restrict the parsing of the string to shorter than the actual length. wctomb would segfault if the string parameter was shared or hadn't been pre-allocated with a string of sufficient length to hold the result.
* POSIX::mblen() Make thread-safe; allow shift state controlKarl Williamson2020-02-191-0/+6
| | | | | | | | | | | | | | | | | | | This commit changes the behavior so that it takes a scalar parameter instead of a char *, and thus might not be forceable into a valid PV. When not a PV, the shift state is reinitialized, like calling mblen with a NULL first parameter. Previously the shift state was always reinitialized with every call, which meant this could not work on locales with shift states. This commit also changes to use mbrlen() on threaded perls transparently (mostly), when available, to achieve thread-safe operation. It is not completely transparent because mbrlen (under the very rare stateful locales) returns a different value when it's resetting the shift state. It also may set errno differently upon errors, and no effort is made to hide that difference. Also mbrlen on some platforms can handle partial characters. [perl #133928] showed that someone was having trouble with shift states.
* locale.c: Use proper #ifdef to enable behaviorKarl Williamson2019-11-301-3/+3
| | | | | | | This changes to use USE_POSIX_2008_LOCALE instead of HAS_POSIX_2008_LOCALE. Rarely do they differ, but someone may choose to configure their installation to not use these more modern functions, even if available, perhaps because they're buggy on that system.
* locale.c white space onlyKarl Williamson2019-11-301-2/+2
|
* PATCH: GH #17081: Workaround glibc bug with LC_MESSAGESKarl Williamson2019-11-301-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Please see the ticket for a full explanation. This bug has been submitted to glibc, without any real action forthcoming so far. This invalidates the message cache each time the locale of LC_MESSAGES is changed, as glibc should be doing this when uselocale changes that, but glibc fails to do so. This patch is an extension to the one submitted by Niko Tyni++. I don't know how to test it, since a test would rely on several different locales in different languages being available, and that depends on what's installed on the platform. I suppose that one could go through the available locales, and try to find three with different wording for the same message. Doing so however would trigger the bug, and at the end, if we didn't get three that differed, we wouldn't know we wouldn't know if it is because of the bug, or that they just didn't exist on the system. However, below is a perl program that demonstrated the patch worked. You could adjust it to the available locales. The buggy code shows the same text for all locales. The fixed shows three different languages. use strict; use Locale::gettext; use POSIX; $ENV{LANG} = 'C.UTF-8'; for my $lang (qw(fi_FI fr_FR en_US)) { $ENV{LANGUAGE} = $lang; setlocale(LC_MESSAGES, ''); my $d = Locale::gettext->domain("bash"); print $d->get('syntax error'), "\n"; }
* (perl #133981) fix my stupid mistakeTony Cook2019-09-051-2/+2
|
* (perl #133981) fix for Win32 setlocale() abortTony Cook2019-09-031-1/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This appears to abort because the supplied locale string isn't validly encoded in the current code page, so we see the following steps: 1) an internal sizing call to mbstowcs_s() fails, but 2) the calling (CRT) code doesn't handle that, allocating a zero length buffer 3) mbstowcs_s() is called with a buffer and a zero size, causing the exception. Since it's the conversion that fails, perform our own conversion. Rather than using the current code page always use CP_UTF8, since this is perl's typical non-Latin1 encoding. Unfortunately we don't have the SVf_UTF8 flag at this point, so all we can do is assume UTF-8. This introduces a change in behaviour - previously locale names were interpreted in the current code page, but most locale names are ASCII, so it shouldn't matter. One issue is that the return value is freed on the next LEAVE, but all callers immediately use or copy the string.
* locale.c: Stop Coverity warningKarl Williamson2019-08-061-5/+6
| | | | | Coverity is right, so re-order these clauses. This code is executed only if some very strange error occurs.
* Fix "it it" typosDagfinn Ilmari Mannsåker2019-07-041-1/+1
| | | | And regen affected files
* PATCH: [perl #134098] no locales + debugging = no compileKarl Williamson2019-05-241-1/+1
| | | | The wrong #define was being tested for
* locale.c: Fix '%s' directive argument is nullKarl Williamson2019-05-241-0/+1
| | | | | This was just an oversight. THe code doesn't get executed unless it's trying to panic
* locale.c: Add some commentsKarl Williamson2019-05-241-4/+7
|
* locale.c: remove unnecessary castJerome Duval2019-05-241-3/+1
| | | | | | This was failing in gcc 2.95. The original commit added a cast, but we figured out that removing this other one that really served no purpose causes this compiler to work.
* s/safefree()/Safefree() in a few placesDavid Mitchell2019-04-171-2/+2
| | | | | | Karl pointed that a couple of my recent commits used (lower case) safefree() rather than Safefree(), the latter having extra debugging facilities.
* fix leak when $LANG unsetDavid Mitchell2019-04-161-11/+8
| | | | | | | | | | | | | | | | The following leaked: LANG= perl -e1 because in S_emulate_setlocale(), it was 1) making a copy of $ENV{"LANG"}; 2) throwing that copy away and replacing it with "C" when it discovered that the string was empty. A little judicious reordering of that chunk of code makes the issue go away. Showed up as failures of lib/locale_threads.t under valgrind / ASan.
* fix locale leaks on utf8 stringsDavid Mitchell2019-04-161-0/+2
| | | | | | | | | | | | | | | | | | | | | | For example the following leaked: require POSIX; import POSIX ':locale_h'; setlocale(&POSIX::LC_ALL, 'aa_DJ.iso88591') or die; use locale; my $ok = 'A' lt chr 0x100; Some code in Perl__mem_collxfrm() does a couple of for (j = 1; j < 256; j++) { ... } loops where for each chr(j) character it recursively calls itself, and records the index of the 'smallest' / 'largest' result. However, when updating cur_min_x / cur_max_x, it wasn't freeing the previous value. The symptoms were that valgrind / Address Sanitizer found fault with lib/locale.t
* fix locale.c under -DPERL_GLOBAL_STRUCT_PRIVATEDavid Mitchell2019-04-021-0/+1
|
* perlapi: Add weasel word to make stmt accurateKarl Williamson2019-03-271-1/+1
| | | | | | It is possible to have a single-threaded build use the thread-safe locale setting operations. Add a word to indicate it's not 100% the other way.
* PATCH: [perl #133959] Free BSD broken testsKarl Williamson2019-03-271-1/+1
| | | | | | | | | | | | | | | | Commit 70bd6bc82ba64c1d197d3ec823f43c4a454b2920 fixed a leak (likely due to a bug in glibc) by not duplicating the C locale object. However, that meant that there's only one copy running around. And freeing that will cause havoc, as its supposed to be there until destruction. What appears to be happening is that the current locale object is freed upon thread destruction, and that could be this global one. But I don't understand why it's only happening on Free BSD and only on this version. But this commit fixes the problem there, and makes sense. Simply don't free this global object upon thread destruction. This commit also changes it so it doesn't get destroyed at destruction time, leaving it to the final PERL_SYS_TERM to free. I'm not sure, but I think this fixes any issues with embedded perls.
* locale.c: White-space, comment onlyKarl Williamson2019-03-211-45/+60
| | | | | Indent a block newly formed in the previous commit. Wrap some too-long lines
* locale.c: Don't try to recreate the LC_ALL C localeKarl Williamson2019-03-211-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | On threaded perls, we create a locale object for LC_ALL "C" early in the startup phase. When the user asks for that locale, we can just switch to it instead of trying to create a new one. Doing the creation worked, but ended up with a memory leak. My guess, and its only a guess, is that it's a bug in glibc newlocale.c, in which it does an early return, not doing proper cleanup, when it discovers it can re-use an existing locale without needing to create a new one. The reason I think its a glibc bug is that the sample one-liner sent to me PERL_DESTRUCT_LEVEL=2 valgrind --leak-check=full ./perl -DLv -Ilib -e'require POSIX;POSIX::setlocale(&POSIX::LC_ALL, "C");' 2>&1 | more produced a stack output of where the leaked memory had been allocated. I put a print immediately after that line, and prints at the points where things get freed. Every allocation was matched by an attempt to free it. But clearly at least one failed. freelocale() returns void, so can't be checked for failing. Anyway, it's better to try not to create a new locale when we already have an existing one, and doing so, as this commit does, causes the leak to go away. No tests are added, as there are plenty of similar tests already in the suite, and they all should have been leaking.
* Add, improve some debugging stmts for -DL (locales)Karl Williamson2019-03-211-1/+5
|
* Properly handle systems with crippled localesKarl Williamson2019-03-041-2/+5
| | | | | | | | | | | | | | Some systems fake their locales, so that they pretend to accept a locale change, but they either do nothing, making everything the C locale, or on some systems there is a a second locale "C-UTF-8" that can be switched to. Configure probes have been added to find such systems, and this commit changes to use the results of these probes, so that we don't try looking for other locales (any names we came up with would be accepted as valid, but don't work, and tests were failing as a result). Anything running the musl library fits, as does OpenBSD and its kin, as they view locales as security risks. This commit allows us to take out some code that was looking for particular OS's.
* locale.c: Tighten turkish locale tests on C99 platformsKarl Williamson2019-03-041-0/+12
| | | | | C99 has wide character case changing. If those are available, use them to be surer we have a Turkic locale.