summaryrefslogtreecommitdiff
path: root/locale.c
Commit message (Collapse)AuthorAgeFilesLines
* Fix broken locale case-insensitive matchingKarl Williamson2013-12-031-5/+5
| | | | | | | | | | Commit 68067e4e501e2ae1c0fb44558b6aa5c0a80a4143 inadvertently broke regular expression /i matching under locale. The tests for this were defective, so the breakage was not caught. A later commit will fix the tests, but this commit restores the functionality. It also casts the input parameter to some functions to be U8 to make sure that optimizing compilers can omit bounds checks
* fix -Wsign-compare in coreDavid Mitchell2013-11-291-2/+2
| | | | | | | | | | | | | There were a few places that were doing unsigned_var = cond ? signed_val : unsigned_val; or similar. Fixed by suitable casts etc. The four in utf8.c were fixed by assigning to an intermediate unsigned var; this has the happy side-effect of collapsing a large macro expansion, where toUPPER_LC() etc evaluate their arg multiple times.
* PATCH: [perl #119443] Blead won't compile on winceKarl Williamson2013-08-231-4/+13
| | | | | | This commit adds #if's to cause locale handling code to compile on platforms that don't have full-featured locale handling. The commits mentioned in the ticket did not adequately cover these situations.
* locale.c: Rmv unused variableKarl Williamson2013-08-121-1/+0
|
* Assume UTF-8 locale if that string occurs anywhere in nameKarl Williamson2013-08-121-11/+23
| | | | | | | | | | | | | When a platform doesn't have nl_langinfo(), heuristics are employed to see if a locale is UTF-8 . The first heuristic is looking at the return value of setlocale(), which generally is the locale name. However, in actuality the return value is opaque and can't be relied on to signify the locale. Nevertheless if it contains the string UTF-8 (ignoring case, and with the hyphen optional), it is a safe bet that the locale is indeed UTF-8. Prior to this patch, we only looked at the end of the name for "UTF-8". This patch makes it not have to be right-anchored. There are UTF-8 locales on our dromedary machine with UTF-8 in the middle of their names.
* locale.c: Add missing STATIC to fcn declKarl Williamson2013-07-191-1/+1
|
* PATCH: [perl #38193] embedded perl always calls setlocale(LC_ALL,"")Karl Williamson2013-07-091-8/+12
| | | | | | | | | | | | | | | | | | This commit causes the locale initialization to skip calling setlocal(foo, "") if the environment variable PERL_SKIP_LOCALE_INIT is set. Instead, the setup code calls setlocale(LC_ALL, NULL) (plus other similar calls for the subcategories) in order to find out what the current locale is. The original poster for this ticket has a workaround for it which involves using a modified copy of Perl core code. This patch defines the C preprocessor variable HAS_SKIP_LOCALE_INIT that can be used by XS writers to discover if the current Perl version needs the workaround or not. I was unable to come up with a test for this patch that did not involve building extensive infrastructure for testing embedded Perl. That does not seem worth it for such a trivial patch. I tested by hand.
* PATCH: [perl #118197] Cope with non-ASCII decimal separatorsKarl Williamson2013-07-071-0/+6
| | | | | | | | This patch causes the radix string to be examined upon a new numeric locale being set. If the string isn't ASCII, and the new locale is UTF-8, it turns on the UTF-8 flag in the scalar that holds the radix. When a floating point number is formatted in Perl_sv_vcatpvfn_flags(), and the flag is on, the result's flag will be set on too.
* locale.c: Further checks for utf8ness of a localeKarl Williamson2013-07-051-0/+154
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In reality, the return value of setlocale() is documented to be opaque, so using it to determine if a locale is UTF-8 or not may not work. It is a char*, which we treat as a name. We can safely assume that if the name contains UTF-8 (or slight variations thereof), that it is a UTF-8 locale. But if the name doesn't contain that, it still could be one. In fact there are currently many locales on our dromedary machine that fall into this category. Similarly, something containing 8859 is not going to be UTF-8. This commit adds another test for cases where there is no nl_langinfo(), and the locale name isn't helpful. It looks at the currency symbol, which typically will be in the locale's script. If that is illegal UTF-8, we know for sure that the locale isn't UTF-8 (or is corrupted). If it is legal UTF-8 (but not ASCII) we can be pretty sure that the locale is UTF-8. If it is ASCII, we still don't know one way or the other, so we err on it not being UTF-8. Originally, I was going to use the locale's error message strings, returned from strerror(), the source for $!, to check for this. These are supposed to be in terms of the LC_MESSAGES locale. Chances are vanishingly small that the locale is not UTF-8 if all the messages pass a utf8ness test, provided that the messages aren't just ASCII. However, on dromedary, the messages for many of the exotic locales haven't been translated, and are still in English, which doesn't help at all. I figure that this is quite likely to be the case generally, and the currency symbol is much more likely to have been translated. I left the code in though, commented out for possible future use. Note that this test will run only on systems that don't have nl_langinfo(). The test can also be turned off by setting a C compiler flag -DNO_LOCALE_MONETARY, (and -DNO_LOCALE_MESSAGES for the commented-out part), corresponding to the way the other categories can be turned off (none of which is documented).
* locale.c: Extract out, fix, expand fcn to see if a locale is utf8Karl Williamson2013-07-051-40/+121
| | | | | | | | | | | | | | | | | | | | | There was buggy code to see if the start-up locale is UTF-8. This commit extracts it into a separate function. The bugs involved looking at the name of the locale to see if that implies a UTF-8 name. Prior to this commit, it looked at the beginning of the locale name, whereas in reality, it is at the end, as in "fr_FR.UTF8". Also, it didn't look for the documented Windows name for UTF-8 locales on those platforms. The function is expanded to have an input category to find the utf8ness of. Thus it now works on any non-LC_ALL category, not just LC_CTYPE. It is possible for categories to be in different locales, so that LC_CTYPE is in a UTF-8 locale, and LC_NUMERIC isn't. For the purposes of PERL_UNICODE, the most applicable category is LC_CTYPE, so that is the one used in its currently only call.
* locale.c: Compare apples to applesKarl Williamson2013-07-051-4/+9
| | | | | | Prior to this patch, one parameter to strNE would have been through a standardizing function, while the other had not. By standardizing both before doing the compare, we avoid false positives.
* perl.h, locale.c: White space onlyKarl Williamson2013-07-051-8/+8
| | | | This indents some nested #if's to clarify the program structure.
* locale.c: Add commentsKarl Williamson2013-07-051-3/+4
|
* update the editor hints for spaces, not tabsRicardo Signes2012-05-291-2/+2
| | | | | This updates the editor hints in our files for Emacs and vim to request that tabs be inserted as spaces.
* Don't #include headers already included by perl.hNicholas Clark2011-09-151-4/+0
| | | | | | | | | 097ee67dff1c60f2 didn't need to include <locale.h> in locale.c (then util.c) because it had been included by perl.h since 5.002 beta 1 3f270f98f9305540 missed removing the include of <unistd.h> from perl.c or perlio.c de8ca8af19546d49 changed perl.h to also include <sys/wait.h>, but didn't notice that it code therefore be removed from perl.c, pp_sys.c and util.c
* When probing strxfrm, consider a consistent return value of 0 as saneNicholas Clark2011-09-091-1/+1
|
* Provide more information in the message for "strxfrm() gets absurd".Nicholas Clark2011-09-091-1/+2
| | | | | Prefix it with "panic", report the two lengths that caused the sanity test failure, and add the message to perldiag.pod.
* Convert some files from Latin-1 to UTF-8Keith Thompson2011-09-071-2/+2
|
* Fix typos (spelling errors) in Perl sources.Peter J. Acklam) (via RT2011-01-071-1/+1
| | | | | | | | | # New Ticket Created by (Peter J. Acklam) # Please include the string: [perl #81904] # in the subject line of all future correspondence about this issue. # <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=81904 > Signed-off-by: Abigail <abigail@abigail.be>
* Change name of ibcmp to foldEQKarl Williamson2010-06-051-8/+8
| | | | | | | | | | | | | | | | As discussed on p5p, ibcmp has different semantics from other cmp functions in that it is a binary instead of ternary function. It is less confusing then to have a name that implies true/false. There are three functions affected: ibcmp, ibcmp_locale and ibcmp_utf8. ibcmp is actually equivalent to foldNE, but for the same reason that things like 'unless' and 'until' are cautioned against, I changed the functions to foldEQ, so that the existing names, like ibcmp_utf8 are defined as macros as being the complement of foldEQ. This patch also changes the one file where turning ibcmp into a macro causes problems. It changes it to use the new name. It also documents for the first time ibcmp, ibcmp_locale and their new names.
* delimcopy(), ibcmp(), ibcmp_locale(), instr(), ninstr() and rninstr() from ↵Vincent Pit2009-08-271-8/+8
| | | | util.c don't need the interpreter as well
* PATCH: Large omnibus patch to clean up the JRRT quotesTom Christiansen2008-11-021-7/+9
| | | | | | Message-ID: <25940.1225611819@chthon> Date: Sun, 02 Nov 2008 01:43:39 -0600 p4raw-id: //depot/perl@34698
* Update copyright years.Nicholas Clark2008-10-251-2/+2
| | | p4raw-id: //depot/perl@34585
* assert() that every NN argument is not NULL. Otherwise we have theNicholas Clark2008-02-121-0/+7
| | | | | | | | | | | | ability to create landmines that will explode under someone in the future when they upgrade their compiler to one with better optimisation. We've already done this at least twice. (Yes, some of the assertions are after code that would already have SEGVd because it already deferences a pointer, but they are put in to make it easier to automate checking that each and every case is covered.) Add a tool, checkARGS_ASSERT.pl, to check that every case is covered. p4raw-id: //depot/perl@33291
* Fix up copyright years for files modified in 2007.Nicholas Clark2007-11-071-2/+2
| | | p4raw-id: //depot/perl@32237
* strxfrm() returns a size_t, not a ssize_t. See:Devin Heitmueller2007-04-261-2/+2
| | | | | | | Subject: locale.c usage of strxfrm From: "Devin Heitmueller" <devin.heitmueller@gmail.com> Message-ID: <412bdbff0704201520i7aac0189n74f0cef5c5213f41@mail.gmail.com> p4raw-id: //depot/perl@31092
* Turn on UTF8 cache assertions with -CaNicholas Clark2006-04-171-0/+2
| | | p4raw-id: //depot/perl@27875
* locale.c: more Safefree() (Coverity finding)Jarkko Hietaniemi2006-04-111-0/+6
| | | | | | Message-Id: <200604111908.k3BJ8ewn030950@kosh.hut.fi> Date: Tue, 11 Apr 2006 22:08:40 +0300 (EEST) p4raw-id: //depot/perl@27769
* Re: [PATCH] locale.c: Coverity findingJarkko Hietaniemi2006-04-091-0/+3
| | | | | Message-ID: <4438B854.6040301@gmail.com> p4raw-id: //depot/perl@27750
* unused context warningsAndy Lester2006-02-241-0/+1
| | | | | Message-ID: <20060221062711.GA16160@petdance.com> p4raw-id: //depot/perl@27300
* Re: [PATCH] s/Null(gv|hv|sv)/NULL/gSteven Schubiger2006-02-031-2/+2
| | | | | | Message-ID: <20060203152449.GI12591@accognoscere.homeunix.org> Date: Fri, 3 Feb 2006 16:24:49 +0100 p4raw-id: //depot/perl@27065
* Re: [PATCH] s/Null(av|ch)/NULL/gSteven Schubiger2006-02-021-6/+6
| | | | | Message-ID: <20060202093849.GD12591@accognoscere.homeunix.org> p4raw-id: //depot/perl@27054
* sprinkle dVARJarkko Hietaniemi2006-01-061-0/+7
| | | | | Message-ID: <43BE7C4D.1010302@gmail.com> p4raw-id: //depot/perl@26675
* More copyright updatesRafael Garcia-Suarez2006-01-041-1/+1
| | | p4raw-id: //depot/perl@26652
* Make the new STR_WITH_LEN() affected compile under -Dusethreads.Gisle Aas2006-01-041-8/+8
| | | | | Can't use STR_WITH_LEN() as argument to a macro :-( p4raw-id: //depot/perl@26649
* Get rid of a few more hardcoded string lengths.Gisle Aas2006-01-041-8/+8
| | | p4raw-id: //depot/perl@26645
* More consting, and putting stuff in embed.fncAndy Lester2005-12-061-1/+1
| | | | | Message-ID: <20051205194613.GB7791@petdance.com> p4raw-id: //depot/perl@26281
* Consting and localizing: Part LXVIIIAndy Lester2005-11-071-6/+5
| | | | | Message-ID: <20051104211256.GA12651@petdance.com> p4raw-id: //depot/perl@26028
* init_i18nl14n is a mathom.Nicholas Clark2005-10-301-7/+0
| | | p4raw-id: //depot/perl@25898
* Re: janitorial work ? [patch]Jim Cromie2005-07-081-1/+1
| | | | | | | | Message-ID: <42CC3CE9.5050606@divsol.com> (reverted all dual-lived modules since they must work with older perls too so must wait for a new Devel::PPPort) p4raw-id: //depot/perl@25101
* Don't check the pointer is non-NULL before calling Safefree() inNicholas Clark2005-07-021-10/+5
| | | | | | | | | little used code, code used only once per run (such as interpreter construction and destruction), and cases where the pointer nearly never is NULL. Safefree does its own non-NULL check, and even that isn't strictly necessary as all conformant free()s accept a NULL pointer. p4raw-id: //depot/perl@25045
* Pre-YAPC consting funAndy Lester2005-06-231-22/+17
| | | | | Message-ID: <20050623190423.GA13835@petdance.com> p4raw-id: //depot/perl@24965
* Include vim/emacs modelines in generated files to open themRafael Garcia-Suarez2005-05-111-2/+2
| | | | | | in read-only mode. Make vi modelines compatible with non-vim vi versions. p4raw-id: //depot/perl@24445
* Add editor boilerplates to all C filesRafael Garcia-Suarez2005-05-101-0/+9
| | | | | (except the generated ones) p4raw-id: //depot/perl@24440
* Symbian port of PerlJarkko Hietaniemi2005-04-211-1/+3
| | | | | Message-ID: <B356D8F434D20B40A8CEDAEC305A1F2453D653@esebe105.NOE.Nokia.com> p4raw-id: //depot/perl@24271
* Consting fiveAndy Lester2005-03-251-1/+2
| | | | | | | | Message-ID: <20050325231409.GB17660@petdance.com> [with modification - the extra argument to incpush was supposed to be being used] p4raw-id: //depot/perl@24081
* strEQ/strNE of 1 character strings seems better hand inlined,Nicholas Clark2005-01-011-3/+5
| | | | | | because it generates smaller object code (as well as being faster than a true function call) p4raw-id: //depot/perl@23725
* Add comment to the top of most .c files explaining their purposeDave Mitchell2004-07-311-0/+5
| | | p4raw-id: //depot/perl@23176
* Fix up Larry's copyright statements to my best knowledge.Jarkko Hietaniemi2003-04-161-1/+2
| | | | | | | (Lots of Perl 5 source code archaeology was involved.) Larry didn't make strangled noises when I showed him the patch, either :-) p4raw-id: //depot/perl@19242
* Update all copyrights to 2003, from JarkkoHugo van der Sanden2003-03-021-1/+1
| | | p4raw-id: //depot/perl@18801