| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Commit 68067e4e501e2ae1c0fb44558b6aa5c0a80a4143 inadvertently
broke regular expression /i matching under locale. The tests for this
were defective, so the breakage was not caught. A later commit will fix
the tests, but this commit restores the functionality.
It also casts the input parameter to some functions to be U8 to make
sure that optimizing compilers can omit bounds checks
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were a few places that were doing
unsigned_var = cond ? signed_val : unsigned_val;
or similar. Fixed by suitable casts etc.
The four in utf8.c were fixed by assigning to an intermediate
unsigned var; this has the happy side-effect of collapsing
a large macro expansion, where toUPPER_LC() etc evaluate their arg
multiple times.
|
|
|
|
|
|
| |
This commit adds #if's to cause locale handling code to compile on
platforms that don't have full-featured locale handling. The commits
mentioned in the ticket did not adequately cover these situations.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a platform doesn't have nl_langinfo(), heuristics are employed
to see if a locale is UTF-8 . The first heuristic is looking at the
return value of setlocale(), which generally is the locale name.
However, in actuality the return value is opaque and can't be relied on
to signify the locale. Nevertheless if it contains the string UTF-8
(ignoring case, and with the hyphen optional), it is a safe bet that the
locale is indeed UTF-8. Prior to this patch, we only looked at the end
of the name for "UTF-8". This patch makes it not have to be
right-anchored. There are UTF-8 locales on our dromedary machine with
UTF-8 in the middle of their names.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit causes the locale initialization to skip calling
setlocal(foo, "") if the environment variable PERL_SKIP_LOCALE_INIT is
set. Instead, the setup code calls setlocale(LC_ALL, NULL) (plus other
similar calls for the subcategories) in order to find out what the
current locale is.
The original poster for this ticket has a workaround for it which
involves using a modified copy of Perl core code. This patch defines
the C preprocessor variable HAS_SKIP_LOCALE_INIT that can be used by XS
writers to discover if the current Perl version needs the workaround or
not.
I was unable to come up with a test for this patch that did not involve
building extensive infrastructure for testing embedded Perl. That does
not seem worth it for such a trivial patch. I tested by hand.
|
|
|
|
|
|
|
|
| |
This patch causes the radix string to be examined upon a new numeric
locale being set. If the string isn't ASCII, and the new locale is
UTF-8, it turns on the UTF-8 flag in the scalar that holds the radix.
When a floating point number is formatted in Perl_sv_vcatpvfn_flags(),
and the flag is on, the result's flag will be set on too.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In reality, the return value of setlocale() is documented to be opaque,
so using it to determine if a locale is UTF-8 or not may not work. It
is a char*, which we treat as a name. We can safely assume that if the
name contains UTF-8 (or slight variations thereof), that it is a UTF-8
locale. But if the name doesn't contain that, it still could be one.
In fact there are currently many locales on our dromedary machine that
fall into this category. Similarly, something containing 8859 is not
going to be UTF-8.
This commit adds another test for cases where there is no nl_langinfo(),
and the locale name isn't helpful. It looks at the currency symbol,
which typically will be in the locale's script. If that is illegal
UTF-8, we know for sure that the locale isn't UTF-8 (or is corrupted).
If it is legal UTF-8 (but not ASCII) we can be pretty sure that the
locale is UTF-8. If it is ASCII, we still don't know one way or the
other, so we err on it not being UTF-8.
Originally, I was going to use the locale's error message strings,
returned from strerror(), the source for $!, to check for this.
These are supposed to be in terms of the LC_MESSAGES locale. Chances
are vanishingly small that the locale is not UTF-8 if all the messages
pass a utf8ness test, provided that the messages aren't just ASCII.
However, on dromedary, the messages for many of the exotic locales
haven't been translated, and are still in English, which doesn't help at
all. I figure that this is quite likely to be the case generally, and
the currency symbol is much more likely to have been translated.
I left the code in though, commented out for possible future use.
Note that this test will run only on systems that don't have
nl_langinfo(). The test can also be turned off by setting a C compiler
flag -DNO_LOCALE_MONETARY, (and -DNO_LOCALE_MESSAGES for the
commented-out part), corresponding to the way the other categories can
be turned off (none of which is documented).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was buggy code to see if the start-up locale is UTF-8. This
commit extracts it into a separate function.
The bugs involved looking at the name of the locale to see if that
implies a UTF-8 name. Prior to this commit, it looked at the
beginning of the locale name, whereas in reality, it is at the end, as
in "fr_FR.UTF8".
Also, it didn't look for the documented Windows name for UTF-8 locales
on those platforms.
The function is expanded to have an input category to find the utf8ness
of. Thus it now works on any non-LC_ALL category, not just LC_CTYPE.
It is possible for categories to be in different locales, so that
LC_CTYPE is in a UTF-8 locale, and LC_NUMERIC isn't. For the purposes
of PERL_UNICODE, the most applicable category is LC_CTYPE, so that is
the one used in its currently only call.
|
|
|
|
|
|
| |
Prior to this patch, one parameter to strNE would have been through a
standardizing function, while the other had not. By standardizing both
before doing the compare, we avoid false positives.
|
|
|
|
| |
This indents some nested #if's to clarify the program structure.
|
| |
|
|
|
|
|
| |
This updates the editor hints in our files for Emacs and vim to request
that tabs be inserted as spaces.
|
|
|
|
|
|
|
|
|
| |
097ee67dff1c60f2 didn't need to include <locale.h> in locale.c (then
util.c) because it had been included by perl.h since 5.002 beta 1
3f270f98f9305540 missed removing the include of <unistd.h> from perl.c
or perlio.c
de8ca8af19546d49 changed perl.h to also include <sys/wait.h>, but didn't
notice that it code therefore be removed from perl.c, pp_sys.c and util.c
|
| |
|
|
|
|
|
| |
Prefix it with "panic", report the two lengths that caused the sanity test
failure, and add the message to perldiag.pod.
|
| |
|
|
|
|
|
|
|
|
|
| |
# New Ticket Created by (Peter J. Acklam)
# Please include the string: [perl #81904]
# in the subject line of all future correspondence about this issue.
# <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=81904 >
Signed-off-by: Abigail <abigail@abigail.be>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed on p5p, ibcmp has different semantics from other cmp
functions in that it is a binary instead of ternary function. It is
less confusing then to have a name that implies true/false.
There are three functions affected: ibcmp, ibcmp_locale and ibcmp_utf8.
ibcmp is actually equivalent to foldNE, but for the same reason that things
like 'unless' and 'until' are cautioned against, I changed the functions
to foldEQ, so that the existing names, like ibcmp_utf8 are defined as
macros as being the complement of foldEQ.
This patch also changes the one file where turning ibcmp into a macro
causes problems. It changes it to use the new name. It also documents
for the first time ibcmp, ibcmp_locale and their new names.
|
|
|
|
| |
util.c don't need the interpreter as well
|
|
|
|
|
|
| |
Message-ID: <25940.1225611819@chthon>
Date: Sun, 02 Nov 2008 01:43:39 -0600
p4raw-id: //depot/perl@34698
|
|
|
| |
p4raw-id: //depot/perl@34585
|
|
|
|
|
|
|
|
|
|
|
|
| |
ability to create landmines that will explode under someone in the
future when they upgrade their compiler to one with better
optimisation. We've already done this at least twice.
(Yes, some of the assertions are after code that would already have
SEGVd because it already deferences a pointer, but they are put in
to make it easier to automate checking that each and every case is
covered.)
Add a tool, checkARGS_ASSERT.pl, to check that every case is covered.
p4raw-id: //depot/perl@33291
|
|
|
| |
p4raw-id: //depot/perl@32237
|
|
|
|
|
|
|
| |
Subject: locale.c usage of strxfrm
From: "Devin Heitmueller" <devin.heitmueller@gmail.com>
Message-ID: <412bdbff0704201520i7aac0189n74f0cef5c5213f41@mail.gmail.com>
p4raw-id: //depot/perl@31092
|
|
|
| |
p4raw-id: //depot/perl@27875
|
|
|
|
|
|
| |
Message-Id: <200604111908.k3BJ8ewn030950@kosh.hut.fi>
Date: Tue, 11 Apr 2006 22:08:40 +0300 (EEST)
p4raw-id: //depot/perl@27769
|
|
|
|
|
| |
Message-ID: <4438B854.6040301@gmail.com>
p4raw-id: //depot/perl@27750
|
|
|
|
|
| |
Message-ID: <20060221062711.GA16160@petdance.com>
p4raw-id: //depot/perl@27300
|
|
|
|
|
|
| |
Message-ID: <20060203152449.GI12591@accognoscere.homeunix.org>
Date: Fri, 3 Feb 2006 16:24:49 +0100
p4raw-id: //depot/perl@27065
|
|
|
|
|
| |
Message-ID: <20060202093849.GD12591@accognoscere.homeunix.org>
p4raw-id: //depot/perl@27054
|
|
|
|
|
| |
Message-ID: <43BE7C4D.1010302@gmail.com>
p4raw-id: //depot/perl@26675
|
|
|
| |
p4raw-id: //depot/perl@26652
|
|
|
|
|
| |
Can't use STR_WITH_LEN() as argument to a macro :-(
p4raw-id: //depot/perl@26649
|
|
|
| |
p4raw-id: //depot/perl@26645
|
|
|
|
|
| |
Message-ID: <20051205194613.GB7791@petdance.com>
p4raw-id: //depot/perl@26281
|
|
|
|
|
| |
Message-ID: <20051104211256.GA12651@petdance.com>
p4raw-id: //depot/perl@26028
|
|
|
| |
p4raw-id: //depot/perl@25898
|
|
|
|
|
|
|
|
| |
Message-ID: <42CC3CE9.5050606@divsol.com>
(reverted all dual-lived modules since they must work with older
perls too so must wait for a new Devel::PPPort)
p4raw-id: //depot/perl@25101
|
|
|
|
|
|
|
|
|
| |
little used code, code used only once per run (such as interpreter
construction and destruction), and cases where the pointer nearly
never is NULL. Safefree does its own non-NULL check, and even that
isn't strictly necessary as all conformant free()s accept a NULL
pointer.
p4raw-id: //depot/perl@25045
|
|
|
|
|
| |
Message-ID: <20050623190423.GA13835@petdance.com>
p4raw-id: //depot/perl@24965
|
|
|
|
|
|
| |
in read-only mode. Make vi modelines compatible with non-vim
vi versions.
p4raw-id: //depot/perl@24445
|
|
|
|
|
| |
(except the generated ones)
p4raw-id: //depot/perl@24440
|
|
|
|
|
| |
Message-ID: <B356D8F434D20B40A8CEDAEC305A1F2453D653@esebe105.NOE.Nokia.com>
p4raw-id: //depot/perl@24271
|
|
|
|
|
|
|
|
| |
Message-ID: <20050325231409.GB17660@petdance.com>
[with modification - the extra argument to incpush was supposed to
be being used]
p4raw-id: //depot/perl@24081
|
|
|
|
|
|
| |
because it generates smaller object code (as well as being
faster than a true function call)
p4raw-id: //depot/perl@23725
|
|
|
| |
p4raw-id: //depot/perl@23176
|
|
|
|
|
|
|
| |
(Lots of Perl 5 source code archaeology was involved.)
Larry didn't make strangled noises when I showed him
the patch, either :-)
p4raw-id: //depot/perl@19242
|
|
|
| |
p4raw-id: //depot/perl@18801
|