| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
This changes from using the standard C, generally unsafe, library
functions to using Perl safer alternatives. This code, only used in
debugging, really doesn't need that safety, but I had forgotten that
Perl makes it easy to add it, and it silences the warnings about using
the C functions from t/porting/libperl.t. Why this warning didn't
happen in smoking, I don't know.
Spotted by Dave Mitchell.
|
|
|
|
|
| |
This adds more stuff that gets dumped when debugging locale handling.
And it adds even more when the v modifier appears.
|
|
|
|
|
|
|
|
|
|
|
| |
This initialization is done before the processing of command line
arguments, so that it has to be handled specially. This commit changes
the initialization code to output debugging information if the
environment variable PERL_DEBUG_LOCALE_INIT is set.
I don't see the need to document this outside the source, as anyone who
is using it would be reading the source anyway; it's of highly
specialized use.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
locale.c:
- the pointers are always null at this point, see
http://www.nntp.perl.org/group/perl.perl5.porters/2015/07/msg229533.html
pp.c:
- reduce scope of temp_buffer and svrecode, into an inner branch
- in some permutations, either temp_buffer is never set to non-null, or
svrecode, in permutations where it is known that the var hasn't been set
yet, skip the freeing calls at the end, this doesn't eliminate all
permutations with NULL being passed to Safefree and SvREFCNT_dec, but
only some of them
regcomp.c
- dont create a save stack entry to call Safefree(NULL), see ticket for
this patch for some profiling stats
|
|
|
|
| |
Add a comment, indent some nested #if's
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An empty cpan/.dir-locals.el stops Emacs using the core defaults for
code imported from CPAN.
Committer's work:
To keep t/porting/cmp_version.t and t/porting/utils.t happy, $VERSION needed
to be incremented in many files, including throughout dist/PathTools.
perldelta entry for module updates.
Add two Emacs control files to MANIFEST; re-sort MANIFEST.
For: RT #124119.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some questions and loose ends:
XXX gv.c:S_gv_magicalize - why are we using SSize_t for paren?
XXX mg.c:Perl_magic_set - need appopriate error handling for $)
XXX regcomp.c:S_reg - need to check if we do the right thing if parno
was not grokked
Perl_get_debug_opts should probably return something unsigned; not sure
if that's something we can change.
|
|
|
|
|
|
| |
The variables in these statments were undefined when compiled with
ccflag -DNO_LOCALE, because the declarations are skipped then. Just
move them a few lines up so are within the same #if.
|
|
|
|
|
|
|
| |
See https://rt.perl.org/Public/Bug/Display.html?id=123748.
This also changes a '0' into a FALSE when initializing a boolean, which
I consider clearer.
|
| |
|
|
|
|
| |
Extracted from patch submitted by Lajos Veres in RT #123693.
|
|
|
|
|
|
| |
A better comment is added. The #if is moved so that the rare
compilation that doesn't use LC_CTYPE, no unused variable warning would
be generated.
|
|
|
|
|
|
|
|
|
|
| |
The bulk of this macro is extremely rarely executed, so it makes sense
to optimize for space, as it is called from a fair number of places, and
move as much as possible to a single function.
For whatever it's worth, on my system with my typical compilation
options, including -O0, the savings was 19640 bytes in regexec.o, 4528
in utf8.o, at a cost of 1488 in locale.o.
|
|
|
|
|
|
|
| |
I spotted this in code review. I didn't add a test for it, because to
expose the much more serious bug fixed by the previous commit, I had to
temporarily change the C code to force these extremely
unlikely-to-be-taken branches to execute.
|
|
|
|
|
| |
I got confused in writing this: the global needs to be cleared always,
and set to NULL.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 8c6180a91de91a1194f427fc639694f43a903a78 added a warning message
for when Perl determines that the program's underlying locale just
switched into is poorly supported. At the time it was thought that this
would be an extremely rare occurrence. However, a bug in HP-UX -
B.11.00/64 causes this message to be raised for the "C" locale. A
workaround was done that silenced those. However, before it got fixed,
this message would occur gobs of times executing the test suite. It was
raised even if the script is not locale-aware, so that the underlying
locale was completely irrelevant. There is a good prospect that someone
using an older Asian locale as their default would get this message
inappropriately, even if they don't use locales, or switch to a
supported one before using them.
This commit causes the message to be raised only if it actually is
relevant. When not in the scope of 'use locale', the message is stored,
not raised. Upon the first locale-dependent operation within a bad
locale, the saved message is raised, and the storage cleared. I was
able to do this without adding extra branching to the main-line
non-locale execution code. This was done by adding regnodes which get
jumped to by switch statements, and refactoring some existing C tests so
they exclude non-locale right off the bat.
These changes would have been necessary for another locale warning that
I previously agreed to implement, and which is coming a few commits from
now.
I do not know of any way to add tests in the test suite for this. It is
in fact rare for modern locales to have these issues. The way I tested
this was to temporarily change the C code so that all locales are viewed
as defective, and manually note that the warnings came out where
expected, and only where expected.
I chose not to try to output this warning on any POSIX functions called.
I believe that all that are affected are deprecated or scheduled to be
deprecated anyway. And POSIX is closer to the hardware of the machine.
For convenience, I also don't output the message for some zero-length
pattern matches. If something is going to be matched, the message will
likely very soon be raised anyway.
|
|
|
|
|
|
| |
HP-UX - B.11.00/64 has a problem with the C locale that's only
noticeable from newly added warnings flooding the logs. This adds a
test to suppress them.
|
|
|
|
|
| |
is_ascii_string's name has misled me in the past; the new name is
clearer.
|
|
|
|
|
|
|
|
|
| |
Some systems setlocale()s use static storage for the locale name
returned by it, so that a subsequent setlocale overwrites it.
Therefore, you must make a copy of the name if you want it to work after
the next setlocale.
Thanks to Craig Berry for finding and diagnosing this problem.
|
|
|
|
|
| |
This reverts commit 1244bd171b8d1fd4b6179e537f7b95c38bd8f099,
thus reinstating commit 3d3a881c1b0eb9c855d257a2eea1f72666e30fbc.
|
|
|
|
|
|
|
| |
This reverts commit 3d3a881c1b0eb9c855d257a2eea1f72666e30fbc.
Win32 with a 1252 code page was failing blead. Revert until I have time
to look at it.
|
|
|
|
|
|
|
|
|
| |
Perl only supports single-byte locales (except for UTF-8 ones), and has
poor support for 7-bit locales that aren't supersets of ASCII (these
should be exceedingly rare these days).
This commit raises warnings in the new locale warning category when
such a locale is entered.
|
|
|
|
|
|
|
|
| |
Building a debugging perl triggered warnings such as
warning: format ‘%d’ expects argument of type ‘int’, but argument 3 has type ‘U32’
warning: field width specifier ‘*’ expects argument of type ‘int’, but argument 5 has type ‘long unsigned int’
warning: format ‘%x’ expects argument of type ‘unsigned int’, but argument 3 has type ‘wchar_t’
|
|
|
|
|
| |
At least on the system that we have tested on. There are locales that
say they are UTF-8, but they're not; they're EBCDIC 1047.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds to handy.h isALPHA_FOLD_EQ(c1,c2) which efficiently tests if
c1 and c2 are the same character, case-insensitively. For example
isALPHA_FOLD_EQ(c, 's') returns true if and only if <c> is 's' or 'S'.
isALPHA_FOLD_NE() is also added by this commit.
At least one of c1 and c2 must be known to be in [A-Za-z] or this macro
doesn't work properly. (There is an assert for this in the macro in
DEBUGGING builds). That is why the name includes "ALPHA", so you won't
forget when using it.
This functionality has been in regcomp.c for a while, under a different
name. I had thought that the only reason to make it more generally
available was potential speed gain, but recent gcc versions optimize to
the same code, so I thought there wasn't any point to doing so.
But I now think that using this makes things easier to read (and
certainly shorter to type in). Once you grok what this macro does, it
simplifies what you have to keep in your mind when reading logical
expressions with multiple operands. That something can be either upper
or lower case can be a distraction to understanding the larger point of
the expression.
|
|
|
|
|
|
| |
This trivial function is to be used by XS code when it changes the
program's locale. It hides the details from that code of what needs to
be done, which could change in the future.
|
| |
|
|
|
|
|
| |
The previous way to suppress messages wasn't working for all gcc
versions. Spotted by Jarkko Hietaniemi.
|
|
|
|
|
| |
Remaining atoi() uses include at least:
ext/DynaLoader/dl_aix.xs, os2/os2.c, vms/vms.c
|
| |
|
|
|
|
|
|
|
|
|
|
| |
This code extends the heuristics used to determine if a locale is UTF-8
or not on older platforms. It has been #ifdef'd out because it only
added a little value on dromedary. Now the previous commit has added
new heuristics, and tests on dromedary show that this adds nothing to
that. But I'm leaving it in the source in case it might ever prove
useful. In order to test it, I compiled it and found some problems with
the earlier version that this now fixes.
|
|
|
|
|
|
|
|
| |
On older platforms that don't conform to POSIX 2001 nor C99, heuristics
are employed to try to determine if a locale is UTF-8 or not. This
commit improves those heuristics by looking at names of the months and
days of the week to see if they are UTF-8 or not. This is done if
looking at the currency symbol failed to help.
|
|
|
|
|
| |
Indent and outdent blocks of code to conform to newly formed or removed
braces
|
|
|
|
|
|
|
|
|
| |
On older platforms that aren't C99 nor POSIX 2001, locale.c uses the
currency symbol to try to see if a locale is UTF-8 or not. This commit
refactors it somewhat to make it cleaner, and which fixes several
problems. The least issue was that it sometimes did a setlocale()
unnecessarily. Others are that in some circumstances it called
localeconv() and/or looked at the result while within the wrong locale.
|
|
|
|
| |
This only affected runs with the -DL parameter to perl set.
|
|
|
|
|
|
|
| |
The interior-most function can return NULL. Currently savepv() which is
the next outer function handles this correctly, as does the next outer
function, but it is dangerous to rely on that behavior. So we test for
NULL before calling functions on a NULL ptr.
|
|
|
|
|
|
|
|
|
| |
In the function that determines if a POSIX locale is UTF-8 or not, if
either nl_langinfo or MB_CUR_MAX are defined, it can reliably determine
the answer. If they are not defined, it uses heuristics to figure
things out as best it can. This code doesn't add value for those
platforms where one of the two symbols is defined, so can just be
ifdef'd out
|
|
|
|
|
| |
Looking at if the currency symbol is UTF-8 should come ahead of looking
at the locale name.
|
|
|
|
|
|
|
|
| |
This section of code just returned generally,. This commit changes it
so that it drops off the end if it can't determine if the current locale
is UTF-8 or not, so that additional tests can be added later. The
function defaults to not UTF-8 if this drops off the end, so there
should be no functionality change
|
|
|
|
|
|
| |
Commit a39edc4c877304d4075679b1d8de1904671a9c37 got a parenthesis
misplaced so it wasn't really looking at the next character, like it was
supposed to be doing
|
|
|
|
| |
Outdent because the previous commit removed the enclosing block.
|
|
|
|
|
|
|
|
|
| |
These two functions are supposed to normally be called through macro
interfaces which check whether they actually should be called or not.
That means the conditionals removed by this commit are redundant from
the normal interface. By removing them, we allow the exceptional case
where the code should be executed unconditionally, to happen, by just
calling the functions directly, not using the macro interface.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Perl uses three interpreter-level (but private) variables to keep track
of numeric locales. PL_numeric name is the current underlying locale.
PL_standard is a boolean to indicate if we are switched to the C (or
POSIX) locale, and PL_local is a boolean to indicate if we are switched
to the underlying one. The reason there are two booleans is if the
underlying locale is C, both can be true at the same time. But the code
that is being changed by this commit didn't realize this, and
could unnecessarily set the booleans to FALSE. This could cause
unnecessary switching of locales.
|
| |
|
|
|
|
|
|
| |
From Brian Fraser: "Technically, any Perl compiled with
-Accflags="-UUSE_LOCALE", or -Ui_locale -Ud_setlocale...
realistically, for Android".
|
|
|
|
|
|
|
|
| |
You need to configure with g++ *and* -Accflags=-DPERL_GLOBAL_STRUCT
or -Accflags=-DPERL_GLOBAL_STRUCT_PRIVATE to see any difference.
(g++ does not do the "post-annotation" form of "unused".)
The version code has some of these issues, reported upstream.
|
|
|
|
|
|
|
| |
The chunk is not MAD-related but instead locale stuff. I have no idea
why that chunk got removed (I used a combination of unifdef(1) and editor).
It's #if-0-ed, so no change of behavior either way, but let's keep
the code for now, since it seems to have "historical significance".
|
|
|
|
|
|
| |
MAD = Misc Attribute Decoration; unmaintained attempt at preserving
the Perl parse tree more faithfully so that automatic conversion to
Perl 6 would have been easier.
|
|
|
|
|
| |
Somehow the ! in this if () got dropped, and there were no tests to
catch it. Now both are remedied.
|