| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As noted in the previous commit, some library functions now keep
per-thread state. So far the only ones we care about are libc
locale-changing ones.
When perl changes threads by swapping out tTHX, those library functions
need to be informed about the new value so that they remain in sync with
what perl thinks the locale should be.
This commit creates a function to do this, and changes the
thread-changing macros to also call this as part of the change.
For POSIX 2008, the function just calls uselocale() using the
per-interpreter object introduced previously.
For Windows, this commit adds a per-interpreter string of the current
LC_ALL, and the function calls setlocale on that. We keep the same
string for POSIX 2008 implementations that lack querylocale(), so this
commit just enables that variable on Windows as well. The code is
already in place to free the memory the string occupies when done.
The commit also creates a mechanism to skip this during thread
destruction. A thread in its death throes doesn't need to have accurate
locale information, and the information needed to map from thread to
what libc needs to know gets destroyed as part of those throes, while
relics of the thread remain. I couldn't find a way to accurately know
if we are dealing with a relic or not, so the solution I adopted was to
just not switch during destruction.
This commit completes fixing #20155.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a step in solving #20155
The POSIX 2008 locale API introduces per-thread locales. But the
previous global locale system is retained, probably for backward
compatibility.
The POSIX 2008 interface causes memory to be malloc'd that needs to be
freed. In order to do this, the caller must first stop using that
memory, by switching to another locale. perl accomplishes this during
termination by switching to the global locale, which is always available
and doesn't need to be freed.
Perl has long assumed that all that was needed to switch threads was to
change out tTHX. That's because that structure was intended to hold all
the information for a given thread. But it turns out that this doesn't
work when some library independently holds information about the
thread's state. And there are now some libraries that do that.
What was happening in this case was that perl thought that it was
sufficient to switch tTHX to change to a different thread in order to do
the freeing of memory, and then used the POSIX 2008 function to change
to the global locale so that the memory could be safely freed. But the
POSIX 2008 function doesn't care about tTHX, and actually was typically
operating on a different thread, and so changed that thread to the global
locale instead of the intended thread. Often that was the top-level
thread, thread 0. That caused whatever thread it was to no longer be in
the expected locale, and to no longer be thread-safe with regards to
localess,
This commit causes locale_term(), which has always been called from the
actual terminating thread that POSIX 2008 knows about, to change to the
global thread and free the memory.
It also creates a new per-interpreter variable that effectively maps the
tTHX thread to the associated POSIX 2008 memory. During
perl_destruct(), it frees the memory this variable points to, instead of
blindly assuming the memory to free is the current tTHX thread's.
This fixes the symptoms associtated with #20155, but doesn't solve the
whole problem. In general, a library that has independent thread status
needs to be updated to the new thread when Perl changes threads using
tTHX. Future commits will do this.
|
|
|
|
|
| |
This prevents some unnecessary steps, that the next commit would turn
into memory leaks.
|
|
|
|
|
| |
This is in preparation for it to be used in more instances in future
commits. It uses a symbol that won't be defined until those commits.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some configurations require us to store the current locale for each
category. Prior to this commit, this was done in the array
PL_curlocales, with the entry for LC_ALL being in the highest element.
Future commits will need just the value for LC_ALL in some other
configurations, without needing the rest of the array. This commit
splits off the LC_ALL element into its own per-interpreter variable to
accommodate those. It always had to have special handling anyway beyond
the rest of the array elements,
|
|
|
|
|
|
| |
This just moves some code out of #ifdefs so that the compiler sees
it, decides it is always false, and almost certainly won't generate any
code for it, but stops warning.
|
|
|
|
|
|
| |
Other platforms declare the nl_item typedef an int, but this one makes
it a long. To portably output its value, cast it to a long and use the
%ld format.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Warning fixed:
locale.c:130:55: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘int’ [-Wformat=]
130 | dSAVE_ERRNO; dTHX; PerlIO_printf(Perl_debug_log, "\n%s: %" LINE_Tf ": ", \
| ^~~~~~~~~
This warning might be only on 32-bit build.
|
|
|
|
|
|
| |
C99 allows declarations to be closer to their first use. This also
removes a redundant conditional that would set a variable to what it
already was initialized to.
|
|
|
|
|
| |
Add a bit of safety, and makes it correspond to the other setlocale
returns we use.
|
|
|
|
| |
And move declarations closer to first use as allowed in C99
|
|
|
|
|
|
|
|
|
| |
This changes these functions to take the code page as input, instead of
being just UTF-8. Macros are created to call them with UTF-8.
I'm doing this because there is no loss of efficiency, and it is
somewhat jarring, given Perl terminology, to call a function with 'Byte'
in the name with a parameter with 'utf8' in the name.
|
|
|
|
|
|
|
| |
These are non-API, used in this file, and because of #ifdefs, not
accessible outside it, so there is no current need to make them publicly
available. If we were ever to need them to be accessible more widely,
they would not belong in this file.
|
|
|
|
|
|
| |
The wide setlocale function in Windows has been in the field since 5.32,
long enough, that we won't be forced to discontinue its use. So can
remove the never-used overrides, cleaning it up slightly
|
|
|
|
|
|
| |
This gets the trivial case out of the way, and can use plain setlocale,
as the locale string is non-existent, so doesn't need to handle
different character sets.
|
|
|
|
|
|
|
| |
The previous commit changed find_locale_from_environment() to work on
Windows, and took care to not make the function have side effects. But
in the only use of this function so far (and likely forever), those side
effects are fine. Changing to allow them simplifies things.
|
|
|
|
|
|
| |
There is code in locale.c to emulate POSIX 'setlocale(foo, "")'. And
there is separate code to emulate this on Windows. This commit
collapses them, ensuring the same algorithm is used on both systems.
|
|
|
|
|
|
| |
This changes this function a bit to make the next commit easier, which
will extend the function to being usable from Windows. This also moves
declarations closer to first use, as now allowed in C99.
|
|
|
|
|
| |
This is in preparation for this function to be used under more
circumstances.
|
|
|
|
| |
This makes the calls to it cleaner.
|
|
|
|
|
|
|
|
|
|
|
| |
Locale names are supposed to be opaque to the calling program. The
only requirement is that any name output by libc means the same as input
to that libc. And it makes sense, you might very well want to have a
locale name in your native language. This commit changes locale.c to
not impose any restrictions on the name proper. (It should be noted,
however, other Standards have come along that specify a particular
syntax using only ASCII. Perl needn't, and shouldn't, impose those
further restrictions.)
|
|
|
|
| |
This makes it easier to understand what's going on in threaded perls.
|
| |
|
| |
|
|
|
|
|
| |
A future commit will want the context for more than just DEBUGGING
builds.
|
|
|
|
|
|
| |
In reading this code, I realized that there were instances where the
functions didn't work properly. It is hard to test these, but a future
commit will do so.
|
|
|
|
|
| |
This makes some print statements less awkward, and is more flexible,
which will be used in future commits
|
|
|
|
|
|
|
| |
Depending on Configuration and platform and details of the current
request, the value returned could be pointing to a system static buffer,
or be a temporary freeable upon LEAVE. This commit standardizes it to a
known per-interpreter buffer that can be properly freed at termination.
|
|
|
|
|
|
|
|
| |
This function is called to save a string to a buffer. Teach it to treat
as a no-op the string passed being the buffer itself. This generalizes
it to make it work properly under more circumstances; the commit also
removes the current case where the function call was explicitly avoided
under this circumstance.
|
|
|
|
|
|
| |
At this point we have two variables which we just set equal. Change
here to use the synonym that doesn't require looking elsewhere to
understand what's going on.
|
|
|
|
|
|
|
|
|
| |
Coverity complains that the call to Safefree() now at line 5098
could be called with an uninitialized value for curlocales[i],
which appears to be possible if the trial_locales loop just below
this change fails to find a locale.
Fixes CID 184451
|
|
|
|
|
|
|
|
| |
S_less_dicey_bool_setlocale_r() is a short function that makes a
complete set of similar functions, but there is no current use of it.
So just #ifdef it out.
This resolves #20338
|
|
|
|
|
| |
setlocale_debug_string() variants now use Perl_form, a function I
didn't know existed when I originally wrote this code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This a new set of macros and functions to do locale changing and
querying for platforms where perl is compiled with threads, but the
platform doesn't have thread-safe locale handling.
All it does is:
1) The return of setlocale() is always safely saved in a per-thread
buffer, and
2) setlocale() is protected by a mutex from other threads which are
using perl's locale functions.
This isn't much, but it might be enough to get some programs to work on
such platforms which rarely change or query the locale.
|
|
|
|
|
|
| |
This macro is used to surround raw setlocale() calls so that the return
value in a global static buffer can be saved without interference with
other threads.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See https://github.com/Perl/perl5/issues/20155
The root cause of that problem is that under POSIX 2008, when a thread
terminates, it causes thread 0 (the controller) to change to the global
locale. Commit a7ff7ac caused perl to pay attention to the environment
variables in effect at startup for setting the global locale when using
the POSIX 2008 locale API. (Previously only the initial per-thread
locale was affected.)
This causes problems when the initial setting was for a locale that uses
a comma as the radix character, but the thread 0 is set to a locale that
is expecting a dot as a radix character. Whenever another thread
terminates, thread 0 was silently changed to using the global locake,
and hence a comma. This caused parse errors.
The real solution is to fix thread 0 to remain in its chosen locale.
But that fix is not ready in time for 5.37.4, and it is deemed important
to get something working for this monthly development release.
This commit changes the initial global LC_NUMERIC locale to always be C,
hence uses a dot radix. The vast majority of code is expecting a dot.
This is not the ultimate fix, but it works around the immediate problem
at hand.
The test case is courtesy @bram-perl
|
|
|
|
|
| |
These are the final (unless I missed something) cases where LC_ALL could
be referred to even if undefined on the system.
|
|
|
|
| |
Prior to this commit, wrong cpp directives did not guarantee this.
|
|
|
|
|
| |
STMT_START...END aren't required in DEBUG() calls, as that macro already
wraps its argument with those. So, they are just clutter here.
|
|
|
|
|
|
|
|
|
| |
Without this commit, Perl won't compile if -DUSE_NL_LOCALE_NAME is
specified to Configure. This is an undocumented feature that uses an
undocumented glibc feature that is effectively the querylocale() found
on Darwin and some other systems. POSIX 2017 has added a
querylocale-like function to the repertoire, and should eventually
supplant this option.
|
|
|
|
|
| |
The previous commit initialized this variable early in start up, so that
we never have to now check that it is non-NULL.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are currently 4 functions that do special handling when the locale
for their respective categories changes. One of these is for LC_ALL,
which has and continues to be called at the end of initialization.
But the other three have changed in recent commits to handle the trivial
case specially of the locale being "C". These changes now avoid the
complexities required for the general case (that needs everything to be
set up at the time of the call).
They can thus be called early in the initialization precess. This
avoids having to duplicate their logics in the initialization code,
which has led to some things being overlooked there. Now everything is
guaranteed to stay in sync.
|
|
|
|
|
| |
This probably doesn't matter, but it's better form to initialize it to a
sane value.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Merge commit e4bbbfe02b9e9aae521b164eba0e518ca478945f refactored this
function some. Most of the commits in that series dated to before when
we could assume C99. In re-reading the result, I saw some opportunities
to take advantage of C99, by, for example, moving declarations closer to
their use.
I also hadn't previously noticed that when changing to the C locale (a
frequent occurrence), various things that we being recalculated are
determinable at compile time. So this commit returns early under this
circumstance.
And, an obsolete comment is removed
|
|
|
|
|
| |
On Configurations without LC_COLLATE, various unused warnings were
being generated.
|
|
|
|
|
| |
On Configurations without LC_CTYPE, various unused warnings were
being generated.
|
|
|
|
|
| |
This function is not used unless locales are enabled, so need not be
defined unless that is true.
|
|
|
|
|
| |
This function is not used unless LC_NUMERIC is enabled, so need not be
defined unless that is true.
|
|
|
|
|
|
|
|
|
|
| |
This fixes #20140
This static variable is used in just one or (unlikely) two places, and
only in some Configureations. Rather than add #ifdefs, or make a
PERL_UNUSED call somewhere, making it a #define fixes the issue without
taking up extra memory except in some dumb compilers under unlikely
Configurations.
|