| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Old glibc versions had a buggy modulo implementation for 64 bit
integers on 32-bit architectures. This was fixed in glibc 2.3,
released in 2002 (the version check in the code is overly cautious).
Removing the alternate PP function support is left for the next
commit, in case we need to resurrect it in future.
|
|
|
|
|
|
|
| |
check that porting/copyright.t is passing when
run with --now
../perl -I../lib porting/copyright.t --now
|
|
|
|
|
|
|
| |
This replaces strchr("list", c) calls throughout the core. They don't
work properly when 'c' is a NUL, returning the position of the
terminating NUL in "list" instead of failure. This could lead to
segfaults or even security issues.
|
|
|
|
|
| |
and the associated commits, at least until a way to make
wrap_op_checker() work is available.
|
|
|
|
| |
Fixes issue #14816
|
|
|
|
| |
This is for Devel::PPPort.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is part of fixing gh #17154
This scenario from the ticket
(https://github.com/Perl/perl5/issues/17154#issuecomment-558877358)
shows why this fix is necessary:
main interpreter initializes PL_AboveLatin1 to an SV it owns
loads threads::lite and creates a new thread/interpreter which
initializes PL_AboveLatin1 to a SV owned by the new interpreter
threads::lite child interpreter finishes, freeing all of its SVs,
PL_AboveLatin1 is now invalid
main interpreter uses a regexp that relies on PL_AboveLatin1, dies
horribly.
By making these interpreter level variables, this is avoided. There is
extra copying, but it is just the SV headers, as the real data is kept
as static C arrays.
|
|
|
|
|
|
|
| |
Currently, whether the OS-level signal handler function is declared as
1-arg or 3-arg depends on the configuration. Add explicit versions of
these functions, principally so that POSIX.xs can call which version of
the handler it wants regardless of configuration: see next commit.
|
| |
|
|
|
|
| |
This is part of this branch of changes.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
This reverts commit 2773b4f50f991900e38d33daace2b9c6a0902c6a.
I haven't made much progress in resolving the problems this produces
downstream, so rather than leaving it broken, I'll revert it until
they can be solved.
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 857320cbf85e762add18885ae8a197b5e0c21b69,
re-instating the [perl #2754] fix, which was reverted in late
2017 to allow Module::Install based distributions to update or
re-work per [perl #132577].
# Conflicts:
# t/op/blocks.t
|
|
|
|
| |
and update eval_pv() to use it.
|
|
|
|
|
| |
This environment variable was previously only checked for on DEBUGGING
builds.
|
|
|
|
|
| |
Remove WinCE support as agreed in the thread starting here:
https://www.nntp.perl.org/group/perl.perl5.porters/2018/07/msg251683.html
|
|
|
|
| |
This would allow to rethrow object exceptions.
|
|
|
|
|
|
|
|
|
| |
This information is already in embed.fnc, and we know it compiles. Some
of this information is now out-of-date. Get rid of it.
There was one bit of information that was (apparently) wrong in
embed.fnc. The apidoc line asked that there be no usage example
generated for newXS. I added that flag to the embed.fnc entry.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 70bd6bc82ba64c1d197d3ec823f43c4a454b2920 fixed a leak (likely due
to a bug in glibc) by not duplicating the C locale object. However,
that meant that there's only one copy running around. And freeing that
will cause havoc, as its supposed to be there until destruction. What
appears to be happening is that the current locale object is freed upon
thread destruction, and that could be this global one. But I don't
understand why it's only happening on Free BSD and only on this version.
But this commit fixes the problem there, and makes sense. Simply don't
free this global object upon thread destruction.
This commit also changes it so it doesn't get destroyed at destruction
time, leaving it to the final PERL_SYS_TERM to free. I'm not sure, but
I think this fixes any issues with embedded perls.
|
|
|
|
|
|
|
|
|
| |
Normally by the time we reach perl_destruct(), PL_parser should be null
due to having its original (null) value restored by SAVEt_PARSER during
leaving scope (usually before run-time starts in fact). But if a thread
is created within a BEGIN block, the parser is duped, but the
SAVEt_PARSER savestack entry isn't. So PL_parser never gets cleaned up.
Clean it up in perl_destruct() instead. This is a bit of a hack.
|
| |
|
|
|
|
| |
This will be used in future commits
|
| |
|
|
|
|
| |
... as in other ifdefs within S_Internals_V(pTHX_ CV *cv).
|
|
|
|
| |
Spotted by Tux
|
|
|
|
| |
A space is needed in these formats to comply with C++11
|
|
|
|
| |
The former is designed to be compilable out.
|
|
|
|
|
| |
These variables are constant, once initialized, through the life of a
program, so having them be per instance is a waste of time and space
|
|
|
|
|
|
|
|
|
|
| |
This changes the internal function grok_atoUV() to not require its input
to be NUL-terminated. That means the existing calls to it must be
changed to set the ending position before calling it, as some did
already.
This function is recommended to use in a couple of pods, but it wasn't
documented in perlintern. This commit does that as well.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #133220
This commit partially reverts v5.27.6-180-g0301e89953.
That commit changed the return values of perl_parse() and perl_run()
so that an exit(0) wouldn't return 0 (which indicates a normal finish)
and instead return 0x100, which a indicates non-normal return, but with
a value which if used as an 8-bit process exit value on UNIX, has the
modulo value of 0.
However, it turns out that perl_run() (via S_run_body()) does a my_exit(0)
rather than just running to completion. So it turns out that it's not
possible to distinguish between perl code finishing normally, and perl
code doing exit(0).
This broke code which embedded perl and expected perl_run() to return 0
on normal completion.
It may be possible to fix this by getting S_run_body() to not call
my_exit(0), but that's too unpredictable change while we're at -RC1.
So just revert the new perl_run() 0x100 behaviour for now.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These previously were statics in perl.c. A future commit would need
access to these from regcomp.c. We could create an access function in
perl.c so that regcomp.c could access them, or we could move them to
regcomp.c. But doing that means also they would be statics in
re_comp.c, and that would mean two copies.
So that means an access function is needed. Their use is really
unrelated to perl.c, which merely initializes them, so that could have
an access function instead. But the most logical place for their home
is utf8.c, which is described as for Unicode things, not just UTF-8
things.
So this commit moves these inversion lists to utf8.c, and creates an
initialization function called on perl startup from perl.c
|
|
|
|
|
|
|
|
|
|
| |
This commit changes to use the C data structures generated by the
previous commit to compute what characters fold to a given one. This is
used to find out what things should match under /i.
This now avoids the expensive start up cost of switching to perl
utf8_heavy.pl, loading a file from disk, and constructing a hash from
it.
|
|
|
|
|
| |
These were for when some of the Posix character classes were implemented
as swashes, which is no longer the case, so these can be removed.
|
|
|
|
|
|
|
|
| |
This commit makes the inversion lists for parsing character name global
instead of interpreter level, so can be initialized once per process,
and no copies are created upon new thread instantiation. More
importantly, this is another instance where utf8_heavy.pl no longer
needs to be loaded, and the definition files read from disk.
|
|
|
|
|
|
|
| |
These read-only globals can be initialized in perl.c, which allows us to
remove runtime checks that they are initialized. This commit also takes
advantage of the fact that they are now always initialized to use them
as inversion lists, avoid swash creation.
|
|
|
|
|
|
|
|
| |
The initialization time spent here is trivial, and this saves a copy of
these arrays on some systems. This is because there is only one perl.c,
and there is both regcomp.c and re_comp.c which would contain the
identical static const array. Some OS's won't remove the duplicate
copies.
|
|
|
|
|
| |
These are now constant through the life of the program, so don't need to
be duplicated at each new thread instantiation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this commit, if a program wanted to compute the case-change of
a character above 0xFF, the C code would switch to perl, loading
lib/utf8heavy.pl and then read another file from disk, and then create a
hash. Future references would use the hash, but the start up cost is
quite large. There are five case change types, uc, lc, tc, fc, and
simple fc. Only the first encountered requires loading of utf8_heavy,
but each required switching to utf8_heavy, and reading the appropriate
file from disk.
This commit changes these functions to use compiled-in C data structures
(inversion maps) to represent the data. To look something up requires a
binary search instead of a hash lookup.
An individual hash lookup tends to be faster than a binary search, but
the differences are small for small sizes. I did some benchmarking some
years ago, (commit message 87367d5f9dc9bbf7db1a6cf87820cea76571bf1a) and
the results were that for fewer than 512 entries, the binary search was
just as fast as a hash, if not actually faster. Now, I've done some
more benchmarks on blead, using the tool benchmark.pl, which wasn't
available back then. The results below indicate that the differences
are minimal up through 2047 entries, which all Unicode properties are
well within.
A hash, PL_foldclosures, is still constructed at runtime for the case of
regular expression /i matching, and this could be generated at Perl
compile time, as a further enhancement for later. But reading a file
from disk is no longer required to do this.
======================= benchmarking results =======================
Key:
Ir Instruction read
Dr Data read
Dw Data write
COND conditional branches
IND indirect branches
_m branch predict miss
_m1 level 1 cache miss
_mm last cache (e.g. L3) miss
- indeterminate percentage (e.g. 1/0)
The numbers represent raw counts per loop iteration.
"\x{10000}" =~ qr/\p{CWKCF}/"
swash invlist Ratio %
fetch search
------ ------- -------
Ir 2259.0 2264.0 99.8
Dr 665.0 664.0 100.2
Dw 406.0 404.0 100.5
COND 406.0 405.0 100.2
IND 17.0 15.0 113.3
COND_m 8.0 8.0 100.0
IND_m 4.0 4.0 100.0
Ir_m1 8.9 17.0 52.4
Dr_m1 4.5 3.4 132.4
Dw_m1 1.9 1.2 158.3
Ir_mm 0.0 0.0 100.0
Dr_mm 0.0 0.0 100.0
Dw_mm 0.0 0.0 100.0
These were constructed by using the file whose contents are below, which
uses the property in Unicode that currently has the largest number of
entries in its inversion list, > 1600. The test was run on blead -O2,
no debugging, no threads. Then the cut-off boundary was changed from
512 to 2047 for when we use a hash vs an inversion list, and the test
run again. This yields the difference between a hash fetch and an
inversion list binary search
===================== The benchmark file is below ===============
no warnings 'once';
my @benchmarks;
push @benchmarks, 'swash' => {
desc => '"\x{10000}" =~ qr/\p{CWKCF}/"',
setup => 'no warnings "once"; my $re = qr/\p{CWKCF}/; my $a =
"\x{10000}";',
code => '$a =~ $re;',
};
\@benchmarks;
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 9fe4122e6defd7e9204ed6f2370d926d4c3b261b broke threaded builds
because it changed to free a global variable upon thread exit (I had
forgotten that it wasn't an interpreter variable).
I do not know why this passed before pushing; others have had trouble
reproducing it. But the same tests were failing for me now. The one
difference is that I had been using clang with address sanitizer
compiled in but turned off when I made that commit. Now I'm using g++
Spotted by Dave Mitchell
|
|
|
|
|
| |
This stops potential memory leaks when using POSIX 2008 locale handling,
by freeing the current locale object and two special ones.
|
|
|
|
|
|
|
|
|
|
| |
These structures are read-only, use const C strings, and are truly
global, so no need to have them be interpreter level. This saves
duplicating and freeing them as threads come and go.
In doing this, I noticed that not every one was properly being
copied/deallocated, so this fixes some potential unreported bugs, and
leaks.
|
|
|
|
|
|
| |
PL_C_locale_obj is now only created on threaded builds on systems with
POSIX 2008. On unthreaded builds, we really should continue to use the
old tried and true library calls.
|
|
|
|
| |
init_i18nl10n(1) uses SAVEFREEPV, before any ENTER is performed. Move it afterwards
|
|
|
|
|
|
| |
This (large) commit allows locales to be used in threaded perls on
platforms that support it. This includes recent Windows and Posix 2008
ones.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
khw could not find any modules on CPAN that correctly use the C library
function setlocale(). (The very few that do try, do not use it
correctly, looking at the return value incorrectly, so they are broken.)
This analysis does not include modules that call non-Perl libaries that
may call setlocale().
And, a future commit will render the setlocale() function useless in
some configurations on some platforms.
So this commit adds Perl_setlocale(), for XS code to call, and which is
always effective, but it should not be used to alter the locale except
on platforms where the predefined variable ${^SAFE_LOCALES} evaluates to
1.
This function is also what POSIX::setlocale() calls to do the real work.
|
|
|
|
|
| |
perl.h has a single #define which is the combination of several that
determines if this object should be created or not.
|
|
|
|
|
|
|
|
|
| |
On systems that have the POSIX 2008 operations, including
nl_langinfo_l(), this commit causes them to not have to actually change
the locale when determining what the decimal point character is.
The locale may have to change during the printing/reading of numbers,
but eventually we can use sprintf_l(), if available, to avoid that too.
|
|
|
|
|
| |
This is now done very early in the file, as it may be needed for
initializing the locale handling.
|