| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
contents is short
it is a bit "noisy" to have comments that duplication the conditions
when the original line with the condition is visible on the screen at
the same time. this patch changes the rules so we only add these comments
when the clause is 10 lines or more from its prior
|
|
|
|
|
| |
Vim's filetype declarations are case sensitive. The correct types for
Perl, C, and Pod are perl, c, and pod, respectively.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This defines a new magic hash C<%{^HOOK}> which is intended to be used for
hooking keywords. It is similar to %SIG in that the values it contains
are validated on set, and it is not allowed to store something in
C<%{^HOOK}> that isn't supposed to be there. Hooks are expected to be
coderefs (people can use currying if they really want to put an object
in there, the API is deliberately simple.)
The C<%{^HOOK}> hash is documented to have keys of the form
"${keyword}__${phase}" where $phase is either "before" or "after"
and in this initial release two hooks are supported,
"require__before" and "require__after":
The C<require__before> hook is called before require is executed,
including any @INC hooks that might be fired. It is called with the path
of the file being required, just as would be stored in %INC. The hook
may alter the filename by writing to $_[0] and it may return a coderef
to be executed *after* the require has completed, otherwise the return
is ignored. This coderef is also called with the path of the file which
was required, and it will be called regardless as to whether the require
(or its dependencies) die during execution. This mechanism makes it
trivial and safe to share state between the initial hook and the coderef
it returns.
The C<require__after> hook is similar to the C<require__before> hook
however except that it is called after the require completes
(successfully or not), and its return is ignored always.
|
|
|
|
|
|
|
|
|
|
| |
This updates the mode-line for most of our generated files so that
they include file type information so they will be properly syntax
highlighted on github.
This does not make any other functional changes to the files.
[Note: Commit message rewritten by Yves]
|
|
|
|
| |
Note this also tidies embed.fnc as well.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a step in solving #20155
The POSIX 2008 locale API introduces per-thread locales. But the
previous global locale system is retained, probably for backward
compatibility.
The POSIX 2008 interface causes memory to be malloc'd that needs to be
freed. In order to do this, the caller must first stop using that
memory, by switching to another locale. perl accomplishes this during
termination by switching to the global locale, which is always available
and doesn't need to be freed.
Perl has long assumed that all that was needed to switch threads was to
change out tTHX. That's because that structure was intended to hold all
the information for a given thread. But it turns out that this doesn't
work when some library independently holds information about the
thread's state. And there are now some libraries that do that.
What was happening in this case was that perl thought that it was
sufficient to switch tTHX to change to a different thread in order to do
the freeing of memory, and then used the POSIX 2008 function to change
to the global locale so that the memory could be safely freed. But the
POSIX 2008 function doesn't care about tTHX, and actually was typically
operating on a different thread, and so changed that thread to the global
locale instead of the intended thread. Often that was the top-level
thread, thread 0. That caused whatever thread it was to no longer be in
the expected locale, and to no longer be thread-safe with regards to
localess,
This commit causes locale_term(), which has always been called from the
actual terminating thread that POSIX 2008 knows about, to change to the
global thread and free the memory.
It also creates a new per-interpreter variable that effectively maps the
tTHX thread to the associated POSIX 2008 memory. During
perl_destruct(), it frees the memory this variable points to, instead of
blindly assuming the memory to free is the current tTHX thread's.
This fixes the symptoms associtated with #20155, but doesn't solve the
whole problem. In general, a library that has independent thread status
needs to be updated to the new thread when Perl changes threads using
tTHX. Future commits will do this.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some configurations require us to store the current locale for each
category. Prior to this commit, this was done in the array
PL_curlocales, with the entry for LC_ALL being in the highest element.
Future commits will need just the value for LC_ALL in some other
configurations, without needing the rest of the array. This commit
splits off the LC_ALL element into its own per-interpreter variable to
accommodate those. It always had to have special handling anyway beyond
the rest of the array elements,
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit removes the separate mutex for locking locale-related
numeric operations on threaded perls; instead using the general locale
one. The previous commit made that a general semaphore, so now suitable
for use for this purpose as well.
This means that the locale can be locked for the duration of some
sprintf operations, longer than before this commit. But on most modern
platforms, thread-safe locales cause this lock to expand just to a
no-op; so there is no effect on these. And on the impacted platforms,
one is not supposed to be using locales and threads in combination, as
races can occur. This lock is used on those perls to keep Perl's
manipulation of LC_NUMERIC thread-safe. And for those there is also no
effect, as they already lock around those sprintf's.
|
|
|
|
|
| |
Future commits will use this new capability, and in Configurations where
no locale locking is currently necessary.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
What these functions do has been subsumed by code introduced in previous
commits, and in a more straight forward manner.
Also removed in this commit is the cache of the knowing what locales are
UTF-8 or not. This data is now cheaper to calculate when needed, and
there is now a single entry cache, so I don't think the complexity
warrants keeping it.
It could be added back if necessary, split off from the remainder of
this commit.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Nested BEGIN blocks can cause us to segfault by exhausting
the C stack. Eg:
perl -le'sub f { eval "BEGIN { f() }" } f()'
will segfault. This adds a new interpreter var PL_eval_begin_nest_depth
to keep track of how many layer of eval/BEGIN we have seen, and a new
reserved variable called ${^MAX_NESTED_EVAL_BEGIN_BLOCKS} which can be
used to raise or lower the limit. When set to 0 it blocks BEGIN entirely,
which might be useful from time to time.
This fixes https://github.com/Perl/perl5/issues/20176
|
|
|
|
|
|
|
|
|
|
|
| |
When changing locales the new decimal point needs to be calculated.
This commit creates a new per-interpreter variable to save that
calculation, so it only has to be done when a new locale is set; prior
to this commit it was recalculated each time it was needed.
The calculation is still performed twice when the new locale is switched
into. But the redundant calculation will be removed in a couple of
commits hence.
|
|
|
|
|
|
|
|
| |
This is now used as a cache of length 1 to avoid having to lookup up the
UTF-8ness as often.
This commit also skips doing S_newctype() if the new boss is the same as
the old
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This env var can be used to trigger a repeatable run of a script which
calls C<srand()> with no arguments, either explicitly or implicitly
via use of C<rand()> prior to calling srand(). This is implemented in
such a way that calling C<srand()> with no arguments in forks or
subthreads (again explicitly or implicitly) will receive their own seed
but the seeds they receive will be repeatable.
This is intended for debugging and perl development performance testing,
and for running the test suite consistently. It is documented that the
exact seeds used to initialize the random state are unspecified, and
that they may change between releases or even builds. The only guarantee
provided is that the same perl executable will produce the same results
twice all other things being equal. In practice and in core testing we
do expect consistency, but adding the tightest set of restrictions on
our commitments seemed sensible.
The env var is ignored when perl is run setuid or setgid similarly to
the C<PERL_INTERNAL_RAND_SEED> env var.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The POSIX 2008 API has an edge case in that the result of most of the
functions when called with a global (as opposed to a per-thread) locale
is undefined.
The duplocale() function is the exception which will create a per-thread
locale containing the values copied from the global one.
This commit just calls duplocale, if needed, and the caller need not
concern itself with this possibility
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function is rewritten to handle LC_ALL, and to make it easier to
add new checks.
There is also a change, which I think is an improvement, in that everything
starting with a \n is trimmed, instead of just a trailing \n.
A couple of calls to stdize_locale() are removed, as they are redundant,
because they are called only as a result of Perl_setlocale() being
called, and that ends up calling stdize_locale always, early on.
The call to savepv() is also moved in a couple cases to after the result
is known to not be NULL
I originally had such a new check in mind, but it turned out that doing
it here didn't solve the problem, so this commit has been amended
(before ever being pushed) to not include that.
chomped.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A long standing bug in Perl that has gone undetected is that the array
is global that is created when changing locales and tells fc() and qr//i
matching what the folds are in the new locale.
What this means is that any program only has one set of fold definitions
that apply to all threads within it, even if we claim that the locales
are thread-safe on the given platform. One possibility for this going
undetected so long is that no one is using locales on multi-threaded
systems much. Another possibility is that modern UTF-8 locales have the
same set of folds as any other one.
It is a simple matter to make the fold array per-thread instead of
per-process, and that solves the problem transparently to other code.
I discovered this stress-testing locale handling under threads. That
test will be added in a future commit.
In order to keep from having a dTHX inside foldEQ_locale, it has to have
a pTHX_ parameter. This means that the other functions that function
pointer variables get assigned to point to have to have an identical
signature, which means adding pTHX_ to functions that don't require it.
The bodies of all these are known to the compiler, since they are all
in inline.h or in the same .c file as where they are called. Hence the
compiler can optimize out the unused parameter.
Two calls of STR_WITH_LEN also have to be changed because of C
preprocessor limitations; perhaps there is another way to do it that I'm
unfamiliar with.
|
|
|
|
|
| |
The purpose of PL_origenviron is to preserve the earliest known value
of environ, which is a global. All interpreters should share it.
|
|
|
|
|
|
| |
Save/restore PL_prevailing_version at SAVEHINTS time
Have PL_prevailing_version track the applied use VERSION currently in scope
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the removal of PERL_OBJECT
(acfe0abcedaf592fb4b9cb69ce3468308ae99d91) PERL_IMPLICIT_CONTEXT and
MULTIPLICITY have been synonymous and they're being used interchangeably.
To simplify the code, this commit replaces all instances of
PERL_IMPLICIT_CONTEXT with MULTIPLICITY.
PERL_IMPLICIT_CONTEXT will stay defined for compatibility with XS
modules.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes GH #18341
There are problems with getenv() on threaded perls wchich can lead to
incorrect results when compiled with PERL_MEM_LOG.
Commit 0b83dfe6dd9b0bda197566adec923f16b9a693cd fixed this for some
platforms, but as Tony Cook, pointed out there may be
standards-compliant platforms that that didn't fix.
The detailed comments outline the issues and (complicated) full solution.
|
|
|
| |
Co-authored-by: Karl Williamson <khw@cpan.org>
|
|
|
|
|
|
|
|
| |
This was originally added for MinGW, which no longer needs it, and
only still used by Symbian, which is now removed.
This also leaves perlapi.[ch] empty, but we keep the header for CPAN
backwards compatibility.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This resolves #17774.
This ticket is because the fixes in GH #17154 failed to get every case,
leaving this one outlier to be fixed by this commit.
The text in https://github.com/Perl/perl5/issues/17154 gives extensive
details as to the problem. But briefly, in an attempt to speed up
interpreter cloning, I moved certain SVs from interpreter level to
global level in e80a0113c4a8036dfb22aec44d0a9feb65d36fed (v5.27.11,
March 2018). This was doable, we thought, because the content of these
SVs is constant throughout the life of the program, so no need to copy
them when cloning a new interpreter or thread. However when an
interpreter exits, all its SVs get cleaned up, which caused these to
become garbage in applications where another interpreter remains
running. This circumstance is rare enough that the bug wasn't reported
until September 2019, #17154. I made an initial attempt to fix the
problem, and closed that ticket, but I overlooked one of the variables,
which was reported in #17774, which this commit addresses.
Effectively the behavior is reverted to the way it was before
e80a0113c4a8036dfb22aec44d0a9feb65d36fed.
|
| |
|
|
|
|
|
|
|
|
| |
This makes special-cased forms such as sort { $b <=> $a }
even faster.
Also, since this commit removes PL_sort_RealCmp, it fixes the
issue with nested sort calls mentioned in gh #16129
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit enhances these functions so that on threaded perls, they use
mbrtowc and wcrtomb when available, making them thread safe. The
substitution isn't completely transparent, as no effort is made to hide
any differences in errno setting upon error. And there may be slight
differences in edge case behavior on some platforms.
This commit also changes the behaviors so that they take a scalar
parameter instead of a char *, and this might be 'undef' or not be
forceable into a valid PV. If not a PV, the functions initialize the
shift state. Previously the shift state was always reinitialized with
every call, which meant these could not work on locales with shift
states.
In addition, there were several issues in mbtowc and wctomb that this
commit fixes.
mbtowc and wctomb, when used, are now run with a semaphore. This avoids
races if called at the same time in another thread.
The returned wide character from mbtowc() could well have been garbage.
The final parameter to mbtowc is now optional, as passing an SV allows
us to determine the length without the need for an extra parameter. It
is now used only to restrict the parsing of the string to shorter than
the actual length.
wctomb would segfault if the string parameter was shared or hadn't
been pre-allocated with a string of sufficient length to hold the
result.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit changes the behavior so that it takes a scalar parameter
instead of a char *, and thus might not be forceable into a valid PV.
When not a PV, the shift state is reinitialized, like calling mblen with
a NULL first parameter. Previously the shift state was always
reinitialized with every call, which meant this could not work on
locales with shift states.
This commit also changes to use mbrlen() on threaded perls transparently
(mostly), when available, to achieve thread-safe operation. It is not
completely transparent because mbrlen (under the very rare stateful
locales) returns a different value when it's resetting the shift state.
It also may set errno differently upon errors, and no effort is made to
hide that difference. Also mbrlen on some platforms can handle partial
characters.
[perl #133928] showed that someone was having trouble with shift states.
|
|
|
|
|
| |
and the associated commits, at least until a way to make
wrap_op_checker() work is available.
|
|
|
|
| |
Fixes issue #14816
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is part of fixing gh #17154
This scenario from the ticket
(https://github.com/Perl/perl5/issues/17154#issuecomment-558877358)
shows why this fix is necessary:
main interpreter initializes PL_AboveLatin1 to an SV it owns
loads threads::lite and creates a new thread/interpreter which
initializes PL_AboveLatin1 to a SV owned by the new interpreter
threads::lite child interpreter finishes, freeing all of its SVs,
PL_AboveLatin1 is now invalid
main interpreter uses a regexp that relies on PL_AboveLatin1, dies
horribly.
By making these interpreter level variables, this is avoided. There is
extra copying, but it is just the SV headers, as the real data is kept
as static C arrays.
|
|
|
|
|
|
|
| |
Currently, whether the OS-level signal handler function is declared as
1-arg or 3-arg depends on the configuration. Add explicit versions of
these functions, principally so that POSIX.xs can call which version of
the handler it wants regardless of configuration: see next commit.
|
|
|
|
|
|
| |
With the revamping done in cc288b7a2732c37504039083ebb98241954636be, the
table of Unicode case folds that are more than a single character is no
longer used, so no need to generate it, or having it available.
|
|
|
|
| |
Also references to the term.
|
|
|
|
| |
This is part of this branch of changes.
|
|
|
|
| |
These were missed by 059703b088f44d5665f67fba0b9d80cad89085fd.
|
|
|
|
|
|
|
|
|
|
| |
I am starting to write a Unicode::Private_Use module which will allow
one to specify the Unicode properties of private use code points, thus
making them actually useful. This commit adds a hook to regcomp.c to
accommodate this module. The changes are pretty minimal. This way we
don't have to wait another release cycle to get it out there.
I don't want to document this interface, until it's proven.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The MY_CXT subsystem allows per-thread pseudo-static data storage.
Part of the implementation for this involves each XS module being
assigned a unique index in its my_cxt_index static var when first
loaded.
Because PERL_GLOBAL_STRUCT bans any static vars, under those builds
there is instead a table which maps the MY_CXT_KEY identifying string to
index.
Unfortunately, this table was allocated per-interpreter rather than
globally, meaning if multiple threads tried to load the same XS module,
crashes could ensue.
This manifested itself in failures in
ext/XS-APItest/t/keyword_plugin_threads.t
The fix is relatively straightforward: allocate PL_my_cxt_keys globally
rather than per-interpreter.
Also record the size of this struct in a new var, PL_my_cxt_keys_size,
rather than doing double duty on PL_my_cxt_size.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the various Perl_PerlSock_dup2_cloexec() type functions so that
t/porting/liberl.a passes under -DPERL_GLOBAL_STRUCT_PRIVATE builds.
In these builds it is forbidden to have any static variables, but each
of these functions (via convoluted macros) has a static var called
'strategy' which records, for each function, whether a run-time probe
has been done to determine the best way of achieving close-exec
functionality, and the result.
Replace them all with 'global' vars: PL_strategy_dup2 etc.
NB these vars aren't thread-safe but it doesn't really matter, as the
worst that can happen is for a redundant probe or two to be done before
a suitable "don't probe any more" value is written to the var and seen
by all the threads.
|
|
|
|
|
|
|
| |
A global hash has to be specially handled. The keys can't be shared,
and all the SVs stored into it must be in its thread. This commit adds
the hash, and initialization, and macros for context change, but doesn't
use them. The code to deal with this is entirely confined to regcomp.c.
|
|
|
|
| |
This will be used in future commits
|
|
|
|
| |
It currently is always set false, until later in this series of commits.
|
|
|
|
| |
This will be used in a future commit.
|
|
|
|
|
| |
The inversion list this refers to now includes the Latin 1 range, so the
name was misleading.
|
|
|
|
|
|
| |
This variable's name was out-of-date and misleading. It is the name of
an inversion list that contains all the code points in the current
version of Unicode that participate in any way in a /i type of fold.
|
|
|
|
|
|
|
| |
This table contains all the code points that are in any multi-character
fold (not the folded-from character, but what that character folds to).
It will be used in a future commit.
|
|
|
|
|
| |
These variables are constant, once initialized, through the life of a
program, so having them be per instance is a waste of time and space
|
|
|
|
|
|
|
|
|
|
|
|
| |
Under /a pattern matching, the matches of the [:posix:] classes are
restricted to the ASCII range. Previously, in a time/space trade-off
that favored space, we created the list of matching characters at
pattern compilation time by ANDing the full-range Posix class with the
set of ASCII characters.
But now, the tables for just the ASCII-range classes are generated
anyway, so there's no need to do that compilation-time intersection.
This slightly simplifies the code.
|
|
|
|
|
|
|
|
|
|
| |
This commit changes to use the C data structures generated by the
previous commit to compute what characters fold to a given one. This is
used to find out what things should match under /i.
This now avoids the expensive start up cost of switching to perl
utf8_heavy.pl, loading a file from disk, and constructing a hash from
it.
|
|
|
|
|
| |
These were for when some of the Posix character classes were implemented
as swashes, which is no longer the case, so these can be removed.
|