| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our checks on the define info we expose via Internals::V(), especially
the sorted part, did not really work properly as it only checked defines
that are actually exposed in our standard builds. Many of the defines
that are exposed in this list are special cases that would not be
enabled in a normal build we test under CI, and indeed prior to this
patch it was possible for us to produce unsorted output if certain
defines were enabled.
This patch adds checks that reads the actual code. It checks that the
define and the string are the same, and it checks that strings would be
output in sorted order assuming every define was enabled.
There are two historical exceptions where the string we show and the
define use internally are different, but we work around these two cases
with as special case hash.
|
|
|
|
| |
Makes it easier to debug long double issues.
|
|
|
|
|
|
|
| |
svtype is an enum with 18 values. SVTYPEMASK is 31. A picky compiler
(like on HPUX) will complain that casting 31 to a svtype is an error.
We have SvIS_FREED() to do this properly anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This defines a new magic hash C<%{^HOOK}> which is intended to be used for
hooking keywords. It is similar to %SIG in that the values it contains
are validated on set, and it is not allowed to store something in
C<%{^HOOK}> that isn't supposed to be there. Hooks are expected to be
coderefs (people can use currying if they really want to put an object
in there, the API is deliberately simple.)
The C<%{^HOOK}> hash is documented to have keys of the form
"${keyword}__${phase}" where $phase is either "before" or "after"
and in this initial release two hooks are supported,
"require__before" and "require__after":
The C<require__before> hook is called before require is executed,
including any @INC hooks that might be fired. It is called with the path
of the file being required, just as would be stored in %INC. The hook
may alter the filename by writing to $_[0] and it may return a coderef
to be executed *after* the require has completed, otherwise the return
is ignored. This coderef is also called with the path of the file which
was required, and it will be called regardless as to whether the require
(or its dependencies) die during execution. This mechanism makes it
trivial and safe to share state between the initial hook and the coderef
it returns.
The C<require__after> hook is similar to the C<require__before> hook
however except that it is called after the require completes
(successfully or not), and its return is ignored always.
|
|
|
|
|
|
|
| |
Explain in detail why the return arg on the stack from the eval can't be
NULL on successful compilation.
(Spotted by Hugo)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like the previous commit which did it for amagic_call() and call_sv(),
this commit makes executing the faked-up OP_ENTEREVAL be executed as
part of the runops loop rather than as a separate call. This is to allow
shortly fixing up for a reference-counted stack. (CALLRUNOPS() will
reify the stack if necessary, while the raw call to pp_entereval() won't
fix up the stack unless its part of the runops loop too.)
However, this is a bit more complex than call_sv() etc in that there is
a good reason for calling pp_entereval() separately. The faked up
OP_ENTEREVAL has its op_next set to NULL - this is the op which would
normally be returned on failure of the eval compilation. By seeing
whether the retuned value from pp_entereval() is NULL or not, eval_sv()
can tell whether compilation failed.
On the other hand, if pp_entereval() was made to be called as part of
the runops loop, then the runops loop *always* finishes with PL_op set
to NULL. So we can no lo longer distinguish between compile-failed and
compile-succeeded-and-eval-ran-to-completion.
This commit moves the entereval into the runops loop, but restores the
ability to distinguish in a slightly hacky way. It adds a new private
flag for OP_ENTEREVAL - OPpEVAL_EVALSV - which indicates to
pp_entereval() that it was called from eval_sv(). And of course
eval_sv() sets this flag on the OPpEVAL_EVALSV op it fakes up. If
pp_entereval() fails to compile, then if that flag is set, it pushes a
null pointer onto the argument stack before returning.
Thus by checking whether *PL_stack_sp is NULL or not on return from
CALLRUNOPS(), eval_sv() regains the ability to distinguish the two
cases.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These two functions do a slightly odd thing (which has been present
since 5.000) when calling out to a CV: they half fake up an OP_ENTERSUB,
then call pp_entersub() directly, and only then if it returns a non-null
PL_op value, execute the rest of the ops of the sub within a
CALLRUNOPS() loop.
I can't find any particular reason for this. I guess it might make
calling XS subs infinitesimally faster by not have to invoke the runops
loop when only a single op is executed (the entersub), but hardly seems
worth the effort.
Conversely, eliminating this special-case makes the code cleaner, and
it will allow the workarounds planned to be added shortly (to the runops
loops for reference-counted stacks) to work uniformly. Without this
commit, pp_entersub() would get called before runops() has had a chance
to fix up the stack if necessary.
So this commit *fully* populates the fake OP_ENTERSUB (including type
and pp address) then causes pp_entersub to be invoked implicitly from the
runops loop rather than explicitly.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In GH 20435 many typos in our C code were corrected. However, this pull
request was not applied to blead and developed merge conflicts. I
extracted diffs for the individual modified files and applied them with
'git apply', excepting four files where patch conflicts were reported.
Those files were:
handy.h
locale.c
regcomp.c
toke.c
We can handle these in a subsequent commit. Also, had to run these two
programs to keep 'make test_porting' happy:
$ ./perl -Ilib regen/uconfig_h.pl
$ ./perl -Ilib regen/regcomp.pl regnodes.h
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In dd66b1d793 we added an assert to perl_run() that PL_restartop should
never be true when perl_run() is called after perl_parse(). Looked at
from the point of the internals, which calls perl_parse() and perl_run()
exactly once, this made sense.
It turns out however that there is at least one XS module out there that
expects to be able to set PL_restartop and then call perl_run(). If that
works out for them then we shouldn't block it, as we aren't really
trying to say "perl_run() should never be called with PL_restartop set"
(at least this assert wasn't trying to say that really), we are trying
to assert "between the top level transition from perl_parse() to
perl_run() we shouldnt leak any PL_restartop".
One could argue the assert maybe should go at the end of perl_parse(),
but I chose to put it in Miniperl.pm and thus into perlmain.c and
miniperlmain.c as I am not certain that perl_parse() should never be
called with PL_restartop set already, and putting it in the main code
really does more closely reflect the intent of this assert anyway.
This was reported as Blead Breaks CPAN Github Issue #20557.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changed:
* perl.c - Perl_moreswitches
* pp_sys.c - pp_sysread
* toke.c - Perl_scan_str
In each of the above functions, one instance of SvGROW on a new SVt_PV
can be swapped for the more efficient sv_grow_fresh. In two of the
instances, the calls used to create the the SVt_PV have also been
streamlined.
There should not be any functional change as a result of this commit.
|
|
|
|
|
| |
Of course, it wasn't safe to free it before the previous patch,
but it is now.
|
|
|
|
|
|
|
|
| |
I also fixed const correctness for PL_splitstr, it was initialized
to " " which required it to be const, but the value is only looked
at when PL_minus_F is true, so it wasn't necessary.
So initialize it to NULL, and duplicate the value in threads.
|
|
|
|
|
|
| |
this allows multiple ops to share the same underlying
warning bit vector. the vectors are not deduped, however
they are efficiently copied and shared.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have a weird bifurcation of the cop logic around threads. With
threads we use a char * cop_file member, without it we use a GV * and
replace cop_file with cop_filegv.
The GV * code refcounts filenames and more or less efficiently shares
the filename amongst many opcodes. However under threads we were
simplify copying the filenames into each opcode. This is because in
theory opcodes created in one thread can be destroyed in another. I say
in theory because as far as I know the core code does not actually do
this. But we have tests that you can construct a perl, clone it, and
then destroy the original, and have the copy work just fine, this means
that opcodes constructed in the main thread will be destroyed in the
cloned thread. This in turn means that you can't put SV derived
structures into the op-tree under threads. Which is why we can not use
the GV * stategy under threads.
As such this code adds a new struct/type RCPV, which is a refcounted
string using shared memory. This is implemented in such a way that code
that previously used a char * can continue to do so, as the refcounting
data is located a specific offset before the char * pointer itself.
This also allows the len data to embedded "into" the PV, which allows
us to expose macros to acces the length of what is in theory a null
terminated string.
struct rcpv {
UV refcount;
STRLEN len;
char pv[1];
};
typedef struct rcpv RCPV;
The struct is sized appropriately on creation in rcpv_new() so that the
pv member contains the full string plus a null byte. It then returns a
pointer to the pv member of the struct. Thus the refcount and length and
embedded at a predictable offset in front of the char *, which means we
do not have to change any types for members using this.
We provide three operations: rcpv_new(), rcpv_copy() and rcpv_free(),
which roughly correspond with newSVpv(), SvREFCNT_inc(), SvREFCNT_dec(),
and a handful of macros as well. We also expose SAVERCPVFREE which is
similar to SAVEGENERICSV but operates on pv's constructed with
rcpv_new().
Currently I have not restricted use of this logic to threaded perls. We
simply do not use it in unthreaded perls, but I see no reason we
couldn't normalize the code to use this in both cases, except possibly
that actually the GV case is more efficient.
Note that rcpv_new() does NOT use a hash table to dedup strings. Two
calls to rcpv_new() with the same arguments will produce two distinct
pointers with their own refcount data.
Refcounting the cop_file data was Tony Cook's idea.
|
|
|
|
|
|
| |
Various code related to set_and_free_cop_warnings was terribly mangled
as far as whitespace goes. This patch cleans it up so it is readable and
correctly indented. Whitespace only changes.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix GH Issue #20396, try_yyparse() breaks Attribute::Handlers.
Reduced test case is:
perl -e'CHECK { eval "]" }'
which should not assert or segfault.
In c304acb49 we made it so that when doeval_compile() is executed and it
calls yyparse() inside of an eval any exceptions that occur during the
parse process are trapped by try_yyparse() so that exection would return
to doeval_compile(). This was done so that post eval compilation cleanup
logic could be handled similarly regardless of whether Perl_croak() was
called or not. However the logic to setup PL_restartop was not adjusted
accordingly.
The opcode that calls doeval_compile() setups an eval context data
before it calls doeval_compile(). This data includes the "retop" which
is used to return control to after the eval should it die and is set to
the be the evaling opcodes op_next. When Perl_die_unwind() is called it
sets PL_restartop to be the "retop" of the of the current eval frame,
and then does a longjmp, on the assumption it will end up inside of a
"run loop enabled jump enviornment", where it restarts the run loop
based on the value of PL_restartop, zeroing it aftewards.
After c304acb49 however, a die inside of try_yyparse the die_unwind
returns control back to the try_yyparse, which ignores PL_restartop, and
leaves it set. Code then goes through the "compilation failed" branch
and execution returns to PL_restartop /anyway/, as PL_op hasn't changed
and pp_entereval returns control to PL_op->op_next, which is what we
pushed into the eval context anyway for the PL_restartop.
The end result of this however is that PL_restartop remains set when we
enter perl_run() for the first time. perl_run() is a "run loop enabled
jump enviornment" which uses run_body() to do its business, such that
when PL_restartop is NULL it executes the just compiled body of the
program, and when PL_restartop is not null it assumes it must be in the
eval handler from an eval from the main body and it should recontinue.
The leaked PL_restartop is thus executed instead of the main program
body and things go horribly wrong.
This patch changes it so that when try_yyparse traps an exception we
restore PL_restartop back to its old value. Same for its partner
PL_restartjmpenv. This is fine as they have been set to the values from
the beginning of the eval frame which we are part of, which is now over.
|
|
|
|
| |
This can be helpful debugging issues related to phaser blocks and eval.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a step in solving #20155
The POSIX 2008 locale API introduces per-thread locales. But the
previous global locale system is retained, probably for backward
compatibility.
The POSIX 2008 interface causes memory to be malloc'd that needs to be
freed. In order to do this, the caller must first stop using that
memory, by switching to another locale. perl accomplishes this during
termination by switching to the global locale, which is always available
and doesn't need to be freed.
Perl has long assumed that all that was needed to switch threads was to
change out tTHX. That's because that structure was intended to hold all
the information for a given thread. But it turns out that this doesn't
work when some library independently holds information about the
thread's state. And there are now some libraries that do that.
What was happening in this case was that perl thought that it was
sufficient to switch tTHX to change to a different thread in order to do
the freeing of memory, and then used the POSIX 2008 function to change
to the global locale so that the memory could be safely freed. But the
POSIX 2008 function doesn't care about tTHX, and actually was typically
operating on a different thread, and so changed that thread to the global
locale instead of the intended thread. Often that was the top-level
thread, thread 0. That caused whatever thread it was to no longer be in
the expected locale, and to no longer be thread-safe with regards to
localess,
This commit causes locale_term(), which has always been called from the
actual terminating thread that POSIX 2008 knows about, to change to the
global thread and free the memory.
It also creates a new per-interpreter variable that effectively maps the
tTHX thread to the associated POSIX 2008 memory. During
perl_destruct(), it frees the memory this variable points to, instead of
blindly assuming the memory to free is the current tTHX thread's.
This fixes the symptoms associtated with #20155, but doesn't solve the
whole problem. In general, a library that has independent thread status
needs to be updated to the new thread when Perl changes threads using
tTHX. Future commits will do this.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some configurations require us to store the current locale for each
category. Prior to this commit, this was done in the array
PL_curlocales, with the entry for LC_ALL being in the highest element.
Future commits will need just the value for LC_ALL in some other
configurations, without needing the rest of the array. This commit
splits off the LC_ALL element into its own per-interpreter variable to
accommodate those. It always had to have special handling anyway beyond
the rest of the array elements,
|
|
|
|
|
| |
Rather than recalculate this combined conditional, do it once in
perl.h.
|
|
|
|
|
|
| |
This conditional dates from when the rest of the conditions used
'HAS_foo' (from config.h) instead of USE_foo, which takes more things
into account.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We try to keep parsing after many types of errors, up to a (current)
maximum of 10 errors. Continuing after a semantic error (like
undeclared variables) can be helpful, for instance showing a set of
common errors, but continuing after a syntax error isn't helpful
most of the time as the internal state of the parser can get confused
and is not reliably restored in between attempts. This can produce
sometimes completely bizarre errors which just obscure the true error,
and has resulted in security tickets being filed in the past.
This patch makes the parser stop after the first syntax error, while
preserving the current behavior for other errors. An error is considered
a syntax error if the error message from our internals is the literal
text "syntax error". This may not be a complete list of true syntax
errors, we can iterate on that in the future.
This fixes the segfaults reported in Issue #17397, and #16944 and
likely fixes other "segfault due to compiler continuation after syntax
error" bugs that we have on record, which has been a recurring issue
over the years.
|
|
|
|
|
|
| |
See https://github.com/Perl/perl5/issues/16418
This should fix the issue with Win32.
|
|
|
|
|
|
|
| |
This reverts commit 857320cbf85e762add18885ae8a197b5e0c21b69.
There were a lot of conflicts due to whitespace changes in the
intervening time. I manually reviewed the differences and merged them.
|
|
|
|
|
|
|
|
|
|
|
| |
When changing locales the new decimal point needs to be calculated.
This commit creates a new per-interpreter variable to save that
calculation, so it only has to be done when a new locale is set; prior
to this commit it was recalculated each time it was needed.
The calculation is still performed twice when the new locale is switched
into. But the redundant calculation will be removed in a couple of
commits hence.
|
|
|
|
|
|
|
|
| |
This is now used as a cache of length 1 to avoid having to lookup up the
UTF-8ness as often.
This commit also skips doing S_newctype() if the new boss is the same as
the old
|
|
|
|
|
|
| |
This is also initialized in locale.c
Spotted by Tony Cook.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This env var can be used to trigger a repeatable run of a script which
calls C<srand()> with no arguments, either explicitly or implicitly
via use of C<rand()> prior to calling srand(). This is implemented in
such a way that calling C<srand()> with no arguments in forks or
subthreads (again explicitly or implicitly) will receive their own seed
but the seeds they receive will be repeatable.
This is intended for debugging and perl development performance testing,
and for running the test suite consistently. It is documented that the
exact seeds used to initialize the random state are unspecified, and
that they may change between releases or even builds. The only guarantee
provided is that the same perl executable will produce the same results
twice all other things being equal. In practice and in core testing we
do expect consistency, but adding the tightest set of restrictions on
our commitments seemed sensible.
The env var is ignored when perl is run setuid or setgid similarly to
the C<PERL_INTERNAL_RAND_SEED> env var.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The POSIX 2008 API has an edge case in that the result of most of the
functions when called with a global (as opposed to a per-thread) locale
is undefined.
The duplocale() function is the exception which will create a per-thread
locale containing the values copied from the global one.
This commit just calls duplocale, if needed, and the caller need not
concern itself with this possibility
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function is rewritten to handle LC_ALL, and to make it easier to
add new checks.
There is also a change, which I think is an improvement, in that everything
starting with a \n is trimmed, instead of just a trailing \n.
A couple of calls to stdize_locale() are removed, as they are redundant,
because they are called only as a result of Perl_setlocale() being
called, and that ends up calling stdize_locale always, early on.
The call to savepv() is also moved in a couple cases to after the result
is known to not be NULL
I originally had such a new check in mind, but it turned out that doing
it here didn't solve the problem, so this commit has been amended
(before ever being pushed) to not include that.
chomped.
|
|
|
|
|
|
|
|
|
|
| |
Our hash function is compiled into code which chooses to use it,
and is not linked in from the perl executable. That means that
the hash related settings can affect binary compatibility.
I proposed making the hash function linkable, but people said "there was
no need". But that means that the hash function config DOES affect
binary compatibility.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the hash based defines are no longer supported, so remove them.
Also at the same time introduce a new simpler way to track which hash
function we are using. Also add the info about if SBOX32 is in use.
This removes the need to keep the list of hash functions supported in
two places, hv_func.h and perl.c. Instead hv_func.h drives the whole
process and perl.c just does what it is told.
Previously the way to control SBOX32 was to use a define with a value
but our perl -V output currently doesnt support that, so this adds some
two new defines PERL_HASH_USE_SBOX32 and PERL_HASH_NO_SBOX32 which map
to the older PERL_HASH_USE_SBOX32_ALSO flag define (integer 1/0).
Both are still supported, this just makes everything more consistent.
This also includes minor doc changes to INSTALL to mention -Accflags
as being the way to set these defines during the Configure process.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The IO and memory terminations need to be after other things. Add a
comment so that future maintainers won't make the mistakes I did.
Also refactor so that amiga os doesn't have a separate list to get out
of sync
I suspect that the amiga termination should be moved to earlier in
the sequence, but absent any evidence; I'm leaving it unchanged.
vms destruction was missing a bunch of things and I didn't see any
reason to have special handling, so I changed it to just use the
standard, presuming the discrepancies were due to changes in the
standard not getting propagated to vms.
The common definitions are also moved to perl.c which is the only place
they are used (including cpan). This makes them available in all
circumstances. Otherwise, the #ifdef's for including the relevant
header files only include one, so there would be undefined macros.
|
|
|
|
| |
Follow-up to 8fb00d81c65144.
|
|
|
| |
A recent contribution (PR #19853) added a description for the `-g` option to Perl's usage output. Unfortunately, it included a stray period in the description that wasn't caught before merging. (Note that none of the other descriptions have any periods.) This very trivial pull request removes that stray period.
|
|
|
|
|
| |
This was meant to be a part of the previous commit, but somehow got
omitted.
|
|
|
|
|
|
|
|
|
| |
A platform shouldn't be required to use the Posix 2008 locale handling
functions if they are present. Perhaps they are buggy. So, a separate
define for using them was introduced, USE_POSIX_2008_LOCALE. But until
this commit there were cases that were looking at the underlying
availability of the functions, not if the Configuration called for their
use.
|
|
|
|
|
|
|
| |
C reserves symbols beginning with underscores for its own use. This
commit moves the underscore so it is trailing, which is legal. The
symbols changed here are many of the ones in handy.h that have
significant uses outside it.
|
| |
|
| |
|
|
|
|
|
| |
Commit b95d23342a119c6677aa5ad786ca7d002c98bef2 would not compile under
some options.
|
|
|
|
|
| |
The purpose of PL_origenviron is to preserve the earliest known value
of environ, which is a global. All interpreters should share it.
|
|
|
|
|
| |
environ is managed by the C runtime, while we are using the system APIs
directly.
|
|
|
|
|
|
|
| |
Now environ isn't owned by Perl and calling setenv/putenv in XS code
will no longer result in memory corruption.
Fixes #19399
|
|
|
|
|
|
| |
This allows us to overwrite the original environ when $0 is being set,
which means that enabling PERL_USE_SAFE_PUTENV will no longer decrease
the maximum length of $0 on some platforms.
|
| |
|
| |
|
| |
|