| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
A suggested way of avoiding the the warning on nv1 != nv2
by replacing it with (nv1 < nv2 || nv1 > nv2), has too many issues
with NaN. [perl #120538].
I haven't found any other way of selectively disabling the warning,
so for now I'm just reverting the whole commit.
This reverts commit c279c4550ce59702722d0921739b1a1b92701b0d.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The gcc option -Wfloat-equal warns when two floating-point numbers
are directly compared for equality or inequality, the idea being that
this is usually a logic error, and that you should be checking that the
values are instead very near to each other.
perl on the other hand has lots of reasons to do a direct comparison.
Add two macros, NV_eq_nowarn(a,b) and NV_eq_nowarn(a,b)
that do the same as (a == b) and (a != b), but without the warnings.
They achieve this by instead doing (a < b) || ( a > b).
Under gcc at least, this is optimised into the same code as the direct
comparison.
The are three places that I've left untouched, because they are handling
NaNs, and that gets a bit tricky. In particular (nv != nv) is a test for a
NaN, and replacing it with (< || >) creates signalling NaNs (whereas ==
and != create quiet NaNs)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The way CORE:: was handled in the lexer was convoluted.
CORE was treated initially as a keyword, with exceptions in the lexer
to make it behave correctly. If it turned out not to be followed
by ::, then the lexer would fall back to treating it as a bareword
or sub name.
Before even checking for a keyword, the lexer looks for :: and goes
to the bareword/sub code. But it made a special exception there
for CORE::.
In the end, treating CORE as a keyword recognized by the keyword()
function requires more special cases than simply special-casing CORE::
in toke.c.
This fixes the lexical CORE sub bug, while reducing the total num-
ber of lines.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As of v5.16.0-81-g234df27, SvAMAGIC has only meant potentially over-
loaded. When method changes occur, the flag is turned on. When over-
loading tables are calculated (the first time overloading is used),
the flag is turned off if it turns out there is no overloading.
At the time I did that, I assumed that all uses of SvAMAGIC were to
avoid the inefficient code path for non-overloaded objects. What I
did not notice at the time was that SvAMAGIC is used in pp_bless to
determine whether an object used as a class name should be exempt from
‘Attempt to bless into a reference’.
Hence, the bizarre result:
$ ./perl -Ilib -e 'sub foo{} bless [], bless []'
$ ./perl -Ilib -e 'bless [], bless []'
Attempt to bless into a reference at -e line 1.
This commit makes both die consistently, as they did in 5.16.
|
|
|
|
|
| |
There is no reason tied (or otherwise magical variables like $/)
should be exempt from the ‘Attempt to bless into a reference’ error.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When PERL_DONT_CREATE_GVSV is defined, perl generally does not vivify
the scalar slot in every GV. But it hides that implementation detail
by vivifying it when *foo{SCALAR} is accessed.
undef(*foo) was one exception to this. It vivified the scalar in
the scalar slot regardless of whether PERL_DONT_CREATE_GVSV was
defined.
Until recently, it was not safe to remove this exception, because a
typeglob with no scalar could be a candidate for downgrading (see
gv.c:gv_try_downgrade), causing global pointers like PL_DBgv to point
to freed SVs. Recent commits have fixed all those cases, so this
is now safe.
|
|
|
|
| |
These get rid of some "possible loss of data" warnings
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
kvaslice operator that imlements %a[0,2,4] syntax which
result in list of index/value pairs. Implemented in
consistency with "key/value hash slice" operator.
|
|
|
|
|
|
| |
kvhslice operator that implements %h{1,2,3,4} syntax which
returns list of key value pairs rather than just values
(regular slices).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Based on Yves's random branch work.
This version makes the new random number visible to external modules,
for example, List::Util's XS shuffle() implementation.
I've also added a 64-bit implementation when HAS_QUAD is true, this
should be significantly faster, even on 32-bit CPUs. This is intended to
produce exactly the same sequence as the original implementation.
The original version of this commit retained the "freebsd" name from
Yves's original work for the function and data structure names. I've
removed "freebsd" from most function names so the name isn't an issue
if we choose to replace the implementation,
|
|
|
|
|
|
|
| |
This removes a macro not yet even in a development release, and splits
its calls into two classes: those where the input is a byte; and those
where it can be any unsigned integer. The byte implementation avoids a
function call on EBCDIC platforms.
|
|
|
|
|
|
|
| |
Commit ce0d59f changed AVs to use NULLs for nonexistent elements.
pp_splice needs to take that into account and avoid pushing NULLs on
to the stack.
|
|
|
|
| |
Make a ternary operation more clear
|
|
|
|
|
|
|
|
| |
Now that the Unicode tables are stored in native format, we shouldn't be
doing remapping.
Note that this assumes that the Latin1 casing tables are stored in
native order; not all of this has been done yet.
|
|
|
|
|
|
|
|
| |
The conversion from UTF-8 to code point should generally be to the
native code point. This adds a macro to do that, and converts the
core calls to the existing macro to use the new one instead. The old
macro is retained for possible backwards compatibility, though it
probably should be deprecated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The value of pos() is stored as a byte offset. If it is stored on a
tied variable or a reference (or glob), then the stringification could
change, resulting in pos() now pointing to a different character off-
set or pointing to the middle of a character:
$ ./perl -Ilib -le '$x = bless [], chr 256; pos $x=1; bless $x, a; print pos $x'
2
$ ./perl -Ilib -le '$x = bless [], chr 256; pos $x=1; bless $x, "\x{1000}"; print pos $x'
Malformed UTF-8 character (unexpected end of string) in match position at -e line 1.
0
So pos() should be stored as a character offset.
The regular expression engine expects byte offsets always, so allow it
to store bytes when possible (a pure non-magical string) but use char-
acters otherwise.
This does result in more complexity than I should like, but the alter-
native (always storing a character offset) would slow down regular
expressions, which is a big no-no.
|
|
|
|
|
|
|
| |
This is something that ce0d59fdd1c missed.
This will probably fix most of the modules mentioned in
ticket #119433.
|
|
|
|
|
|
|
|
|
|
| |
Make the array interface 64-bit safe by using SSize_t instead of I32
for array indices.
This is based on a patch by Chip Salzenberg.
This completes what the previous commit began when it changed
av_extend.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit fixes bug #7508 and provides the groundwork for fixing
several other bugs.
Elements of @_ are aliased to the arguments, so that \$_[0] within
sub foo will reference the same scalar as \$x if the sub is called
as foo($x).
&PL_sv_undef (the global read-only undef scalar returned by the
‘undef’ operator itself) was being used to represent nonexistent
array elements. So the pattern would be broken for foo(undef), where
\$_[0] would vivify a new $_[0] element, treating it as having been
nonexistent.
This also causes other problems with constants under ithreads
(#105906) and causes a pending fix for another bug (#118691) to trig-
ger this bug.
This commit changes the internals to use a null pointer to represent a
nonexistent element.
This requires that Storable be changed to account for it. Also,
IPC::Open3 was relying on the bug. So this commit patches
both modules.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a problem with undefined values return from XSUBs not pro-
ducing such warnings.
The default typemap for XSUBs uses the target of the entersub call (in
the caller’s pad) to return the converted value, instead of having to
allocate a new SV for that.
So, for example, a function returning char* will cause that char* to
be assigned to the target via sv_setpv. Then the target is returned.
As a special case, NULL return from a char*-returning function will
produce an undef return value. This undef return value was not trig-
gering an uninitialized warning.
All targets are marked PADTMP, and anything marked PADTMP is exempt
from uninitialized warnings in some code paths, but not others.
This goes all the way back to 91bba347, which suppressed the warning
with only a hit at why (something to do with bitwise ops warning inap-
propriately). I think it was to make ~undef exempt. But a1afd104
stopped it from being exempt.
Only a few pieces of code were relying on this exemption, and it was
hiding bugs, too. The last few commits have addressed those, so kiss
this exemption good-bye!
pp_reverse had a workaround to force an uninit warning (since
1e21d011c), so remove the workaround to avoid a double uninit warning.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 91bba347 made targets exempt from uninit warnings. The com-
mit message only says it has something to do with fixing warnings on
bitwise ops.
Commit a1afd1046 worked around that by doing an extra stringifica-
tion before assigning to the target, to produce an uninit warning
on purpose.
As far as I can tell, the only purpose of 91bba347 was to avoid the
warning for ~undef. (I haven’t actually tried building the commit
before that to confirm.)
In any case, the uninit warning has been long tested for and is now
expected behaviour.
Since I am about to remove the uninit warning exemption for targets,
stop relying on that.
This speed ups the code slightly, as we avoid a double string-
ification.
|
| |
|
|
|
|
|
| |
So that \(("$a")x2) will give two references to the same scalar, the
way that \(($a)x2) does.
|
|
|
|
|
|
| |
So that slices that reference the same value multiple times (such as
(...)[1,1]) will exhibit referential identity (\(...)[1,1] will return
two references to the same scalar).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is part of #116907, too. It also fixes #72924 as a side effect;
the next commit will explain.
The value of pos($foo) was being stored as an I32, not allowing values
above I32_MAX. Change it to SSize_t (the signed equivalent of size_t,
representing the maximum string length the OS/compiler supports).
This is accomplished by changing the size of the entry in the magic
struct, which is the simplest fix.
Other parts of the code base can benefit from this, too.
We actually cast the pos value to STRLEN (size_t) when reading
it, to allow *very* long strings. Only the value -1 is special,
meaning there is no pos. So the maximum supported offset is
2**sizeof(size_t)-2.
The regexp engine itself still cannot handle large strings, so being
able to set pos to large values is useless right now. This is but one
piece in a larger puzzle.
Changing the size of mg->mg_len also requires that
Perl_hv_placeholders_p change its type. This function
should in fact not be in the API, since it exists
solely to implement the HvPLACEHOLDERS macro. See
<https://rt.perl.org/rt3/Ticket/Display.html?id=116907#txn-1237043>.
|
|
|
|
|
|
|
|
|
| |
When elements of @_ refer to nonexistent hash or array elements, then
the magic scalar in $_[0] delegates all set/get actions to the element
in represents, vivifying it if needed.
pos($_[0]), however, was not delegating the value to the element, but
storing it on the magical ‘deferred element’ scalar.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a nutshell, very few PADHV and PADAV OPs are executed that have
the OPpLVAL_INTRO flag set. To wit, "my %h" does whereas "$h{foo}" and
similar (also "$h{foo} = 1") do not. Also, traditional lexicals greatly
outnumber state variables, so pessimize "state" slightly.
This was determined with a nifty new trick. With a Perl compiled with
-DPERL_TRACE_OPS, we get a summary of all executed op counts by type at
the end of the program execution. The above was figured out (naively) by
adding the following:
--- a/dump.c
+++ b/dump.c
@@ -2215,6 +2215,8 @@ Perl_runops_debug(pTHX)
do {
#ifdef PERL_TRACE_OPS
++PL_op_exec_cnt[PL_op->op_type];
+ if (PL_op->op_type == OP_PADHV && PL_op->op_private & OPpLVAL_INTRO)
+ ++PL_op_exec_cnt[OP_max+1];
#endif
if (PL_debug) {
if (PL_watchaddr && (*PL_watchaddr != PL_watchok))
Which adds a special case (OP_max+1) to the OP report. Dividing that
count by the total PADHV count gives a diminishingly small percentage.
|
| |
|
|
|
|
|
|
| |
Previous commit 8d455b9f99c1046e969462419b0eb5b8bf740a47 was a partial
lie. This commit fixes it up not to be a lie and to avoid mortalizing
the HV.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pp_anonhash can be used to construct both bare hashes and hashrefs. In
the hashref case, it used to create an HV and mortalize it. Then it went
through the parameters on the stack and copied them into the hashref.
This can throw exceptions which would make the HV leak if it hadn't been
mortalized.
After potentially copying the arguments, pp_anonhash would then in the
hashREF case increment the refcount on the HV *again*, create an RV for
the HV, and mortalize that RV before pushing it on the stack.
Instead, we can get away with constructing the HV and only mortalizing
it if there's no mortalizable ref to clean up if there's an exception.
This should remove a fair fraction of work in the common case of empty {}
hash ref construction.
|
| |
|
|
|
|
|
|
| |
Perl_magic_methcall is not public API, so there is no
need to add another function and we can just change
function's arguments.
|
|
|
|
|
|
|
|
|
| |
Can be used when it's known that method name has no
package part - just method name.
With flag set SV with precomputed hash value is used
and pp_method_named is called instead of pp_method.
Method lookup is faster.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This format string allows char*s to be interpolated with the
utf8ness and length specified as well, avoiding the need to create
extra SVs:
Perl_croak(aTHX_ "Couldn't twiggle the twoggle in \"%"UTF8f"\"",
UTF8fARG(is_utf8, len, s));
This is the second attempt.
I screwed up in commits 1c8b67b38f0a5 and b3e714770ee1 because
I didn’t really understand how varargs functions receive their
arguments.
They are like structs, in that different members can be different
sizes. So therefore both ends--the caller and the called--*must* get
the casts right, or the data will be corrupted.
The main mistake I made was to use %u in the format for the first
argument and then retrieve it as UV (a simple typo, I meant unsigned
int or U32--I don’t remember).
To be on the safe side, I added a UTF8fARG macro (after SVfARG), which
(unlike SVfARG) takes three arguments and casts them explicitly, mak-
ing it much harder to get this wrong at call sites.
|
|
|
|
|
|
|
|
| |
This reverts commit acc19697c67fa63c10e07491b670a26c48f4175f.
This and the other UTF8f patch are causing significant problems on some
configurations on 32-bit platforms. We've decided to revert them until
they can be resubmitted after the kinks get ironed out.
|
| |
|
|
|
|
| |
This saves having to allocate as many SVs.
|
|
|
|
|
|
|
| |
The number of bytes the result of changing the case of a single UTF-8
character is given by UTF8_MAXBYTES_CASE. In one of these arrays, space
is saved by using the proper #define; in the other there is no change
except on EBCDIC platforms.
|
|
|
|
|
|
| |
There is no way a null can get on to the stack here. The only piece
of code that puts nulls on the stack is pp_coreargs, which doesn’t do
that for functions with a _ prototype.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This extra protection was added in commit 44a8e56aa0 or d976ac8220
(I’m a little confused as to the history here).
At the time, av_make really could reallocate the stack. That changed
in commit efaf36747029.
Now there is no need for dORIGMARK and ORIGMARK. (ORIGMARK is more
expensive than MARK since it looks up PL_stack_base and recalculates
where we are on the stack, based on a recorded offset. MARK simply
stores a pointer to where we want to be.)
|
|
|
|
|
|
| |
If the current stash has been freed, bless() with one argument will
cause a crash when the object’s ‘stash’ is accessed. Simply disallow-
ing this is the easiest fix.
|
|
|
|
|
| |
open’s handle vivification could crash if the current stash was freed,
so check before passing a freed stash to gv_init.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was only leaking when the env var did not already exist.
The code in question in pp.c:S_do_delete_local was calling hv_delete
to delete the element, which autovivifies, deletes, and returns a mag-
ical mortal for magical hashes. It was assuming that if a value was
returned the element must have existed, so it was calling SvREFCNT_inc
on the returned mortal to de-mortalize it (since it has to be
restored). Whether the element had existed previously was already
recorded in a bool named ‘preeminent’ (strange name). This variable
should be checked before the SvREFCNT_inc.
I found the same bug in the array code path, potentially affecting
@- and @+, so I fixed it. But I didn’t write a test, as that would
involve a custom regexp engine.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The way that the regex engine knows that the match string is utf8 is
currently a complete mess. It's partially signalled by the utf8 flag of
the passed SV, but also by the RXf_MATCH_UTF8 flag in the regex itself,
and the value of PL_reg_match_utf8.
Currently all the callers of the engine (such as pp_match, pp_split etc)
initially use RX_MATCH_UTF8_set() before calling the engine. This sets both
the RXf_MATCH_UTF8 flag on the regex, and PL_reg_match_utf8.
Then the two entry points to the engine (regexec_flags() and
re_intuit_start()) initially repeat the RX_MATCH_UTF8_set()
themselves.
Remove the usage of RX_MATCH_UTF8_set() by the callers of the engine,
and instead just rely on the engine to do it.
Also, remove the "secret" setting of PL_reg_match_utf8 by
RX_MATCH_UTF8_set(), and do it explicitly.
This is a prelude to eliminating PL_reg_match_utf8.
|
|
|
|
|
|
| |
I think it's clearer to use Copy. When I wrote this custom macro, we
didn't have the infrastructure to generate a UTF-8 encoded string at
compile time.
|
|
|
|
|
| |
The previous commit added macros to do some case changing. This
commit uses them in the core, where appropriate.
|
| |
|