| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
It's been deprecated and a no-op since 5.10.
Move :unique test into it own file so it can be skipped separately
Merely parsing an unknown attribute fails, so the skip has to happen
at BEGIN time.
|
|
|
|
| |
This has been issuing a deprecation warning since perl 5.000.
|
| |
|
|
|
|
|
|
| |
This reverts commit 05f97de032fe95cabe8c9f6d6c0a5897b1616194.
Accidentally pushed while waiting for blead-unfreeze.
|
|
|
|
|
|
| |
This reverts commit a3bf60fbb1f05cd2c69d4ff0a2ef99537afdaba7.
Accidentally pushed work pending unfreeze.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for three new hash functions: StadtX, Zaphod32 and SBOX,
and reworks some of our hash internals infrastructure to do so.
SBOX is special in that it is designed to be used in conjuction with any
other hash function for hashing short strings very efficiently and very
securely. It features compile time options on how much memory and startup
time are traded off to control the length of keys that SBOX hashes.
This also adds support for caching the hash values of single byte characters
which can be used in conjuction with any other hash, including SBOX, although
SBOX itself is as fast as the lookup cache, so typically you wouldnt use both
at the same time.
This also *removes* support for Jenkins One-At-A-Time. It has served us
well, but it's day is done.
This patch adds three new files: zaphod32_hash.h, stadtx_hash.h,
sbox32_hash.h
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this patch we resized hashes when after inserting a key
the load factor of the hash reached 1 (load factor= keys / buckets).
This patch makes two subtle changes to this logic:
1. We split only after inserting a key into an utilized bucket,
2. and the maximum load factor exceeds 0.667
The intent and effect of this change is to increase our hash tables
efficiency. Reducing the maximum load factor 0.667 means that we should
have much less keys in collision overall, at the cost of some unutilized
space (2/3rds was chosen as it is easier to calculate than 0.7). On the
other hand, only splitting after a collision means in theory that we execute
the "final split" less often. Additionally, insertin a key into an unused
bucket increases the efficiency of the hash, without changing the worst
case.[1] In other words without increasing collisions we use the space
in our hashes more efficiently.
A side effect of this hash is that the size of a hash is more sensitive
to key insert order. A set of keys with some collisions might be one
size if those collisions were encountered early, or another if they were
encountered later. Assuming random distribution of hash values about
50% of hashes should be smaller than they would be without this rule.
The two changes complement each other, as changing the maximum load
factor decreases the chance of a collision, but changing to only split
after a collision means that we won't waste as much of that space we
might.
[1] Since I personally didnt find this obvious at first here is my
explanation:
The old behavior was that we doubled the number of buckets when the
number of keys in the hash matched that of buckets. So on inserting
the Kth key into a K bucket hash, we would double the number of
buckets. Thus the worse case prior to this patch was a hash
containing K-1 keys which all hash into a single bucket, and the post
split worst case behavior would be having K items in a single bucket
of a hash with 2*K buckets total.
The new behavior says that we double the size of the hash once inserting
an item into an occupied bucket and after doing so we exceeed the maximum
load factor (leave aside the change in maximum load factor in this patch).
If we insert into an occupied bucket (including the worse case bucket) then
we trigger a key split, and we have exactly the same cases as before.
If we insert into an empty bucket then we now have a worst case of K-1 items
in one bucket, and 1 item in another, in a hash with K buckets, thus the
worst case has not changed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #131098
The helpful "you may need to install" hint which 'require' sometimes
includes in its error message these days (split across multiple lines for
clarity):
$ perl -e'require Foo::Bar'
Can't locate Foo/Bar.pm in @INC
(you may need to install the Foo::Bar module)
(@INC contains: ... ) at ...
is a bit over-enthusiastic when the pathname hasn't actually been derived
from a module name:
$ perl -e'require "Foo.+/%#Bar.pm"'
Can't locate Foo.+%#Bar.pm in @INC
(you may need to install the Foo.+::%#Bar module)
(@INC contains: ... ) at ...
This commit changes things so that the hint message is only emitted if the
reverse-mapped module name is legal as a bareword:
$ perl -e'require "Foo.+/%#Bar.pm"'
Can't locate Foo.+%#Bar.pm in @INC
(@INC contains: ... ) at ...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #131098
5.8.0 introduced a change which as an inadvertent side-effect caused
this @INC-related require croak message:
Can't locate foo in @INC (@INC contains: ...) at ...
to be emitted even when foo is a non-searchable pathname (like /foo or
./foo) and @INC isn't used.
This commit reverts the error message in these cases to be the simple
Can't locate foo at ...
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
See [perl #130497]
GNU Autoconf depends on Perl, and will not work on Blead (and the
forthcoming Perl 5.26), due to a single unescaped '{', that has
previously been deprecated and is now fatal. A patch for it has been in
the Autoconf repository since early 2013, but there has not been a
release since before then.
Because this is depended on by so much code, and because it is simpler
than trying to revert to making the fatality merely deprecated, this
patch simply changes perl to not die when compiled with the exact
pattern that trips up Autoconf. Thus Autoconf can continue to work, but
any other patterns that use the now illegal construct will continue to
die. If other code uses the exact pattern, they too will not die, but
the deprecation message continues to get raised. The use of the left
brace in this particular pattern is not one where we envision using the
construct to mean something else, so a deprecation is suitable for the
foreseeable future.
|
|
|
|
|
|
| |
These are some cases which weren't picked up by the test suite
[ This is a subset of a set of fixes originally submitted by James as
2 commits - DAPM]
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently TestInit.pm adds '.' to @INC (except if running under taint).
Since *all* tests run from the perl core are invoked as
perl -MTestInit[=arg,arg,..] some/test.t
this means that all test scripts (including those under cpan/ etc) are
excuted with dot present, regardless of the settings of
$PERL_USE_UNSAFE_INC and -Ddefault_inc_excludes_dot.
This commit changes it so that:
1) TestInit.pm transparently passes though a trailing dot in @INC
if present (so it now honours $PERL_USE_UNSAFE_INC and
-Ddefault_inc_excludes_dot)
2) Adds a 'DOT' arg (e.g. -MTestInit=DOT) which unconditionally adds '.';
3) Updates t/TEST so that it (and t/harness which requires t/TEST)
have a whitelist of cpan/ modules which need '.'; test scripts for these
are invoked with -MTestInit=DOT.
4) Removes the $PERL_USE_UNSAFE_INC unsetting in t/TEST and t/harness;
now that environmant variable is passed unchanged to all perl processes
involved in running the test suite.
As of this commit, lots of tests will fail on a dotless perl build; the
next few commits will fix up any tests scripts and non cpan/ distributions
which relied on dot being present.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #131083
Recent commits v5.25.10-81-gd69c430 and v5.25.10-82-g67dd6f3 added
out-of-range/overflow checks for the offset arg of vec(). However in
lvalue context, these croaks now happen before the SVt_PVLV was created,
rather than when its set magic was called. This means that something like
sub f { $x = $_[0] }
f(vec($s, -1, 8))
now croaks even though the out-of-range value never ended up getting used
in lvalue context.
This commit fixes things by, in pp_vec(), rather than croaking, just set
flag bits in LvFLAGS() to indicate that the offset is -Ve / out-of-range.
Then in Perl_magic_getvec(), return 0 if these flags are set, and in
Perl_magic_setvec() croak with a suitable error.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The warning
do "%s" failed, '.' is no longer in @INC
was added in the previous devel release; this commit enhances it to say
do "%s" failed, '.' is no longer in @INC; did you mean do "./%s"
and updates the relevant docs.
See http://nntp.perl.org/group/perl.perl5.porters/243788.
|
| |
|
|
|
|
|
| |
It intereferes with tests of @INC contents, and all core tests must work
without it.
|
|
|
|
| |
follow-up to the previous commit's reverting of base.pm @INC changes.
|
| |
|
|
|
|
|
|
|
| |
RT #131033
A recently added test checked for a memory wrap condition, which won't
happen if memory wrap checking is disabled.
|
|
|
|
|
|
| |
By running t/porting/customized.t --regen. This should have been done as
final part of CPAN-sync for IO-Compress et al in commit
5173674b1cb46b59301b559929904bc67fa15056 on Mar 10 2017.
|
|
|
|
|
| |
Other tests are run from t/ and so they add ".." to @INC, but
this test runs from the main dir, so it needs to add ".".
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #130915
In something like
vec($str, $bignum, 16)
(i.e. where $str is treated as a series of 16-bit words), Perl_do_vecget()
and Perl_do_vecset() end up doing calculations equivalent to:
$start = $bignum*2;
$end = $start + 2;
Currently both these calculations can wrap if $bignum is near the maximum
value of a STRLEN (the previous commit already fixed cases for $bignum >
max(STRLEN)).
So this commit makes them check for potential overflow before doing such
calculations.
It also takes account of the fact that the previous commit changed the
type of offset from signed to unsigned.
Finally, it also adds some tests to t/op/vec.t for where the 'word'
overlaps the end of the string, for example
$x = vec("ab", 0, 64)
should behave the same as:
$x = vec("ab\0\0\0\0\0\0", 0, 64)
This uses a separate code path, and I couldn't see any tests for it.
This commit is based on an earlier proposed fix by Aaron Crane.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
... and fix up its caller, pp_vec().
This is part of a fix for RT #130915.
pp_vec() is responsible for extracting out the offset and size from SVs on
the stack, and then calling do_vecget() with those values. (Sometimes the
call is done indirectly by storing the offset in the LvTARGOFF() field of
a SVt_PVLV, then later Perl_magic_getvec() passes the LvTARGOFF() value to
do_vecget().)
Now SvCUR, SvLEN and LvTARGOFF are all of type STRLEN (a.k.a Size_t),
while the offset arg of do_vecget() is of type SSize_t (i.e. there's a
signed/unsigned mismatch). It makes more sense to make the arg of type
STRLEN. So that is what this commit does.
At the same time this commit fixes up pp_vec() to handle all the
possibilities where the offset value can't fit into a STRLEN, returning 0
or croaking accordingly, so that do_vecget() is never called with a
truncated or wrapped offset.
The next commit will fix up the internals of do_vecget() and do_vecset(),
which have to worry about offset*(2^n) wrapping or being > SvCUR().
This commit is based on an earlier proposed fix by Aaron Crane.
|
|
|
|
|
|
| |
For -DPERL_GLOBAL_STRUCT_PRIVATE builds, it checks that there aren't any
global symbols. Make it display the symbols if it finds any. It already
does so for bss; this commit adds data and common diag()s.
|
| |
|
|
|
|
| |
the message and warning category may need adjustment
|
|
|
|
|
|
|
| |
and provide a makefile option of the same name to control it. It is set to
'define' by default, but can be commented out or set to 'undef' in the
usual manner to switch it off and return to the legacy default behaviour
of including '.' at the end of @INC.
|
| |
|
|
|
|
|
| |
d7186add added a runperl() test that breaks command line length limits for
VMS. Switch to fresh_perl() instead, so the prog is put in a file for us.
|
| |
|
|
|
|
|
|
| |
To keep t/porting/test_bootstrap.t happy, we need to declare the new test file
as an exception in that it says 'require test.pl' which tests in t/comp/ are
normally not permitted to do.
|
|
|
|
|
| |
Sometimes it's useful to have test.pl around, but it seems inappropriate
to pollute the existing t/comp/parser.t with that.
|
|
|
|
| |
Originally noted as a scoping issue by Andy Lester.
|
|
|
|
|
|
|
| |
If someone is running on a slow system, and they want fold_grind to
complete, they can now set an environment variable based on the relative
slowness of their system, that will be factored in to the length of the
timer.
|
| |
|
|
|
|
| |
(the previous commit forgot to)
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
My recent commit v5.25.9-89-g43dbb3c added an assertion to the effect that
in
@{ local $a[0]{b}[1] } = 1;
the 'local' could only appear at the end of a block and so asserted that the
next op should be OP_LEAVE. However, this:
@{1, local $a[0]{b}[1] } = 1;
inserts an OP_LIST before the OP_LEAVE.
Improve the assert to cope with any number of OP_NULL or OP_LISTs before
the OP_LEAVE.
|
|
|
|
|
|
|
|
|
|
| |
RT #130703
My recent commit v5.25.9-77-g90c3aa0 attempted to simplify buffer growth
in pp_formline(), but missed the operators which append data to
PL_formtarget *without* doing 'goto append'. These ops either append a
fieldsize's worth of bytes, or a \n (FF_NEWLINE). So grow by fieldsize
whenever we fetch something new, and for each FF_NEWLINE.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In something like
"a1b2c3d4..." =~ /(?:(\w)(\d))*..../
A WHILEM state is pushed for each iteration of the '*'. Part of this
state saving includes the previous indices for each of the captures within
the body of the thing being iterated over. So we save the following sets of
values for $1,$2:
()()
(a)(1)
(b)(2)
(c)(3)
(d)(4)
Then if at any point we backtrack, we can undo one or more iterations and
restore the older values of $1,$2.
However, when the match is non-greedy, as in A*?B, then on failure of B
and backtracking we attempt *more* A's rather than removing some already
matched A's. So there's never any need to save all the current paren state
for each iteration.
This eliminates a lot of per-iteration overhead for minimal WHILEMs and
makes the following run about 25% faster:
$s = ("a" x 1000);
$s =~ /^(?:(.)(.))*?[XY]/ for 1..10_000;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #126697
In a regex, after executing a (?{...}) code block, if we fail and
backtrack over the codeblock, we're supposed to unwind the savestack, so
that for any example any local()s within the code block are undone.
It turns out that a backtracking state isn't pushed for (?{...}), only
for postponed evals ( i.e. (??{...})). This means that it relies on one
of the earlier backtracking states to clear the savestack on its behalf.
This can't always be relied upon, and the ticket above contains code where
this falls down; in particular:
'ABC' =~ m{
\A
(?:
(?: AB | A | BC )
(?{
local $count = $count + 1;
print "! count=$count; ; pos=${\pos}\n";
})
)*
\z
}x
Here we end up relying on TRIE_next to do the cleaning up, but TRIE_next
doesn't, since there's nothing it would be responsible for that needs
cleaning up.
The solution to this is to push a backtrack state for every (?{...}) as
well as every (??{...}). The sole job of that state is to do a
LEAVE_SCOPE(ST.lastcp).
The existing backtrack state EVAL_AB has been renamed EVAL_postponed_AB
to make it clear it's only used on postponed /(??{A})B/ regexes, and a new
state has been added, EVAL_B, which is only called when backtracking after
failing something in the B in /(?{...})B/.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #129881 heap-buffer-overflow Perl_pad_sv
In some circumstances involving a pattern which has embedded code blocks
from more than one source, e.g.
my $r = qr{(?{1;}){2}X};
"" =~ /$r|(?{1;})/;
the wrong PL_comppad could be active while doing a LEAVE_SCOPE() or on
exit from the pattern.
This was mainly due to the big context stack changes in 5.24.0 - in
particular, since POP_MULTICALL() now does CX_LEAVE_SCOPE(cx) *before*
restoring PL_comppad, the (correct) unwinding of any SAVECOMPPAD's was
being followed by C<PL_comppad = cx->blk_sub.prevcomppad>, which wasn't
necessarily a sensible value.
To fix this, record the value of PL_savestack_ix at entry to S_regmatch(),
and set the cx->blk_oldsaveix of the MULTICALL to this value when pushed.
On exit from S_regmatch, we either POP_MULTICALL which will do a
LEAVE_SCOPE(cx->blk_oldsaveix), or in the absense of any EVAL, do the
explicit but equivalent LEAVE_SCOPE(orig_savestack_ix).
Note that this is a change in behaviour to S_regmatch() - formerly it
wouldn't necessarily clear the savestack completely back the point of
entry - that would get left to do by its caller, S_regtry(), or indirectly
by Perl_regexec_flags(). This shouldn't make any practical difference, but
is tidier and less likely to introduce bugs later.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Regular expression patterns are parsed by the lexer/toker, and then
compiled by the regex compiler. It is foolish to try to compile one
that the parser has rejected as syntactically bad; assumptions may be
violated and segfaults ensue. This commit abandons all parsing
immediately if a pattern had errors in it. A better solution would be
to flag this pattern as not to be compiled, and continue parsing other
things so as to find the most errors in a single attempt, but I don't
think it's worth the extra effort.
Making this change caused some misleading error messages in the test
suite to be replaced by better ones.
|
| |
|
|
|
|
|
| |
The previous commits hardening toke.c against malformed UTF-8 input have
allowed this test case to be cut down substantially
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previous commits have tightened up the checking of UTF-8 for
well-formedness in the input program or string eval. This is done in
lex_next_chunk and lex_start. But it doesn't handle the case of
use utf8; foo
because 'foo' is checked while UTF-8 is still off. This solves that
problem by noticing when utf8 is turned on, and then rechecking at the
next opportunity.
See thread beginning at
http://nntp.perl.org/group/perl.perl5.porters/242916
This fixes [perl #130675]. A test will be added in a future commit
This catches some errors earlier than they used to be and aborts. so
some tests in the suite had to be split into multiple parts.
|
|
|
|
| |
This reverts commit bfdc8cd3d5a81ab176f7d530d2e692897463c97d.
|