| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
EPOC was a family of operating systems developed by Psion for mobile
devices. It was the predecessor of Symbian.
The port was last updated in April 2002.
|
| |
|
| |
|
|
|
|
| |
SvIsCOW returns a flag which will turn into 0 if truncated to 8 bits.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch does the following:
*) Introduces multiple new hash functions to choose from at build
time. This includes Murmur-32, SDBM, DJB2, SipHash, SuperFast, and
One-at-a-time. Currently this is handled by muning hv.h. Configure
support hopefully to follow.
*) Changes the default hash to Murmur hash which is faster than the
old default One-at-a-time.
*) Rips out the old HvREHASH mechanism and replaces it with a
per-process random hash seed.
*) Changes the old PL_hash_seed from an interpreter value to a
global variable. This means it does not have to be copied during
interpreter setup or cloning.
*) Changes the format of the PERL_HASH_SEED variable to a hex
string so that hash seeds longer than fit in an integer are possible.
*) Changes the return of Hash::Util::hash_seed() from a number to a
string. This is to accomodate hash functions which have more bits than
can be fit in an integer.
*) Adds new functions to Hash::Util to improve introspection of hashes
-) hash_value() - returns an integer hash value for a given string.
-) bucket_info() - returns basic hash bucket utilization info
-) bucket_stats() - returns more hash bucket utilization info
-) bucket_array() - which keys are in which buckets in a hash
More details on the new hash functions can be found below:
Murmur Hash: (v3) from google, see
http://code.google.com/p/smhasher/wiki/MurmurHash3
Superfast Hash: From Paul Hsieh.
http://www.azillionmonkeys.com/qed/hash.html
DJB2: a hash function from Daniel Bernstein
http://www.cse.yorku.ca/~oz/hash.html
SDBM: a hash function sdbm.
http://www.cse.yorku.ca/~oz/hash.html
SipHash: by Jean-Philippe Aumasson and Daniel J. Bernstein.
https://www.131002.net/siphash/
They have all be converted into Perl's ugly macro format.
I have not done any rigorous testing to make sure this conversion
is correct. They seem to function as expected however.
All of them use the random hash seed.
You can force the use of a given function by defining one of
PERL_HASH_FUNC_MURMUR
PERL_HASH_FUNC_SUPERFAST
PERL_HASH_FUNC_DJB2
PERL_HASH_FUNC_SDBM
PERL_HASH_FUNC_ONE_AT_A_TIME
Setting the environment variable PERL_HASH_SEED_DEBUG to 1 will make
perl output the current seed (changed to hex) and the hash function
it has been built with.
Setting the environment variable PERL_HASH_SEED to a hex value will
cause that value to be used at the seed. Any missing bits of the seed
will be set to 0. The bits are filled in from left to right, not
the traditional right to left so setting it to FE results in a seed
value of "FE000000" not "000000FE".
Note that we do the hash seed initialization in perl_construct().
Doing it via perl_alloc() (via init_tls) causes problems under
threaded builds as the buffers used for reentrant srand48 functions
are not allocated. See also the p5p mail "Hash improvements blocker:
portable random code that doesnt depend on a functional interpreter",
Message-ID:
<CANgJU+X+wNayjsNOpKRqYHnEy_+B9UH_2irRA5O3ZmcYGAAZFQ@mail.gmail.com>
|
|
|
|
|
|
|
|
| |
When cloning stacks (e.g. for fake fork), the stack is cloned
by copying the stack AV pointed to by PL_curstackinfo; but the
AvFILL on that AV may not be up to date, resulting in the top N
items of the stack not being cloned. Fix by saving PL_stack_sp
back into AvFILL(PL_curstack) before cloning
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As discussed in ticket #114820, instead of using READONLY+FAKE to mark
a copy-on-write string, we should make it a separate flag.
There are many modules in CPAN (and 1 in core, Compress::Raw::Zlib)
that assume that SvREADONLY means read-only. Only one CPAN module,
POSIX::pselect will definitely be broken by this. Others may need to
be tweaked. But I believe this is for the better.
It causes all tests except ext/Devel-Peek/t/Peek.t (which needs a tiny
tweak still) to pass under PERL_OLD_COPY_ON_WRITE, which is a prereq-
uisite for any new COW scheme that creates COWs under the same cir-
cumstances.
|
|
|
|
|
|
|
|
|
|
|
| |
Remove the context/pTHX from Perl_croak_no_modify and Perl_croak_xs_usage.
For croak_no_modify, it now has no parameters (and always has been
no return), and on some compilers will now be optimized to a conditional
jump. For Perl_croak_xs_usage one push asm opcode is removed at the caller.
For both funcs, their footprint in their callers (which probably are hot
code) is smaller, which means a tiny bit more room in the cache. My text
section went from 0xC1A2F to 0xC198F after apply this. Also see
http://www.nntp.perl.org/group/perl.perl5.porters/2012/11/msg195233.html .
|
|
|
|
|
| |
t/TEST may appear in upper or lower case and with or without a
trailing dot depending on various Unix compatibility settings.
|
|
|
|
|
|
|
| |
[perl #115602]
MUTLICALL sets a local var, cx, to point to the current context stack
frame. When a function is called, the context stack might be realloc()ed,
in which case cx would point to freed memory.
|
|
|
|
|
| |
The code that strips hints from a nextstate couldn't handle
the 'next' pointer being '-', e.g. v:>,<,% ->-
|
|
|
|
|
| |
Some of the unicode setting in a smoke environment sets the open hints
output on nextstate lines.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given something like
my ($a,$b); my $c; my $d;
then after having detected that we can create a padrange op for $a,$b,
extend it to include $c,$d too.
Together with the previous commit that consolidates adjacent padrange
ops, this means that any contiguous sequence of void 'my' declarations
that starts with a list (i.e. my ($x,..) rather than my $x) will
all be compressed into a single padrange op. For example
my ($a,$b);
my @c;
my %d;
my ($e,@f);
becomes the two ops
padrange[$a;$b;@c;%d;$e;@f]
nextstate
The restriction on the first 'my' being a list is that we only ever
convert pushmarks into padranges, to keep things manageable (both for
compiling and for Deparse). This simply means that
my $x; my ($a,$b); my @c; my %d; my ($e,@f)
becomes
padsv[$x]
nextstate
padrange[$a;$b;@c;%d;$e;@f]
nextstate
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In something like
my ($a,$b);
my ($c,$d);
when converting $c,$d into a padrange op, check first whether we're
immediately preceded by a similar padrange (and nextstate) op,
and if so re-use the existing padrange op (by increasing the count).
Also, skip the first nextstate and only use the second nextstate.
So
pushmark;
padsv[$a]; padsv[$b]; list;
nextstate 1;
pushmark;
padsv[$c]; padsv[$c]; list;
nextstate 2;
becomes
padrange[$a,$b]
nextstate 1;
pushmark;
padsv[$c]; padsv[$c]; list;
nextstate 2;
which then becomes
padrange[$a,$b,$c,$d];
nextstate 2;
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a construct like
my ($x,$y) = @_
the pushmark/padsv/padsv is already optimised into a single padrange
op. This commit makes the OPf_SPECIAL flag on the padrange op indicate
that in addition, @_ should be pushed onto the stack, skipping an
additional pushmark/gv[*_]/rv2sv combination.
So in total (including the earlier padrange work), the above construct
goes from being
3 <0> pushmark s
4 <$> gv(*_) s
5 <1> rv2av[t3] lK/1
6 <0> pushmark sRM*/128
7 <0> padsv[$x:1,2] lRM*/LVINTRO
8 <0> padsv[$y:1,2] lRM*/LVINTRO
9 <2> aassign[t4] vKS
to
3 <0> padrange[$x:1,2; $y:1,2] l*/LVINTRO,2 ->4
4 <2> aassign[t4] vKS
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This single op can, in some circumstances, replace the sequence of a
pushmark followed by one or more padsv/padav/padhv ops, and possibly
a trailing 'list' op, but only where the targs of the pad ops form
a continuous range.
This is generally more efficient, but is particularly so in the case
of void-context my declarations, such as:
my ($a,@b);
Formerly this would be executed as the following set of ops:
pushmark pushes a new mark
padsv[$a] pushes $a, does a SAVEt_CLEARSV
padav[@b] pushes all the flattened elements (i.e. none) of @a,
does a SAVEt_CLEARSV
list pops the mark, and pops all stack elements except the last
nextstate pops the remaining stack element
It's now:
padrange[$a..@b] does two SAVEt_CLEARSV's
nextstate nothing needing doing to the stack
Note that in the case above, this commit changes user-visible behaviour in
pathological cases; in particular, it has always been possible to modify a
lexical var *before* the my is executed, using goto or closure tricks.
So in principle someone could tie an array, then could notice that FETCH
is no longer being called, e.g.
f();
my ($s, @a); # this no longer triggers two FETCHES
sub f {
tie @a, ...;
push @a, 1,2;
}
But I think we can live with that.
Note also that having a padrange operator will allow us shortly to have
a corresponding SAVEt_CLEARPADRANGE save type, that will replace multiple
individual SAVEt_CLEARSV's.
|
|
|
|
|
|
|
| |
This allows you to alter the read-only "view" of an optree, by making
particular B::*OP methods on particular op nodes return customised values.
Intended to be used by B::Deparse to "undo" optimisations, thus making it
easier to add new optree optimisations without breaking Deparse.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By defining NO_TAINT_SUPPORT, all the various checks that perl does for
tainting become no-ops. It's not an entirely complete change: it doesn't
attempt to remove the taint-related interpreter variables, but instead
virtually eliminates access to it.
Why, you ask? Because it appears to speed up perl's run-time
significantly by avoiding various "are we running under taint" checks
and the like.
This change is not in a state to go into blead yet. The actual way I
implemented it might raise some (valid) objections. Basically, I
replaced all uses of the global taint variables (but not PL_taint_warn!)
with an extra layer of get/set macros (TAINT_get/TAINTING_get).
Furthermore, the change is not complete:
- PL_taint_warn would likely deserve the same treatment.
- Obviously, tests fail. We have tests for -t/-T
- Right now, I added a Perl warn() on startup when -t/-T are detected
but the perl was not compiled support it. It might be argued that it
should be silently ignored! Needs some thinking.
- Code quality concerns - needs review.
- Configure support required.
- Needs thinking: How does this tie in with CPAN XS modules that use
PL_taint and friends? It's easy to backport the new macros via PPPort,
but that doesn't magically change all code out there. Might be
harmless, though, because whenever you're running under
NO_TAINT_SUPPORT, any check of PL_taint/etc is going to come up false.
Thus, the only CPAN code that SHOULD be adversely affected is code
that changes taint state.
|
|
|
|
| |
(Suggested by Tony Cook.)
|
| |
|
|
|
|
|
|
|
| |
These assorted static allocated variables were in RW memory in the perl
image. Move them to RO memory so they are sharable between different
Perl processes by the OS. The lack of consting in Win32 Dynaloader traces
to commit 0a753a76406 . S_Internals_V traces to commit 4a5df386486 .
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the xpvlv and regexp structs conflict, we have to find somewhere
else to put the regexp struct.
I was going to sneak it in SvPVX, allocating a buffer large
enough to fit the regexp struct followed by the string, and have
SvPVX - sizeof(regexp) point to the struct. But that would make all
regexp flag-checking macros fatter, and those are used in hot code.
So I came up with another method. Regexp stringification is not
speed-critical. So we can move the regexp stringification out of
re->sv_u and put it in the regexp struct. Then the regexp struct
itself can be pointed to by re->sv_u. So SVt_REGEXPs will have
re->sv_any and re->sv_u pointing to the same spot. PVLVs can then
have sv->sv_any point to the xpvlv body as usual, but have sv->sv_u
point to a regexp struct. All regexp member access can go through
sv_u instead of sv_any, which will be no slower than before.
Regular expressions will no longer be SvPOK, so we give sv_2pv spec-
ial logic for regexps. We don’t need to make the regexp struct
larger, as SvLEN is currently always 0 iff mother_re is set. So we
can replace the SvLEN field with the pv.
SvFAKE is never used without SvPOK or SvSCREAM also set. So we can
use that to identify regexps.
|
|
|
|
|
|
| |
This fixes up a couple of test files to work under 5.14.x.
Lots more needs fixing up to make the whole distribution work
under 5.14.x, but I've lost the will for now.,
|
|
|
|
|
|
|
|
|
|
| |
The previous commit moved all B::*OP methods capable of using direct field
offsets into next(). This commit moves the remaining B::*OP methods onto
it too (apart from oplist(), which returns a list rather than a single
item).
This simplifies the code, reduces the object size, and will also make it
easier to add an overlay facility, which will be coming soon.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code for B::OP::next() actually implements all B::*OP::* methods
that work by directly returning a field at a known offset in the OP
structure. Methods that can't do direct access usually have their own
body, rather than sharing with next().
However, whether a method can do direct field access is often dependent on
threading and/or perl version; so the same method is sometimes implemented
by next(), and sometimes by one or more individual method bodies. This is
all very confusing.
This commit takes all methods that *may* be implemented within next(),
and makes them always implemented by next(), using a table of data that
describes each method's offset, or -1 if it needs special handling.
This makes it a lot easier to see what's going on, and will also make it
easier to add an overlay facility, which will be coming soon.
The following commit will consolidate the remaining B::*OP methods within
next().
|
|
|
|
|
|
|
|
|
| |
Expunge all conditional code that supports 5.6.x through 5.9.x,
making 5.10.0 the oldest release notionally supported.
This simplifies things considerably.
See p5p thread starting at
Message-ID: <20121018122941.GE1908@iabyn.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The modules and tests under ext/B are notionally supposed to be
portable to older perl versions; in practice, extensive bit-rot
has occurred; often attempts have been made to add version-specific
code, which haven't actually been tested against older perl versions.
This commit does the minimum necessary to get the tests under ext/B
working with 5.16.0 and 5.16.1, threaded and unthreaded. It makes no
assertions as to whether it will work with the rest of the 5.16.x test
suite.
The side effects of this fix-up are:
* a facility has been added to OptreeCheck.pm (the test module that
checks the Concise output of various constructs) that allows
version-specific matching, e.g.:
# 4 <$> const(PV "junk") s* < 5.017002
# 4 <$> const(PV "junk") s*/FOLD >=5.017002
* OptreeCheck.pm's skip mechanism was found to be broken: checkOptree()
allows you to specify skipping, but only skipped one test, even though
a single call to checkOptree() could generate multiple lines of test
output.
|
| |
|
|
|
|
| |
Hash seed randomization causes these tests to fail occasionally.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Various pieces of code were creating an SV and then assigning to it
from a value that might be magical. If the source scalar is magical,
it could die when magic is called, leaking the scalar that would have
been assigned to.
So we call get-magic before creating the new scalar, and then use a
non-magical assignment.
Also, anonhash and anonlist were doing nothing to protect the aggre-
gate if an argument should die on FETCH, resulting in a leak.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Commits 667763bdbf and e9a8753af fixed bugs involving buffer realloca-
tions during encode and decode. But what was not taken into account
was that the COW flags could still be left on even when buffer real-
ocations were accounted for. This could result in SvPV_set and
SvLEN_set(sv,0) being called on an SV with the COW flags still on,
so SvPVX would be treated as a key inside a shared_he, resulting in
assertion failures.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 667763bdbf was not good enough.
If the buffer passed to an encode method is reallocated, it may be
smaller than the size (bufsiz) stored inside the encoding layer. So
we need to extend the buffer in that case and make sure the buffer
pointer is not pointing to freed memory.
The test as modified by this commit causes malloc errors on stderr
when I try it without the encoding.xs changes.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Using XSANY in addition to a struct of strings, saved 650 bytes (.rdata
and .text combined, 32bit/MS VC2K3/O1) from the previous implementation of
Win32CORE. Instead of encoding pointers or relative pointer sized offsets
to string literals, use unsigned chars. Instead of creating new XSUB C
function stubs, one per forwarded sub, use the ALIAS/XSANY feature and
have only 1 XSUB which has many names. If a length aware version of newXS
is ever added to perl, the sub names's lengths already are available. See
also commit eff5b9d539e for something similar to this commit.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
$baz could be aliased to a package variable, so we do need to recon-
catenate for every iteration of s///g. For s/// without /g, only one
more op will be executed, so the speed difference is negligible.
The only cases we can optimise in terms of skipping the evaluation of
the ops on the rhs (by eliminating the substconst op) are s//constant/
and s//$single_variable/. Anything more complicated causes bugs.
A recent commit made s/foo/$bar/g re-stringify $bar for each iteration
(though without having to reevaluate the ops that return $bar). So we
no longer have to special-case match vars at compile time.
This means that s/foo/bar$baz/g will be slower (and less buggy), but
s/foo/$1/g will be faster.
This also caused an existing taint but in pp_subst to surface. If
get-magic turns off taint on a replacement string, it should not be
considered tainted. So the taint check on the replacement should come
*after* the stringification. This applies to the constant replacement
optimisation. pp_substcont was already doing this correctly.
|
|
|
|
|
|
|
|
| |
XS code doing sv_mortalcopy(sv) will expect to get a true copy, and
not a COW ‘copy’.
So make sv_mortalcopy and wrapper around the new sv_mortalcopy_flags
that passes it SV_DO_COW_SVSETSV, which is defined as 0 for XS code.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was trying to figure out why Encode’s perlio.t was sometimes failing
under PERL_OLD_COPY_ON_WRITE (depending on the number of comments in
the source code, or metereological conditions).
I noticed that PerlIO::encoding assumes that the buffer passed to
the encode method will come back SvPOKp. (It accesses SvCUR without
checking any flags.)
That means it can come back as a typeglob, reference, or undefined,
and PerlIO::encoding won’t care. This can result in crashes. Assign-
ing $_[1] = *foo inside an encode method is not a smart thing to do,
but it shouldn’t crash.
PerlIO::encoding was also assuming that SvPVX would not change between
calls to encode. It is very easy to reallocate it. This means the
internal buffer used by the encoding layer (which is owned by the
SV buffer passed to the encode method) can be freed and still subse-
quently written too, which is not good.
This commit makes PerlIO::encoding force stringification of the value
returned. If it does not match its internal buffer pointers, it
resets them based on the buffer SV.
This probably makes Encode pass its tests under
PERL_OLD_COPY_ON_WRITE, but I have yet to confirm it. Encoding mod-
ules are expected to write to the buffer ($_[1] = '') in certain
cases. If COW is enabled, that would cause the buffer’s SvPVX to
point to the same string as the rhs, which would explain why the lack
of accounting for SvPVX changes caused test failures under
PERL_OLD_COPY_ON_WRITE.
|
|
|
|
|
| |
Though this is harmless, since the test script is short-lived, it is
good to fix it in case this test is copied by module authors.
|
|
|
|
| |
Add documentation to attributes.pm for :shared and :unique, and bump version.
|
|
|
|
|
| |
my subs do not currently work yet. I am not sure what the API
should be.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Historically the regex engine has assumed that any string passed to it
will have a trailing null char. This isn't normally an issue in perl code,
since perl strings *are* null terminated; but it could cause problems with
strings returned by XS code, or with someone calling the regex engine
directly from XS, with strend not pointing at a null char.
The engine currently relies on there being a null char in the following
ways.
First, when at the end of string, the main loop of regmatch() still reads
in the 'next' character (i.e. the character following the end of string)
even if it doesn't make any use of it. This precludes using memory mapped
files as strings for example, since the read off the end would SEGV.
Second, the matching algorithm often required the trailing character to be
\0 to work correctly: the test for 'EOF' was "if next char is null *and*
locinput >= PL_regeol, then stop". So a random non-null trailing char
could cause an overshoot.
Thirdly, some match ops require the trailing char to be null to operate
correctly; for example, \b applied at the end of the string only happens
to work because the trailing char (\0) happens to match \W.
Also, some utf8 ops will try to extract the code point at the end, which
can result in multiple bytes past the end of string being read, and
possible problems if they don't correspond to well-formed utf8.
The main fix is in S_regmatch, where the 'read next char' code has been
updated to set it to a special value, NEXTCHR_EOS instead, if we would be
reading past the end of the string.
Lots of other random bits in the regex engine needed to be fixed up too.
To track these down, I temporarily hacked regexec_flags() to make a copy
of the string but without trailing \0, then ran all the t/re/*.t tests
under valgrind to flush out all buffer overruns. So I think I've removed
most of the bad code, but by no means all of it. The code within the
various functions in regexec.c is far too complex to be able to visually
audit the code with any confidence.
|
|
|
|
|
| |
MPE/iX was a business-oriented minicomputer operating system made by
Hewlett-Packard. Support from HP terminated at the end of 2010.
|
| |
|
|
|
|
|
|
|
|
|
| |
If a pattern passed to File::Glob consists of a space-separated list
of patterns, the stack will only be extended by doglob() enough for
the list returned by each subpattern. So iterate() needs to extend
the stack before copying the list of files from an AV to the stack.
This fixes a regression introduced in 5.16.0.
|
| |
|
| |
|
|
|
|
|
|
| |
This will be used for cloning a ‘my’ sub on scope entry.
I was going to use pp_padcv for this, but it would end up having a
top-level if/else.
|