| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
/[[:upper:]]/i and /[[:lower:]]/i should match the Unicode property
\p{Cased}. This commit introduces a pseudo-Posix class, internally named
'cased', to represent this. This class isn't specifiable by the user,
except through using either /[[:upper:]]/i or /[[:lower:]]/i. Debug
output will say ':cased:'.
The regex parsing either of :lower: or :upper: will change them into
:cased:, where already existing logic can handle this, just like any
other class.
This commit fixes the regression introduced in
3018b823898645e44b8c37c70ac5c6302b031381, and that these have never
worked under 'use locale'. The next commit will un-TODO the tests for
these things.
|
|
|
|
|
|
|
| |
This also changes isIDCONT_utf8() to use the Perl definition, which
excludes any \W characters (the Unicode definition includes a few of
these). Tests are also added. These macros remain undocumented for
now.
|
|
|
|
|
| |
Previous commits have placed some inversion list pointers into arrays.
This commit extends that to another group of inversion lists
|
|
|
|
|
| |
An earlier commit placed some inversion list pointers into an array.
This commit extends that to another group of inversion lists.
|
|
|
|
|
|
| |
This patch creates an array pointing to the inversion lists that cover
the Latin-1 ranges for Posix character classes, and uses it instead of
the individual variables previously referred to.
|
| |
|
|
|
|
|
|
| |
This refactors the code slightly that checks for Korean precomposed
syllables in \X. It eliminates the PL_variable formerly used to keep
track of things.
|
|
|
|
| |
As of the previous commit, nothing is using it.
|
|
|
|
|
|
|
| |
We think this is meant to stand for C's alphanumeric, that is what is
matched by POSIX [:alnum:]. There were not functions and a dedicated
swash available for accessing it. Future commits will want to use
these.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PL_sawampersand actually causes bugs (e.g., perl #4289), because the
behaviour changes. eval '$&' after a match will produce different
results depending on whether $& was seen before the match.
Using copy-on-write for the pre-match copy (preceding patches do that)
alleviates the slowdown caused by mentioning $&. The copy doesn’t
happen unless the string is modified after the match. It’s now a
post- match copy. So we no longer need to do things differently
depending on whether $& has been seen.
PL_sawampersand is now #defined to be equal to what it would be if
every program began with $',$&,$`.
I left the PL_sawampersand code in place, in case this commit proves
immature. Running Configure with -Accflags=PERL_SAWAMPERSAND will
reënable the PL_sawampersand mechanism.
|
|
|
|
|
|
|
|
|
|
|
|
| |
These variables have been unused in the Perl core since
commit 4c88d5e0740d796bf5064336d280bba72897f385.
The variables are undocumented. The only real use of any of these I
found in CPAN is at
https://metacpan.org/source/ABERGMAN/Devel-GC-Helper-0.25/Helper.xs#L1
The uses there appear to be in a list of known Perl variables. Since
the module was published, more than a few new variables have been added,
making this code obsolete anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch does the following:
*) Introduces multiple new hash functions to choose from at build
time. This includes Murmur-32, SDBM, DJB2, SipHash, SuperFast, and
One-at-a-time. Currently this is handled by muning hv.h. Configure
support hopefully to follow.
*) Changes the default hash to Murmur hash which is faster than the
old default One-at-a-time.
*) Rips out the old HvREHASH mechanism and replaces it with a
per-process random hash seed.
*) Changes the old PL_hash_seed from an interpreter value to a
global variable. This means it does not have to be copied during
interpreter setup or cloning.
*) Changes the format of the PERL_HASH_SEED variable to a hex
string so that hash seeds longer than fit in an integer are possible.
*) Changes the return of Hash::Util::hash_seed() from a number to a
string. This is to accomodate hash functions which have more bits than
can be fit in an integer.
*) Adds new functions to Hash::Util to improve introspection of hashes
-) hash_value() - returns an integer hash value for a given string.
-) bucket_info() - returns basic hash bucket utilization info
-) bucket_stats() - returns more hash bucket utilization info
-) bucket_array() - which keys are in which buckets in a hash
More details on the new hash functions can be found below:
Murmur Hash: (v3) from google, see
http://code.google.com/p/smhasher/wiki/MurmurHash3
Superfast Hash: From Paul Hsieh.
http://www.azillionmonkeys.com/qed/hash.html
DJB2: a hash function from Daniel Bernstein
http://www.cse.yorku.ca/~oz/hash.html
SDBM: a hash function sdbm.
http://www.cse.yorku.ca/~oz/hash.html
SipHash: by Jean-Philippe Aumasson and Daniel J. Bernstein.
https://www.131002.net/siphash/
They have all be converted into Perl's ugly macro format.
I have not done any rigorous testing to make sure this conversion
is correct. They seem to function as expected however.
All of them use the random hash seed.
You can force the use of a given function by defining one of
PERL_HASH_FUNC_MURMUR
PERL_HASH_FUNC_SUPERFAST
PERL_HASH_FUNC_DJB2
PERL_HASH_FUNC_SDBM
PERL_HASH_FUNC_ONE_AT_A_TIME
Setting the environment variable PERL_HASH_SEED_DEBUG to 1 will make
perl output the current seed (changed to hex) and the hash function
it has been built with.
Setting the environment variable PERL_HASH_SEED to a hex value will
cause that value to be used at the seed. Any missing bits of the seed
will be set to 0. The bits are filled in from left to right, not
the traditional right to left so setting it to FE results in a seed
value of "FE000000" not "000000FE".
Note that we do the hash seed initialization in perl_construct().
Doing it via perl_alloc() (via init_tls) causes problems under
threaded builds as the buffers used for reentrant srand48 functions
are not allocated. See also the p5p mail "Hash improvements blocker:
portable random code that doesnt depend on a functional interpreter",
Message-ID:
<CANgJU+X+wNayjsNOpKRqYHnEy_+B9UH_2irRA5O3ZmcYGAAZFQ@mail.gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
This completes the process of allowing users to define their own aliases
for \N{} in any language they choose. Names have some validation
applied so that they can't, for example, begin with something that is a
digit in some Unicode script. Tests and documentation are included in
this patch. The loop in toke.c that does the validation for
user-supplied translators is revamped, and the messages that are output
when there is an error are fixed to work with UTF-8.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I added pad IDs so that a pad could record which pad it closes over,
to avoid problems with closures closing over the wrong pad, resulting
in crashes or bizarre copies. These pad IDs were shared between
clones of the same pad.
In commit 9ef8d56, for efficiency I made clones of the same closure
share the same pad name list.
It has just occurred to be that each padlist containing the same pad
name list also has the same pad ID, so we can just use the pad name
list itself as the ID.
This makes padlists 32 bits smaller and eliminates PL_pad_generation
from the interpreter struct.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The design for handling characters that fold to multiple characters when
the former are encountered in a bracketed character class is defective.
The ticket reads, "If a bracketed character class includes a character
that has a multi-char fold, and it also includes the first character of
that fold, the multi-char fold will never be matched; just the first
character of the fold.". Thus, in the class /[\0-\xff]/i, \xDF will
never be matched, because its fold is 'ss', the first character of
which, 's', is also in the class.
The reason the design is defective is that it doesn't allow for
backtracking and trying the other options.
This commit solves this by effectively rewriting the above to be
/ (?: \xdf | [\0-\xde\xe0-\xff] ) /xi. And so the backtracking gets
handled automatcially by the regex engine.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
$^L is neither a magical variable, nor a normal one (like $;) but
it's just a little bit special :)
This patch removes PL_formfeed - IMHO, an extra gv_fetchpv per page
when using formats isn't going to cause a sensible speed regression.
I suppose that removing the intrpvar.h hunk from the patch is enough
to keep binary compatibility - unless someone used PL_formfeed from
an XS module.
[with regen.pl run as noted by the author, and an additional change to
perl.c to remove the reference to PL_formfeed added soon after this patch
was sent]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The rules for matching whether an above-Latin1 code point are now saved
in a macro generated from a trie by regen/regcharclass.pl, and these are
now used by pp.c to test these cases. This allows removal of a wrapper
subroutine, and also there is no need for dynamic loading at run-time
into a swash.
This macro is about as big as I'm comfortable compiling in, but it
saves the building of a hash that can grow over time, and removes a
subroutine and interpreter variables. Indeed, performance benchmarks
show that it is about the same speed as a hash, but it does not require
having to load the rules in from disk the first time it is used.
|
|
|
|
|
|
|
|
|
|
| |
A previous commit has caused macros to be generated that will match
Unicode code points of interest to the \X algorithm. This patch uses
them. This speeds up modern Korean processing by 15%.
Together with recent previous commits, the throughput of modern Korean
under \X has more than doubled, and is now comparable to other
languages (which have increased themselved by 35%)
|
|
|
|
|
|
|
|
|
| |
Prior to this commit 98.4% of Unicode code points that went through \X
had to be looked up to see if they begin a grapheme cluster; then looked
up again to find that they didn't require special handling. This commit
refactors things so only one look-up is required for those 98.4%. It
changes the table generated by mktables to accomplish this, and hence
the name of it, and references to it are changed to correspond.
|
|
|
|
|
|
|
|
|
|
|
| |
This changes code to be able to handle Unicode 6.2, while continuing to
handle all prevrious releases.
The major change was a new definition of \X, which adds a property to
its calculation. Unfortunately \X is hard-coded into regexec.c, and so
has to revised whenever there is a change of this magnitude in Unicode,
which fortunately isn't all that often. I refactored the code in
mktables to make it easier next time there is a change like this one.
|
|
|
|
|
|
| |
In looking at \X handling, I noticed that this function which is
intended for use in it, actually isn't used. This function may someday
be useful, so I'm leaving the source in.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CVs close over their outer CVs. So, when you write:
my $x = 52;
sub foo {
sub bar {
sub baz {
$x
}
}
}
baz’s CvOUTSIDE pointer points to bar, bar’s CvOUTSIDE points to foo,
and foo’s to the main cv.
When the inner reference to $x is looked up, the CvOUTSIDE chain is
followed, and each sub’s pad is looked at to see if it has an $x.
(This happens at compile time.)
It can happen that bar is undefined and then redefined:
undef &bar;
eval 'sub bar { my $x = 34 }';
After this, baz will still refer to the main cv’s $x (52), but, if baz
had ‘eval '$x'’ instead of just $x, it would see the new bar’s $x.
(It’s not really a new bar, as its refaddr is the same, but it has a
new body.)
This particular case is harmless, and is obscure enough that we could
define it any way we want, and it could still be considered correct.
The real problem happens when CVs are cloned.
When a CV is cloned, its name pad already contains the offsets into
the parent pad where the values are to be found. If the outer CV
has been undefined and redefined, those pad offsets can be com-
pletely bogus.
Normally, a CV cannot be cloned except when its outer CV is running.
And the outer CV cannot have been undefined without also throwing
away the op that would have cloned the prototype.
But formats can be cloned when the outer CV is not running. So it
is possible for cloned formats to close over bogus entries in a new
parent pad.
In this example, \$x gives us an array ref. It shows ARRAY(0xbaff1ed)
instead of SCALAR(0xdeafbee):
sub foo {
my $x;
format =
@
($x,warn \$x)[0]
.
}
undef &foo;
eval 'sub foo { my @x; write }';
foo
__END__
And if the offset that the format’s pad closes over is beyond the end
of the parent’s new pad, we can even get a crash, as in this case:
eval
'sub foo {' .
'{my ($a,$b,$c,$d,$e,$f,$g,$h,$i,$j,$k,$l,$m,$n,$o,$p,$q,$r,$s,$t,$u)}'x999
. q|
my $x;
format =
@
($x,warn \$x)[0]
.
}
|;
undef &foo;
eval 'sub foo { my @x; my $x = 34; write }';
foo();
__END__
So now, instead of using CvROOT to identify clones of
CvOUTSIDE(format), we use the padlist ID instead. Padlists don’t
actually have an ID, so we give them one. Any time a sub is cloned,
the new padlist gets the same ID as the old. The format needs to
remember what its outer sub’s padlist ID was, so we put that in the
padlist struct, too.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Input text to be matched under /i is placed in EXACTFish nodes. The
current limit on such text is 255 bytes per node. Even if we raised
that limit, it will always be finite. If the input text is longer than
this, it is split across 2 or more nodes. A problem occurs when that
split occurs within a potential multi-character fold. For example, if
the final character that fits in a node is 'f', and the next character
is 'i', it should be matchable by LATIN SMALL LIGATURE FI, but because
Perl isn't structured to find multi-char folds that cross node
boundaries, we will miss this it.
The solution presented here isn't optimum. What we do is try to prevent
all EXACTFish nodes from ending in a character that could be at the
beginning or middle of a multi-char fold. That prevents the problem.
But in actuality, the problem only occurs if the input text is actually
a multi-char fold, which happens much less frequently. For example,
we try to not end a full node with an 'f', but the problem doesn't
actually occur unless the adjacent following node begins with an 'i' (or
one of the other characters that 'f' participates in). That is, this
patch splits when it doesn't need to.
At the point of execution for this patch, we only know that the final
character that fits in the node is that 'f'. The next character remains
unparsed, and could be in any number of forms, a literal 'i', or a hex,
octal, or named character constant, or it may need to be decoded (from
'use encoding'). So look-ahead is not really viable.
So finding if a real multi-character fold is involved would have to be
done later in the process, when we have full knowledge of the nodes, at
the places where join_exact() is now called, and would require inserting
a new node(s) in the middle of existing ones.
This solution seems reasonable instead.
It does not yet address named character constants (\N{}) which currently
bypass the code added here.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit eliminates the old slab allocator. It had bugs in it, in
that ops would not be cleaned up properly after syntax errors. So why
not fix it? Well, the new slab allocator *is* the old one fixed.
Now that this is gone, we don’t have to worry as much about ops leak-
ing when errors occur, because it won’t happen any more.
Recent commits eliminated the only reason to hang on to it:
PERL_DEBUG_READONLY_OPS required it.
|
|
|
|
|
|
|
|
|
|
| |
These macros have never worked outside the Latin1 range, so this extends
them to work.
There are no tests I could find for things in handy.h, except that many
of them are called all over the place during the normal course of
events. This commit adds a new file for such testing, containing for
now only with a few tests for the isBLANK's
|
|
|
|
|
|
| |
This used to be the mechanism to determine whether "use re 'eval'" needed
to be in scope; but now that we make a clear distinction between literal
and runtime code blocks, it's no longer needed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this commit, a pointer to the cop’s stash was stored in
cop->cop_stash under non-threaded perls, and the name and name length
were stored in cop->cop_stashpv and cop->cop_stashlen under ithreads.
Consequently, eval "__PACKAGE__" would end up returning the
wrong package name under threads if the current package had been
assigned over.
This commit changes the way cops store their stash under threads. Now
it is an offset (cop->cop_stashoff) into the new PL_stashpad array
(just a mallocked block), which holds pointers to all stashes that
have code compiled in them.
I didn’t use the lexical pads, because CopSTASH(cop) won’t work unless
PL_curpad is holding the right pad. And things start to get very
hairy in pp_caller, since the correct pad isn’t anywhere easily
accessible on the context stack (oldcomppad actually referring to the
current comppad). The approach I’ve followed uses far less code, too.
In addition to fixing the bug, this also saves memory. Instead of
allocating a separate PV for every single statement (to hold the stash
name), now all lines of code in a package can share the same stashpad
slot. So, on a 32-bit OS X, that’s 16 bytes less memory per COP for
short package names. Since stashoff is the same size as stashpv,
there is no difference there. Each package now needs just 4 bytes in
the stashpad for storing a pointer.
For speed’s sake PL_stashpadix stores the index of the last-used
stashpad offset. So only when switching packages is there a linear
search through the stashpad.
|
|
|
|
|
|
|
|
|
| |
The core is not using it any more. Every CPAN module that increments
it also does newXS, which triggers mro_method_changed_in, which is
sufficient; so nothing will break.
So, to keep those modules compiling, PL_amagic_generation is now an
alias to PL_na outside the core.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we cache the UID/GID and effective UID/GID similarly to how
we used to cache getpid() before v5.14.0-251-g0e21945. Remove this
magical behavior in favor of always calling getpid(), getgid()
etc. This resolves RT #96208.
A minimal testcase for this is the following by Leon Timmermans
attached to RT #96208:
eval { require 'syscall.ph'; 1 } or eval { require 'sys/syscall.ph'; 1 } or die $@;
if (syscall(&SYS_setuid, $ARGV[0] + 0 || 1000) >= 0 or die "$!") {
printf "\$< = %d, getuid = %d\n", $<, syscall(&SYS_getuid);
}
I.e. if we call the sete?[ug]id() functions unbeknownst to perl the
$<, $>, $( and $) variables won't be updated. This results in the same
sort of issues we had with $$ before v5.14.0-251-g0e21945, and
getppid() before my v5.15.7-407-gd7c042c patch.
I'm completely eliminating the PL_egid, PL_euid, PL_gid and PL_uid
variables as part of this patch, this will break some CPAN modules,
but it'll be really easy before the v5.16.0 final to reinstate
them. I'd like to remove them to see what breaks, and how easy it is
to fix it.
These variables are not part of the public API, and the modules using
them could either use the Perl_gete?[ug]id() functions or are working
around the bug I'm fixing with this commit.
The new PL_delaymagic_(egid|euid|gid|uid) variables I'm adding are
*only* intended to be used internally in the interpreter to facilitate
the delaymagic in Perl_pp_sassign. There's probably some way not to
export these to programs that embed perl, but I haven't found out how
to do that.
|
|
|
|
|
|
|
|
|
|
| |
As described in the pod changes in this commit, this changes quotemeta()
to consistenly quote non-ASCII characters when used under
unicode_strings. The behavior is changed for these and UTF-8 encoded
strings to more closely align with Unicode's recommendations.
The end result is that we *could* at some future point start using other
characters as metacharacters than the 12 we do now.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Under POSIX threads the getpid() and getppid() functions return the
same values across multiple threads, i.e. threads don't have their own
PID's. This is not the case under the obsolete LinuxThreads where each
thread has a different PID, so getpid() and getppid() will return
different values across threads.
Ever since the first perl 5.0 we've returned POSIX-consistent
semantics for $$, until v5.14.0-251-g0e21945 when the getpid() cache
was removed. In 5.8.1 Rafael added further explicit POSIX emulation in
perl-5.8.0-133-g4d76a34 [1] by explicitly caching getppid(), so that
multiple threads would always return the same value.
I don't think all this effort to emulate POSIX sematics is worth it. I
think $$ and getppid() are OS-level functions that should always
return the same as their C equivalents. I shouldn't have to use a
module like Linux::Pid to get the OS version of the return values.
This is pretty much a complete non-issue in practice these days,
LinuxThreads was a Linux 2.4 thread implementation that nobody
maintains anymore[2], all modern Linux distros use NPTL threads which
don't suffer from this discrepancy. Debian GNU/kFreeBSD does use
LinuxThreads in the 6.0 release, but they too will be moving away from
it in future releases, and really, nobody uses Debian GNU/kFreeBSD
anyway.
This caching makes it unnecessarily tedious to fork an embedded Perl
interpreter. When someone that constructs an embedded perl interpreter
and forks their application, the fork(2) system call isn't going to
run Perl_pp_fork(), and thus the return value of $$ and getppid()
doesn't reflect the current process. See [3] for a bug in uWSGI
related to this, and Perl::AfterFork on the CPAN for XS code that you
need to run after forking a PerlInterpreter unbeknownst to perl.
We've already been failing the tests in t/op/getpid.t on these Linux
systems that nobody apparently uses, the Debian GNU/kFreeBSD users did
notice and filed #96270, this patch fixes that failure by changing the
tests to test for different behavior under LinuxThreads, I've tested
that this works on my Debian GNU/kFreeBSD 6.0.4 virtual machine.
If this change is found to be unacceptable (i.e. we want to continue
to emulate POSIX thread semantics for the sake of LinuxThreads) we
also need to revert v5.14.0-251-g0e21945, because currently we're only
emulating POSIX semantics for getppid(), not getpid(). But I don't
think we should do that, both v5.14.0-251-g0e21945 and this commit are
awesome.
This commit includes a change to embedvar.h made by "make
regen_headers".
1. http://www.nntp.perl.org/group/perl.perl5.porters/2002/08/msg64603.html
2. http://pauillac.inria.fr/~xleroy/linuxthreads/
3. http://projects.unbit.it/uwsgi/ticket/85
|
|
|
|
|
| |
Commit 24caacbccae7b938deecdcc3f13dd66c9c6a684e removed all uses of this
variable, but failed to remove it.
|
|
|
|
|
|
| |
Same for [[:upper:]] and \p{Upper}. These were matching instead all of
[[:alpha:]] or \p{Alpha}. What /\p{Lower}/i and /\p{Upper}/i match instead
is \p{Cased}, and so that is what these should match.
|
|
|
|
|
| |
This function provides a convenient and thread-safe way for modules to
hook op checking.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These will be used in regcomp.c to replace the existing bit-wise
handling of these, enabling subsequent optimizations.
These are compiled-in, and hence affect the memory footprint of every
program, including those that don't use Unicode. The lists that aren't
tiny are therefore currently restricted to only the Latin1 range;
anything needed beyond that will have to be read in at execution time,
just as before.
The design allows for easy conversion from Latin1 to use the full
Unicode range, should it be deemed desirable for some or all of these.
|
|
|
|
|
|
| |
This creates three simple compile-time inversion lists from the data
that has been generated in a previous commit, and uses two of them.
Three PL_ variables are used to store them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit makes CORE::glob bypassing glob overrides.
A side effect of the fix is that, with the default glob implementa-
tion, undefining *CORE::GLOBAL::glob no longer results in an ‘unde-
fined subroutine’ error.
Another side effect is that compilation of a glob op no longer assumes
that the loading of File::Glob will create the *CORE::GLOB::glob type-
glob. ‘++$INC{"File/Glob.pm"}; sub File::Glob::csh_glob; eval '<*>';’
used to crash.
This is accomplished using a mechanism similar to lock() and
threads::shared. There is a new PL_globhook interpreter varia-
ble that pp_glob calls when there is no override present. Thus,
File::Glob (which is supposed to be transparent, as it *is* the
built-in implementation) no longer interferes with the user mechanism
for overriding glob.
This removes one tier from the five or so hacks that constitute glob’s
implementation, and which work together to make it one of the buggiest
and most inconsistent areas of Perl.
|
| |
|
|
|
|
|
|
|
|
|
| |
Unicode stability policy guarantees that no code points will ever be
added to the control characters beyond those already in it.
All such characters are in the Latin1 range, and so the Perl core
already knows which ones those are, and so there is no need to go out to
disk and create a swash for these.
|
|
|
|
|
|
| |
These three properties are restricted to being true only for ASCII
characters. That information is compiled into Perl, so no need to
create swashes for them.
|
|
|
|
|
| |
This information is trivially computed via the macro, no need to go out
to disk and store a swash for this.
|
|
|
|
|
|
|
| |
For the default (non-multiplicity) configuration, PERLVAR*() macros now
directly expand their arguments to tokens such as C<PL_defgv>, instead of
expanding to C<PL_Idefgv>. This removes over 350 lines from F<embedvar.h>,
which defined macros to map from C<PL_Idefgv> to C<PL_defgv> and so forth.
|
|
|
|
|
| |
This allows more than one C<study> to be active at the same time.
It eliminates PL_screamfirst, PL_lastscream, PL_maxscream.
|
|
|
|
|
| |
Effectively, PL_screamnext is now PL_screamfirst + 256. The actual interpreter
variable PL_screamnext is eliminated.
|
|
|
|
|
| |
They exist solely to ensure that Perl_runops_standard and Perl_runops_debug
are linked in - nothing assigns to either variable, and nothing reads them.
|
|
|
|
| |
Make them const U16 - they should have been const from the start.
|
|
|
|
|
| |
Rename PL_interp_size_5_10_0 to PL_interp_size_5_16_0, as it is only intended to
track interpreter size within (forwards) binary compatible maintenance branches.
|
|
|
|
|
| |
To get the initialisation to work, the location of #include patchlevel.h needs
to be moved.
|
|
|
|
|
|
|
|
| |
On OS/2, keep it in perlvars.h, as it's not const there. makedef.pl doesn't
pay attention to C pre-processor symbols, so it will always see the declaration
in perlvars.h, and add the symbol to the linker file, so no need to mention
sh_path in globvar.sym. Add special case logic in regen/embed.pl to make the
embedvar.h macros for PL_sh_path defined only on OS/2.
|
|
|
|
|
|
|
|
|
|
|
| |
They were converted in perl.h from const char[] to #define in 31fb120917c4f65d,
then re-instated as const char[], but in perlvars.h, in 3fe35a814d0a98f4.
There's no need for compile-time constants to jump through the hoops of
perlvars.h, even for Symbian, as the various "EXTCONST" variables already in
perl.h demonstrate.
These were the only 3 users of the the PERLVARISC macro, so eliminate that, and
all related code.
|