| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
These variables are constant, once initialized, through the life of a
program, so having them be per instance is a waste of time and space
|
|
|
|
|
|
|
|
|
|
| |
This commit changes to use the C data structures generated by the
previous commit to compute what characters fold to a given one. This is
used to find out what things should match under /i.
This now avoids the expensive start up cost of switching to perl
utf8_heavy.pl, loading a file from disk, and constructing a hash from
it.
|
|
|
|
|
| |
These were for when some of the Posix character classes were implemented
as swashes, which is no longer the case, so these can be removed.
|
|
|
|
|
|
|
|
| |
This commit makes the inversion lists for parsing character name global
instead of interpreter level, so can be initialized once per process,
and no copies are created upon new thread instantiation. More
importantly, this is another instance where utf8_heavy.pl no longer
needs to be loaded, and the definition files read from disk.
|
|
|
|
|
| |
These are now constant through the life of the program, so don't need to
be duplicated at each new thread instantiation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this commit, if a program wanted to compute the case-change of
a character above 0xFF, the C code would switch to perl, loading
lib/utf8heavy.pl and then read another file from disk, and then create a
hash. Future references would use the hash, but the start up cost is
quite large. There are five case change types, uc, lc, tc, fc, and
simple fc. Only the first encountered requires loading of utf8_heavy,
but each required switching to utf8_heavy, and reading the appropriate
file from disk.
This commit changes these functions to use compiled-in C data structures
(inversion maps) to represent the data. To look something up requires a
binary search instead of a hash lookup.
An individual hash lookup tends to be faster than a binary search, but
the differences are small for small sizes. I did some benchmarking some
years ago, (commit message 87367d5f9dc9bbf7db1a6cf87820cea76571bf1a) and
the results were that for fewer than 512 entries, the binary search was
just as fast as a hash, if not actually faster. Now, I've done some
more benchmarks on blead, using the tool benchmark.pl, which wasn't
available back then. The results below indicate that the differences
are minimal up through 2047 entries, which all Unicode properties are
well within.
A hash, PL_foldclosures, is still constructed at runtime for the case of
regular expression /i matching, and this could be generated at Perl
compile time, as a further enhancement for later. But reading a file
from disk is no longer required to do this.
======================= benchmarking results =======================
Key:
Ir Instruction read
Dr Data read
Dw Data write
COND conditional branches
IND indirect branches
_m branch predict miss
_m1 level 1 cache miss
_mm last cache (e.g. L3) miss
- indeterminate percentage (e.g. 1/0)
The numbers represent raw counts per loop iteration.
"\x{10000}" =~ qr/\p{CWKCF}/"
swash invlist Ratio %
fetch search
------ ------- -------
Ir 2259.0 2264.0 99.8
Dr 665.0 664.0 100.2
Dw 406.0 404.0 100.5
COND 406.0 405.0 100.2
IND 17.0 15.0 113.3
COND_m 8.0 8.0 100.0
IND_m 4.0 4.0 100.0
Ir_m1 8.9 17.0 52.4
Dr_m1 4.5 3.4 132.4
Dw_m1 1.9 1.2 158.3
Ir_mm 0.0 0.0 100.0
Dr_mm 0.0 0.0 100.0
Dw_mm 0.0 0.0 100.0
These were constructed by using the file whose contents are below, which
uses the property in Unicode that currently has the largest number of
entries in its inversion list, > 1600. The test was run on blead -O2,
no debugging, no threads. Then the cut-off boundary was changed from
512 to 2047 for when we use a hash vs an inversion list, and the test
run again. This yields the difference between a hash fetch and an
inversion list binary search
===================== The benchmark file is below ===============
no warnings 'once';
my @benchmarks;
push @benchmarks, 'swash' => {
desc => '"\x{10000}" =~ qr/\p{CWKCF}/"',
setup => 'no warnings "once"; my $re = qr/\p{CWKCF}/; my $a =
"\x{10000}";',
code => '$a =~ $re;',
};
\@benchmarks;
|
|
|
|
|
|
|
| |
This adds an #ifdef around this variable, so that it isn't defined
unless used.
Spotted by Daniel Dragan.
|
|
|
|
|
|
|
|
|
|
| |
These structures are read-only, use const C strings, and are truly
global, so no need to have them be interpreter level. This saves
duplicating and freeing them as threads come and go.
In doing this, I noticed that not every one was properly being
copied/deallocated, so this fixes some potential unreported bugs, and
leaks.
|
|
|
|
|
|
| |
This (large) commit allows locales to be used in threaded perls on
platforms that support it. This includes recent Windows and Posix 2008
ones.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is possible for operations on threaded perls which don't 'use locale'
to still change the locale. This happens when calling
POSIX::localeconv() and I18N::Langinfo(), and in earlier perls, it can
happen for other operations when perl has been initialized with the
environment causing the various locale categories to not have a uniform
locale.
This commit causes the areas where the locale for this category should
predictably be in one or the other state to be a critical section where
another thread can't interrupt and change it. This is a separate
mutex, so that only these particular operations will be held up.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
khw could not find any modules on CPAN that correctly use the C library
function setlocale(). (The very few that do try, do not use it
correctly, looking at the return value incorrectly, so they are broken.)
This analysis does not include modules that call non-Perl libaries that
may call setlocale().
And, a future commit will render the setlocale() function useless in
some configurations on some platforms.
So this commit adds Perl_setlocale(), for XS code to call, and which is
always effective, but it should not be used to alter the locale except
on platforms where the predefined variable ${^SAFE_LOCALES} evaluates to
1.
This function is also what POSIX::setlocale() calls to do the real work.
|
|
|
|
|
| |
perl.h has a single #define which is the combination of several that
determines if this object should be created or not.
|
|
|
|
|
|
|
|
|
| |
On systems that have the POSIX 2008 operations, including
nl_langinfo_l(), this commit causes them to not have to actually change
the locale when determining what the decimal point character is.
The locale may have to change during the printing/reading of numbers,
but eventually we can use sprintf_l(), if available, to avoid that too.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some locales are UTF-8, some are not. Knowledge of this is needed in
various circumstances. This commit saves the results of the last
several lookups so they don't have to be recalculated each time.
The full generality of POSIX locales is such that you can have error
messages be displayed in one locale, say Spanish, while other things are
in French. To accommodate this generality, the program can loop through
all the locale categories finding the UTF8ness of the locale it points
to. However, in almost all instances, people are going to be in either
French or in Spanish, and not in some combination. Suppose it is a
French UTF-8 locale for all categories. This new cache will know that
the French locale is UTF-8, and the queries for all but the first
category can return that immediately.
This simple cache avoids the overhead of hashes.
This also fixes a bug I realized exists in threaded perls, but haven't
reproduced. We do not support locales in such perls, and the user must
not change the locale or 'use locale'. But perl itself could change the
locale behind the scenes, leading to segfaults or incorrect results.
One such instance is the determination of UTF8ness. But this only could
happen if the full generality of locales is used so that the categories
are not all in the same locale. This could only happen (if the user
doesn't change locales) if the environment is such that the perl program
is started up so that the categories are in such a state. This commit
fixes this potential bug by caching the UTF8ness of each category at
startup, before any threads are instantiated, and so checking for it
later just looks it up in the cache, without perl changing the locale.
|
|
|
|
|
|
|
|
|
|
| |
The LC_NUMERIC locale category is kept so that generally the decimal
point (radix) is a dot. For some (mostly) output purposes, it needs to
be swapped into the program's current underlying locale so that a
non-dot can be printed.
This commit changes things so that if the current underlying locale uses
a decimal point, the swap doesn't happen, as it's not needed.
|
|
|
|
| |
This somehow became unused or never got used; I didn't do the research.
|
|
|
|
| |
As explained in the docs, this helps detect spoofing attacks.
|
|
|
|
|
|
|
|
|
|
| |
Bits of exec code were putting the constructed commands into globals
PL_Argv and PL_Cmd, which could then be clobbered by reentrancy.
These are only global in order to manage their freeing, but that's
better managed by using the scope stack. So replace them with automatic
variables, with ENTER/SAVEFREEPV/LEAVE to free the memory. Also copy
the strings acquired from SVs, to avoid magic clobbering the buffers of
SVs already read. Fixes [perl #129888].
|
|
|
|
|
|
| |
The real purpose of this internal variable is to give the name of the
locale that is the underlying one for the C program. Various macros
already indicate that. This furthers the process.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 2bfbbbaf9ef1783ba914ff9e9270e877fbbb6aba changed things so -Dr
output could be changed through an environment variable to truncate
the output differently than the default.
For most purposes, the default is good enough, but for someone trying to
debug the regcomp internals, sometimes one wants to see more than is
output by default.
That commit did not catch all the places. This one changes the handling
so that any place that use the previous default maximum now uses the
environment variable (if set) instead.
|
|
|
|
| |
However, we do preserve it outside PERL_CORE for the use of XS authors.
|
|
|
|
|
| |
and use it to initialize hash randomization and to innoculate against
quadratic behaviour in pp_sort
|
|
|
|
|
|
| |
This is designed to generally replace nl_langinfo() in XS code. It is
thread-safer, hides the quirks of perl's LC_NUMERIC handling, and can be
used on systems lacking nl_langinfo.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure that PL_sv_yes, PL_sv_undef, PL_sv_no and PL_sv_zero are allocated
adjacently in memory.
This allows the SvIMMORTAL() test to be more efficient, and will (in the
next commit) allow SvTRUE() to be more efficient.
In MULTIPLICITY builds the constraint is already met by virtue of them
being adjacent items in the interpreter struct. For non-MULTIPLICITY
builds, they were just 4 global vars with no guarantees of where
they would be allocated. For this case, PL_sv_undef are deleted
as global vars and replaced with a new global var PL_sv_immortals[4],
with
#define PL_sv_yes (PL_sv_immortals[0])
etc in their place.
|
|
|
|
|
|
|
|
|
|
| |
it's like PL_sv_no, except that its string value is "0" rather than "".
It can be used for example where pp function wants to push a zero return
value on the stack. The next commit will start to use it.
Also update the SvIMMORTAL() to be more efficient: it now checks whether
the SV's address is in a range rather than individually checking against
&PL_sv_undef, &PL_sv_no etc.
|
|
|
|
|
|
|
|
|
|
| |
Give Perl_nextargv its own statbuf and pass a pointer to it into
Perl_do_open_raw and thence S_openn_cleanup when needed.
Also reduce the scope of the existing statbuf in Perl_nextargv to make
it clear it's distinct from the one populated by do_open_raw.
Fix perldelta entry for PL_statbuf removal
|
| |
|
|
|
|
| |
This will be used in a future commit.
|
|
|
|
|
|
| |
These macros are being replaced by a safe version; they now generate a
deprecation message at each call site upon the first use there in each
program run.
|
|
|
|
|
|
|
| |
This variable really means the character that replaces any embedded NULs
when doing collation. Change the name accordingly. (Embedded NULs must
be replaced because the libc function strxfrm is used, and it operates
on C strings which have no embedded NULs.)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I thought this bug was in FREEBSD, but when I went to gather the info
needed to report it to the vendor, it turned out to be a mistake I had
made.
The problem is basically doubly encoding into UTF-8. In order to save
CPU time, in a UTF-8 locale, I had stored a string as UTF-8 encoded.
This string is to be inserted into a larger string. What I neglected to
consider in this situation is that not all strings in such locales need
be in UTF-8. The UTF-8 encoded insert could get added to a non-UTF-8
string, and the result later was switched to UTF-8, so the inserted
string's bytes were individually converted to UTF-8, effectively a
second time. This is a problem only if the inserted string is different
when encoded in UTF-8 than not, and for this particular usage, on most
platforms it was UTF-8 invariant, so did not show up, except on those
platforms where it was variant.
The solution is to store the replacement as a code point, and encode it
as UTF-8 only if necessary, once. This actually simplifies the code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FC didn't like my previous patch for this issue, so here is the
one he likes better. With tests and etc. :-)
The basic problem is that code like this: /(?{ s!!! })/ can trigger
infinite recursion on the C stack (not the normal perl stack) when the
last successful pattern in scope is itself. Since the C stack overflows
this manifests as an untrappable error/segfault, which then kills perl.
We avoid the segfault by simply forbidding the use of the empty pattern
when it would resolve to the currently executing pattern.
I imagine with a bit of effort someone can trigger the original SEGV,
unlike my original fix which forbade use of the empty pattern in a
regex code block. So if someone actually reports such a bug we might
have to revert to the older approach of prohibiting this.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that that PADOFFSET is signed, make
PL_comppad_name_fill
PL_comppad_name_floor
PL_padix
PL_constpadix
PL_padix_floor
PL_min_intro_pending
PL_max_intro_pending
be of type PADOFFSET rather than I32, to match the rest of the pad
interface.
At the same time, change various I32 local vars in pad.c functions to be
PADOFFSET.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have an interpreter variable using memory, PL_maxo, which is
defined to be the same as MAXO, a #defined constant. As far as I can
tell, it is never used in lvalue context, in core or on CPAN, except
for the initialisation in intrpvar.h.
It can simply be removed and replaced with a macro defined as equiva-
lent to MAXO.
It was added in this commit:
commit 84ea024ac9cdf20f21223e686dddea82d5eceb4f
Author: Perl 5 Porters <perl5-porters.nicoh.com>
Date: Tue Jan 2 23:21:55 1996 +0000
perl 5.002beta1h patch: perl.h
5.002beta1 attempted some memory optimizations, but unfortunately
they can result in a memory leak problem. This can be
avoided by #define STRANGE_MALLOC. I do that here until
consensus is reached on a better strategy for handling the
memory optimizations.
Include maxo for the maximum number of operations (needed
for the Safe extension).
But apparently it is not needed for the Safe extension (tests pass
without it).
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On some platforms, the libc strxfrm() works reasonably well on UTF-8
locales, giving a default collation ordering. It will assume that every
string passed to it is in UTF-8. This commit changes Perl to make sure
that strxfrm's expectations are met.
Likewise under a non-UTF-8 locale, strxfrm is expecting a non-UTF-8
string. And this commit makes sure of that as well.
So, simply meeting strxfrm's expectations allows Perl to start
supporting default collation in UTF-8 locales, and fixes it to work on
single-byte locales with UTF-8 input. (Unicode::Collate provides
tailorable functionality and is portable to platforms where strxfrm
isn't as intelligent, but is a much more heavy-weight solution that may
not be needed for particular applications.)
There is a problem in non-UTF-8 locales if the passed string contains
code points representable only in UTF-8. This commit causes them to be
changed, before being passed to strxfrm, into the highest collating
character in the locale that doesn't require UTF-8. They then will sort
the same as that character, which means after all other characters in
the locale but that one. In strings that don't have that character,
this will generally provide exactly correct operation. There still is a
problem, if that character, in the given locale, combines with adjacent
characters to form a specially weighted sequence. Then, the change of
these above-255 code points into that character can skew the results.
See the commit message for 6696cfa7cc3a0e1e0eab29a11ac131e6f5a3469e for
more on this. But it is really an illegal situation to have above-255
code points in a single-byte locale, so this behavior is a reasonable
degradation when given illegal input. If two transformed strings
compare exactly equal, Perl already uses the un-transformed versions to
break ties, and there, these faked-up strings will collate so the
above-255 code points sort after everything else, and in code point
order amongst themselves.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's kind of guess work deciding how big a buffer to give to strxfrm().
If you give it too small a one, it will fail. Prior to this commit, the
buffer size was doubled and then strxfrm() was called again, looping
until it worked, or we used too much memory.
Each time a new locale is made, we try to minimize the necessity of
doing this by calculating numbers 'm' and 'b' that can be plugged into
the equation
mx + b
where 'x' is the size of the string passed to strxfrm(). strxfrm() is
roughly linear with respect to its input's length, so this generally
works without us having to do many loops to get a large enough size.
But on many systems, strxfrm(), in failing, returns how much space you
should have given it. On such systems, we can just use that number on
the 2nd try and not have to keep guessing. This commit changes to do
that.
But on other systems this doesn't work. So the original method is
retained if we determine that there are problems with strxfrm(), either
from previous experience, or because using the size returned from the
first trial didn't work
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
One of the problems in implementing Perl is that the C library routines
forbid embedded NUL characters, which Perl accepts. This is true for
the case of strxfrm() which handles collation under locale.
The best solution as far as functionality goes, would be for Perl to
write its own strxfrm replacement which would handle the specific needs
of Perl. But that is not going to happen because of the huge complexity
in handling it across many platforms. We would have to know the
location and format of the locale definition files for every such
platform. Some might follow POSIX guidelines, some might not.
strxfrm creates a transformation of its input into a new string
consisting of weight bytes. In the typical but general case, a 3
character NUL-terminated input string 'A B C 00' (spaces added for
readability) gets transformed into something like:
A¹ B¹ C¹ 01 A² B² C² 01 A³ B³ C³ 00
where the superscripted characters are weights for the corresponding
input characters. Superscript 1 represents (essentially) the primary
sorting key; 2, the secondary, etc, for as many levels as the locale
definition gives. The 01 byte is likely to be the separator between
levels, but not necessarily, and there could be some other mechanisms
used on various platforms.
To handle embedded NULs, the simplest thing would be to just remove them
before passing in to strxfrm(). Then they would be entirely ignored,
which might not be what you want. You might want them to have some
weight at the tertiary level, for example. It also causes problems
because strxfrm is very context sensitive. The locale definition can
define weights for specific sequences of any length (and the weights can
be multi-byte), and by removing a NUL, two characters now become
adjacent that weren't in the input, and they could now form one of those
special sequences and thus throw things off.
Another way to handle NULs, that seemingly ignores them, but actually
doesn't, is the mechanism in use prior to this commit. The input string
is split at the NULs, and the substrings are independently passed to
strxfrm, and the results concatenated together. This doesn't work
either. In our example 'A B C 00', suppose B is a NUL, and should have
some weight at the tertiary level. What we want is:
A¹ C¹ 01 A² C² 01 A³ B³ C³ 00
But that's not at all what you get. Instead it is:
A¹ 01 A² 01 A³ C¹ 01 C² 01 C³ 00
The primary weight of C comes immediately after the teriary weight of A,
but more importantly, a NUL, instead of being ignored at the primary
levels, is significant at all levels, so that "a\0c" would sort before
"ab".
Still another possibility is to replace the NUL with some other
character before passing it to strxfrm. That was my original plan, to
replace each NUL with the character that this code determines has the
lowest collation order for the current locale. On strings that don't
contain that character, the results would be as good as it gets for that
locale. That character is likely to be ignored at higher weight levels,
but have some small non-ignored weight at the lowest ones. And
hopefully the character would rarely be encountered in practice. When
it does happen, it and NUL would sort identically; hardly the end of the
world. If the entire strings sorted identically, the NUL-containing one
would come out before the other one, since the original Perl strings are
used as a tie breaker. However, testing showed a problem with this. If
that other character is part of a sequence that has special weighting,
the results won't be correct. With gcc, U+00B4 ACUTE ACCENT is the
lowest collating character in many UTF-8 locales. It combines in
Romanian and Vietnamese with some other characters to change weights,
and hence changing NULs into U+B4 screws things up.
What I finally have come to is to do is a modification of this final
approach, where the possible NUL replacements are limited to just
characters that are controls in the locale. NULs are replaced by the
lowest collating control. It would really be a defective locale if this
control combined with some other character to form a special sequence.
Often the character will be a 01, START OF HEADING. In the very
unlikely case that there are absolutely no controls in the locale, 01 is
used, because we have to replace it with something.
The code added by this commit is mostly utf8-ready. A few commits from
now will make Perl properly work with UTF-8 (if the platform supports
it). But until that time, this isn't a full implementation; it only
looks for the lowest-sorting control that is invariant, where the
the UTF8ness doesn't matter. The added tests are marked as TODO until
then.
|
|
|
|
| |
This will be used in future commits
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The regex engine when displaying debugging info, say under -Dr, will elide
data in order to keep the output from getting too long. For example,
the number of code points in all of Unicode matched by \w is quite
large, and so when displaying a pattern that matches this, only the
first some number of them are printed, and the rest are truncated,
represented by "...".
Sometimes, one wants to see more than what the
compiled-into-the-engine-max shows. This commit creates code to read
this environment variable to override the default max lengths. This
changes the lengths for everything to the input number, even if they
have different compiled maximums in the absence of this variable.
I'm not currently documenting this variable, as I don't think it works
properly under threads, and we may want to alter the behavior in various
ways as a result of gaining experience with using it.
|
|
|
|
| |
Its name implies that it's the top allocated element; in fact it's top+1.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the final Unicode boundary type previously missing from core
Perl: the LineBreak one. This feature is already available in the
Unicode::LineBreak module, but I've been told that there are portability
and some other issues with that module. What's added here is a
light-weight version that is lacking the customizable features of the
module.
This implements the default Line Breaking algorithm, but with the
customizations that Unicode is expecting everybody to add, as their
test file tests for them. In other words, this passes Unicode's fairly
extensive furnished tests, but wouldn't if it didn't include certain
customizations specified by Unicode beyond the basic algorithm.
The implementation uses a look-up table of the characters surrounding a
boundary to see if it is a suitable place to break a line. In a few
cases, context needs to be taken into account, so there is code in
addition to the lookup table to handle those.
This should meet the needs for line breaking of many applications,
without having to load the module.
The algorithm is somewhat independent of the Unicode version, just like
the other boundary types. Only if new rules are added, or existing ones
modified is there need to go in and change this code. Otherwise,
running regen/mk_invlists.pl should be sufficient when a new Unicode
release is done to keep it up-to-date, again like the other Unicode
boundary types.
|
|
|
|
| |
Saves memory in interp struct.
|
|
|
|
|
|
| |
As ilmari++ points out, the fix didn't work on builds without
PERL_IMPLICIT_CONTEXT (including non-threaded, non-multiplicity) or
PERL_DEBUG_READONLY_COW.
|
|
|
|
|
|
|
|
|
|
| |
There are two version numbers in intrpvar.h that have been repeatedly but
confusingly bumped by an older version of Porting/bump-perl-version. Now
that the porting script ignores intrpvar.h, it's better to restore the
version numbers to the way they were when they were originally added. The
comment for PERL_LAST_5_18_0_INTERP_MEMBER was introduced in commit
d399cf59bde32e412ae99791ae46a871c7337b42, and the comments for PL_timesbuf
was introduced in 25983af42cdcf2dc1fea6717dac7aac441b6301d.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Niko Tyni of Debian points out that the size of the interpreter structure
differs between plain and -DDEBUGGING builds, and that this breaks binary
compatibility of XS modules between such builds. Making the definition of
PL_memory_debug_header unconditional on PERL_TRACK_MEMPOOL (which itself is
defined only on debug builds) eliminates this needless incompatibility.
There is some confusion about whether plain and debug builds are expected to
be compatible. Commit 1e8125c621275d18c74bc8dae3bfc3c03929fe1e (July 2010)
refers in passing to "binary incompatible perls with the same API version
(i.e. the same perl version configured with and without DEBUGGING)". But
f2b88940d815760ad254d35a0ee1eb2ed8ce7762 (November 2009) says explicitly
that "-DDEBUGGING and not need to be binary compatible with each other", and
I think this explicit statement is a better example to follow.
Further, this compatibility is clearly useful for our downstream packagers
(as reported by Niko), and for any users who'd like to be able to use a
debug build for tracking down problems (including those encountered while
using modules with XS parts).
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
In various places we test for both (PL_tainting && PL_tainted).
Since if tainting isn't enabled PL_tainted should never get set, it's
more efficient to just test for (TAINT_get).
We ensure that PL_tainted doesn't actually get set when !PL_tainting
by changing some "setting" macros from PL_tainted = TRUE to
PL_tainted = PL_tainting.
|