| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have a weird bifurcation of the cop logic around threads. With
threads we use a char * cop_file member, without it we use a GV * and
replace cop_file with cop_filegv.
The GV * code refcounts filenames and more or less efficiently shares
the filename amongst many opcodes. However under threads we were
simplify copying the filenames into each opcode. This is because in
theory opcodes created in one thread can be destroyed in another. I say
in theory because as far as I know the core code does not actually do
this. But we have tests that you can construct a perl, clone it, and
then destroy the original, and have the copy work just fine, this means
that opcodes constructed in the main thread will be destroyed in the
cloned thread. This in turn means that you can't put SV derived
structures into the op-tree under threads. Which is why we can not use
the GV * stategy under threads.
As such this code adds a new struct/type RCPV, which is a refcounted
string using shared memory. This is implemented in such a way that code
that previously used a char * can continue to do so, as the refcounting
data is located a specific offset before the char * pointer itself.
This also allows the len data to embedded "into" the PV, which allows
us to expose macros to acces the length of what is in theory a null
terminated string.
struct rcpv {
UV refcount;
STRLEN len;
char pv[1];
};
typedef struct rcpv RCPV;
The struct is sized appropriately on creation in rcpv_new() so that the
pv member contains the full string plus a null byte. It then returns a
pointer to the pv member of the struct. Thus the refcount and length and
embedded at a predictable offset in front of the char *, which means we
do not have to change any types for members using this.
We provide three operations: rcpv_new(), rcpv_copy() and rcpv_free(),
which roughly correspond with newSVpv(), SvREFCNT_inc(), SvREFCNT_dec(),
and a handful of macros as well. We also expose SAVERCPVFREE which is
similar to SAVEGENERICSV but operates on pv's constructed with
rcpv_new().
Currently I have not restricted use of this logic to threaded perls. We
simply do not use it in unthreaded perls, but I see no reason we
couldn't normalize the code to use this in both cases, except possibly
that actually the GV case is more efficient.
Note that rcpv_new() does NOT use a hash table to dedup strings. Two
calls to rcpv_new() with the same arguments will produce two distinct
pointers with their own refcount data.
Refcounting the cop_file data was Tony Cook's idea.
|
|
|
|
|
|
|
|
|
|
|
| |
Per RFC 18, whenever `use feature 'module_true';` is enabled in a scope,
any file required with `require` has an implicit return value of true
and will not trigger the "did not return a true value" error condition.
This includes logic to use the OPf_SPECIAL flag for OP_RETURN listops to
indicate that the module_true feature is in effect when it executes.
This flag plays no role unless the OP_RETURN tail calls the pp_leaveeval
logic, so it doesn't affect normal sub returns.
|
|
|
|
|
|
|
|
|
|
| |
This converts INIT {} blocks from the Module::Install::DSL
namespace into BEGIN blocks. This works around the bug reported in
GH Issue #16300. (Hopefully, not fully tested yet.) Which in turn
should allow us to close the bug in #2754.
See also PR: #20168 and Issue: #20161 both of which are blocked by
this.
|
|
|
|
|
| |
Don't just reuse the cSVOPx_sv() one any more, so we no longer have to
keep the same storage shape for SVOPs vs METHOPs.
|
|
|
|
|
|
|
|
|
|
|
| |
Several of these were missing:
cMETHOP, cMETHOPo, kMETHOP
Also, the field-accessing ones:
cMETHOP_meth cMETHOP_rclass cMETHOPo_meth cMETHOPo_rclass
This commit adds them all, and use them to neaten other code where
appropriate.
|
|
|
|
|
|
| |
This commit clarifies in perlapi that these macros take a parameter that
is a CPP token, and that Devel::PPPOrt can't blindly gernerate test
cases for them.
|
|
|
|
|
|
|
| |
see it
Also defend it against side-effects in arguments, by allocating
temporary variables.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
execution order pointer, not a tree-structural one
|
|
|
|
|
|
|
|
|
| |
* Add feature, experimental warning, keyword
* Basic parsing
* Basic implementation as optree fragment
See also
https://github.com/Perl/perl5/issues/18504
|
|
|
|
|
|
|
|
|
|
|
| |
This just detabifies to get rid of the mixed tab/space indentation.
Applying consistent indentation and dealing with other tabs are another issue.
Done with `expand -i`.
* vutil.* left alone, it's part of version.
* Left regen managed files alone for now.
|
|
|
|
|
|
| |
This reverts commit 1d6cadf136bf2c85058a5359fb48b09b3ea9fe6f.
Due to cpan breakage: GH #18374 #18375 #18376
|
|
|
|
|
| |
so that they aren't accessible to XS code and won't be picked up by
autodoc
|
|
|
|
|
|
|
| |
This reverts commit a5d5855671af6956a8d1a13e419457afdffeb416.
It turns out that CPAN modules are using these values; whether
they should be using them or not, I don't know.
|
|
|
|
| |
These are internal only
|
|
|
|
|
|
|
|
| |
Many of the files in perl are for one thing only, and hence their
embedded documentation will be for that one thing. By creating a hash
here of them, those files don't have to worry about what section that
documentation goes under, and so it can be completely changed without
affecting them.
|
|
|
|
|
|
|
|
|
|
|
| |
This feature allows documentation destined for perlapi or perlintern to
be split into sections of related functions, no matter where the
documentation source is. Prior to this commit the line had to contain
the exact text of the title of the section. Now it can be a $variable
name that autodoc.pl expands to the title. It still has to be an exact
match for the variable in autodoc, but now, the expanded text can be
changed in autodoc alone, without other files needing to be updated at
the same time.
|
|
|
|
|
|
| |
Nothing in the test suite (nor apparently CPAN) had exercised this area
of the code, and so this flaw hadn't been discovered. But new code
about to be commited does.
|
|
|
|
|
|
|
|
|
| |
For: https://github.com/Perl/perl5/pull/18201
Committer: Samanta Navarro is now a Perl author.
To keep 'make test_porting' happy: Increment $VERSION in several files.
Regenerate uconfig.h via './perl -Ilib regen/uconfig_h.pl'.
|
|
|
|
|
| |
We already have a macro that expands to what this code does; it's
clearer to use it.
|
|
|
|
|
| |
This uses a new organization of sections that I came up with. I asked
for comments on p5p, but there were none.
|
|
|
|
|
| |
apidoc_section is slightly favored over head1, as it is known only to
autodoc, and can't be confused with real pod.
|
|
|
|
|
|
|
| |
I was never happy with this short form, and other people weren't either.
Now that most things are better expressed in terms of av_count, convert
the few remaining items that are clearer when referring to an index into
using the fully spelled out form
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #17871
The op slab allocator code made the assumption that since OP and
hence OPSLOT contain pointers, the base of each of those would be
an integral number of sizeof(pointer) (pointer units) from the
beginning of OPSLAB.
This assumption is non-portable, and broke calculating the location
of the slab based on the address of the op slot and the op slot offset
on m68k platforms.
To avoid that, this change now stores the opslot_offset as the
offset in pointer units from the beginning of opslab_slots rather
than from the beginning of the slab.
If alignment on a pointer boundary for OPs is required, the compiler
will align opslab_opslots and since we work in pointer units from there,
any allocated op slots will also be aligned.
If we assume PADOFFSET is no larger than a pointer and requires no
stricter alignment and structures in themselves have no stricter
alignment requirements then since we work in pointer units all core
OP structures should have sufficient alignment (if this isn't true,
then it's not a new problem, and not the problem I'm trying to solve
here.)
I haven't been able to test this on m68k hardware (the emulator I
tried to use can't maintain a network connection.)
|
|
|
|
|
|
|
|
|
| |
alignment"
This reverts commit a760468c9355bafaee57e94f13705c0ea925d9ca.
This change is fragile, the next change avoids the need for such
manual padding.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On m68k, the natural alignment is 16 bits which causes the opslab_opslot
member of struct opslab to be aligned at a 16-bit offset. Other 32-bit
and 64-bit architectures have a natural alignment of at least 32 bits, so
the offset is always guaranteed to be at least 32-bit-aligned.
Fix this by adding additional padding bytes before the opslab_opslot
member, both for cases when PERL_DEBUG_READONLY_OPS defined and not
defined to ensure the offset of oplab_slots is always 32-bit-aligned.
On architectures which have a natural alignment of at least 32 bits,
the padding does not affect the alignment, offsets or struct size.
|
|
|
|
|
| |
Mostly in comments and docs, but some in diagnostic messages and one
case of 'or die die'.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
previously freed ops were stored as one singly linked list, and
a failed search for a free op to re-use could potentially search
that entire list, making freed op lookups O(number of freed ops),
or given that the number of freed ops is roughly proportional to
program size, making the total cost of freed op handling roughly
O((program size)**2). This was bad.
This change makes opslab_freed into an array of linked list heads,
one per op size. Since in a practical sense the number of op sizes
should remain small, and insertion is amortized O(1), this makes
freed op management now roughly O(program size).
fixes #17555
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The algorithm for dealing with Unicode property wildcards is to wrap the
user-supplied pattern with /miaa. We don't want the user to be able to
override the /m and /aa parts. Modifiers that are only specifiable as a
modifier in a qr or similar op (like /gc) can't be included in things
like (?gc). These normally incur a warning that they are ignored, but
the texts of those warnings are misleading when using wildcards, so I
chose to just make them illegal. Of course that could be changed to
having custom useful warning texts, but I didn't think it was worth it.
I also chose to forbid recursion of using nested \p{}, just from fear
that it might lead to issues down the road, and it really isn't useful
for this limited universe of strings to match against. Because
wildcards currently can't handle '}' inside them, only the single letter
\p,\P are valid anyway.
Similarly, I forbid the '*' quantifier to make it harder for the
constructed subpattern to take forever to make any progress and decide
to halt. Again, using it would be overkill on the universe of possible
match strings.
|
|
|
|
|
|
|
|
|
| |
This is in preparation for adding a new flag bit at the end in a future
commit. It could have been added in the unused space that the first of
these was moved to, but the new one is less important/used, so I thought
it best to come last. The reason to use unused space is to preserve
binary compatibility with the bits, and we don't care about that at this
point in the development cycle.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This large commit removes the last use of swashes from core.
It replaces swashes by inversion maps. This data structure is already
in use for some Unicode properties, such as case changing.
The inversion map data structure leads to straight forward
implementation code, so I collapsed the two doop.c routines
do_trans_complex_utf8() and do_trans_simple_utf8() into one. A few
conditionals could be avoided in the loop if this function were split so
that one version didn't have to test for, e.g., squashing, but I suspect
these are in the noise in the loop, which has to deal with UTF-8
conversions. This should be faster than the previous implementation
anyway. I measured the differences some releases back, and inversion
maps were faster than the equivalent swash for up to 512 or 1024
different ranges. These numbers are unlikely to be exceeded in tr///
except possibly in machine-generated ones.
Inversion maps are capable of handling both UTF-8 and non-UTF-8 cases,
but I left in the existing non-UTF-8 implementation, which uses tables,
because I suspect it is faster. This means that there is extra code,
purely for runtime performance.
An inversion map is always created from the input, and then if the table
implementation is to be used, the table is easily derived from the map.
Prior to this commit, the table implementation was used in certain edge
cases involving code points above 255. Those cases are now handled by
the inversion map implementation, because it would have taken extra code
to detect them, and I didn't think it was worth it. That could be
changed if I am wrong.
Creating an inversion map for all inputs essentially normalizes them,
and then the same logic is usable for all. This fixes some false
negatives in the previous implementation. It also allows for detecting
if the actual transliteration can be done in place. Previously, the
code mostly punted on that detection for the UTF-8 case.
This also allows for accurate counting of the lengths of the two sides,
fixing some longstanding TODO warning tests.
A new flag is created, OPpTRANS_CAN_FORCE_UTF8, when the tr/// has a
below 256 character resolving to one that requires UTF-8. If this isn't
set, the code knows that a non-UTF-8 input won't become UTF-8 in the
process, and so can take short cuts. The bit representing this flag is
the same as OPpTRANS_FROM_UTF, which is no longer used. That name is
left in so that the dozen-ish modules in cpan that refer to it can still
compile. AFAICT none of them actually use the flag, as well they
shouldn't since it is private to the core.
Inversion maps are ideally suited for tr/// implementations. An issue
with them in general is that for some pathological data, they can become
fragmented requiring more space than you would expect, to represent the
underlying data. However, the typical tr/// would not have this issue,
requiring only very short inversion maps to represent; in some cases
shorter than the table implementation.
Inversion maps are also easier to deparse than swashes. A deparse TODO
was also fixed by this commit, and the code to deparse UTF-8 inputs is
simplified.
One could implement specialized data structures for specific types of
inputs. For example, a common tr/// form is a single range, like
tr/A-Z/a-z/. That could be implemented without a table and be quite
fast. An intermediate step would be to use the inversion map
implementation always when the transliteration is a single range, and
then special case length=1 maps at execution time.
Thanks to Nicholas Rochemagne for his help on B
|
| |
|
|
|
|
|
|
|
| |
These two flags will shortly become obsolete, replaced by ones with
different meanings. This flag makes the new ones the normal ones, and
makes the old names synonyms so that code that refers to them can
compile.
|
|
|
|
|
|
| |
This currently uses 0xfeedface as a marker for something that isn't a
legal value. But that could in fact become legal at same point. This
defines a value TR_OOB that can be guaranteed not to become legal.
|
|
|
|
| |
For legibility and maintainability
|
|
|
|
| |
This makes it more mnemonic. Also add an explanation in toke.c
|
|
|
|
| |
This is no longer used.
|
| |
|
|
|
|
|
| |
Commit 0f9a6232f0af0895807ddd0afae2d5512aa91bf9 removed the #ifdef
PERL_OP_PARENT, but left the #define directives indented.
|
|
|
|
|
|
| |
For some reason I was storing the counts of sub signature parameters and
optional parameters as signed ints. Since these can never be negative,
change them to UV instead.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This op is of class OP_UNOP_AUX, Ops of this class have an op_aux pointer
which typically points to a variable-length malloced array of IVs,
UVs, etc. However in the specific case of OP_ARGCHECK the data stored
in the aux struct is fixed. So this commit casts the aux pointer to a
struct containing the relevant fields (number of parameters etc), rather
than referring to them as aux[0], aux[1] etc. This makes the code more
readable.
Should be no functional changes.
|
|
|
|
|
|
|
|
| |
original merge commit: v5.31.3-198-gd2cd363728
reverted by: v5.31.4-0-g20ef288c53
The commit following this commit fixes the breakage, which that means
the revert can be undone.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit d2cd363728088adada85312725ac9d96c29659be, reversing
changes made to 068b48acd4bdf9e7c69b87f4ba838bdff035053c.
This change breaks installing Test::Deep:
...
not ok 37 - Test 'isa eq' completed
ok 38 - Test 'isa eq' no premature diagnostication
...
|