| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
|
|
|
| |
execution order pointer, not a tree-structural one
|
|
|
|
|
|
|
|
|
| |
* Add feature, experimental warning, keyword
* Basic parsing
* Basic implementation as optree fragment
See also
https://github.com/Perl/perl5/issues/18504
|
|
|
|
|
|
|
|
|
|
|
| |
This just detabifies to get rid of the mixed tab/space indentation.
Applying consistent indentation and dealing with other tabs are another issue.
Done with `expand -i`.
* vutil.* left alone, it's part of version.
* Left regen managed files alone for now.
|
|
|
|
|
|
| |
This reverts commit 1d6cadf136bf2c85058a5359fb48b09b3ea9fe6f.
Due to cpan breakage: GH #18374 #18375 #18376
|
|
|
|
|
| |
so that they aren't accessible to XS code and won't be picked up by
autodoc
|
|
|
|
|
|
|
| |
This reverts commit a5d5855671af6956a8d1a13e419457afdffeb416.
It turns out that CPAN modules are using these values; whether
they should be using them or not, I don't know.
|
|
|
|
| |
These are internal only
|
|
|
|
|
|
|
|
| |
Many of the files in perl are for one thing only, and hence their
embedded documentation will be for that one thing. By creating a hash
here of them, those files don't have to worry about what section that
documentation goes under, and so it can be completely changed without
affecting them.
|
|
|
|
|
|
|
|
|
|
|
| |
This feature allows documentation destined for perlapi or perlintern to
be split into sections of related functions, no matter where the
documentation source is. Prior to this commit the line had to contain
the exact text of the title of the section. Now it can be a $variable
name that autodoc.pl expands to the title. It still has to be an exact
match for the variable in autodoc, but now, the expanded text can be
changed in autodoc alone, without other files needing to be updated at
the same time.
|
|
|
|
|
|
| |
Nothing in the test suite (nor apparently CPAN) had exercised this area
of the code, and so this flaw hadn't been discovered. But new code
about to be commited does.
|
|
|
|
|
|
|
|
|
| |
For: https://github.com/Perl/perl5/pull/18201
Committer: Samanta Navarro is now a Perl author.
To keep 'make test_porting' happy: Increment $VERSION in several files.
Regenerate uconfig.h via './perl -Ilib regen/uconfig_h.pl'.
|
|
|
|
|
| |
We already have a macro that expands to what this code does; it's
clearer to use it.
|
|
|
|
|
| |
This uses a new organization of sections that I came up with. I asked
for comments on p5p, but there were none.
|
|
|
|
|
| |
apidoc_section is slightly favored over head1, as it is known only to
autodoc, and can't be confused with real pod.
|
|
|
|
|
|
|
| |
I was never happy with this short form, and other people weren't either.
Now that most things are better expressed in terms of av_count, convert
the few remaining items that are clearer when referring to an index into
using the fully spelled out form
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes #17871
The op slab allocator code made the assumption that since OP and
hence OPSLOT contain pointers, the base of each of those would be
an integral number of sizeof(pointer) (pointer units) from the
beginning of OPSLAB.
This assumption is non-portable, and broke calculating the location
of the slab based on the address of the op slot and the op slot offset
on m68k platforms.
To avoid that, this change now stores the opslot_offset as the
offset in pointer units from the beginning of opslab_slots rather
than from the beginning of the slab.
If alignment on a pointer boundary for OPs is required, the compiler
will align opslab_opslots and since we work in pointer units from there,
any allocated op slots will also be aligned.
If we assume PADOFFSET is no larger than a pointer and requires no
stricter alignment and structures in themselves have no stricter
alignment requirements then since we work in pointer units all core
OP structures should have sufficient alignment (if this isn't true,
then it's not a new problem, and not the problem I'm trying to solve
here.)
I haven't been able to test this on m68k hardware (the emulator I
tried to use can't maintain a network connection.)
|
|
|
|
|
|
|
|
|
| |
alignment"
This reverts commit a760468c9355bafaee57e94f13705c0ea925d9ca.
This change is fragile, the next change avoids the need for such
manual padding.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On m68k, the natural alignment is 16 bits which causes the opslab_opslot
member of struct opslab to be aligned at a 16-bit offset. Other 32-bit
and 64-bit architectures have a natural alignment of at least 32 bits, so
the offset is always guaranteed to be at least 32-bit-aligned.
Fix this by adding additional padding bytes before the opslab_opslot
member, both for cases when PERL_DEBUG_READONLY_OPS defined and not
defined to ensure the offset of oplab_slots is always 32-bit-aligned.
On architectures which have a natural alignment of at least 32 bits,
the padding does not affect the alignment, offsets or struct size.
|
|
|
|
|
| |
Mostly in comments and docs, but some in diagnostic messages and one
case of 'or die die'.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
previously freed ops were stored as one singly linked list, and
a failed search for a free op to re-use could potentially search
that entire list, making freed op lookups O(number of freed ops),
or given that the number of freed ops is roughly proportional to
program size, making the total cost of freed op handling roughly
O((program size)**2). This was bad.
This change makes opslab_freed into an array of linked list heads,
one per op size. Since in a practical sense the number of op sizes
should remain small, and insertion is amortized O(1), this makes
freed op management now roughly O(program size).
fixes #17555
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The algorithm for dealing with Unicode property wildcards is to wrap the
user-supplied pattern with /miaa. We don't want the user to be able to
override the /m and /aa parts. Modifiers that are only specifiable as a
modifier in a qr or similar op (like /gc) can't be included in things
like (?gc). These normally incur a warning that they are ignored, but
the texts of those warnings are misleading when using wildcards, so I
chose to just make them illegal. Of course that could be changed to
having custom useful warning texts, but I didn't think it was worth it.
I also chose to forbid recursion of using nested \p{}, just from fear
that it might lead to issues down the road, and it really isn't useful
for this limited universe of strings to match against. Because
wildcards currently can't handle '}' inside them, only the single letter
\p,\P are valid anyway.
Similarly, I forbid the '*' quantifier to make it harder for the
constructed subpattern to take forever to make any progress and decide
to halt. Again, using it would be overkill on the universe of possible
match strings.
|
|
|
|
|
|
|
|
|
| |
This is in preparation for adding a new flag bit at the end in a future
commit. It could have been added in the unused space that the first of
these was moved to, but the new one is less important/used, so I thought
it best to come last. The reason to use unused space is to preserve
binary compatibility with the bits, and we don't care about that at this
point in the development cycle.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This large commit removes the last use of swashes from core.
It replaces swashes by inversion maps. This data structure is already
in use for some Unicode properties, such as case changing.
The inversion map data structure leads to straight forward
implementation code, so I collapsed the two doop.c routines
do_trans_complex_utf8() and do_trans_simple_utf8() into one. A few
conditionals could be avoided in the loop if this function were split so
that one version didn't have to test for, e.g., squashing, but I suspect
these are in the noise in the loop, which has to deal with UTF-8
conversions. This should be faster than the previous implementation
anyway. I measured the differences some releases back, and inversion
maps were faster than the equivalent swash for up to 512 or 1024
different ranges. These numbers are unlikely to be exceeded in tr///
except possibly in machine-generated ones.
Inversion maps are capable of handling both UTF-8 and non-UTF-8 cases,
but I left in the existing non-UTF-8 implementation, which uses tables,
because I suspect it is faster. This means that there is extra code,
purely for runtime performance.
An inversion map is always created from the input, and then if the table
implementation is to be used, the table is easily derived from the map.
Prior to this commit, the table implementation was used in certain edge
cases involving code points above 255. Those cases are now handled by
the inversion map implementation, because it would have taken extra code
to detect them, and I didn't think it was worth it. That could be
changed if I am wrong.
Creating an inversion map for all inputs essentially normalizes them,
and then the same logic is usable for all. This fixes some false
negatives in the previous implementation. It also allows for detecting
if the actual transliteration can be done in place. Previously, the
code mostly punted on that detection for the UTF-8 case.
This also allows for accurate counting of the lengths of the two sides,
fixing some longstanding TODO warning tests.
A new flag is created, OPpTRANS_CAN_FORCE_UTF8, when the tr/// has a
below 256 character resolving to one that requires UTF-8. If this isn't
set, the code knows that a non-UTF-8 input won't become UTF-8 in the
process, and so can take short cuts. The bit representing this flag is
the same as OPpTRANS_FROM_UTF, which is no longer used. That name is
left in so that the dozen-ish modules in cpan that refer to it can still
compile. AFAICT none of them actually use the flag, as well they
shouldn't since it is private to the core.
Inversion maps are ideally suited for tr/// implementations. An issue
with them in general is that for some pathological data, they can become
fragmented requiring more space than you would expect, to represent the
underlying data. However, the typical tr/// would not have this issue,
requiring only very short inversion maps to represent; in some cases
shorter than the table implementation.
Inversion maps are also easier to deparse than swashes. A deparse TODO
was also fixed by this commit, and the code to deparse UTF-8 inputs is
simplified.
One could implement specialized data structures for specific types of
inputs. For example, a common tr/// form is a single range, like
tr/A-Z/a-z/. That could be implemented without a table and be quite
fast. An intermediate step would be to use the inversion map
implementation always when the transliteration is a single range, and
then special case length=1 maps at execution time.
Thanks to Nicholas Rochemagne for his help on B
|
| |
|
|
|
|
|
|
|
| |
These two flags will shortly become obsolete, replaced by ones with
different meanings. This flag makes the new ones the normal ones, and
makes the old names synonyms so that code that refers to them can
compile.
|
|
|
|
|
|
| |
This currently uses 0xfeedface as a marker for something that isn't a
legal value. But that could in fact become legal at same point. This
defines a value TR_OOB that can be guaranteed not to become legal.
|
|
|
|
| |
For legibility and maintainability
|
|
|
|
| |
This makes it more mnemonic. Also add an explanation in toke.c
|
|
|
|
| |
This is no longer used.
|
| |
|
|
|
|
|
| |
Commit 0f9a6232f0af0895807ddd0afae2d5512aa91bf9 removed the #ifdef
PERL_OP_PARENT, but left the #define directives indented.
|
|
|
|
|
|
| |
For some reason I was storing the counts of sub signature parameters and
optional parameters as signed ints. Since these can never be negative,
change them to UV instead.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This op is of class OP_UNOP_AUX, Ops of this class have an op_aux pointer
which typically points to a variable-length malloced array of IVs,
UVs, etc. However in the specific case of OP_ARGCHECK the data stored
in the aux struct is fixed. So this commit casts the aux pointer to a
struct containing the relevant fields (number of parameters etc), rather
than referring to them as aux[0], aux[1] etc. This makes the code more
readable.
Should be no functional changes.
|
|
|
|
|
|
|
|
| |
original merge commit: v5.31.3-198-gd2cd363728
reverted by: v5.31.4-0-g20ef288c53
The commit following this commit fixes the breakage, which that means
the revert can be undone.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit d2cd363728088adada85312725ac9d96c29659be, reversing
changes made to 068b48acd4bdf9e7c69b87f4ba838bdff035053c.
This change breaks installing Test::Deep:
...
not ok 37 - Test 'isa eq' completed
ok 38 - Test 'isa eq' no premature diagnostication
...
|
|
|
|
|
|
| |
This function makes use of PL_curstackinfo->si_cxsubix to avoid the
overhead of a call to block_gimme() when the context of the op is
unknown.
|
|
|
|
| |
This is because they take a preprocessor token as an argument
|
|
|
|
|
|
|
|
|
| |
Currently, each allocated opslot has a pointer to the opslot that was
allocated immediately above it. Replace this with a U16 opslot_size field
giving the size of the opslot. The next opslot can then be found by
adding slot->opslot_size * sizeof(void*) to slot.
This saves space.
|
| |
|
|
|
|
|
|
|
| |
Currently a OPSLAB maintains a pointer to the lowest allocated OPSLOT
within the slab (slots are allocated downwards). Replace this pointer
with a U16 indicating how many pointer-sized words are free below the
lowest allocated slot.
|
|
|
|
|
|
|
|
| |
Currently this struct only has the opslab_size field on debugging
builds. Change it so that this field is always present. This will make
it easier to not need a fake partial OPSLOT at the end of the slab with
a NULL opslot_next field, which will in turn simplify converting
opslot_next into U16 size field shortly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each OPSLOT allocated within an OPSLAB contains a pointer, opslot_slab,
which points back to the first (head) slab of the slab chain (i.e. not
necessarily to the slab which the op is contained in).
This commit changes the pointer to be a 16-bit offset from the start of
the current slab, and adds a pointer at the start of each slab which
points back to the head slab.
The mapping from an op to the head slab is now a two-step process: use
the op's slot's opslot_offset field to find the start of the current
slab, then use that slab's new opslab_head pointer to find the head
slab.
The advantage of this is that it reduces the storage per op. (It
probably doesn't make any practical difference yet, due to alignment
issues, but that will will be sorted shortly in this branch.)
|
|
|
|
|
|
|
|
|
| |
This breaks hooking the filetest ops' check function by modules like
bareword::filehandles. Instead use the OP_IS_FILETEST() macro to decide
check for filetest ops. Also add an OP_IS_STAT() macro for when we want
to check for (l)stat as well as the filetest ops.
c.f. https://rt.cpan.org/Ticket/Display.html?id=127073
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This macro is defined as
(PL_op->op_flags & OPf_STACKED)
and indicates, for ops which support it, that the mutator-variant of the
op is present (e.g. $x += 1).
This macro was mainly used as an arg for the old-style overloading
macros (tryAMAGICbin()) which were eliminated several years ago.
This commit removes its vestigial usage, and instead tests OPf_STACKED
directly at each location, along with adding a comment about the
significance of the flag.
This removes one item of obfuscation from the overloading code.
There is one potentially functional change in this commit:
Perl_try_amagic_bin() was sometimes testing for OPf_STACKED without
first checking that it had been called with the AMGf_assign flag (which
indicates that this op supports a mutator variant). With this commit, it
now checks first, so this is theoretically a bug fix. In practice that
section of code was never reached without AMGf_assign always being set
anyway.
|