| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds syntax `defer { BLOCK }` to create a deferred block; code that is
deferred until the scope exits. This syntax is guarded by
use feature 'defer';
Adds a new opcode, `OP_PUSHDEFER`, which is a LOGOP whose `op_other` field
gives the start of an optree to be deferred until scope exit. That op
pointer will be stored on the save stack and invoked as part of scope
unwind.
Included is support for `B::Deparse` to deparse the optree back into
syntax.
|
|
|
|
|
| |
This commit allows this function to be called with NULL parameters when
the result of these is not needed.
|
|
|
|
|
|
|
|
|
| |
This adds a new function for changing the case of an input code point.
The difference between this and the existing function is that the new
one returns an array of UVs instead of a combination of the first code
point and UTF-8 of the whole thing, a somewhat awkward API that made
more sense when we used swashes. That function is retained for now, at
least, but most of the work is done in the new function.
|
| |
|
| |
|
|
|
|
|
|
| |
Instead of destroying the input by first swapping the bytes, this calls
a base function with the order to use. The non-reverse function is
changed to call the base function with the non-reversed order.
|
|
|
|
|
|
|
|
|
| |
Now that the DFA is used by the only callers to this to eliminate the
need to check for e.g., wrong continuation bytes, this function can be
refactored to use a switch statement, which makes it clearer, shorter,
and faster.
The name is changed to indicate its private nature
|
|
|
|
| |
This makes it use the fast DFA for this functionality.
|
|
|
|
|
|
|
|
| |
The DFA macro for determining if a sequence is valid UTF-8 was
deliberately made general enough to accommodate this use-case, in which
only a partial character is acceptable. Change the code to use the DFA.
The helper function's name is changed to indicate it is private
|
|
|
|
| |
The new mname is more mnemonic
|
|
|
|
|
|
|
| |
The previous commit for EBCDIC paved the way for moving some checks for
a code point being for Perl extended UTF-8 out of places where they
cannot succeed. The resultant simplifications more than compensate for
the two extra case statements added by this commit.
|
|
|
|
| |
This will make more sense of the next commit
|
|
|
|
|
|
|
|
|
| |
This specialized functionality is used to check the validity of Perl's
extended-length UTF-8, which has some ideosyncratic characteristics from
the shorter sequences. This means this function doesn't have to
consider those differences. It will be used in the next commit to avoid
some work, and to eventually enable is_utf8_char_helper() to be
simplified.
|
|
|
|
|
|
|
|
| |
One of these functions is now only called from the other, and there is
significant overlap in their logic.
This commit refactors them into one resulting function, which is half
the code, and more straight forward.
|
|
|
|
|
|
|
|
|
| |
I've always been uncomfortable with the input constraints this function
had. Now that it has been refactored into using a switch(), new cases
for full generality can be added without affecting performance, and
some conditionals removed before calling it.
The function is renamed to reflect its more generality
|
|
|
|
|
| |
This changes only portions of the capitalization, and the new version is
more in keeping with other function names.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing code to determine the position of the most significant 1
bit in a word is extracted from variant_byte_number(), and generalized
to use the deBruijn method previously added that works on any bit in the
word, rather than the existing method which looks just at the msb of
each byte. The code is moved to a new function in preparation for being
called from other places.
A U32 version is created, and on 64 bit platforms, a second, parallel,
version taking a U64 argument is also created. This is because future
commits may care about the word size differences.
|
|
|
|
|
|
|
|
|
|
| |
The existing code to determine the position of the least significant 1
bit in a word is extracted from variant_byte_number() and moved to a new
function in preparation for being called from other places.
A U32 version is created, and on 64 bit platforms, a second, parallel,
version taking a U64 argument is also created. This is because future
commits may care about the word size differences.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will prove useful in future commits on platforms that have 64 bit
capability.
The deBruijn sequence used here, taken from the internet, differs
from the 32 bit one in how they treat a word with no set bits. But this
is considered undefined behavior, so that difference is immaterial.
Apparently figuring this out uses brute force methods, and so I decided
to live with this difference, rather than to expend the time needed to
bring them into sync.
|
|
|
|
|
|
| |
This moves the code from regcomp.c to inline.h that calculates the
position of the lone set bit in a U32. This is in preparation for use
by other call sites.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the removal of PERL_OBJECT
(acfe0abcedaf592fb4b9cb69ce3468308ae99d91) PERL_IMPLICIT_CONTEXT and
MULTIPLICITY have been synonymous and they're being used interchangeably.
To simplify the code, this commit replaces all instances of
PERL_IMPLICIT_CONTEXT with MULTIPLICITY.
PERL_IMPLICIT_CONTEXT will stay defined for compatibility with XS
modules.
|
| |
|
|
|
|
|
| |
S_regclass() is unwieldy. This commit splits it into two nearly equal
size parts. More could be done.
|
| |
|
|
|
|
| |
In particular, if the length is beyond the end, it should not be stored as the end.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The try change added code to pp_return to skip past try contexts
when looking for the sub/sort/eval context to return from.
This was only needed because cx_pusheval() sets si_cxsubix to the
current frame and try uses that function to push it's context, that
value is then used by the dopopto_cursub() macro to shortcut
walking the context stack.
Since we don't need to treat try as a sub for return, list vs array
checks or lvalue sub checks, don't set si_cxsubix on try.
|
| |
|
|
|
|
|
| |
This used to be called from utf8.c, but no longer; no need to make it
other than static. This allows the compiler to better optimize.
|
|
|
|
|
| |
This was the consensus in
http://nntp.perl.org/group/perl.perl5.porters/258489
|
|
|
|
|
|
|
|
| |
This change has been planned for a long time, bringing Perl into parity
with similar languages, but it took many deprecation cycles to be able
to reach the point where it could safely go in.
This fixes GH #18264
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit copies portions of new_regcurly(), which has been around
since 5.28, into plain regcurly(), as a baby step in preparation for
converting entirely to the new one. These functions are used for
parsing {m,n} quantifiers. Future commits will add capabilities not
available using the old version.
The commit adds an optional parameter, to return to the caller
information it gleans during parsing.
regpiece() is changed by this commit to use this information, instead of
itself reparsing the input. Part of the reason for this commit is that
changes are planned soon to what is legal syntax. With this commit in
place, those changes only have to be done once.
This commit also extracts into a function the calculation of the
quantifier bounds. This allows the logic for that to be done in one
place instead of two.
|
|
|
|
| |
This disables use of bareword filehandles except for the built-in handles
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
The purpose of this function is to raise a parse warning; not something
something outside core should be doing.
|
| |
|
|
|
|
|
| |
These appear to be helper functions for various API functions; there are
no uses of them in cpan
|
|
|
|
|
| |
There are documented macros that one is supposed to use instead for this
functionality.
|
|
|
|
|
| |
This is a helper function used by such things as SSGROW; there is no
cpan usage
|