| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
The tricky fold characters need to be expanded to include the ones
that map to the same ones as the original set. This isn't because the
new ones have a length issue, it's that they get left out of comparisons
because of the special regnodes generated for the tricky ones.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ANYOFV handles multi-char folds in ANYOF nodes, and it turns
out it is a superset of what FOLDCHAR does, which never got fully
implemented in regexec.c, whereas ANYOFV is. FOLDCHAR may be the better
way to go in the long-term, as it takes less space and is faster, but
this gives us the functionality today, with no extra work.
FOLDCHAR had been generated only when the character in question is a
literal in the input stream, and wasn't touched for the probably more
common use of \N{} or \x, which were fixed from not doing anything
special to using ANYOFV earlier in the 5.13 series, and it turns out
that the code that does it all is in a part of the code that gets
executed anyway, so that simply removing the special FOLDCHAR code
causes execution to drop down to this code.
I'm thinking at the moment that for 5.16, ANYOV should be removed in
favor of branches, using the technique of recursion that has recently
been added to \N{}. That would enable easier trie generation and
simplify things in regexec and the optimizer.
|
| |
|
|
|
|
|
|
| |
The tricky folds have only worked one direction. This handles the
other, when it sees something the tricky fold folds to it converts that
to the tricky fold op.
|
|
|
|
|
| |
This is in preparation for future commits. The declarations don't
depend on the two code lines.
|
|
|
|
| |
This failed under some circumstances
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This routine now calls reg() recursively after converting the parse
to something the rest of the code understands. This eliminates
duplicated code, and allows for uniform treatment of code points, as
things were getting out of sync. It also eliminates the restrction on
how many characters a named sequence can expand to.
toke now converts its input (which is in Unicode terms) to native on
EBCDIC platforms, so the rest of the code can can continue to ignore
that.
The restriction on the length of the number of characters a named
sequence is hereby removed, because reg() handles that.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Also, change the skip added in 2feceb76bc07c897 to a todo skip.
|
| |
|
| |
|
| |
|
|
|
|
| |
Only the Acknowledgements section still needs filling in.
|
| |
|
| |
|
|
|
|
|
|
|
| |
Retain the call to XSLoader::load() at BEGIN time, as we want the constants
loaded before the compiler meets OPf_KIDS below, as the combination of having
the constant stay a Proxy Constant Subroutine and its value being inlined
saves a little over .5K
|
|
|
|
|
|
|
| |
Typeglob aliasing saves just about 1.25K, because fewer internal structures are
created. In the general case the behaviour of the two differs, but as the
only package variables of these names are subroutines, and we are within our
own namespace, there is no difference here.
|
|
|
|
|
|
| |
local($_) will now strip all magic from $_, so that it is always safe
to localize $_, regardless what kind of special (or tied) variable it
may have been aliased to.
|
| |
|
| |
|
|
|
|
|
|
|
| |
F(ull) case foldings are not handled all that well in Perl. It turns out
that a number of them have S foldings as well. In all cases, what
matches in S is supposed to also match in F, but Perl doesn't always
know that; this adds that information.
|
|
|
|
| |
Partial fix for RT #80626
|
|
|
|
|
|
| |
‘lvalue sub return values are now COW’ is not very clear.
I know 5.12.3 is already released, but at least for posterity’s
sake it’s nice to make this more descriptive.
|
| |
|
|
|
|
| |
These are CPAN tickets, not perl tickets.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[DELTA]
0.011 2011-03-19 20:48:39 America/New_York
[BUG FIXES]
- Made t/000_load.t less verbose under harness (RT#65507) [Dave Mitchell]
- Removed 'Errno' as an explicit prefix (it is a core module, but not
indexed by PAUSE, which might confuse some installers
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As indicated in the comments, this flag needs to be initialized to
1 or the optimizer loses the fact that something could match a
character that isn't in utf8 and whose bitmap bit isn't set. This
happens, for example, with Unicode properties.
Thus this fixes #77414. That ticket had been closed recently because
it went away due to another patch that caused the optimizer to be
bypassed in the cases tested for. But when that patch was reverted,
and cleaned-up, this bug came back. Now, I believe I have found the
root cause.
|
|
|
|
|
|
|
|
| |
For non-locale, \d, etc are compiled in with their actual code points they
match, so the class portion of the synthetic start class node is
irrelevant, and should initialized to zero to avoid confusion. But for
locale it is highly relevant, and should be initialized to all ones, to
indicate matching anything.
|
|
|
|
|
|
|
| |
When ORing two nodes together for the synthetic start class, and one
matches outside the 256-char bitmap, we currently don't know what it
matches. In some cases it could be some or all of those 256 characters.
If so, we have to assume it's all of them.
|
|
|
|
|
| |
This is in prep for another commit which needs the flags to be
untouched for some tests.
|
|
|
|
|
| |
When my system was at 100%, the 2 seconds wasn't enough. I set it
to 10 seconds which is the most common value used in other .t's
|
|
|
|
|
| |
This macro sets all the bits of the class (for \w, etc) for use during
initialization
|
|
|
|
| |
It can't just be large enough to hold the Unicode subset.
|
|
|
|
|
|
| |
The comment said that there was no use doing this in lenp was NULL,
but there is, as it sees if there is a match or not and sets the
appropriate variable.
|
| |
|
|
|
|
|
|
|
|
|
| |
[DELTA]
Changes for 0.9103 Sun Mar 20 00:38:05 2011
================================================
* Fixed the logic not sending NA reports when
'perl' is expressed as a prereq
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This:
commit 0298d7b92741692bcf2e34c418a564332bb034e6:
Date: Tue May 31 10:40:01 2005 +0000
Avoid updating a variable in a loop.
Only calculate the number of links in a hash bucket chain if we really
need it.
p4raw-id: //depot/perl@24648
forgot to move a large comment to its new location; this new commit
fixes that.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I had about an hour of über confusion regarding smart matching in a
when, and when I finally clocked on to what the POD was telling me I
thought clarification would be in order. Many agreed :)
The chief change I would make is to use the word 'operands' instead of
'arguments' when referring to the ... and ..., ... && ... etc
sections; this was the major cause of my confusion. Second
clarification is that 'the test' in question is whether to use smart
matching, not the result of using smart matching!
Patch follows; please go ahead and amend as required :)
|
| |
|
|
|
|
|
|
|
|
|
| |
Perl_sighandler currently increments the savestack by 5
before running a signal handler, to avoid messing with a
partially completed SS push operation that's been interrupted.
This is irrelevant for safe signals, so make this action conditional on
unsafe signals only.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Perl_sighandler, we currently increment PL_markstack_ptr and
PL_scopestack_ix.
This was added back in 1997 in the era of unsafe signals, to make them
slightly less unsafe. The idea presumably was to stop signal handlers
inadvertently corrupting the top element of each stack. However, given that
the normal method of pushing something onto those stacks is to increment
the pointer before pushing the value, I don't see how that can happen.
The downside of this is that an uninitialised or stale value can be left
in the 'hole' left on these stacks. When exiting from a signal handler via
exit(), these holes can be read and corruption occur, while stack
unwinding is taking place. The ordering of things means we can't use
SAVEDESTRUCTOR_X to undo the damage.
This commit leaves the 'PL_savestack_ix += 5', because in this case, with
unsafe signals, it *is* possible to interrupt halfway through a new set of
save data being pushed onto the stack, and it *is* possible for this to be
undone via SAVEDESTRUCTOR_X. (But it's still unsafe and half-baked.)
This fixes [perl #85206].
|
|
|
|
|
| |
For threaded platforms, this reduces the object code size, and should slightly
reduce CPU usage.
|
|
|
|
|
| |
For threaded platforms, this reduces the object code size, and should slightly
reduce CPU usage.
|
| |
|