| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Commit 4d68ffa0f7f345bc1ae6751744518ba4bc3859bd failed to get the
correct bug number in a comment
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are three pairs of characters that Perl recognizes as
metacharacters in regular expression patterns: {}, [], and (). These
can be used as well to delimit patterns, as in:
m{foo}
s(foo)(bar)
Since they are metacharacters, they have special meaning to regular
expression patterns, and it turns out that you can't turn off that
special meaning by the normal means of preceding them with a backslash,
if you use them, paired, within a pattern delimitted by them. For
example, in
m{foo\{1,3\}}
the backslashes do not change the behavior, and this matches "f", "o"
followed by one to three more occurrences of "o".
Usages like this, where they are interpreted as metacharacters, are
exceedingly rare; we think there are none, for example, in all of CPAN.
Hence, this deprecation should affect very little code. It does give
notice, however, that any such code needs to change, which will in turn
allow us to change the behavior in future Perl versions so that the
backslashes do have an effect, and without fear that we are silently
breaking any existing code.
=head1 Performance Enhancements
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2a53d3314d380af5ab5283758219417c6dfa36e9.
Not the entire commit was reverted, but the deprecation message is
gone. This caused too many problems. See thread
http://www.nntp.perl.org/group/perl.perl5.porters/2012/11/msg195425.html
(which lists previous threads).
|
|
|
|
|
|
|
|
|
|
| |
This recently added regex syntax imposes stricter rules on parsing than
normal. However, this did not include parsing \N{} constructs that
occur within it. This commit does that, making fatal the warnings that
come from \N{}
I will add to perldiag the newly added messages along with the others
for (?[ ]) before 5.18 ships
|
|
|
|
|
|
|
| |
Identify the OS version by capturing the first two parts of the M.m.p version
number.
For RT #116262
|
|
|
|
|
|
|
|
| |
1. actually use the EISDIR string, rather than getting it and
not using it; this was a refactoring screw-up
2. don't hardcode the Win32 EACCES error, either, use the same
"$!" mechanism
|
|
|
|
|
|
|
| |
It was not enough to ensure the English value, as some platforms
use a different string entirely. Rather than goof around with
figuring them out, just get the known value by making an EISDIR
and stringifying it, then compare to that.
|
|
|
|
|
|
|
| |
Rhapsody was an Apple OS that later evolved into Darwin and Mac OS X. It was
initially only released to developers, but later became Mac OS X Server, with
releases in 1999 and 2000. It was obsoleted by Mac OS X 10.0, released in
March 2001.
|
|
|
|
|
|
| |
This is a time-honored tradition from such places as t/op. Tony
Cook alerted me to failures caused by this test on machines smoking
in non-English locales.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was discussed in thread
http://perl.markmail.org/thread/avtzvtpzemvg2ki2
but I never got around to this portion of the consensus, until now.
I did a cpan grep
http://grep.cpan.me/?q=%28^|[^\\]%29\\[0-7]{1%2C2}[8-9]&page=1
and eyeballing the results, saw three cases where this warning might
show up; one of which was for EBCDIC. The others looked to be false
positives, such as in .css files.
|
| |
|
|
|
|
|
| |
inline.h is a special header file that contains C functions, and hence
perhaps PERL_ARGS_ASSERTS.
|
|
|
|
|
|
|
| |
We've known that this is how Win32 behaves, as it was documented in
the ticket for which this is a fix. I don't think it's worth the
bother of ensuring we get EISDIR, as long as we don't just exit
silently!
|
|
|
|
| |
For: RT #61362
|
|
|
|
|
|
|
|
|
|
| |
This adds testing of (?[ ]), using the same tests, t/re/re_tests<
as are used by many of the regular expression .t files. Basically, it
converts the [bracketed] character classes in these tests to the new
syntax and verifies that they work there.
Some tests won't work in one or the other, and the capability to skip
depending on the .t is added
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a fancier [bracketed] character class which allows set
operations, such as intersection and subtraction. The entry in perlre
for this commit details its operation.
Besides extending regular expressions to handle this functionality,
recommended by Unicode, the intent here is to do three things:
1) Intersection has been simulated by regexes using zero-width
look-around assertions, which are non-obvious. This allows replacing
those with a more powerful and clearer syntax; the compiled regexes
are smaller and faster. Everything is known at compile time.
2) Set operations have also been simulated by using user-defined Unicode
properties. These are globals, have security implications,
restricted names, and d don't allow as complex expressions as this
new feature.
3) I hope that this feature will come to be viewed as a "better"
bracketed character class. I took advantage of the fact that there
is no embedded base to have to be compatibile with to forbid certain
iffy practices with the existing ones, while remaining mostly
backwards compatible. The main difference is that /x is always
enabled, so white space can be pretty much freely used with these,
but to specify a match on white space, it must be escaped. Things
that should have been illegal are, such as \x{}, and \x{abcdefghi}.
Things that look like a posix specifier but don't quite meet the
rules now give an error instead of silently compiling. e.g., [:digit]
is an error instead of the union of the characters that compose it.
I may have omitted things; perhaps it should be an error to have the
same letter occur twice, adjacent. Since this is experimental, we
can make such changes based on field feed back.
The intent is to keep this feature, since it is strongly recommended by
Unicode. The exact syntax is subject to change, so is experimental.
|
|
|
|
|
| |
This is currently unused, but will have regclass() return an inversion
list instead of a node.
|
|
|
|
|
|
|
|
|
|
| |
This adds the capability, currently unused, of forbidding certain things
in [bracketed] character classes. Included are things that warn bug
still compile, such as false ranges, [\d-\w], and unrecognized escapes.
Also forbidden are potentially ambiguous cases where \x (without braces)
isn't followed by exactly two hex digits, or \000 where the number of
octal digits isn't precisely three.
|
|
|
|
|
| |
This adds a parameter to regpposixcc() to enforce stricter rules on the
posix class syntax. It is currently unused
|
|
|
|
|
|
| |
This mode croaks on any iffy constructs that currently compile. It is
not currently used; documentation of the error messages will be
delivered later.
|
|
|
|
|
|
|
|
|
| |
These functions advance the parse pointer for the caller. The regex
code has the infrastructure to output a marker as to where the error
was. This commit simply moves the parse pointer past all the legal
digits in the input, which are likely supposed to be part of the number,
which makes it likely that the missing right brace point is just past
those.
|
|
|
|
| |
This is easier to read.
|
|
|
|
|
| |
reg_mesg.t has some more infrastructure, so it is probably easier
to add new tests there.
|
|
|
|
| |
This is identical to the test two lines above.
|
|
|
|
|
|
|
| |
These lines all fail to compile, so matching doesn't happen, so it
doesn't matter at all what the target string to be matched against
is set to. It is misleading to put apparently meaningful stuff in that
string.
|
|
|
|
|
|
| |
This reorders some if elsif ... blocks so that skip is tested for and
done before actually trying the test. This only affected tests which
were supposed to generate compiler errors.
|
|
|
|
| |
XS::APItest isn't available under -Uusedl
|
|
|
|
|
| |
Also, add note() before tests 4 and 5 explaining rationale for addition of
parentheses to second arguments.
|
|
|
|
|
|
|
|
| |
If a description were to be added to these tests, in the absence of
parentheses the scalar prototype of CORE::not would enforce a scalar context
onto the balance of the statement, leading to apparently anomalous behavior,
viz., the descriptions would not be printed and test 5 would be reported to
FAIL.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Those which were still lacking descriptions were all testing that
exponentiation has precedence over negation.
|
| |
|
|
|
|
|
|
|
| |
A user-defined character name with trailing or multiple spaces in a row
is likely a typo, and hence likely won't match what the other uses of
it. These names also won't work if we extend :loose to these. This
now generates a warning.
|
|
|
|
|
|
|
|
|
| |
The documentation says this is how it should behave, but only 1 of the
three paths in the code did it, and in fact there was a test to the
contrary.
I'm only adding a test for one of the two fixed paths, as the other one
appears to require a weird file name.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
[perl #115080]
m?...? is only supposed to match once, until reset. Normally this is done
by setting the PMf_USED flag on the PMOP. Under ithreads we can't modify
ops, so instead we indicate by setting the regex's SV to readonly. (This
is a bit of a hack: the flag should be associated with the PMOP, not the
regex).
This breaks with run-time regexes when the pattern gets recompiled; for
example:
for my $c (qw(a b c)) {
print "matched $c\n" if $c =~ m?^$c$?;
}
outputs
matched a
on unthreaded, but
matched a
matched b
matched c
on threaded.
The re_eval jumbo fix made this more noticeable by sometimes recompiling
even when the pattern text hasn't changed (to make closures work ok).
The quick fix is to propagate the readonlyness of the old re to the new
re. (The proper fix would be to store the flag state in a pad slot
associated with the PMOP).
Needless to say, I've gone for the quick fix.
|
| |
|
| |
|
|
|
|
|
|
|
| |
Previous work had collapsed most of the cases of this switch. This
removes the entire switch, allowing more of the logic to be collapsed
into single code paths. Most of this commit is just moving things
around; the heavy lifting has been done in previous commits.
|
|
|
|
| |
The previous commit fixed these TODOs.
|
|
|
|
|
|
|
|
|
| |
Commit 3018b823898645e44b8c37c70ac5c6302b031381 added a regression for
the Posix classes [:upper:] and [:lower:] when matching
case-insensitively. If an above-Latin1 code point has been matched by
one of these classes at the time another regex is compiled which also
has the same class as the first one, and the second regex is /i, the
case-insensitivity is ignored.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The ANYOF_CLASS flag is used in ANYOF nodes (for [bracketed] and the
synthetic start class) only when matching something like \w, [:punct:]
etc., under /l (locale). It should not be set unless /l is specified.
However, it was always getting set for the synthetic start class. This
commit fixes that. The previous code was masking errors in which it was
being tested for unnecessarily, and for much of the 5.17 series, the
synthetic start class was always set to test for locale, which was a
waste of cpu when no locale was specified.
|
|
|
|
|
|
| |
* Support regcomp.c ckWARN and vWARN macros
* Update pod/perldiag.pod for fixes discovered with new checks
* Allow t/porting/diag.t to match printfs with flags more liberally
|
|
|
|
| |
I almost broke this, so adding a precautionary test.
|
|
|
|
|
|
|
|
|
| |
This reverts commit f6a6501216dee24e251d4482bd3a1f6daf4ac0da.
The fix seems wrong and is causing podcheck.t test failures, and (for my
system at least), reverting it removes those errors and doesn't create new
errors. Whatever was originally causing podcheck errors needs to be fixed,
rather than trying to mask it.
|