| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
This patch extracts regcurly from regcomp.c and converts it
to a static inline function in a new file dquote_static.c
that is now #included by regcomp.c and toke.c. This change
will require 'make regen'.
|
|
|
|
|
|
| |
VMS seems to have a 31 character limitation for external symbols. To be able to
fit into that, rename 'coerce_qwlist_to_paren_list' to
'munge_qwlist_to_paren_list'.
|
|
|
|
|
| |
ea25a9b2cf73948b1e8c5675de027e0ad13277bd broke MAD due to incorrect
usage of the token-forcing mechanism.
|
|
|
|
|
|
|
|
|
|
| |
This makes a qw(...) list literal a distinct token type for the
parser, where previously it was munged into a "(",THING,")" sequence.
The change means that qw(...) can't accidentally supply parens to parts
of the grammar that want real parens. Due to many bits of code taking
advantage of that by "foreach my $x qw(...) {}", this patch also includes
a hack to coerce qw(...) to the old-style parenthesised THING, emitting
a deprecation warning along the way.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
yyparse() becomes reentrant. The yacc stack and related resources
are allocated in yyparse(), rather than in lex_start(), and they are
localised to yyparse(), preserving their values from any outer parser.
yyparse() now takes a parameter which determines which production it
will parse at the top level. New API function parse_fullstmt() uses this
facility to parse just a single statement. The top-level single-statement
production that is used for this then messes with the parser's head so
that the parsing stops without seeing EOF, and any lookahead token seen
after the statement is pushed back to the lexer.
|
|
|
|
|
| |
Some literals (e.g. q'abc') don't set the UTF8 flag for pure ASCII literals.
Others (e.g. -abc) do. This should be consistent.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These are left from PERL_OBJECT, which was an implementation of
multiplicity using C++ objects. PERL_OBJECT was removed in 5.8, but the
macros seem to have been cargo-culted all over the core (including in
places where they would have been inappropriate originally). Since they
now do exactly nothing, it's cleaner to remove them.
I have left the definitions in perl.h, under #ifndef PERL_CORE, since
some CPAN XS code uses them (also often incorrectly). I have also left
STATIC alone, since it seems potentially more useful and is much more
ingrained.
The only appearance of these macros this patch doesn't touch is in
Devel-PPPort, because that's a CPAN module.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes this problem :
$ perl -le' sub foo($) { print "foo" }; foo $_, exit'
foo
$ perl -le' sub foo(\$) { print "foo" }; foo $_, exit'
Too many arguments for main::foo at -e line 1, at EOF
Execution of -e aborted due to compilation errors.
for all those prototypes:
*
\sigil
\[...]
;$
;*
;\sigil
;\[...]
|
|
|
|
| |
The previous return value where NULL meant OK is outside-the-norm.
|
| |
|
|
|
|
| |
such or use &" in toke.c, so t/porting/diag.t can find it.
|
|
|
|
|
|
|
|
|
|
| |
This commit adds the new construct \o{} to express a character constant
by its octal ordinal value, along with ancillary tests and
documentation.
A function to handle this is added to util.c, and it is called from the
3 parsing places it could occur. The function is a candidate for
in-lining, though I doubt that it will ever be used frequently.
|
|
|
|
| |
Signed-off-by: David Golden <dagolden@cpan.org>
|
| |
|
|
|
|
|
|
|
| |
Perl_lex_start() assumes that the SV passed to it is a well-behaved
string that it can do PVX() stuff to. If it's actually a ref to an
overloaded object, it can crash and burn. Fixed by creating a stringified
copy of the SV if necessary.
|
|
|
|
|
| |
Perl_sv_clear() understands the IOf_FAKE_DIRP flag, and when set won't treat
IoANY() as a a pointer to a directory handle that needs closing.
|
|
|
|
|
| |
Pattern replacements need to have the deprecation added; the prior patch
on this ticket only changed m/a/keyword; this adds the s/a/b/keyword
|
| |
|
| |
|
|
|
|
|
|
|
| |
This patch raises a deprecated warning on constructs like
$result = $a =~ m/$foo/sand $bar;
which means
$result = $a =~ m/$foo/s and $bar;
|
|
|
|
| |
Fixes a (legitimate) compiler warning present since 6e1bad6cc227c8e8.
|
| |
|
|
|
|
| |
Resolves RT #72800.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This changes s/// so that it doesn't act destructively on its target.
Instead it returns the result of the substitution (or the original string if
there was no match).
In addition this patch:
* Adds a new warning when s///r happens in void context.
* Adds a error when you try to use s///r with !~
* Makes it so constant strings can be bound to s///r with =~
* Adds documentation.
* Adds some tests.
* Updates various debug code so it knows about the /r flag.
* Adds some new 'r' words to B::Deparse.
|
|
|
|
| |
This restores the change of 9bde8eb087a2c05d4c8b0394a59d28a09fe5f529.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was put in to ensure that defined %stash:: continued to return false after
the implementation of hashes was changed, such that stashes were always defined.
defined %stash:: is deprecated.
This reverts the tokeniser changes of adc51b978ed1b2e9d4512c9bfa80386ac917d05a,
76138434928a968a390c791aec92e5f00017d01d,
d6069db2e52f58ef65bf59f2fd453604270d2205 and part of
9bde8eb087a2c05d4c8b0394a59d28a09fe5f529, and updates the tests added with those
commits to reflect the restored (but as yet unreleased) behaviour.
I don't think that this should be merged to blead until after 5.12.0 ships,
with the enabled deprecation warnings on defined %hash, as it changes subtle
behaviour that all current released stable perls accept without warning.
|
|
|
|
|
|
| |
Package block syntax limits the scope of the package declaration to the
attached block. It's cleaner than requiring the declaration to come
inside the block.
|
|
|
|
|
|
|
| |
This reverts commit 6fb472bab4fadd0ae2ca9624b74596afab4fb8cb.
Zefram asked me to revert this as he's going to be doing something more
pluggable
|
|
|
|
|
|
|
| |
This reverts commit a7e260e62a5e47961e908363da32ef16f41301b2.
Zefram asked me to revert this as he's going to be doing something more
pluggable
|
|
|
|
|
|
|
| |
This reverts commit 1183a10042af0734ee65e252f15bd820b7bbe686.
Zefram asked me to revert this as he's going to be doing something more
pluggable
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some improvements to the deprecation added in commit
6fb472bab4fadd0ae2ca9624b74596afab4fb8cb:
- warning message includes the word "deprecated"
- warning is in "syntax" category as well as "deprecated"
- more systematic tests
- dot detected more efficiently by incorporation into existing switch
- small doc rewording
- avoid the warning in t/op/taint.t
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
make regen is needed
This patch forbids non-ascii following the "\c". It also terminates for
"\c{" with a message to contact p5p if there is need for continuing its
current definition. And if the character following the "\c" causes the
result to not be a control character, a warning is issued. This is
currently 'deprecated', which by default is turned on. This can easily
be changed later.
This patch is the initial patch. It does not do any fancy showing the
context where the problematic construct occurs. This can be added
later.
It gathers the 3 occurrences of evaluating \c and puts them in one
common routine.
|
|
|
|
|
|
|
|
| |
There is a small possibility of a memory leak in toke.c when there is a
deprecated character in the name in a \N{...} construct, and the Perl is
embedded or something like that so that memory isn't freed up when it
exits. This patch avoids the creation of a new scalar, and gives a
better error message besides.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bool b = (bool)some_int
doesn't necessarily do what you think. In some builds, bool is defined as
char, and that cast's behaviour is thus undefined. So this line in mg.c:
const bool was_temp = (bool)SvTEMP(sv);
was actually setting was_temp to false even when the SVs_TEMP flag was set.
Fix this by replacing all the (bool) casts with a new cBOOL() cast macro
that (hopefully) does the right thing.
|
|
|
|
|
|
|
| |
There's a small bug in lex_stuff_pvn() that causes spurious syntax errors
in an obscure situation. It happens if stuffing is performed on the
last line of a file, and the line ends with a statement that lacks its
terminating semicolon. Attached patch fixes and adds test.
|
|
|
|
| |
This reverts commit 06164d6c3ad67ed7ba18030ae378f46f482a29af.
|
|
|
|
|
|
|
| |
The commit was good, but we're in freeze for 5.12.0. I'd be happy to
see this hit blead again after 5.12.0 is tagged.
This reverts commit 675ac12c19e6fe00eff6e604a7d637bf621997ef.
|
| |
|
|
|
|
|
|
|
|
| |
This reverts commit f71d6157c7933c0d3df645f0411d97d7e2b66b2f.
Revert "Add new error "Can't use keyword '%s' as a label""
This reverts commit 28ccebc469d90664106fcc1cb73d7321c4b60716.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to now just about anything has been legal for a character name in
\N{...}. This means that legal code was broken by having \N{3,4} for
example mean [^\n]{3,4}. Such code doesn't come from standard
charnames, but from legal custom translators.
This patch deprecates "unreasonable" names. handy.h is changed by the
addition of macros that taken together define the names we deem
reasonable, namely alpha beginning with alphanumerics and some
punctuations as continuations.
toke.c is changed to parse each name and to raise a warning if any
problematic characters are found.
Some tests and diagnostic documentation are also included.
|
|
|
|
|
|
| |
Commits c3acb9e0760135dfd888c0ee1b415777d784aabc, 867fa1e2da145229b4db2c6e8d5b51700c15f114
and f0e67a1d29102aa9905aecf2b0f98449697d5af3 added or changed functions that now require a
dVAR declaration to compile with -DPERL_GLOBAL_STRUCT.
|
| |
|
|
|
|
|
|
| |
It was decided that this should be a fatal error instead of a warning.
Also some comments were updated..
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
make regen embed.fnc
needs to be run on this patch.
This patch fixes Bugs #56444 and #62056.
Hopefully we have finally gotten this right. The parser used to handle
all the escaped constants, expanding \x2e to its single byte equivalent.
The problem is that for regexp patterns, this is a '.', which is a
metacharacter and has special meaning that \x2e does not. So things
were changed so that the parser didn't expand things in patterns. But
this causes problems for \N{NAME}, when the pattern doesn't get
evaluated until runtime, as for example when it has a scalar reference
in it, like qr/$foo\N{NAME}/. We want the value for \N{NAME} that was
in effect at the point during the parsing phase that this regex was
encountered in, but we don't actually look at it until runtime, when
these bug reports show that it is gone. The solution is for the
tokenizer to parse \N{NAME}, but to compile it into an intermediate
value that won't ever be considered a metacharacter. We have chosen to
compile NAME to its equivalent code point value, and express it in the
already existing \N{U+...} form. This indicates to the regex compiler
that the original input was a named character and retains the value it
had at that point in the parse.
This means that \N{U+...} now always must imply Unicode semantics for
the string or pattern it appeared in. Previously there was an
inconsistency, where effectively \N{NAME} implied Unicode semantics, but
\N{U+...} did not necessarily. So now, any string or pattern that has
either of these forms is utf8 upgraded.
A complication is that a charnames handler can return a sequence of
multiple characters instead of just one. To deal with this case, the
tokenizer will generate a constant of the form \N{U+c1.c2.c2...}, where
c1 etc are the individual characters. Perhaps this will be made a
public interface someday, but I decided to not expose it externally as
far as possible for now in case we find reason to change it. It is
possible to defeat this by passing it in a single quoted string to the
regex compiler, so the documentation will be changed to discourage that.
A further complication is that \N can have an additional meaning: to
match a non-newline. This means that the two meanings have to be
disambiguated.
embed.fnc was changed to make public the function regcurly() in
regcomp.c so that it could be referred to in toke.c to see if the ... in
\N{...} is a legal quantifier like {2,}. This is used in the
disambiguation.
toke.c was changed to update some out-dated relevant comments.
It now parses \N in patterns. If it determines that it isn't a named
sequence, it passes it through unchanged. This happens when there is no
brace after the \N, or no closing brace, or if the braces enclose a
legal quantifier. Previously there has been essentially no restriction
on what can come between the braces so that a custom translator can
accept virtually anything. Now, legal quantifiers are assumed to mean
that the \N is a "match non-newline that quantity of times".
I removed the #ifdef'd out code that had been left in in case pack U
reverted to earlier behavior. I did this because it complicated things,
and because the change to pack U has been in long enough and shown that
it is correct so it's not likely to be reverted.
\N meaning a named character is handled differently depending on whether
this is a pattern or not. In all cases, the output will be upgraded to
utf8 because a named character implies Unicode semantics. If not a
pattern, the \N is parsed into a utf8 string, as before. Otherwise it
will be parsed into the intermediate \N{U+...} form. If the original
was already a valid \N{U+...} constant, it is passed through unchanged.
I now check that the sequence returned by the charnames handler is not
malformed, which was lacking before.
The code in regcomp.c which dealt with interfacing with the charnames
handler has been removed. All the values should be determined by the
time regcomp.c gets involved. The affected subroutine is necessarily
restructured.
An EXACT-type node is generated for the character sequence. Such a node
has a capacity of 255 bytes, and so it is possible to overflow it. This
wasn't checked for before, but now it is, and a warning issued and the
overflowing characters are discarded.
|
|
|
|
|
|
| |
VERSION;" statements
Fixes [perl #72432]
|