summaryrefslogtreecommitdiff
path: root/toke.c
Commit message (Collapse)AuthorAgeFilesLines
* Extract regcurly as a static inline function.Andy Dougherty2010-09-221-0/+1
| | | | | | | This patch extracts regcurly from regcomp.c and converts it to a static inline function in a new file dquote_static.c that is now #included by regcomp.c and toke.c. This change will require 'make regen'.
* Shorten external symbol name for VMSFlorian Ragwitz2010-09-111-2/+2
| | | | | | VMS seems to have a 31 character limitation for external symbols. To be able to fit into that, rename 'coerce_qwlist_to_paren_list' to 'munge_qwlist_to_paren_list'.
* fix MAD breakage caused by qw patchZefram2010-09-111-0/+1
| | | | | ea25a9b2cf73948b1e8c5675de027e0ad13277bd broke MAD due to incorrect usage of the token-forcing mechanism.
* make qw(...) first-class syntaxZefram2010-09-081-10/+24
| | | | | | | | | | This makes a qw(...) list literal a distinct token type for the parser, where previously it was munged into a "(",THING,")" sequence. The change means that qw(...) can't accidentally supply parens to parts of the grammar that want real parens. Due to many bits of code taking advantage of that by "foreach my $x qw(...) {}", this patch also includes a hack to coerce qw(...) to the old-style parenthesised THING, emitting a deprecation warning along the way.
* function interface to parse Perl statementZefram2010-09-061-13/+59
| | | | | | | | | | | | | yyparse() becomes reentrant. The yacc stack and related resources are allocated in yyparse(), rather than in lex_start(), and they are localised to yyparse(), preserving their values from any outer parser. yyparse() now takes a parameter which determines which production it will parse at the top level. New API function parse_fullstmt() uses this facility to parse just a single statement. The top-level single-statement production that is used for this then messes with the parser's head so that the parsing stops without seeing EOF, and any lookahead token seen after the statement is pushed back to the lexer.
* Avoid needless use of UTF8=1 format [RT#56336]Eric Brine2010-08-311-13/+6
| | | | | Some literals (e.g. q'abc') don't set the UTF8 flag for pure ASCII literals. Others (e.g. -abc) do. This should be consistent.
* Remove CALL_FPTR and CPERLscope.Ben Morrow2010-08-201-1/+1
| | | | | | | | | | | | | | | | These are left from PERL_OBJECT, which was an implementation of multiplicity using C++ objects. PERL_OBJECT was removed in 5.8, but the macros seem to have been cargo-culted all over the core (including in places where they would have been inappropriate originally). Since they now do exactly nothing, it's cleaner to remove them. I have left the definitions in perl.h, under #ifndef PERL_CORE, since some CPAN XS code uses them (also often incorrectly). I have also left STATIC alone, since it seems potentially more useful and is much more ingrained. The only appearance of these macros this patch doesn't touch is in Devel-PPPort, because that's a CPAN module.
* [perl #75904] \$ prototype does not make a unary functionFather Chrysostomos2010-08-111-2/+19
| | | | | | | | | | | | | | | | | | This fixes this problem : $ perl -le' sub foo($) { print "foo" }; foo $_, exit' foo $ perl -le' sub foo(\$) { print "foo" }; foo $_, exit' Too many arguments for main::foo at -e line 1, at EOF Execution of -e aborted due to compilation errors. for all those prototypes: * \sigil \[...] ;$ ;* ;\sigil ;\[...]
* Change function signature of grok_bslash_oKarl Williamson2010-07-271-2/+3
| | | | The previous return value where NULL meant OK is outside-the-norm.
* Correct comment in toke.cKarl Williamson2010-07-271-1/+1
|
* Normalize formatting of "Ambiguous call resolved as CORE::%s(), qualify as ↵James Mastros2010-07-261-2/+3
| | | | such or use &" in toke.c, so t/porting/diag.t can find it.
* Add \o{} escapeKarl Williamson2010-07-171-0/+14
| | | | | | | | | | This commit adds the new construct \o{} to express a character constant by its octal ordinal value, along with ancillary tests and documentation. A function to handle this is added to util.c, and it is called from the 3 parsing places it could occur. The function is a candidate for in-lining, though I doubt that it will ever be used frequently.
* Code for allowing uppercase X/B in hexadecimal/binary numbers (#76296).Bo Lindbergh2010-07-061-2/+2
| | | | Signed-off-by: David Golden <dagolden@cpan.org>
* In Perl_lex_start(), use newSVpvn_flags() to reduce source and object size.Nicholas Clark2010-07-051-2/+2
|
* eval $overloaded can crashDavid Mitchell2010-07-031-2/+3
| | | | | | | Perl_lex_start() assumes that the SV passed to it is a well-behaved string that it can do PVX() stuff to. If it's actually a ref to an overloaded object, it can crash and burn. Fixed by creating a stringified copy of the SV if necessary.
* In Perl_filter_del(), no need to NULL IoANY(datasv).Nicholas Clark2010-06-301-2/+0
| | | | | Perl_sv_clear() understands the IOf_FAKE_DIRP flag, and when set won't treat IoANY() as a a pointer to a directory handle that needs closing.
* Deprecate no space after s/a/b/ and keywordKarl Williamson2010-06-281-1/+7
| | | | | Pattern replacements need to have the deprecation added; the prior patch on this ticket only changed m/a/keyword; this adds the s/a/b/keyword
* Add clarifying comment to toke.cKarl Williamson2010-06-281-1/+2
|
* RT 75902: Add prototypes for tie() and untie() to allow overloadingFather Chrysostomos2010-06-251-3/+3
|
* Deprecate no space between pattern, following wordKarl Williamson2010-06-181-0/+6
| | | | | | | This patch raises a deprecated warning on constructs like $result = $a =~ m/$foo/sand $bar; which means $result = $a =~ m/$foo/s and $bar;
* Parameters for * in *printf must be int - add a cast to ensure this.Nicholas Clark2010-06-101-1/+2
| | | | Fixes a (legitimate) compiler warning present since 6e1bad6cc227c8e8.
* Eliminate some newSV(0)s by merging the SV allocation with first modification.Nicholas Clark2010-05-301-10/+6
|
* Fix the regexp in t/porting/args_assert.t, and add 3 missing macros.Nicholas Clark2010-05-291-0/+4
| | | | Resolves RT #72800.
* Add s///r (non-destructive substitution).David Caldwell2010-05-221-4/+5
| | | | | | | | | | | | | | | | This changes s/// so that it doesn't act destructively on its target. Instead it returns the result of the substitution (or the original string if there was no match). In addition this patch: * Adds a new warning when s///r happens in void context. * Adds a error when you try to use s///r with !~ * Makes it so constant strings can be bound to s///r with =~ * Adds documentation. * Adds some tests. * Updates various debug code so it knows about the /r flag. * Adds some new 'r' words to B::Deparse.
* Re-instate the use of gv_stashpvn_flags(), and the correct non-boolean argument.Nicholas Clark2010-05-211-4/+5
| | | | This restores the change of 9bde8eb087a2c05d4c8b0394a59d28a09fe5f529.
* Remove the tokeniser hack that prevents compile-time vivification of %stash::Nicholas Clark2010-05-211-22/+4
| | | | | | | | | | | | | | | | This was put in to ensure that defined %stash:: continued to return false after the implementation of hashes was changed, such that stashes were always defined. defined %stash:: is deprecated. This reverts the tokeniser changes of adc51b978ed1b2e9d4512c9bfa80386ac917d05a, 76138434928a968a390c791aec92e5f00017d01d, d6069db2e52f58ef65bf59f2fd453604270d2205 and part of 9bde8eb087a2c05d4c8b0394a59d28a09fe5f529, and updates the tests added with those commits to reflect the restored (but as yet unreleased) behaviour. I don't think that this should be merged to blead until after 5.12.0 ships, with the enabled deprecation warnings on defined %hash, as it changes subtle behaviour that all current released stable perls accept without warning.
* support "package Foo { ... }"Zefram2010-05-201-2/+5
| | | | | | Package block syntax limits the scope of the package declaration to the attached block. It's cleaner than requiring the declaration to come inside the block.
* Revert "New deprecation warning: Dot after %s literal is concatenation"Jesse Vincent2010-05-051-9/+0
| | | | | | | This reverts commit 6fb472bab4fadd0ae2ca9624b74596afab4fb8cb. Zefram asked me to revert this as he's going to be doing something more pluggable
* Revert "Deprecation warnings should always be mandatory since 5.12.0"Jesse Vincent2010-05-051-2/+2
| | | | | | | This reverts commit a7e260e62a5e47961e908363da32ef16f41301b2. Zefram asked me to revert this as he's going to be doing something more pluggable
* Revert "tweak "0x123.456" deprecation"Jesse Vincent2010-05-051-12/+9
| | | | | | | This reverts commit 1183a10042af0734ee65e252f15bd820b7bbe686. Zefram asked me to revert this as he's going to be doing something more pluggable
* If we're going to introduce an @@ array, we'll want to be able to parse $#@ tooRafael Garcia-Suarez2010-05-051-1/+1
|
* tweak "0x123.456" deprecationZefram2010-05-031-9/+12
| | | | | | | | | | | | Some improvements to the deprecation added in commit 6fb472bab4fadd0ae2ca9624b74596afab4fb8cb: - warning message includes the word "deprecated" - warning is in "syntax" category as well as "deprecated" - more systematic tests - dot detected more efficiently by incorporation into existing switch - small doc rewording - avoid the warning in t/op/taint.t
* remove Perl_pmflagRobin Barker2010-04-261-13/+0
|
* Deal with "\c{", and its kinKarl Williamson2010-04-261-6/+1
| | | | | | | | | | | | | | | | | | make regen is needed This patch forbids non-ascii following the "\c". It also terminates for "\c{" with a message to contact p5p if there is need for continuing its current definition. And if the character following the "\c" causes the result to not be a control character, a warning is issued. This is currently 'deprecated', which by default is turned on. This can easily be changed later. This patch is the initial patch. It does not do any fancy showing the context where the problematic construct occurs. This can be added later. It gathers the 3 occurrences of evaluating \c and puts them in one common routine.
* PATCH: memory leak introduced in 5.12.0Karl Williamson2010-04-251-7/+4
| | | | | | | | There is a small possibility of a memory leak in toke.c when there is a deprecated character in the name in a \N{...} construct, and the Perl is embedded or something like that so that memory isn't freed up when it exits. This patch avoids the creation of a new scalar, and gives a better error message besides.
* consting in lex_stuff_pvnRobin Barker2010-04-231-4/+4
|
* Deprecation warnings should always be mandatory since 5.12.0Rafael Garcia-Suarez2010-04-231-2/+2
|
* New deprecation warning: Dot after %s literal is concatenationJames Mastros2010-04-231-0/+9
|
* use cBOOL for bool castsDavid Mitchell2010-04-151-1/+1
| | | | | | | | | | | | | bool b = (bool)some_int doesn't necessarily do what you think. In some builds, bool is defined as char, and that cast's behaviour is thus undefined. So this line in mg.c: const bool was_temp = (bool)SvTEMP(sv); was actually setting was_temp to false even when the SVs_TEMP flag was set. Fix this by replacing all the (bool) casts with a new cBOOL() cast macro that (hopefully) does the right thing.
* [perl #74006] 5.12.0-RC stuffing bugZefram2010-04-141-0/+5
| | | | | | | There's a small bug in lex_stuff_pvn() that causes spurious syntax errors in an obscure situation. It happens if stuffing is performed on the last line of a file, and the line ends with a statement that lacks its terminating semicolon. Attached patch fixes and adds test.
* Revert "Revert "* Fixed typo in toke.c docs, identified by Zefram""Jesse Vincent2010-04-131-1/+1
| | | | This reverts commit 06164d6c3ad67ed7ba18030ae378f46f482a29af.
* Revert "* Fixed typo in toke.c docs, identified by Zefram"Jesse Vincent2010-04-121-1/+1
| | | | | | | The commit was good, but we're in freeze for 5.12.0. I'd be happy to see this hit blead again after 5.12.0 is tagged. This reverts commit 675ac12c19e6fe00eff6e604a7d637bf621997ef.
* * Fixed typo in toke.c docs, identified by Zeframbrian d foy2010-04-111-1/+1
|
* Revert "Forbid labels with keyword names"Jan Dubois2010-03-021-2/+0
| | | | | | | | This reverts commit f71d6157c7933c0d3df645f0411d97d7e2b66b2f. Revert "Add new error "Can't use keyword '%s' as a label"" This reverts commit 28ccebc469d90664106fcc1cb73d7321c4b60716.
* PATCH: deprecation warnings for unreasonable charnamesKarl Williamson2010-02-201-1/+64
| | | | | | | | | | | | | | | | | Prior to now just about anything has been legal for a character name in \N{...}. This means that legal code was broken by having \N{3,4} for example mean [^\n]{3,4}. Such code doesn't come from standard charnames, but from legal custom translators. This patch deprecates "unreasonable" names. handy.h is changed by the addition of macros that taken together define the names we deem reasonable, namely alpha beginning with alphanumerics and some punctuations as continuations. toke.c is changed to parse each name and to raise a warning if any problematic characters are found. Some tests and diagnostic documentation are also included.
* Add some missing dVAR'sMarcus Holland-Moritz2010-02-201-0/+2
| | | | | | Commits c3acb9e0760135dfd888c0ee1b415777d784aabc, 867fa1e2da145229b4db2c6e8d5b51700c15f114 and f0e67a1d29102aa9905aecf2b0f98449697d5af3 added or changed functions that now require a dVAR declaration to compile with -DPERL_GLOBAL_STRUCT.
* Avoid returning an undefined SV*Rafael Garcia-Suarez2010-02-191-1/+2
|
* Make a missing right brace on \N{ fatalKarl Williamson2010-02-191-24/+9
| | | | | | It was decided that this should be a fatal error instead of a warning. Also some comments were updated..
* PATCH: [perl #56444] delayed interpolation of \N{...}Karl Williamson2010-02-191-85/+299
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | make regen embed.fnc needs to be run on this patch. This patch fixes Bugs #56444 and #62056. Hopefully we have finally gotten this right. The parser used to handle all the escaped constants, expanding \x2e to its single byte equivalent. The problem is that for regexp patterns, this is a '.', which is a metacharacter and has special meaning that \x2e does not. So things were changed so that the parser didn't expand things in patterns. But this causes problems for \N{NAME}, when the pattern doesn't get evaluated until runtime, as for example when it has a scalar reference in it, like qr/$foo\N{NAME}/. We want the value for \N{NAME} that was in effect at the point during the parsing phase that this regex was encountered in, but we don't actually look at it until runtime, when these bug reports show that it is gone. The solution is for the tokenizer to parse \N{NAME}, but to compile it into an intermediate value that won't ever be considered a metacharacter. We have chosen to compile NAME to its equivalent code point value, and express it in the already existing \N{U+...} form. This indicates to the regex compiler that the original input was a named character and retains the value it had at that point in the parse. This means that \N{U+...} now always must imply Unicode semantics for the string or pattern it appeared in. Previously there was an inconsistency, where effectively \N{NAME} implied Unicode semantics, but \N{U+...} did not necessarily. So now, any string or pattern that has either of these forms is utf8 upgraded. A complication is that a charnames handler can return a sequence of multiple characters instead of just one. To deal with this case, the tokenizer will generate a constant of the form \N{U+c1.c2.c2...}, where c1 etc are the individual characters. Perhaps this will be made a public interface someday, but I decided to not expose it externally as far as possible for now in case we find reason to change it. It is possible to defeat this by passing it in a single quoted string to the regex compiler, so the documentation will be changed to discourage that. A further complication is that \N can have an additional meaning: to match a non-newline. This means that the two meanings have to be disambiguated. embed.fnc was changed to make public the function regcurly() in regcomp.c so that it could be referred to in toke.c to see if the ... in \N{...} is a legal quantifier like {2,}. This is used in the disambiguation. toke.c was changed to update some out-dated relevant comments. It now parses \N in patterns. If it determines that it isn't a named sequence, it passes it through unchanged. This happens when there is no brace after the \N, or no closing brace, or if the braces enclose a legal quantifier. Previously there has been essentially no restriction on what can come between the braces so that a custom translator can accept virtually anything. Now, legal quantifiers are assumed to mean that the \N is a "match non-newline that quantity of times". I removed the #ifdef'd out code that had been left in in case pack U reverted to earlier behavior. I did this because it complicated things, and because the change to pack U has been in long enough and shown that it is correct so it's not likely to be reverted. \N meaning a named character is handled differently depending on whether this is a pattern or not. In all cases, the output will be upgraded to utf8 because a named character implies Unicode semantics. If not a pattern, the \N is parsed into a utf8 string, as before. Otherwise it will be parsed into the intermediate \N{U+...} form. If the original was already a valid \N{U+...} constant, it is passed through unchanged. I now check that the sequence returned by the charnames handler is not malformed, which was lacking before. The code in regcomp.c which dealt with interfacing with the charnames handler has been removed. All the values should be determined by the time regcomp.c gets involved. The affected subroutine is necessarily restructured. An EXACT-type node is generated for the character sequence. Such a node has a capacity of 255 bytes, and so it is possible to overflow it. This wasn't checked for before, but now it is, and a warning issued and the overflowing characters are discarded.
* Allow arbitrary whitespace between NAME and VERSION in "package NAME ↵Jesse Vincent2010-02-031-0/+1
| | | | | | VERSION;" statements Fixes [perl #72432]