summaryrefslogtreecommitdiff
path: root/toke.c
Commit message (Collapse)AuthorAgeFilesLines
* Don't allow non-graphemes as pattern delimitersKarl Williamson2018-06-251-23/+13
| | | | This has been deprecated, and scheduled for removal in 5.30.
* toke.c: Move some code into called functionKarl Williamson2018-06-251-10/+2
| | | | | It makes more sense for this code to be in the function called, rather than separated out.
* PATCH: [perl #133074] 5.26.1: some coverity fixesMarc-Philip2018-04-081-4/+4
| | | | | | | | we have some coverity code scans here. They have found this uninilialized variable in pp.c and the integer overrun in toke.c. Though it might be possible that these are false positives (no reasonable control path gets there), it's good to mute the scan here to see the real problems easier.
* Use charnames inversion listsKarl Williamson2018-03-311-16/+11
| | | | | | | | This commit makes the inversion lists for parsing character name global instead of interpreter level, so can be initialized once per process, and no copies are created upon new thread instantiation. More importantly, this is another instance where utf8_heavy.pl no longer needs to be loaded, and the definition files read from disk.
* fix line numbers in multi-line s///David Mitchell2018-03-071-1/+1
| | | | | | | | | | my commit v5.25.6-230-g6432a58, "Eliminate SVrepl_EVAL and SvEVALED()", introduced a regression: __LINE__ no longer took account of multiple lines in the s///. Now fixed. Spotted by Abigail.
* detect sub attributes following a signatureDavid Mitchell2018-03-021-10/+33
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RT #132760 A recent commit (v5.27.7-212-g894f226) moved subroutine attributes back before the subroutine's signature: e.g. sub foo :prototype($$) ($a, $b) { ... } # 5.18 and 5.28 + sub foo ($a, $b) :prototype($$) { ... } # 5.20 .. 5.26 This change means that any code still using an attribute following the signature is going to trigger a syntax error. However, the error, followed by error recovery and further warnings and errors, is very unfriendly and gives no indication of the root cause. This commit introduces a new error, "Subroutine attributes must come before the signature". For example, List::Lazy, the subject of the ticket, failed to compile tests, with output like: Array found where operator expected at blib/lib/List/Lazy.pm line 43, near "$$@)" (Missing operator before @)?) "my" variable $step masks earlier declaration in same statement at blib/lib/List/Lazy.pm line 44. syntax error at blib/lib/List/Lazy.pm line 36, near ") :" Global symbol "$generator" requires explicit package name (did you forget to declare "my $generator"?) at blib/lib/List/Lazy.pm line 38. Global symbol "$state" requires explicit package name (did you forget to declare "my $state"?) at blib/lib/List/Lazy.pm line 39. Global symbol "$min" requires explicit package name (did you forget to declare "my $min"?) at blib/lib/List/Lazy.pm line 43. Global symbol "$max" requires explicit package name (did you forget to declare "my $max"?) at blib/lib/List/Lazy.pm line 43. Global symbol "$step" requires explicit package name (did you forget to declare "my $step"?) at blib/lib/List/Lazy.pm line 43. Invalid separator character '{' in attribute list at blib/lib/List/Lazy.pm line 44, near "$step : sub " Global symbol "$step" requires explicit package name (did you forget to declare "my $step"?) at blib/lib/List/Lazy.pm line 44. But following this commit, it now just outputs: Subroutine attributes must come before the signature at blib/lib/List/Lazy.pm line 36. Compilation failed in require at t/append.t line 5. BEGIN failed--compilation aborted at t/append.t line 5. It works by: 1) adding a boolean flag (sig_seen) to the parser state to indicate that a signature has been parsed; 2) at the end of parsing a signature, PL_expect is set to XATTRBLOCK rather than XBLOCK. Then if something looking like one or more attributes is encountered by the lexer immediately afterwards, it scans it as if it were an attribute, but then if sig_seen is true, it croaks.
* subtly change meaning of XATTRBLOCK, XATTRTERMDavid Mitchell2018-03-021-8/+5
| | | | | | | | | | | | | | Currently they tell the toker that the next thing will be attributes, followed by an XBLOCK or XTERMBLOCK respectively. This commit subtly changes their meanings so that they indicate that attributes legally *might* follow. This makes the code which initially sets them slightly simpler (no need to check whether the next char is ':'), and the code elsewhere in yylex() which handles XATTR* only triggers if the next char is ':' anyway. Doing it this way will shortly make detection simpler of an attribute illegally following a signature.
* parse subs and signature subs separatelyDavid Mitchell2018-03-021-3/+12
| | | | | | | | | | | | | | | | Currently the toker returns a SUB or ANONSUB token at the beginning of a sub (or BEGIN etc). Change it so that in the scope of 'use feature "signatures"', it returns a SIGSUB / ANON_SIGSUB token instead. Then in perly.y, duplicate the 2 rules containing SUB / ANONSUB to cope with these two new tokens. The net effect of this is to use different rules in the parser for parsing subs when signatures are in scope. Since the two sets of rules have just been cut and pasted, there should be no functional changes yet, but that will change shortly.
* add Perl_init_named_cv() functiomDavid Mitchell2018-03-021-0/+33
| | | | | | | This moves a block of code out from perly.y into its own function, because it will shortly be needed in more than one place. Should be no functional changes.
* (perl #125351) abort parsing if parse errors happen in a sub lexTony Cook2018-02-061-0/+18
| | | | | | | | | | | | | We've had a few reports of segmentation faults and other misbehaviour when sub-parsing, such as within interpolated expressions, fails. This change aborts compilation if anything complex enough to not be parsed by the lexer is compiled in a sub-parse *and* an error occurs within the sub-parse. An earlier version of this patch failed on simpler expressions, which caused many test failures, which this version doesn't (which may just mean we need more tests...)
* toke.c: Remove unnecessary macro callsKarl Williamson2018-01-301-2/+0
| | | | | | | These macros were to shift the LC_NUMERIC state into using a dot for the radix character. When I wrote this code, I assumed that parsing should be using just the dot. Since then, I have discovered that this wraps other uses where the dot is not correct, so remove it.
* Allow space for NUL is UTF-8 array declsKarl Williamson2018-01-221-1/+1
| | | | | | In grepping the source, I noticed that several arrays that are for holding UTF-8 characters did not allow space for a trailing NUL. This commit adds that.
* Revert "Revert "make PerlIO handle FD_CLOEXEC""Zefram2018-01-181-5/+0
| | | | | | This reverts commit 523d71b314dc75bd212794cc8392eab8267ea744, reinstating commit 2cdf406af42834c46ef407517daab0734f7066fc. Reversion is not the way to address the porting problem that motivated that reversion.
* Revert "make PerlIO handle FD_CLOEXEC"Abigail2018-01-181-0/+5
| | | | | | | | | | | | | | | | | | | | | This reverts commit 2cdf406af42834c46ef407517daab0734f7066fc. The reason for the revert is that with this commit, perl fails to compile on darwin (or at least, one some versions of it): ./miniperl -Ilib make_ext.pl lib/auto/DB_File/DB_File.bundle MAKE="/Applications/Xcode.app/Contents/Developer/usr/bin/make" LIBPERL_A=libperl.a LINKTYPE=dynamic Parsing config.in... Looks Good. dyld: lazy symbol binding failed: Symbol not found: _mkostemp Referenced from: /private/tmp/perl/cpan/DB_File/../../miniperl Expected in: flat namespace dyld: Symbol not found: _mkostemp Referenced from: /private/tmp/perl/cpan/DB_File/../../miniperl Expected in: flat namespace Unsuccessful Makefile.PL(cpan/DB_File): code=5 at make_ext.pl line 518. make: *** [lib/auto/DB_File/DB_File.bundle] Error 2
* revert smartmatch to 5.27.6 behaviourZefram2017-12-291-7/+12
| | | | | | | | | | | | | The pumpking has determined that the CPAN breakage caused by changing smartmatch [perl #132594] is too great for the smartmatch changes to stay in for 5.28. This reverts most of the merge in commit da4e040f42421764ef069371d77c008e6b801f45. All core behaviour and documentation is reverted. The removal of use of smartmatch from a couple of tests (that aren't testing smartmatch) remains. Customisation of a couple of CPAN modules to make them portable across smartmatch types remains. A small bugfix in scope.c also remains.
* make PerlIO handle FD_CLOEXECZefram2017-12-221-5/+0
| | | | | | Move handling of close-on-exec flag for PerlIO handles into PerlIO itself. Where PerlIO opens new file descriptors, have them opened in O_CLOEXEC mode where possible.
* factor out remaining fcntl F_SETFD callsZefram2017-12-221-5/+2
|
* merge branch zefram/dumb_matchZefram2017-12-171-12/+7
|\
| * add "whereis"Zefram2017-12-061-2/+4
| | | | | | | | | | "whereis" is like "whereso" except that it performs an implicit smartmatch.
| * change "when" keyword to "whereso"Zefram2017-12-051-4/+4
| |
| * remove useless "break" mechanismZefram2017-11-291-3/+0
| |
| * remove useless "default" mechanismZefram2017-11-281-4/+0
| |
* | semicolon-friendly diagnostic controlZefram2017-12-161-6/+6
| | | | | | | | | | | | New macros {GCC,CLANG}_DIAG_{IGNORE,RESTORE}_{DECL,STMT}, which take a following semicolon. It is necessary to use the _DECL or _STMT version as appropriate to the context. Fixes [perl #130726].
* | assert min identifier length in S_pending_ident()Zefram2017-12-081-0/+1
| |
* | fix oct/bin fp fractions in non-HEXFP_UQUAD buildsZefram2017-12-061-1/+1
| | | | | | | | | | | | | | | | | | | | The code for binaryish floating point literals, on builds where we're not confident of being able to fit a significand into an integer type, had built-in knowledge that the radix is 16, after the radix point. This gave erroneous values for octal and binary literals on those builds. This was shown up by the tests added in commit 58be57636a42d6c6fd404c48c4e1cb87870182df. Correct it to use the actual radix.
* | limit digits based on radix for oct/bin fpTony Cook2017-12-061-2/+4
| | | | | | | | | | | | | | All hexadecimal digits were being permitted in octal and binary floating point literals. (That octal and binary literals are permitted at all might be an accidental result of permitting hexadecimal?) Restrict which digits are permitted, in accordance with the radix.
* | avoid negative shift in scan_num()Zefram2017-12-061-1/+1
| | | | | | | | | | | | Lengthy binaryish floating point literals used to perform illegal bit shifts. Ignore digits that are past the end of the significand at an earlier stage to avoid this. Code fix by Tony C. Fixes [perl #131894].
* | assert legality of bitshifts in scan_num()Zefram2017-12-061-0/+4
| | | | | | | | | | [perl #131894] found some negative-exponent shifting going on here. Make the illegality more accessible by asserting.
* | Make Bad name error less unhelpfulFather Chrysostomos2017-12-041-8/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was brought up in perl #132485. Because ‘Bad name after...’ is a croak, it suppresses the more helpful hints like ‘Might be a runaway multi-line string’, in such cases as: use Moose; has erdef => ( isa => 'Int', is => 'ro, default => sub { 1 } ); has cxxc => ( isa => 'Int', is => 'ro', default => sub { 1 } ); We can allay this infelicity by emitting the ‘Missing operator before bareword’ before the Bad name croak, so in the example above we end up with: Bareword found where operator expected at - line 10, near "isa => 'Int" (Might be a runaway multi-line '' string starting on line 5) (Do you need to predeclare isa?) Bad name after Int' at - line 10. rather than just: Bad name after Int' at - line 10.
* | Initialize the variable.Jarkko Hietaniemi2017-11-281-2/+2
| |
* | toke.c: Don’t leak memoryFather Chrysostomos2017-11-261-0/+1
| |
* | [perl #132485] Warn about "$foo'bar"Father Chrysostomos2017-11-261-5/+31
| |
* | toke.c: Comment typoFather Chrysostomos2017-11-261-1/+1
| |
* | toke.c: Convert to use is_utf8_non_invariant_stringKarl Williamson2017-11-261-4/+3
| |
* | toke.c lex_stuff_pvn(): Use fcn, not handrolled codeKarl Williamson2017-11-231-7/+1
|/ | | | | | Use the inline function that accomplishes the same thing as this hand-rolled code. The inline function should generate the same thing on ASCII platforms, but be faster on EBCDIC ones.
* localise $@ around source filtersZefram2017-11-131-1/+6
| | | | | | $@ could be clobbered by source filters, screwing up the reporting of errors in the filtered source. Prevent this by localising $@ around each call to a source filter. Fixes [perl #38920].
* avoid redundant initialisation around Newxz()Zefram2017-11-131-2/+4
| | | | | | Reduce Newxz() to Newx() where all relevant parts of the memory are being explicitly initialised, and don't explicitly zero memory that was already zeroed. [perl #36078]
* prevent invalid memory access in S_check_uni (RT #132433)Lukas Mai2017-11-121-1/+1
|
* add wrap_keyword_plugin function (RT #132413)Lukas Mai2017-11-111-0/+73
|
* parse yada-yada only as a statementZefram2017-11-101-1/+1
| | | | | | | | | | | | | | Commit f5727a1c71878a34f6255eb1a506c0b21af7d36f tried to make yada-yada be parsed consistently as a term expression, but actually things are more complicated than that. The tokeniser didn't accept yada-yada in the right contexts to make it usable as an expression, and changing that would require decisions on resolving ambiguities between yada-yada and flip-flop. It's also documented as being a statement rather than an expression, though with some incorrect information about ambiguities. Overall it looks more like the intent was for yada-yada to be a statement. This commit makes it grammatically treated as such, and also fixes up the dubious parts of the documentation. [perl #132150]
* toke.c: Add commentKarl Williamson2017-11-081-0/+2
|
* Dest buffer needs to be bigger for utf16_to_utf8()Karl Williamson2017-11-081-1/+4
| | | | | | | | | | These undocumented functions require the destination buffer to have the worst case size. However that size (previously listed as 3/2 * input) is wrong for EBCDIC. Correct the comments, and the single use of these in core. These functions do not have a way to avoid overflowing, which strikes me as wrong.
* restore error message for unterminated stringsLukas Mai2017-11-081-7/+13
| | | | | | | | | | | The previous strchr/memchr changes inadvertently broke the error message for perl -e '"'. Instead of Can't find string terminator '"' anywhere before EOF it became Can't find string terminator """ anywhere before EOF
* toke.c: Fix wrong use of memrchrKarl Williamson2017-11-071-1/+1
| | | | | | | | This was a replacement of strchr(), so should not have used the find-right-most memrchr. This was spotted by Christian Hansen. I don't know what the implications are, but thought I should get a fix in immediately.
* toke.c: use my_memrchr helper for portability [round 2]Nicolas R2017-11-061-2/+2
| | | | compilation broken on darwin using clang
* toke.c: Convert some strchr to memchrKarl Williamson2017-11-061-14/+20
| | | | | This allows things to work properly in the face of embedded NULs. See the branch merge message for more information.
* toke.c use memBEGINs with prev commitKarl Williamson2017-11-061-2/+4
|
* toke.c: Add limit parameter to 3 static functionsKarl Williamson2017-11-061-26/+31
| | | | | This will make it possible to fix to handle embedded NULs in the next commits.
* dquote.c: Use memchr() instead of strchr()Karl Williamson2017-11-061-2/+4
| | | | | | | This allows \x and \o to work properly in the face of embedded NULs. A limit parameter is added to each function, and that is passed to memchr (which replaces strchr). See the branch merge message for more information.
* Add memBEGINPs() to core and use itKarl Williamson2017-11-061-9/+16
| | | | | | This macro is like memBEGINs(), but the 'P' signifies we want a proper substring, meaning that the 2nd string parameter must not be the entire first parameter.