summaryrefslogtreecommitdiff
path: root/toke.c
Commit message (Collapse)AuthorAgeFilesLines
* Make source filters work in evalbytesFather Chrysostomos2011-11-061-20/+88
| | | | | | When a filter is added, the current buffer is hung on the end of the filters array, and a new substring of it becomes the current buffer.
* Add evalbytes functionFather Chrysostomos2011-11-061-0/+4
| | | | | | | | | | | This function evaluates its argument as a byte string, regardless of the internal encoding. It croaks if the string contains characters outside the byte range. Hence evalbytes(" use utf8; '\xc4\x80' ") will return "\x{100}", even if the original string had the UTF8 flag on, and evalbytes(" '\xc4\x80' ") will return "\xc4\x80". This has the side effect of fixing the deparsing of CORE::break under ‘use feature’ when there is an override.
* Forbid source filters in Unicode evalsFather Chrysostomos2011-11-061-0/+3
| | | | | | Source filters have always been byte-level filters. Therefore they don’t make sense on Unicode strings, unless we are planning to add new APIs to support it. Until then, croak.
* eval STRING UTF8 cleanup.Brian Fraser2011-11-061-2/+3
| | | | | (modified by the committer only to apply when the unicode_eval feature is enabled)
* Fix CORE::globFather Chrysostomos2011-10-261-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | This commit makes CORE::glob bypassing glob overrides. A side effect of the fix is that, with the default glob implementa- tion, undefining *CORE::GLOBAL::glob no longer results in an ‘unde- fined subroutine’ error. Another side effect is that compilation of a glob op no longer assumes that the loading of File::Glob will create the *CORE::GLOB::glob type- glob. ‘++$INC{"File/Glob.pm"}; sub File::Glob::csh_glob; eval '<*>';’ used to crash. This is accomplished using a mechanism similar to lock() and threads::shared. There is a new PL_globhook interpreter varia- ble that pp_glob calls when there is no override present. Thus, File::Glob (which is supposed to be transparent, as it *is* the built-in implementation) no longer interferes with the user mechanism for overriding glob. This removes one tier from the five or so hacks that constitute glob’s implementation, and which work together to make it one of the buggiest and most inconsistent areas of Perl.
* Resolve XS AUTOLOAD-prototype conflictFather Chrysostomos2011-10-091-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Did you know that a subroutine’s prototype can be modified with s///? Don’t look: *AUTOLOAD = *Internals'SvREFCNT; my $f = "Just another "; eval{main->$f}; print prototype AUTOLOAD; $f =~ s/Just another /Perl hacker,\n/; print prototype AUTOLOAD; You did look, didn’t you? You must admit that’s creepy. The problem goes back to this: commit adb5a9ae91a0bed93d396bb0abda99831f9e2e6f Author: Doug MacEachern <dougm@covalent.net> Date: Sat Jan 6 01:30:05 2001 -0800 [patch] xsub AUTOLOAD fix/optimization Message-ID: <Pine.LNX.4.10.10101060924280.24460-100000@mojo.covalent.net> Allow AUTOLOAD to be an xsub and allow such xsubs to avoid use of $AUTOLOAD. p4raw-id: //depot/perl@8362 which includes this: + if (CvXSUB(cv)) { + /* rather than lookup/init $AUTOLOAD here + * only to have the XSUB do another lookup for $AUTOLOAD + * and split that value on the last '::', + * pass along the same data via some unused fields in the CV + */ + CvSTASH(cv) = stash; + SvPVX(cv) = (char *)name; /* cast to loose constness warning */ + SvCUR(cv) = len; + return gv; + } That ‘unused’ field is not unused. It’s where the prototype is stored. So, not only is it clobbering the prototype, it’s also leak- ing it by assigning over the top of SvPVX. Furthermore, it’s blindly assigning someone else’s string, which could be freed before it’s even used. Since it has been documented for a long time that SvPVX contains the name of the AUTOLOADed sub, and since the use of SvPVX for prototypes is documented nowhere, we have to preserve the former. So this commit makes the prototype and the sub name share the same buffer, in a manner resembling that which CvFILE used before I changed it with bad4ae38. There are two new internal macros, CvPROTO and CvPROTOLEN for retriev- ing the prototype.
* Cast to signed before negating, to avoid compiler warningsBrian Fraser2011-10-061-1/+1
|
* toke.c, ext/attributes/attributes.xs: Make attributes UTF-8 clean.Brian Fraser2011-10-061-1/+1
|
* Modify S_pending_ident to use sv_catpvn_flagsFather Chrysostomos2011-10-061-1/+1
| | | | | with the new SV_CAT* constants, since that’s faster than creating an SV to pass to sv_catsv.
* toke.c, op.c, sv.c: Prototype parsing and checking are nul-and-UTF8 clean.Brian Fraser2011-10-061-8/+14
| | | | | | | | | | | | This means that eval "sub foo ($;\0whoops) { say @_ }" will correctly include \0whoops in the CV's prototype (while complaining about illegal characters), and that use utf8; BEGIN { $::{"foo"} = "\$\0L\351on" } BEGIN { eval "sub foo (\$\0L\x{c3}\x{a9}on) {};"; } will not warn about a mismatched prototype.
* toke.c: Some simple mending to get readline() working with UTF-8 filehandlesBrian Fraser2011-10-061-1/+1
|
* toke.c: Take utf8 into account when creating DATA handleFather Chrysostomos2011-10-061-3/+13
| | | | | This is based on work from Brian Fraser, but differs from his original in that it does not require an intermediate SV.
* mro UTF8 cleanup.Brian Fraser2011-10-061-4/+14
| | | | | | | | | | | This patch also duplicates existing mro tests with copies that use Unicode in identifiers, to test the mro code. Since those tests trigger it, it also fixes a bug in the parsing of *{...}: If the first character inside the braces is a non-ASCII Unicode identifier character, the inside is now implicitly quoted if it is just an identifier (just as it is with ASCII identifiers), instead of being parsed as a bareword that would violate strict subs.
* toke.c: S_scan_inputsymbol, initial GV-related UTF8 cleanupBrian Fraser2011-10-061-2/+2
|
* toke.c: S_checkcomma, GV-related UTF8 cleanupBrian Fraser2011-10-061-1/+1
|
* toke.c: yylex, GV-related UTF8 cleanupBrian Fraser2011-10-061-10/+18
|
* toke.c: S_find_in_my_stash, GV-related UTF8 cleanupBrian Fraser2011-10-061-3/+3
|
* toke.c: S_intuit_method, GV-related UTF8 cleanupBrian Fraser2011-10-061-3/+4
|
* toke.c: S_intuit_more, GV-related UTF8 cleanupBrian Fraser2011-10-061-1/+2
|
* toke.c: S_force_ident, GV-related UTF8 cleanupBrian Fraser2011-10-061-3/+4
|
* gv.c: Initial gv_fetchpvn_flags and gv_stashpvn UTF8 cleanupBrian Fraser2011-10-061-6/+8
| | | | | | | | | Now that a glob can be initialized and fetched in UTF-8, the next commit will introduce some changes in toke.c to actually test this. Committer’s note: To keep tests passing I had to incorporate the toke.c:S_pending_ident changes in the same patch.
* Fix inability of lex_read_unichar to handle 80-FF under "no utf8;". ↵Eric Brine2011-09-201-1/+4
| | | | lex_peek_unichar is already correct.
* The Borland Chainsaw MassacreSteve Hay2011-09-101-6/+0
| | | | | Remove support for the Borland C++ compiler on Win32, as agreed here: http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2011-09/msg00034.html
* remove index offsetting ($[)Zefram2011-09-091-8/+0
| | | | | | $[ remains as a variable. It no longer has compile-time magic. At runtime, it always reads as zero, accepts a write of zero, but dies on writing any other value.
* remove unused variables and assignmentsRobin Barker2011-09-081-2/+1
| | | | | | and silences some compiler warnings. I do not understand the code in toke.c but the change aligns the code with other uses of FUN0OP, it has no warnings and does not break any test.
* [perl #95546] Allow () after __FILE__, etc.Father Chrysostomos2011-08-121-9/+15
| | | | | | | | This commit makes the __FILE__, __LINE__ and __PACKAGE__ token parse the same way as nullary functions. It adds two extra rules to perly.y to allow the op to be created in toke.c, instead of directly inside the parser.
* Passing the flag to the pad functions in toke.cBrian Fraser2011-07-121-4/+6
|
* APIify pad functionsZefram2011-07-121-2/+2
| | | | | | | Move several pad functions into the core API. Document the pad functions more consistently for perlapi. Fix the interface issues around delimitation of lexical variable names, providing _pvn, _pvs, _pv, and _sv forms of pad_add_name and pad_findmy.
* Stop having one of the following qw() warnings hide the other:Eric Brine2011-07-031-6/+7
| | | | | - Possible attempt to separate words with commas - Possible attempt to put comments in qw() list
* Allow ‘continue;’ without feature.pmFather Chrysostomos2011-06-141-12/+1
| | | | | | | | | Since there is no conflict between ‘continue;’ and a user-defined sub- routine (it’s a syntax error, as ‘continue’ is already a keyword), there is no need to require the ‘switch’ feature to be enabled for this keyword. This actually simplifies the implementation.
* [perl #90130] Allow CORE::* without feature.pmFather Chrysostomos2011-06-111-4/+7
| | | | | This commit allows feature.pm-enabled keywords to work with CORE::* even outside the scope of ‘use feature’.
* [perl #88776] Signedness warning in toke.cDavid Mitchell2011-06-061-3/+3
| | | | fix a warning introduced by 6d51015587940c2032a6533d886163f69ca028f9
* scan_heredoc could reallocate PL_parser->linestr's PVDavid Leadbeater2011-05-181-0/+1
| | | | | | | | | | | Since f0e67a1 it was possible the freed buffer may be read from when parsing a heredoc. This adds a call to lex_grow_linestr to grow the buffer and ensure the pointers in PL_parser are updated. The bug is pretty hard to reproduce, hence no test. I'm able to reproduce it with the following: perl -Meverywhere=re,debug -MParams::Util -e1
* [perl #88420] BOM support on Windows broken in 5.13.11Jan Dubois2011-04-131-1/+7
| | | | | | | When Perl reads the script in text mode, then the tell() position on the script handle may include stripped carriage return characters. Therefore the file position after reading the first line of the script may be one larger than the length of the input buffer.
* PATCH: partial [perl #86972]: Allow /aiaKarl Williamson2011-04-101-9/+16
| | | | | | This allows a second /a modifier to not have to be contiguous with the first. This patch changes only the part in toke.c where the modifiers are in suffix form.
* [perl #87064] eval no longer shares filtersFather Chrysostomos2011-04-031-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Before this commit: commit f07ec6dd59215a56bc1159449a9631be7a02a94d Author: Zefram <zefram@fysh.org> Date: Wed Oct 13 19:05:19 2010 +0100 remove filter inheritance option from lex_start The only uses of lex_start that had the new_filter parameter false, to make the new lexer context share source filters with the previous lexer context, were uses with rsfp null, which therefore never invoked source filters. Inheriting source filters from a logically unrelated file seems like a silly idea anyway. string evals could inherit the same source filter space as the cur- rently compiling code. Despite what the quoted commit message says, sharing source filters allows filters to be inherited in both direc- tions: A source filter created when the eval is being compiled also applies to the file with which it is sharing its space. There are at least 20 CPAN distributions relying on this behaviour (or, rather, what could be considered a Test::More bug). So this com- mit restores the source-filter-sharing capability. It does not change the current API or make public the API for sharing source filters, as this is supposed to be a temporary stop-gap measure for 5.14.
* fix compiler warning in toke.cDavid Mitchell2011-03-261-1/+1
| | | | | | The third arg to newSVOP must be non-null, and the macro expansion for SvREFCNT_inc can give a null value sometimes. So replace it with SvREFCNT_inc_NN and everyone's happy..
* reg_namedseq: Restructure so doesn't duplicate codeKarl Williamson2011-03-201-6/+20
| | | | | | | | | | | | | | | This routine now calls reg() recursively after converting the parse to something the rest of the code understands. This eliminates duplicated code, and allows for uniform treatment of code points, as things were getting out of sync. It also eliminates the restrction on how many characters a named sequence can expand to. toke now converts its input (which is in Unicode terms) to native on EBCDIC platforms, so the rest of the code can can continue to ignore that. The restriction on the length of the number of characters a named sequence is hereby removed, because reg() handles that.
* toke.c: Raise error for multiple regexp modsKarl Williamson2011-03-011-4/+40
| | | | | | | When the new regular expression modifiers being allowed in suffix-form were added on a very tight schedule, it was with the understanding that the error checking that only one can occur per regular experssion would be added later. This accomplishes that.
* [perl #79442] A #line "F" in a string eval doesn't update *{"_<F"}Father Chrysostomos2011-02-271-5/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | There are two problems here. String evals do not populate @{"_<..."} arrays the way parsed streams do. The lines are all put into the array before the parse. Commit e66cf94 added the code to copy (actually alias, but whatever) the lines into the new array when a #line directive is encountered. Inte- restingly, the following commit (8a5ee59) wrapped that code in an #ifndef USE_ITHREADS block, because ‘[c]hange 25409 [e66cf94] wasn’t necessary for threaded perls’. It seems it *was* necessary for threaded perls after all, because the lines are simply not copied. In non-threaded perls it wasn’t working properly either. The array in the new glob was the same array as the old (aliased), so the line numbers would be off if the #line directive contained a line number that differed. This commit does three things: • It removes the #ifndef, • It checks whether the line number has changed and aliases the indi- vidual elements of the array. • The breakpoints hash is not copied if the line number differs, as setting a breakpoint on (eval 1):1 (warn 1) in eval qq{warn 1;\n#line 1 "foo"\nwarn 2;} should not also set a breakpoint on foo:1 (warn 2).
* toke.c: 'Specialized /le message is only for substitutesKarl Williamson2011-02-211-0/+8
| | | | | m//le has to be the lexical comparison, so use the generic deprecation for that case
* move brace to same line as conditionalKarl Williamson2011-02-211-2/+1
|
* toke.c: fix commentKarl Williamson2011-02-211-2/+3
|
* Allow suffix form for /a /d /l /uKarl Williamson2011-02-191-2/+58
| | | | | | This patch contains the code changes for doing this, but not most of the pod changes, nor the new .t tests required. There were already tests in place to make sure that this didn't break backcompat.
* toke.c: Don't take the address of a registerKarl Williamson2011-02-191-1/+1
| | | | | I discovered after I pushed 858a358bdd94da8251cdb2210d9bec7c1bbe7464 that I had forgotten to 'git add' changes before committing.
* toke.c: Move suffix re mods message to one placeKarl Williamson2011-02-191-27/+30
| | | | | This involves a slight refactoring of the routine that handles parsing for the mods
* toke.c: silence win32 compiler warningKarl Williamson2011-02-151-1/+1
|
* perldiag: retitle Ambiguous use of %c{%s%s}Father Chrysostomos2011-02-131-0/+1
| | | | | | | | | | | | | | | | This is not very helpful: =item Ambiguous use of %c{%s%s} resolved to %c%s%s especially since it is functionally identical to the previous entry: =item Ambiguous use of %c{%s} resolved to %c%s Not only can diagnostics.pm never find it, but it is hard for human beings to understand what the different is at first glance, too. So filling in the second and fourth %s’s with the two possible values slays a twain of avians with one piece of petrified matter.
* Silence win32 smoke compiler warningKarl Williamson2011-02-131-2/+2
|
* Fix up \cX for 5.14Karl Williamson2011-02-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | Throughout 5.13 there was temporary code to deprecate and forbid certain values of X following a \c in qq strings. This patch fixes this to the final 5.14 semantics. These are: 1) a utf8 non-ASCII character will croak. This is the same behavior as pre-5.13, but it gives a correct error message, rather than the malformed utf8 message previously. 2) \c{ and \cX where X is above ASCII will generate a deprecated message. The intent is to remove these capabilities in 5.16. The original agreement was to croak on above ASCII, but that does violate our stability policy, so I'm deprecating it instead. 3) A non-deprecated warning is generated for all other \cX; this is the same as throughout the 5.13 series. I did not have the tuits to use \c{} as I had planned in 5.14, but \N{} can be used instead.