summaryrefslogtreecommitdiff
path: root/op.c
Commit message (Collapse)AuthorAgeFilesLines
...
* Fix tr/// compilation on VMSKarl Williamson2019-11-081-1/+1
| | | | | | | 64-bits on that platform require a long long, and 1UL isn't. I should have copied more carefully the similar code in utf8.h (reported to me privately by Craig Berry)
* Silence some compiler warningsKarl Williamson2019-11-071-2/+2
| | | | | These were introduced in the tr/// changes in the series merged in 240494d6992696a7a350217c131e1d5dc1444a0c
* op.c: Remove no-longer used functionKarl Williamson2019-11-061-21/+0
|
* Reimplement tr/// without swashesKarl Williamson2019-11-061-349/+961
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This large commit removes the last use of swashes from core. It replaces swashes by inversion maps. This data structure is already in use for some Unicode properties, such as case changing. The inversion map data structure leads to straight forward implementation code, so I collapsed the two doop.c routines do_trans_complex_utf8() and do_trans_simple_utf8() into one. A few conditionals could be avoided in the loop if this function were split so that one version didn't have to test for, e.g., squashing, but I suspect these are in the noise in the loop, which has to deal with UTF-8 conversions. This should be faster than the previous implementation anyway. I measured the differences some releases back, and inversion maps were faster than the equivalent swash for up to 512 or 1024 different ranges. These numbers are unlikely to be exceeded in tr/// except possibly in machine-generated ones. Inversion maps are capable of handling both UTF-8 and non-UTF-8 cases, but I left in the existing non-UTF-8 implementation, which uses tables, because I suspect it is faster. This means that there is extra code, purely for runtime performance. An inversion map is always created from the input, and then if the table implementation is to be used, the table is easily derived from the map. Prior to this commit, the table implementation was used in certain edge cases involving code points above 255. Those cases are now handled by the inversion map implementation, because it would have taken extra code to detect them, and I didn't think it was worth it. That could be changed if I am wrong. Creating an inversion map for all inputs essentially normalizes them, and then the same logic is usable for all. This fixes some false negatives in the previous implementation. It also allows for detecting if the actual transliteration can be done in place. Previously, the code mostly punted on that detection for the UTF-8 case. This also allows for accurate counting of the lengths of the two sides, fixing some longstanding TODO warning tests. A new flag is created, OPpTRANS_CAN_FORCE_UTF8, when the tr/// has a below 256 character resolving to one that requires UTF-8. If this isn't set, the code knows that a non-UTF-8 input won't become UTF-8 in the process, and so can take short cuts. The bit representing this flag is the same as OPpTRANS_FROM_UTF, which is no longer used. That name is left in so that the dozen-ish modules in cpan that refer to it can still compile. AFAICT none of them actually use the flag, as well they shouldn't since it is private to the core. Inversion maps are ideally suited for tr/// implementations. An issue with them in general is that for some pathological data, they can become fragmented requiring more space than you would expect, to represent the underlying data. However, the typical tr/// would not have this issue, requiring only very short inversion maps to represent; in some cases shorter than the table implementation. Inversion maps are also easier to deparse than swashes. A deparse TODO was also fixed by this commit, and the code to deparse UTF-8 inputs is simplified. One could implement specialized data structures for specific types of inputs. For example, a common tr/// form is a single range, like tr/A-Z/a-z/. That could be implemented without a table and be quite fast. An intermediate step would be to use the inversion map implementation always when the transliteration is a single range, and then special case length=1 maps at execution time. Thanks to Nicholas Rochemagne for his help on B
* op.c: Add debugging dump functionKarl Williamson2019-11-061-0/+41
| | | | This function dumps out an inversion map
* op.c: Simplify expression.Karl Williamson2019-11-061-3/+2
| | | | | This also makes sure 'struct_size' has the correct value in it for any future uses.
* op.c, doop.c Use mnemonics instead of numeric valuesKarl Williamson2019-11-061-15/+24
| | | | For legibility and maintainability
* Change macro name in tr/// codeKarl Williamson2019-11-061-10/+10
| | | | This makes it more mnemonic. Also add an explanation in toke.c
* op.c: Comments onlyKarl Williamson2019-11-061-3/+5
| | | | Indent for clarity, and add a comment
* doop.c, op.c: White-space onlyKarl Williamson2019-11-061-99/+100
| | | | Remove trailing blanks and outdent a doubly indented block
* op.c: Indent some codeKarl Williamson2019-11-061-102/+102
| | | | | This is in preparation for a future commit which will surround this with an 'if'.
* enforce strict for barewords in multiconcatTony Cook2019-11-041-0/+2
| | | | gh #17254
* Faster feature checksTony Cook2019-10-301-2/+4
| | | | | | | | | | | | | | | Perform only a bit check instead of a much more expensive hash lookup to test features. For now I've just added a U32 to the cop structure to store the bits, if we need more we could either add more bits directly, or make it a pointer. We don't have the immediate need for a pointer that warning do since we don't dynamically add new features during compilation/runtime. The changes to %^H are retained so that caller() can be used from perl code to check the features enabled at a given caller's scope.
* Perl_Slab_Alloc(): tweak loggingDavid Mitchell2019-09-251-0/+4
| | | | | | When looking for a suitable op-sized chunk of memory in a slab's free list, perl logs the search but doesn't log a successful match. Add such a log line to make analysis of the output of 'perl -DS' easier.
* sub foo($_) {...} - change error messageDavid Mitchell2019-09-231-6/+12
| | | | | | | | | | | When using one of the globals like $_ or @_ in a subroutine signature, the error message was misleading: Can't use global $_ in "my" This commit changes it to: Can't use global $_ in subroutine signature
* rpeep(): skip duplicate nextstates even with gapsDavid Mitchell2019-09-231-2/+11
| | | | | | | | | | rpeep() already optimises away consecutive nextstate ops. This commit makes it do this even if there are 'noop' ops between them like null, scope, lineseq. This has a specific utility for the next commit, which will reorganise the optree for subroutine signatures in a way which introduces a lineseq between two nextstates.
* Un-revert "[MERGE] add+use si_cxsubix field"David Mitchell2019-09-231-1/+2
| | | | | | | | original merge commit: v5.31.3-198-gd2cd363728 reverted by: v5.31.4-0-g20ef288c53 The commit following this commit fixes the breakage, which that means the revert can be undone.
* Revert "[MERGE] add+use PL_curstackinfo->si_cxsubix field"v5.31.4Max Maischein2019-09-201-2/+1
| | | | | | | | | | | | This reverts commit d2cd363728088adada85312725ac9d96c29659be, reversing changes made to 068b48acd4bdf9e7c69b87f4ba838bdff035053c. This change breaks installing Test::Deep: ... not ok 37 - Test 'isa eq' completed ok 38 - Test 'isa eq' no premature diagnostication ...
* set VOID on OP_ENTERDavid Mitchell2019-09-191-1/+2
| | | | | | | | The OP_ENTER planted at the start of a program (and possibly elsewhere) gets left as UNKNOWN context rather than VOID context, due to op_scope() not honouring the current context. Fixing this makes things infinitesimally faster.
* perlapi: Properly document Perl_custom_op_xop()Karl Williamson2019-09-151-1/+1
| | | | It requires the prefix and a thread context parameter.
* fix size-miscalculation upgrading LISTOP TO LOOPOPDavid Mitchell2019-08-091-1/+2
| | | | | | | | | RT #134344 My recent commit v5.31.2-54-g8c47b5bce7 broke some CAN modules because the code in Perl_newFOROP() wasn't accounting for the overhead in the opslot struct when deciding whether an allocated LISTOP was large enough to be upgraded in-place to a LOOPOP.
* Perl_opslab_force_free() adjust loop testDavid Mitchell2019-08-051-1/+1
| | | | | | | | | Formerly, slots were allocated within a slab, but leaving the very top word in the slab as a NULL pointer which appeared as a fake slot so that a 'while (slot->opslot_next)' loop would stop. Since opslot_next has been eradicated and the NULL is no longer allocated, the loop condition for scanning all slots can be simplified slightly (with no change in functionality).
* OPSLOT: replace opslot_next with opslot_sizeDavid Mitchell2019-08-051-5/+11
| | | | | | | | | Currently, each allocated opslot has a pointer to the opslot that was allocated immediately above it. Replace this with a U16 opslot_size field giving the size of the opslot. The next opslot can then be found by adding slot->opslot_size * sizeof(void*) to slot. This saves space.
* opslabs: change opslab_first to opslab_free_spaceDavid Mitchell2019-08-051-28/+32
| | | | | | | Currently a OPSLAB maintains a pointer to the lowest allocated OPSLOT within the slab (slots are allocated downwards). Replace this pointer with a U16 indicating how many pointer-sized words are free below the lowest allocated slot.
* OPSLAB: always have opslab_size fieldDavid Mitchell2019-08-051-1/+2
| | | | | | | | Currently this struct only has the opslab_size field on debugging builds. Change it so that this field is always present. This will make it easier to not need a fake partial OPSLOT at the end of the slab with a NULL opslot_next field, which will in turn simplify converting opslot_next into U16 size field shortly.
* make opslot_slab an offset in current slabDavid Mitchell2019-08-051-15/+26
| | | | | | | | | | | | | | | | | | | Each OPSLOT allocated within an OPSLAB contains a pointer, opslot_slab, which points back to the first (head) slab of the slab chain (i.e. not necessarily to the slab which the op is contained in). This commit changes the pointer to be a 16-bit offset from the start of the current slab, and adds a pointer at the start of each slab which points back to the head slab. The mapping from an op to the head slab is now a two-step process: use the op's slot's opslot_offset field to find the start of the current slab, then use that slab's new opslab_head pointer to find the head slab. The advantage of this is that it reduces the storage per op. (It probably doesn't make any practical difference yet, due to alignment issues, but that will will be sorted shortly in this branch.)
* Perl_Slab_Alloc(): rename 'slab' to 'head_slab'David Mitchell2019-08-051-14/+16
| | | | | Rename this local var to better identify that it always points to the first slab in the slab chain, rather than to the current slab.
* Perl_op_lvalue_flags(): make mostly non-recursiveDavid Mitchell2019-06-241-25/+71
| | | | | Recursion is left in a few places where is necessary to call itself with a different value for 'type'.
* Perl_op_lvalue_flags() add blank linesDavid Mitchell2019-06-241-0/+7
| | | | ... between switch cases for readability.
* Perl_op_lvalue_flags(): skip OPf_WANT_VOID ops.David Mitchell2019-06-241-5/+5
| | | | | | | | | | | | | | | Currently this function asserts that its 'o' argument is non-VOID; later when recursing an OP_LIST, it skips any kids which are VOID. This commit changes it so that the assert becomes a return, and OP_LIST doesn't check whether its kids are VOID. Doing it this way makes it easier to shortly make Perl_op_lvalue_flags() non-recursive. The only functional difference is that on debugging builds, Perl_op_lvalue_flags() will no longer fail an assert if inadvertently called with a VOID op.
* Perl_op_lvalue_flags(): fixup documentationDavid Mitchell2019-06-241-19/+25
| | | | | | | | First, move the apidoc text for op_lvalue() to be directly above Perl_op_lvalue_flags() (it had wandered). Secondly, add a brief non-API note explaining what the extra 'flags' parameter does
* reindent op.c:S_lvref()David Mitchell2019-06-241-119/+131
| | | | | ... after the previous commit wrapped most if it in a while loop. Also put a blank line after each switch case for readability.
* make op.c:S_lvref() non-recursiveDavid Mitchell2019-06-241-17/+27
|
* document what op.c:S_lvref() doesDavid Mitchell2019-06-241-0/+9
|
* op.c: S_lvref(): handle all kids on OP_NULLDavid Mitchell2019-06-241-4/+7
| | | | | | | | | | | | | | For an OP_NULL, his function formerly recursed into *all* its kids if was an ex-list, otherwise only the first one. To simplify making this function non-recursive, make it so that it unconditionally recurses into all the kids. However for now, also add an assertion that a non ex-list OP_NULL will only have one child at most. If we find some code which violates this, then we can nmake a more informed decision as to whether non ex-list OP_NULL's should have all, or only their first child examined.
* Clarify purpose of S_looks_like_bool()David Mitchell2019-06-241-1/+5
|
* make op.c:S_find_and_forget_pmops() non-recursiveDavid Mitchell2019-06-241-13/+27
| | | | | | | | | | | | | | | | | | | | For every CV that's freed which has a shared optree (e.g. a closure or between threads), the whole optree is walked looking for PMOPs. Make that walk non-recursive. Contrived code that triggers a stack overflow: { my $outer; my $e = 'sub { $outer && ' . join('&&', ('$x') x 100_000) . " }"; #print $e, "\n"; eval $e; } Even after this commit, that code still SEGVs due to a separate stack blow in Perl_rpeep().
* Perl_doref(): reindentDavid Mitchell2019-06-241-93/+94
| | | | Previous commit added a while loop.
* Perl_doref(): make non-recursiveDavid Mitchell2019-06-241-15/+40
| | | | | | | | | | | | | | | | | | This stops the following code from SEGVing for example: my $e = "\$r"; $e = "+do{$e}" for 1..70_000; $e = "push \@{$e}, 1"; eval $e; Similarly with a long $a[0][0][0][0]..... This commit causes a slight change in behaviour, in that scalar(o) is now only called once at the end of the top-level doref() call, rather than at the end of processing each child. This should make no functional difference, apart from speeding up compiling infinitesimally.
* document what Perl_doref doesDavid Mitchell2019-06-241-0/+13
|
* make op.c:S_aassign_scan() non-recursiveDavid Mitchell2019-06-241-38/+81
| | | | | | | | | | With this commit and some previous ones, the following code no longer blows the stack: my $e = "1"; $e = "do { \$x; $e}" for 1..100_000; $e = "\@x = $e"; eval $e;
* make Perl_op_linklist() non-recursiveDavid Mitchell2019-06-241-22/+43
|
* Perl_op_linklist(): use OPf_KIDS flagsDavid Mitchell2019-06-241-4/+2
| | | | | | | | This function just blindly assumes that cUNOPo->op_first is a valid indication that the op has at least one child. This is successful *most* of the time. Putting in an assertion caused t/op/lvref.t to fail. Instead, check the OPf_KIDS flag.
* Perl_scalarvoid(): add comment saying what it doesDavid Mitchell2019-06-241-0/+2
| | | | | It applies void context, which isn't all that obvious just from the name.
* op.c: S_search_const: remove recursionDavid Mitchell2019-06-241-3/+7
| | | | | There are a couple of places where this function recurses, but they are both effectively tail recursion and can be easily eliminated.
* op.c: add code comments to S_search_const()David Mitchell2019-06-241-0/+9
| | | | plus a few blank lines for readability.
* op.c: S_assignment_type(): make truly trinaryDavid Mitchell2019-06-241-5/+12
| | | | | | | | | | | Commit 4fec880468dad87517895b935b19a8d51e98b5a6 converted the static boolean function S_is_list_assignment() into a 3-valued function: S_assignment_type(). However, much of the code body still did things like 'return TRUE'. Replace these with 'return ASSIGN_LIST' etc. These have the same physical values, so there's no functional change here. But it makes the code more consistent and readable.
* Perl_scalar() tail-call optimiseDavid Mitchell2019-06-241-4/+18
| | | | | | | | | | | | | | | | | | | | | | The part of this function that scans the children of e.g. $scalar = do { void; void; scalar } applying scalar context only to the last child: tail call optimise that call to Perl_scalar(). It also adds some extra 'warnings' tests. An earlier attempt at this patch caused some unrelated tests to start emitting spurious 'useless in void context' messages, which are covered by the new tests. This also showed up that the current method for updating PL_curcop while descending optrees in Perl_scalar/scalarvoid/S_scalarseq is a bit broken. It gets updated every time a newstate op is seen, but haphazardly (and sometimes wrongly) restored to &PL_compiling when going back up the tree. One of the tests is TODO based on PL_curcop being wrong and so the 'no warnings "void"' leaking into an outer scope. This commit maintains the status quo.
* Perl_scalar(): doc and reorganise complex boolDavid Mitchell2019-06-241-6/+22
| | | | | | | | | The if statement that scans children applying void context to all except the last child: 1) document what it does; 2) reorganise it (without changing its logical meaning) to make it simpler to understand, and to make the next commit easier.
* Perl_scalar(): indent blockDavid Mitchell2019-06-241-117/+117
| | | | | .. that has just been wrapped in a while loop. Whitespace-only change.