summaryrefslogtreecommitdiff
path: root/op.h
Commit message (Collapse)AuthorAgeFilesLines
* replace "define\t" with "define " in most "normal" core files.Yves Orton2023-04-291-15/+15
| | | | | | | | | | | | | | | | | | | The main exceptions being dist/, ext/, and Configure related files, which will be updated in a subsequent commit. Files in the cpan/ directory are also omitted as they are not owned by the core. '#define' has seven characters, so following it with a \t makes it look like '#define ' when it is not, which then frustrates attempts to find where a given define is. If you *know* then you do a git grep -P 'define\s+WHATEVER' but if don't or you forget, you can get very confused trying to find where a given define is located. This fixes all such cases so they actually are 'define WHATEVER' instead. If this patch is getting in your way with blame analysis then view it with the -w option to blame.
* Document the meaning of the OPf_SPECIAL flag on the LOOPEX opsPaul "LeoNerd" Evans2022-12-161-0/+2
|
* cop.h - add support for refcounted filenames in cops under threadsYves Orton2022-11-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a weird bifurcation of the cop logic around threads. With threads we use a char * cop_file member, without it we use a GV * and replace cop_file with cop_filegv. The GV * code refcounts filenames and more or less efficiently shares the filename amongst many opcodes. However under threads we were simplify copying the filenames into each opcode. This is because in theory opcodes created in one thread can be destroyed in another. I say in theory because as far as I know the core code does not actually do this. But we have tests that you can construct a perl, clone it, and then destroy the original, and have the copy work just fine, this means that opcodes constructed in the main thread will be destroyed in the cloned thread. This in turn means that you can't put SV derived structures into the op-tree under threads. Which is why we can not use the GV * stategy under threads. As such this code adds a new struct/type RCPV, which is a refcounted string using shared memory. This is implemented in such a way that code that previously used a char * can continue to do so, as the refcounting data is located a specific offset before the char * pointer itself. This also allows the len data to embedded "into" the PV, which allows us to expose macros to acces the length of what is in theory a null terminated string. struct rcpv { UV refcount; STRLEN len; char pv[1]; }; typedef struct rcpv RCPV; The struct is sized appropriately on creation in rcpv_new() so that the pv member contains the full string plus a null byte. It then returns a pointer to the pv member of the struct. Thus the refcount and length and embedded at a predictable offset in front of the char *, which means we do not have to change any types for members using this. We provide three operations: rcpv_new(), rcpv_copy() and rcpv_free(), which roughly correspond with newSVpv(), SvREFCNT_inc(), SvREFCNT_dec(), and a handful of macros as well. We also expose SAVERCPVFREE which is similar to SAVEGENERICSV but operates on pv's constructed with rcpv_new(). Currently I have not restricted use of this logic to threaded perls. We simply do not use it in unthreaded perls, but I see no reason we couldn't normalize the code to use this in both cases, except possibly that actually the GV case is more efficient. Note that rcpv_new() does NOT use a hash table to dedup strings. Two calls to rcpv_new() with the same arguments will produce two distinct pointers with their own refcount data. Refcounting the cop_file data was Tony Cook's idea.
* Enable `use feature 'module_true'`chromatic2022-11-011-0/+1
| | | | | | | | | | | Per RFC 18, whenever `use feature 'module_true';` is enabled in a scope, any file required with `require` has an implicit return value of true and will not trigger the "did not return a true value" error condition. This includes logic to use the OPf_SPECIAL flag for OP_RETURN listops to indicate that the module_true feature is in effect when it executes. This flag plays no role unless the OP_RETURN tail calls the pp_leaveeval logic, so it doesn't affect normal sub returns.
* op.c - work around Module::Install::DSL issueYves Orton2022-09-031-0/+1
| | | | | | | | | | This converts INIT {} blocks from the Module::Install::DSL namespace into BEGIN blocks. This works around the bug reported in GH Issue #16300. (Hopefully, not fully tested yet.) Which in turn should allow us to close the bug in #2754. See also PR: #20168 and Issue: #20161 both of which are blocked by this.
* Create a dedicated cMETHOPx_meth() macroPaul "LeoNerd" Evans2022-08-031-4/+4
| | | | | Don't just reuse the cSVOPx_sv() one any more, so we no longer have to keep the same storage shape for SVOPs vs METHOPs.
* Define the remaining convenience cMETHOP* macrosPaul "LeoNerd" Evans2022-08-031-0/+9
| | | | | | | | | | | Several of these were missing: cMETHOP, cMETHOPo, kMETHOP Also, the field-accessing ones: cMETHOP_meth cMETHOP_rclass cMETHOPo_meth cMETHOPo_rclass This commit adds them all, and use them to neaten other code where appropriate.
* op.h: Fixups for perlapi, Devel::PPPortKarl Williamson2022-06-221-10/+10
| | | | | | This commit clarifies in perlapi that these macros take a parameter that is a CPP token, and that Devel::PPPOrt can't blindly gernerate test cases for them.
* Move the handy OpTYPE_set() macro out of op.c into op.h where other code can ↵Paul "LeoNerd" Evans2022-06-201-0/+8
| | | | | | | see it Also defend it against side-effects in arguments, by allocating temporary variables.
* op.h: Add commentKarl Williamson2022-05-071-1/+1
|
* op.h: define missing BASEOP fields (op_sibparent,op_targ)Richard Leach2021-08-241-0/+6
|
* Rename G_ARRAY to G_LIST; provide back-compat when not(PERL_CORE)Paul "LeoNerd" Evans2021-06-021-3/+3
|
* A totally new optree structure for try/catch involving three new optypesPaul "LeoNerd" Evans2021-02-141-1/+0
|
* Add documentation comment to op.h to clarify that LOGOP's ->op_other is in ↵Paul "LeoNerd" Evans2021-02-081-0/+6
| | | | execution order pointer, not a tree-structural one
* Initial attempt at feature 'try'Paul "LeoNerd" Evans2021-02-041-0/+1
| | | | | | | | | * Add feature, experimental warning, keyword * Basic parsing * Basic implementation as optree fragment See also https://github.com/Perl/perl5/issues/18504
* style: Detabify indentation of the C code maintained by the core.Michael G. Schwern2021-01-171-98/+98
| | | | | | | | | | | This just detabifies to get rid of the mixed tab/space indentation. Applying consistent indentation and dealing with other tabs are another issue. Done with `expand -i`. * vutil.* left alone, it's part of version. * Left regen managed files alone for now.
* Revert "op.h: Restrict to core certain internal symbols"Karl Williamson2020-12-021-27/+24
| | | | | | This reverts commit 1d6cadf136bf2c85058a5359fb48b09b3ea9fe6f. Due to cpan breakage: GH #18374 #18375 #18376
* op.h: Restrict to core certain internal symbolsKarl Williamson2020-11-291-24/+27
| | | | | so that they aren't accessible to XS code and won't be picked up by autodoc
* Revert "op.h: Restrict scope of multiconcat symbols to core"Karl Williamson2020-11-151-4/+0
| | | | | | | This reverts commit a5d5855671af6956a8d1a13e419457afdffeb416. It turns out that CPAN modules are using these values; whether they should be using them or not, I don't know.
* op.h: Restrict scope of multiconcat symbols to coreKarl Williamson2020-11-131-0/+4
| | | | These are internal only
* autodoc.pl: Specify scn for single-purpose filesKarl Williamson2020-11-061-4/+0
| | | | | | | | Many of the files in perl are for one thing only, and hence their embedded documentation will be for that one thing. By creating a hash here of them, those files don't have to worry about what section that documentation goes under, and so it can be completely changed without affecting them.
* autodoc.pl: Enhance apidoc_section featureKarl Williamson2020-11-061-5/+5
| | | | | | | | | | | This feature allows documentation destined for perlapi or perlintern to be split into sections of related functions, no matter where the documentation source is. Prior to this commit the line had to contain the exact text of the title of the section. Now it can be a $variable name that autodoc.pl expands to the title. It still has to be an exact match for the variable in autodoc, but now, the expanded text can be changed in autodoc alone, without other files needing to be updated at the same time.
* Make some flags accessible from /extKarl Williamson2020-10-161-1/+1
| | | | | | Nothing in the test suite (nor apparently CPAN) had exercised this area of the code, and so this flaw hadn't been discovered. But new code about to be commited does.
* Fix typosSamanta Navarro2020-10-031-1/+1
| | | | | | | | | For: https://github.com/Perl/perl5/pull/18201 Committer: Samanta Navarro is now a Perl author. To keep 'make test_porting' happy: Increment $VERSION in several files. Regenerate uconfig.h via './perl -Ilib regen/uconfig_h.pl'.
* Use macro instead of its expansionKarl Williamson2020-09-091-1/+1
| | | | | We already have a macro that expands to what this code does; it's clearer to use it.
* Reorganize perlapiKarl Williamson2020-09-041-1/+1
| | | | | This uses a new organization of sections that I came up with. I asked for comments on p5p, but there were none.
* Change some =head1 to apidoc_section linesKarl Williamson2020-09-041-5/+5
| | | | | apidoc_section is slightly favored over head1, as it is known only to autodoc, and can't be confused with real pod.
* Use av_top_index() instead of av_tindex()Karl Williamson2020-08-191-1/+1
| | | | | | | I was never happy with this short form, and other people weren't either. Now that most things are better expressed in terms of av_count, convert the few remaining items that are clearer when referring to an index into using the fully spelled out form
* Note GIMME is deprecatedKarl Williamson2020-08-151-1/+1
|
* re-work opslab handling to avoid non-portable alignment assumptionsTony Cook2020-07-301-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes #17871 The op slab allocator code made the assumption that since OP and hence OPSLOT contain pointers, the base of each of those would be an integral number of sizeof(pointer) (pointer units) from the beginning of OPSLAB. This assumption is non-portable, and broke calculating the location of the slab based on the address of the op slot and the op slot offset on m68k platforms. To avoid that, this change now stores the opslot_offset as the offset in pointer units from the beginning of opslab_slots rather than from the beginning of the slab. If alignment on a pointer boundary for OPs is required, the compiler will align opslab_opslots and since we work in pointer units from there, any allocated op slots will also be aligned. If we assume PADOFFSET is no larger than a pointer and requires no stricter alignment and structures in themselves have no stricter alignment requirements then since we work in pointer units all core OP structures should have sufficient alignment (if this isn't true, then it's not a new problem, and not the problem I'm trying to solve here.) I haven't been able to test this on m68k hardware (the emulator I tried to use can't maintain a network connection.)
* Revert "op.h: Add additional padding to struct opslab to ensure proper ↵Tony Cook2020-07-301-3/+0
| | | | | | | | | alignment" This reverts commit a760468c9355bafaee57e94f13705c0ea925d9ca. This change is fragile, the next change avoids the need for such manual padding.
* op.h: Add additional padding to struct opslab to ensure proper alignmentJohn Paul Adrian Glaubitz2020-06-201-0/+3
| | | | | | | | | | | | | On m68k, the natural alignment is 16 bits which causes the opslab_opslot member of struct opslab to be aligned at a 16-bit offset. Other 32-bit and 64-bit architectures have a natural alignment of at least 32 bits, so the offset is always guaranteed to be at least 32-bit-aligned. Fix this by adding additional padding bytes before the opslab_opslot member, both for cases when PERL_DEBUG_READONLY_OPS defined and not defined to ensure the offset of oplab_slots is always 32-bit-aligned. On architectures which have a natural alignment of at least 32 bits, the padding does not affect the alignment, offsets or struct size.
* Fix a bunch of repeated-word typosDagfinn Ilmari Mannsåker2020-05-221-1/+1
| | | | | Mostly in comments and docs, but some in diagnostic messages and one case of 'or die die'.
* Remove spurious double spaces before open braces in core C codeDagfinn Ilmari Mannsåker2020-04-131-1/+1
|
* op:h remove double space in struct op_argcheck_aux declarationDagfinn Ilmari Mannsåker2020-04-131-1/+1
|
* make freed op re-use closer to O(1)Tony Cook2020-03-021-1/+2
| | | | | | | | | | | | | | | | previously freed ops were stored as one singly linked list, and a failed search for a free op to re-use could potentially search that entire list, making freed op lookups O(number of freed ops), or given that the number of freed ops is roughly proportional to program size, making the total cost of freed op handling roughly O((program size)**2). This was bad. This change makes opslab_freed into an array of linked list heads, one per op size. Since in a practical sense the number of op sizes should remain small, and insertion is amortized O(1), this makes freed op management now roughly O(program size). fixes #17555
* Restrict features in wildcardsKarl Williamson2020-02-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | The algorithm for dealing with Unicode property wildcards is to wrap the user-supplied pattern with /miaa. We don't want the user to be able to override the /m and /aa parts. Modifiers that are only specifiable as a modifier in a qr or similar op (like /gc) can't be included in things like (?gc). These normally incur a warning that they are ignored, but the texts of those warnings are misleading when using wildcards, so I chose to just make them illegal. Of course that could be changed to having custom useful warning texts, but I didn't think it was worth it. I also chose to forbid recursion of using nested \p{}, just from fear that it might lead to issues down the road, and it really isn't useful for this limited universe of strings to match against. Because wildcards currently can't handle '}' inside them, only the single letter \p,\P are valid anyway. Similarly, I forbid the '*' quantifier to make it harder for the constructed subpattern to take forever to make any progress and decide to halt. Again, using it would be overkill on the universe of possible match strings.
* op.h: Move some flag bits downKarl Williamson2020-02-191-14/+14
| | | | | | | | | This is in preparation for adding a new flag bit at the end in a future commit. It could have been added in the unused space that the first of these was moved to, but the new one is less important/used, so I thought it best to come last. The reason to use unused space is to preserve binary compatibility with the bits, and we don't care about that at this point in the development cycle.
* Reimplement tr/// without swashesKarl Williamson2019-11-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This large commit removes the last use of swashes from core. It replaces swashes by inversion maps. This data structure is already in use for some Unicode properties, such as case changing. The inversion map data structure leads to straight forward implementation code, so I collapsed the two doop.c routines do_trans_complex_utf8() and do_trans_simple_utf8() into one. A few conditionals could be avoided in the loop if this function were split so that one version didn't have to test for, e.g., squashing, but I suspect these are in the noise in the loop, which has to deal with UTF-8 conversions. This should be faster than the previous implementation anyway. I measured the differences some releases back, and inversion maps were faster than the equivalent swash for up to 512 or 1024 different ranges. These numbers are unlikely to be exceeded in tr/// except possibly in machine-generated ones. Inversion maps are capable of handling both UTF-8 and non-UTF-8 cases, but I left in the existing non-UTF-8 implementation, which uses tables, because I suspect it is faster. This means that there is extra code, purely for runtime performance. An inversion map is always created from the input, and then if the table implementation is to be used, the table is easily derived from the map. Prior to this commit, the table implementation was used in certain edge cases involving code points above 255. Those cases are now handled by the inversion map implementation, because it would have taken extra code to detect them, and I didn't think it was worth it. That could be changed if I am wrong. Creating an inversion map for all inputs essentially normalizes them, and then the same logic is usable for all. This fixes some false negatives in the previous implementation. It also allows for detecting if the actual transliteration can be done in place. Previously, the code mostly punted on that detection for the UTF-8 case. This also allows for accurate counting of the lengths of the two sides, fixing some longstanding TODO warning tests. A new flag is created, OPpTRANS_CAN_FORCE_UTF8, when the tr/// has a below 256 character resolving to one that requires UTF-8. If this isn't set, the code knows that a non-UTF-8 input won't become UTF-8 in the process, and so can take short cuts. The bit representing this flag is the same as OPpTRANS_FROM_UTF, which is no longer used. That name is left in so that the dozen-ish modules in cpan that refer to it can still compile. AFAICT none of them actually use the flag, as well they shouldn't since it is private to the core. Inversion maps are ideally suited for tr/// implementations. An issue with them in general is that for some pathological data, they can become fragmented requiring more space than you would expect, to represent the underlying data. However, the typical tr/// would not have this issue, requiring only very short inversion maps to represent; in some cases shorter than the table implementation. Inversion maps are also easier to deparse than swashes. A deparse TODO was also fixed by this commit, and the code to deparse UTF-8 inputs is simplified. One could implement specialized data structures for specific types of inputs. For example, a common tr/// form is a single range, like tr/A-Z/a-z/. That could be implemented without a table and be quite fast. An intermediate step would be to use the inversion map implementation always when the transliteration is a single range, and then special case length=1 maps at execution time. Thanks to Nicholas Rochemagne for his help on B
* op.h: Add synonyms for some tr/// valuesKarl Williamson2019-11-061-0/+3
|
* Change names of some OPpTRANS flagsKarl Williamson2019-11-061-2/+3
| | | | | | | These two flags will shortly become obsolete, replaced by ones with different meanings. This flag makes the new ones the normal ones, and makes the old names synonyms so that code that refers to them can compile.
* doop.c: Change out-of-bounds valueKarl Williamson2019-11-061-0/+1
| | | | | | This currently uses 0xfeedface as a marker for something that isn't a legal value. But that could in fact become legal at same point. This defines a value TR_OOB that can be guaranteed not to become legal.
* op.c, doop.c Use mnemonics instead of numeric valuesKarl Williamson2019-11-061-0/+5
| | | | For legibility and maintainability
* Change macro name in tr/// codeKarl Williamson2019-11-061-0/+3
| | | | This makes it more mnemonic. Also add an explanation in toke.c
* op.h: Remove obsolete #defineKarl Williamson2019-11-031-4/+0
| | | | This is no longer used.
* On OP_READLINE, OPf_SPECIAL is set for <<>>, clear for <>.Nicholas Clark2019-11-021-0/+1
|
* Remove indentation of no-longer #ifdef-guarded #definesDagfinn Ilmari Mannsåker2019-10-171-7/+7
| | | | | Commit 0f9a6232f0af0895807ddd0afae2d5512aa91bf9 removed the #ifdef PERL_OP_PARENT, but left the #define directives indented.
* Signatures: change param count from IV to UVDavid Mitchell2019-09-231-2/+2
| | | | | | For some reason I was storing the counts of sub signature parameters and optional parameters as signed ints. Since these can never be negative, change them to UV instead.
* OP_ARGCHECK: use custom aux structDavid Mitchell2019-09-231-0/+8
| | | | | | | | | | | | This op is of class OP_UNOP_AUX, Ops of this class have an op_aux pointer which typically points to a variable-length malloced array of IVs, UVs, etc. However in the specific case of OP_ARGCHECK the data stored in the aux struct is fixed. So this commit casts the aux pointer to a struct containing the relevant fields (number of parameters etc), rather than referring to them as aux[0], aux[1] etc. This makes the code more readable. Should be no functional changes.
* Un-revert "[MERGE] add+use si_cxsubix field"David Mitchell2019-09-231-1/+1
| | | | | | | | original merge commit: v5.31.3-198-gd2cd363728 reverted by: v5.31.4-0-g20ef288c53 The commit following this commit fixes the breakage, which that means the revert can be undone.