summaryrefslogtreecommitdiff
path: root/globvar.sym
Commit message (Collapse)AuthorAgeFilesLines
* regex engine - replace many attribute arrays with oneYves Orton2022-08-061-4/+1
| | | | | | | | | | | | | | This replaces PL_regnode_arg_len, PL_regnode_arg_len_varies, PL_regnode_off_by_arg and PL_regnode_kind with a single PL_regnode_info array, which is an array of struct regnode_meta, which contains the same data but as a struct. Since PL_regnode_name is only used in debugging builds of the regex engine we keep it separate. If we add more debug properties it might be good to create a PL_regnode_debug_info[] to hold that data instead. This means when we add new properties we do not need to modify any secondary sources to add new properites, just the struct definition and regen/regcomp.pl
* globvar.sym - sort PL_reg*Yves Orton2022-08-031-2/+2
|
* regex engine - Rename PL_reg_name to PL_regnode_nameYves Orton2022-08-031-1/+1
|
* regex engine - Rename PL_reg_off_by_arg to PL_regnode_off_by_argYves Orton2022-08-031-1/+1
|
* regex engine - Rename PL_regkind to PL_regnode_kindYves Orton2022-08-031-1/+1
|
* regex engine - Rename PL_regargvaries to PL_regnode_arg_len_variesYves Orton2022-08-031-1/+1
|
* regex engine - Rename PL_regarglen to PL_regnode_arg_lenYves Orton2022-08-031-1/+1
|
* regen/regcomp.pl - add PL_regargvariesYves Orton2022-08-031-0/+1
|
* regex engine - Rename reg_off_by_arg to PL_reg_off_by_argYves Orton2022-08-031-0/+1
| | | | | This is in preparation for a future patch, so we can access PL_reg_off_by_arg() from an inline function in regexec.c
* regen/regcomp.pl - Make regarglen available as PL_regarglen in regexec.cYves Orton2022-08-031-0/+1
| | | | | | | | | In a follow up patch we will use this data from regexec.c which currently cannot see the variable. This changes a comment in regen/mk_invlists.pl which necessitated rebuilding several files related to unicode. Only the hashes associated with mk_invlists.pl were changed.
* Make fc(), qr//i thread-safe on participating platformsKarl Williamson2022-06-111-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A long standing bug in Perl that has gone undetected is that the array is global that is created when changing locales and tells fc() and qr//i matching what the folds are in the new locale. What this means is that any program only has one set of fold definitions that apply to all threads within it, even if we claim that the locales are thread-safe on the given platform. One possibility for this going undetected so long is that no one is using locales on multi-threaded systems much. Another possibility is that modern UTF-8 locales have the same set of folds as any other one. It is a simple matter to make the fold array per-thread instead of per-process, and that solves the problem transparently to other code. I discovered this stress-testing locale handling under threads. That test will be added in a future commit. In order to keep from having a dTHX inside foldEQ_locale, it has to have a pTHX_ parameter. This means that the other functions that function pointer variables get assigned to point to have to have an identical signature, which means adding pTHX_ to functions that don't require it. The bodies of all these are known to the compiler, since they are all in inline.h or in the same .c file as where they are called. Hence the compiler can optimize out the unused parameter. Two calls of STR_WITH_LEN also have to be changed because of C preprocessor limitations; perhaps there is another way to do it that I'm unfamiliar with.
* Add 64bit single-1bit_pos()Karl Williamson2021-07-301-0/+1
| | | | | | | | | | | | | This will prove useful in future commits on platforms that have 64 bit capability. The deBruijn sequence used here, taken from the internet, differs from the 32 bit one in how they treat a word with no set bits. But this is considered undefined behavior, so that difference is immaterial. Apparently figuring this out uses brute force methods, and so I decided to live with this difference, rather than to expend the time needed to bring them into sync.
* Create and use single_1bit_pos32()Karl Williamson2021-07-301-0/+1
| | | | | | This moves the code from regcomp.c to inline.h that calculates the position of the lone set bit in a U32. This is in preparation for use by other call sites.
* regnodes.h: Add two convenience bit masksKarl Williamson2020-10-161-0/+2
| | | | | | | These categorize the many types of EXACT nodes, so that code can refer to a particular subset of such nodes without having to list all of them out. This simplifies some 'if' statements, and makes updating things easier.
* Remove PERL_GLOBAL_STRUCTDagfinn Ilmari Mannsåker2020-07-201-1/+0
| | | | | | | | This was originally added for MinGW, which no longer needs it, and only still used by Symbian, which is now removed. This also leaves perlapi.[ch] empty, but we keep the header for CPAN backwards compatibility.
* Remove PL_freqKarl Williamson2020-07-171-1/+0
| | | | | This hasn't actually been used since commit 8922e4382e9c1488fdbe46a0f52493860dc897a6.
* utf8.c: Make global a warning msg textKarl Williamson2020-01-231-0/+1
| | | | This is in preparation for it to be raised in other files
* Revert "Move PL_check to the interp vars to fix threading issues"Tony Cook2019-12-161-0/+1
| | | | | and the associated commits, at least until a way to make wrap_op_checker() work is available.
* Move PL_check to the interp vars to fix threading issuesStefan Seifert2019-12-121-1/+0
| | | | Fixes issue #14816
* foo_cloexec() under PERL_GLOBAL_STRUCT_PRIVATEDavid Mitchell2019-02-191-0/+9
| | | | | | | | | | | | | | | | | | Fix the various Perl_PerlSock_dup2_cloexec() type functions so that t/porting/liberl.a passes under -DPERL_GLOBAL_STRUCT_PRIVATE builds. In these builds it is forbidden to have any static variables, but each of these functions (via convoluted macros) has a static var called 'strategy' which records, for each function, whether a run-time probe has been done to determine the best way of achieving close-exec functionality, and the result. Replace them all with 'global' vars: PL_strategy_dup2 etc. NB these vars aren't thread-safe but it doesn't really matter, as the worst that can happen is for a redundant probe or two to be done before a suitable "don't probe any more" value is written to the var and seen by all the threads.
* regen/warnings.pl: Fix undefined C behaviorKarl Williamson2019-01-051-0/+2
| | | | | | | | | This fixes compiler warnings "performing pointer arithmetic on a null pointer has undefined behavior" There are several ways to fix this. This one was suggested by Tomasz Konojacki++. Instead of trying to point to address 1 and 2, two variables are created, and we point to them. const is cast away.
* Make new EBCDIC tables global.Craig A. Berry2018-07-081-0/+3
| | | | | | | They are defined in the Perl library but referenced in the Perl executable, but the Perl executable can't see them unless they are exported by the library, and some linkers only make a symbol a public export if they've been told to explicitly.
* add PL_sv_zeroDavid Mitchell2017-07-271-0/+1
| | | | | | | | | | it's like PL_sv_no, except that its string value is "0" rather than "". It can be used for example where pp function wants to push a zero return value on the stack. The next commit will start to use it. Also update the SvIMMORTAL() to be more efficient: it now checks whether the SV's address is in a range rather than individually checking against &PL_sv_undef, &PL_sv_no etc.
* PERL_GLOBAL_STRUCT_PRIVATE: fix PL_isa_DOESDavid Mitchell2017-03-171-0/+1
| | | | | | | | | I added the global string constant PL_isa_DOES recently. This caused t/porting/libperl.t to fail under -DPERL_GLOBAL_STRUCT_PRIVATE builds. This commit makes PL_isa_DOES be declared and defined in a similar way to other such global constants. This is pure cargo-culting - I have no real idea of the point of all the EXTCONST, INIT and globvar.sym stuff.
* Slightly shorten most regex patternsKarl Williamson2015-09-081-0/+1
| | | | | | | A compiled pattern requires a byte for each non-default modifier, like /i. Previously, the worst case was presumed in allocating the space (every modifier being non-default). Now, only the actual needed space is reserved.
* infnan: new logic for NV_INF and NV_NANJarkko Hietaniemi2015-06-121-0/+2
| | | | | The global const PL_inf and PL_nan have dual nature: the .nv has the NV, the .u8 has the bytes.
* globvar.sym: include PL_ prefix in namesDavid Mitchell2014-09-221-74/+74
| | | | | | | By prepending 'PL_' to each line in globvar.sym, it a) makes makedef.pl slightly simpler, b) makes it easier to spot all usage of a particular var when you do 'git grep PL_foo'
* Automate processing of op_private flagsDavid Mitchell2014-09-101-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new config file, regen/op_private, which contains all the information about the flags and descriptions for the OP op_private field. Previously, the flags themselves were defined in op.h, accompanied by textual descriptions (sometimes inaccurate or incomplete). For display purposes, there were short labels for each flag found in Concise.pm, and another set of labels for Perl_do_op_dump() in dump.c. These two sets of labels differed from each other in spelling (e.g. REFC verses REFCOUNT), and differed in completeness and accuracy. With this commit, all the data to generate the defines and the labels is derived from a single source, and are generated automatically by 'make regen'. It also contains complete data on which bits are used for what by each op. So any attempt to add a new flag for a particular op where that bit is already in use, will raise an error in make regen. This compares to the previous practice of reading the descriptions in op.h and hoping for the best. It also makes use of data in regen/opcodes: for example, regen/op_private specifies that all ops flagged as 'T' get the OPpTARGET_MY flag. Since the set of labels used by Concise and Perl_do_op_dump() differed, I've standardised on the Concise version. Thus this commit changes the output produced by Concise only marginally, while Perl_do_op_dump() is considerably different. As well as the change in labels (and missing labels), Perl_do_op_dump() formerly had a bug whereby any unrecognised bits would not be shown if there was at least one recognised bit. So while Concise displayed (and still does) "LVINTRO,2", Perl_do_op_dump() has changed: - PRIVATE = (INTRO) + PRIVATE = (LVINTRO,0x2) Concise has mainly changed in that a few op/bit combinations weren't being shown symbolically, and now are. I've avoiding fixing the ones that would break tests; they'll be fixed up in the next few commits. A few new OPp* flags have been added: OPpARG1_MASK OPpARG2_MASK OPpARG3_MASK OPpARG4_MASK OPpHINT_M_VMSISH_STATUS OPpHINT_M_VMSISH_TIME OPpHINT_STRICT_REFS The last three are analogues for existing HINT_* flags. The former four reflect that many ops some of the lower few bits of op_private to indicate how many args the op expects. While (for now) this is still displayed as, e.g. "LVINTRO,2", the definitions in regen/op_private now fully account for which ops use which bits for the arg count. There is a new module, B::Op_private, which allows this new data to be accessed from Perl. For example, use B::Op_private; my $name = $B::Op_private::bits{aelem}{7}; # OPpLVAL_INTRO my $value = $B::Op_private::defines{$name}; # 128 my $label = $B::Op_private::labels{$name}; # LVINTRO There are several new constant PL_* tables. PL_op_private_valid[] specifies for each op number, which bits are valid for that op. In a couple of commits' time, op_free() will use this on debugging builds to assert that no ops gained any private flags which we don't know about. In fact it was by using such a temporary assert repeatedly against the test suite, that I tracked down most of the inconsistencies and errors in the current flag data. The other PL_op_private_* tables contain a compact representation of all the ops/bits/labels in a format suitable for Perl_do_op_dump() to decode Op_private. Overall, the perl binary is about 500 bytes smaller on my system.
* Work properly under UTF-8 LC_CTYPE localesKarl Williamson2014-01-271-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This large (sorry, I couldn't figure out how to meaningfully split it up) commit causes Perl to fully support LC_CTYPE operations (case changing, character classification) in UTF-8 locales. As a side effect it resolves [perl #56820]. The basics are easy, but there were a lot of details, and one troublesome edge case discussed below. What essentially happens is that when the locale is changed to a UTF-8 one, a global variable is set TRUE (FALSE when changed to a non-UTF-8 locale). Within the scope of 'use locale', this variable is checked, and if TRUE, the code that Perl uses for non-locale behavior is used instead of the code for locale behavior. Since Perl's internal representation is UTF-8, we get UTF-8 behavior for a UTF-8 locale. More work had to be done for regular expressions. There are three cases. 1) The character classes \w, [[:punct:]] needed no extra work, as the changes fall out from the base work. 2) Strings that are to be matched case-insensitively. These form EXACTFL regops (nodes). Notice that if such a string contains only characters above-Latin1 that match only themselves, that the node can be downgraded to an EXACT-only node, which presents better optimization possibilities, as we now have a fixed string known at compile time to be required to be in the target string to match. Similarly if all characters in the string match only other above-Latin1 characters case-insensitively, the node can be downgraded to a regular EXACTFU node (match, folding, using Unicode, not locale, rules). The code changes for this could be done without accepting UTF-8 locales fully, but there were edge cases which needed to be handled differently if I stopped there, so I continued on. In an EXACTFL node, all such characters are now folded at compile time (just as before this commit), while the other characters whose folds are locale-dependent are left unfolded. This means that they have to be folded at execution time based on the locale in effect at the moment. Again, this isn't a change from before. The difference is that now some of the folds that need to be done at execution time (in regexec) are potentially multi-char. Some of the code in regexec was trivial to extend to account for this because of existing infrastructure, but the part dealing with regex quantifiers, had to have more work. Also the code that joins EXACTish nodes together had to be expanded to account for the possibility of multi-character folds within locale handling. This was fairly easy, because it already has infrastructure to handle these under somewhat different circumstances. 3) In bracketed character classes, represented by ANYOF nodes, a new inversion list was created giving the characters that should be matched by this node when the runtime locale is UTF-8. The list is ignored except under that circumstance. To do this, I created a new ANYOF type which has an extra SV for the inversion list. The edge case that caused the most difficulty is folding involving the MICRO SIGN, U+00B5. It folds to the GREEK SMALL LETTER MU, as does the GREEK CAPITAL LETTER MU. The MICRO SIGN is the only 0-255 range character that folds to outside that range. The issue is that it doesn't naturally fall out that it will match the CAP MU. If we let the CAP MU fold to the samll mu at compile time (which it can because both are above-Latin1 and so the fold is the same no matter what locale is in effect), it could appear that the regnode can be downgraded away from EXACTFL to EXACTFU, but doing so would cause the MICRO SIGN to not case insensitvely match the CAP MU. This could be special cased in regcomp and regexec, but I wanted to avoid that. Instead the mktables tables are set up to include the CAP MU as a character whose presence forbids the downgrading, so the special casing is in mktables, and not in the C code.
* add PL_reg_intflags_name to globvar.sym - used in debugging regex engineYves Orton2013-06-221-0/+1
|
* Rename PL_interp_size_5_16_0 to PL_interp_size_5_18_0.Nicholas Clark2013-02-191-1/+1
|
* regcomp.c: generate folded for EXACTF and EXACTFUKarl Williamson2011-10-171-0/+1
| | | | | | regcomp.c folds the string in these two nodes except in one case. Change that case to correspond with the predominant behavior. This enables future optimizations
* Eliminate global.sym, as makedef.pl can generate it internally.Nicholas Clark2011-08-251-1/+0
| | | | | | | | global.sym was a file listing the exported symbols, generated by regen/embed.pl from embed.fnc and regen/opcodes, which was only used by makedef.pl Move the code that generates global.sym from regen/embed.pl to makedef.pl, and thereby eliminate the need to ship a 907 line generated file.
* Handle PL_sh_path better in globvar.sym and makedef.plNicholas Clark2011-08-231-0/+1
| | | | | | | PL_sh_path needs some form of special case because it is conditionally defined either in perlvar.h or perl.h, but globvar.sym mentions all symbols unconditionally, and undef -DPERL_GLOBAL_STRUCT perlvar.h is parsed as an unconditional skip list.
* Add PL_valid_types_{IVX,NVX,PVX,RV,IV_set,NV_set} into globar.sym.Nicholas Clark2011-07-151-0/+6
| | | | | f1fb874192252653 added these 6 new global variables, but omitted to add them to the list of exported symbols.
* Sort globvar.sym lexically.Nicholas Clark2011-07-151-3/+3
|
* Move PL_runops_{std,dbg} to perl.h, and make them const.Nicholas Clark2011-06-121-0/+2
| | | | | They exist solely to ensure that Perl_runops_standard and Perl_runops_debug are linked in - nothing assigns to either variable, and nothing reads them.
* Move PL_global_struct_size, PL_interp_size{,_5_16_0} to perl.hNicholas Clark2011-06-121-0/+3
| | | | Make them const U16 - they should have been const from the start.
* Move PL_{revision,version,subversion} to perl.h, making them const U8.Nicholas Clark2011-06-121-0/+3
| | | | | To get the initialisation to work, the location of #include patchlevel.h needs to be moved.
* Move PL_{No,Yes,hexdigit} from perlvars.h to perl.h, as all are const char[]Nicholas Clark2011-06-121-0/+3
| | | | | | | | | | | They were converted in perl.h from const char[] to #define in 31fb120917c4f65d, then re-instated as const char[], but in perlvars.h, in 3fe35a814d0a98f4. There's no need for compile-time constants to jump through the hoops of perlvars.h, even for Symbian, as the various "EXTCONST" variables already in perl.h demonstrate. These were the only 3 users of the the PERLVARISC macro, so eliminate that, and all related code.
* Create a lookup table for magic vtables from magic type, PL_magic_data.Nicholas Clark2011-06-111-0/+1
| | | | | | | | | | | Use it to eliminate the large switch statement in Perl_sv_magic(). As the table needs to be keyed on magic type, which is expressed as C character constants, the order depends on the compiler's character set. Frustratingly, EBCDIC variants don't agree on the code points for '~' and ']', which we use here. Instead of having (at least) 4 tables, get the local runtime to sort the table for us. Hence the regen script writes out the (unsorted) mg_raw.h, which generate_uudmap sorts to generate mg_data.h
* Provide the names of the magic vtables in PL_magic_vtable_names[].Nicholas Clark2011-06-111-0/+1
| | | | | | As it's a 1 to 1 mapping with the vtables in PL_magic_vtables[], refactor Perl_do_magic_dump() to index into it directly to find the name for an arbitrary mg_virtual, avoiding a long switch statement.
* Replace PL_vtbl_* with an array PL_magic_vtables.Nicholas Clark2011-06-111-29/+1
| | | | | | Define each PL_vtbl_* name as a macro which expands to the correct array element. Using a single array instead of multiple named variables will allow the simplification of various pieces of code.
* Abolish PL_vtbl_sig. It's been all 0s since it was added in 5.0 alpha 2.Nicholas Clark2011-06-111-1/+0
| | | | | Magic with a NULL vtable is equivalent to magic with a vtable of all 0s. On CPAN, only Apache::Peek's code for 5.005 is referencing it.
* Restore building with -DPERL_GLOBAL_STRUCT, broken since 4dc941f7cb795735.Nicholas Clark2011-05-221-0/+1
| | | | | | | | | As PL_charclass is a constant, it doesn't need to be accessed via the global struct. It should be exported via globvar.sym, not PERLVARA() in perlvars.h [With a PERVARA() it all compiles perfectly, once C<dVAR>s are added where now needed, but the build loops forever because the (real) charclass array is never initialised]
* export PL_core_reg_engine so it's visible to the re moduleTony Cook2011-02-191-0/+1
| | | | Win32 builds have been broken since de1ac46b without this.
* Add fold_latin1 to the list of exported variable symbols (unbreaking ↵Max Maischein2010-11-241-0/+1
| | | | Win32+gcc build)
* Add ${^GLOBAL_PHASE}Florian Ragwitz2010-11-141-0/+1
| | | | This exposes the current top-level interpreter phase to perl space.
* Add simple_bitmask and varies_bitmask to globvar.sym.Nicholas Clark2010-05-281-0/+2
| | | | global.sym is generated; is there a way to automate globvar.sym?
* Complete the fix for Win32 link following commits 88e1f1a and 406ca27Steve Hay2009-11-061-0/+1
|