summaryrefslogtreecommitdiff
path: root/perl.h
Commit message (Collapse)AuthorAgeFilesLines
* Revert "PERL_SET_LOCALE_CONTEXT: Actually do something"Karl Williamson2023-05-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverts commit 9e254b0b5b145c9bfc3053e778e9f7fbb3760b45. Date: Wed Apr 5 12:26:26 2023 -0600 This fixes GH #21040 The reverted commit caused failures in platforms using the musl library, notably Alpine Linux. I came up with a fix for that, which instead broke Windows. In looking at that I realized the original fix is incomplete, and that things are too precarious to try to fix so close to 5.38.0. For example, I spent hours, due to a %p format printing 0 for what turned out to be a non-NULL string pointer. I think it has to do do with the fact that the failing code is in the middle of transitioning between threads, and the printing got confused as a result. The reverted commit was part of a series fixing #20155 and #20231. But the earlier part of the series succeeded in fixing those, without that commit, so reverting it should not cause things to break as a result. This whole issue has to do with locales and threading. Those still don't play well together. I have a series of well over 200 commits that address this situation, for applying in early 5.39. My point is that we are a long way from solving these kinds of issues; and they don't come up that much in the field because they just don't get used. The reverted commit would help if it worked properly, but it's not the only thing wrong by a long shot.
* PERL_SET_LOCALE_CONTEXT: Actually do somethingKarl Williamson2023-04-181-1/+1
| | | | | | | | | | | | | | | | This is a macro that does a quick check before calling a function to actually do the work. The sense of that check was reversed. The check is repeated in the function, but this time correctly. The bottom line was if the function should be called, the macro failed to call it. If it shouldn't be called the macro would call it, but the check in the function caused it to return without doing anything. Hence this whole thing was a no-op. However, I cant get things to fail without this patch. ISTR this was the result of a BBC, with another one likely affected, but I can't find them now.
* t/porting/bincompat.t - test the code itself not just the outputYves Orton2023-03-301-1/+1
| | | | | | | | | | | | | | | | | | Our checks on the define info we expose via Internals::V(), especially the sorted part, did not really work properly as it only checked defines that are actually exposed in our standard builds. Many of the defines that are exposed in this list are special cases that would not be enabled in a normal build we test under CI, and indeed prior to this patch it was possible for us to produce unsorted output if certain defines were enabled. This patch adds checks that reads the actual code. It checks that the define and the string are the same, and it checks that strings would be output in sorted order assuming every define was enabled. There are two historical exceptions where the string we show and the define use internally are different, but we work around these two cases with as special case hash.
* perl.h - silence certain warnings on HPUX globally.Yves Orton2023-03-291-0/+16
| | | | | | | | | | | | | | | | | There are two warnings classes which account for a very large number of the warnings produced when building on HPUX Itanium. We know the cause of these warnings and we are ok with ignoring them. One set comes from our memory wrap checks, where we end up doing a comparison against constants in certain conditions. See the comments in handy.h line 2723 related to PERL_MALLOC_WRAP. The other set comes from our common "trick" of doing OO in C code with casting. This is the foundation of how we manage SV types and how we manage regular expression ops (regops). If this logic really was a problem then we would have lots of test failures and segfaults, and we do not, so we can silence them.
* perl.h - remove redundant /*EMPTY*/ commentYves Orton2023-03-221-1/+1
| | | | | | mauke: it was added by Andy Lester in 6f207bd3ddac24959aa7f00f2d7a66f116dcc7ed mauke: when he replaced '/*EMPTY*/;' statements by 'NOOP;' mauke: I would also remove the comment
* fix precedence issue with NOOPZefram2023-03-221-1/+1
| | | | Parens are required or precedence issues can occur.
* regen/embed.pl - change _aDEPTH and _pDEPTH to not have a leading underbarYves Orton2023-03-191-7/+19
| | | | | | | | | | | | | | The leading underbar is reserved by C. These defines are debugging only "recursion" depth related counters injected into the function macro wrappers when a function is marked as 'W', much the same way that aTHX_ and pTHX_ are when building under threaded builds. The functions are expected to incremented the depth parameter themselves. Note that "recursion" is quoted above because in practice currently they are only used by the regex engine when recursing virtually, and they do not relate to true C stack related recursion. (But they could be used for tracking C level recursion under debugging if someone needed it.)
* locale.c: Remove one use of nl_langinfo_l()Karl Williamson2023-03-131-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | The limited POSIX guarantees of thread safety for nl_langinfo_l() aren't enough for our uses, and I was naive to think that a simple Configure probe could rule out all possible thread-safety issues that might exist in a libc call. I don't remember what the platforms were that falsely tested ok for the probe, but if it were necessary to find out, revert this patch, and start a smoke-me test. What that Configure probe did was find one particular point of non-safety. And it turns out various platforms pass that, but don't have a thread-safe nl_langinfo_l() generally. There are two calls to nl_langinfo_l() in the code. This commit removes one, where the major advantage of using nl_langinfo_l() over plain nl_langinfo() was efficiency. There still had to be an alternate implementation available that used plain nl_langinfo(). Since we can't guarantee that the _l implementation doesn't have bugs, simply remove it, and the existing alternative gets automatically used. The remaining use of nl_langinfo_l() is only when using glibc, and is disabled by default, requiring an explicit Configure parameter to enable. I have never seen a case where the glibc implementation failed to be thread-safe. This use may be enabled by default at some point, but not until early in a development cycle.
* perl.h - make sure macros using negation are paren wrappedYves Orton2023-03-071-3/+3
| | | | Noticed by Zefram
* perldelta.pod - document $SIG{__DIE__} and compile error count bugfixesYves Orton2023-02-201-0/+2
| | | | | | | | We have fixed bugs related to $SIG{__DIE__} being inconsistently triggered during eval, and we have fixed bugs with compilation inconsistently stopping after 10 errors. This patch also includes a micro-tweak to perl.h to allow the threshold to be sanely overriden in Configure.
* perl.h, pp_ctl.c - switch to standard way of terminating compilationYves Orton2023-02-201-3/+1
| | | | | | I did not fully understand the use of yyquit() when I implemented the SYNTAX_ERROR related stuff. It is not needed, and switching to this makes eval compile error messages more consistent.
* perl.h: Make sure PERL_IMPLICIT_CONTEXT doesn't come backKarl Williamson2023-02-101-4/+8
| | | | | This is an obsolete name, retained for back compat with cpan. Make sure the core doesn't have it defined.
* snprintf() calls need to have proper radixKarl Williamson2023-02-101-3/+10
| | | | | | | | | | | | | | | | | | | Calls to libc snprintf() were neglected to be changed when perl was fixed to change the radix character to the proper one based on whether or not 'use locale' is in effect. Perl-level code is unaffected, but core and XS code is. This commit changes to wrap snprintf() calls with the macros designed for the purpose, long used for similar situations elsewhere in the code. Doing this requires the thread context. I achieved this in a few places by a dTHX, instead of assuming a caller would have the context already available, and adding a pTHX_ parameter. I tried doing it the other way, and got a few breakages in our test suite. Formatting already requires significant CPU time, so this addition should just be in the noise This bug was found by new tests that will be added in a future commit.
* Create a specific SV type for object instancesPaul "LeoNerd" Evans2023-02-101-7/+8
|
* perl.h - break up * lined comment leaders and pod commentsYves Orton2023-02-091-4/+9
| | | | | | Having half of the comment have the * on the left side is confusing for humans and especially so for programs. Split the two style into two comments.
* Correct typos as per GH 20435James E Keenan2022-12-291-1/+1
| | | | | | | | | | | | | | | | | | | In GH 20435 many typos in our C code were corrected. However, this pull request was not applied to blead and developed merge conflicts. I extracted diffs for the individual modified files and applied them with 'git apply', excepting four files where patch conflicts were reported. Those files were: handy.h locale.c regcomp.c toke.c We can handle these in a subsequent commit. Also, had to run these two programs to keep 'make test_porting' happy: $ ./perl -Ilib regen/uconfig_h.pl $ ./perl -Ilib regen/regcomp.pl regnodes.h
* Add HvNAMEfARG() macroPaul "LeoNerd" Evans2022-12-241-0/+2
|
* sv.c - add support for HvNAMEf and HvNAMEf_QUOTEDPREFIX formatsYves Orton2022-12-221-0/+2
| | | | | | | | They are similar to SVf and SVf_QUOTEDPREFIX but take an HV * argument and use HvNAME() and related macros to extract the string. This is helpful as it makes constructing error messages from a stash (HV *) easier. It is the callers responsibility to ensure that the HV is actually a stash.
* Define five new operator precedence levelsPaul "LeoNerd" Evans2022-12-161-6/+13
| | | | | | | | | | | | Assignment operators (`==`) were missing, as were both the logical and the low-precedence shortcutting OR and AND operators (`&&`, `||`, `and`, `or`) Also renumbered them around somewhat to even out the spacing. This is fine during a development cycle. Also renamed the tokenizer/parser symbol names from "PLUG*OP" to "PLUGIN_*_OP" for better readability.
* Add comment to infix operator precedence enum about when we can/can't change ↵Paul "LeoNerd" Evans2022-12-161-1/+5
| | | | the numbers
* regcomp.c - decompose into smaller filesYves Orton2022-12-091-2/+2
| | | | | | | | | | | | | | | | | This splits a bunch of the subcomponents of the regex engine into smaller files. regcomp_debug.c regcomp_internal.h regcomp_invlist.c regcomp_study.c regcomp_trie.c The only real change besides to the build machine to achieve the split is to also adds some new defines which can be used in embed.fnc to control exports without having to enumerate /every/ regex engine file. For instance all of regcomp*.c defines PERL_IN_REGCOMP_ANY, and this is used in embed.fnc to manage exports.
* Define a PL_infix_plugin hook, of a similar style to PL_keyword_pluginPaul "LeoNerd" Evans2022-12-081-0/+17
| | | | | | | | | Runs for identifier-named custom infix operators and sequences of non-identifier symbol characters. Defines multiple precedence levels for custom infix operators that fit alongside exponentiation, multiplication, addition, or relational comparision operators, as well as a "high" and "low" at either end.
* locale.c: Rewrite localeconv() handlingKarl Williamson2022-12-071-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | localeconv() returns a structure contaiing fields that are associated with two different categories: LC_NUMERIC and LC_MONETARY. Perl via POSIX::localeconv() reutrns a hash containing all the fields. Testing on Windows showed that if LC_CTYPE is not the same locale as LC_MONETARY for the monetary fields, or isn't the same as LC_NUMERIC for the numeric ones, mojibake can result. The solution to similar situations elsewhere in the code is to toggle LC_CTYPE into being the same locale as the one for the returned fields. But those situations only have a single locale that LC_CTYPE has to match, so it doesn't work here when LC_NUMERIC and LC_MONETARY are different locales. Unlike Schrödinger's cat, LC_CTYPE has to be one or the other, not both at the same time. The previous implementation did not consider this possibility, and wasn't easily changeable to work. Therefore, this rewrites a bunch of it. The solution used is to call localeconv() twice when the LC_NUMERIC locale and the LC_MONETARY locale don't match (with LC_CTYPE toggled to the corresponding one each time). (Only one call is made if the two categories have the same locale.) This one vs two complicated the code, but I thought it was worth it given that the one call is the most likely case. Another complication is that on platforms that lack nl_langinfo(), (Windows, for example), localeconv() is used to emulate portions of it. Previously there was a separate function to handle this, using an SV() cast as an HV() to avoid using a hash that wasn't actually necessary. That proved to lead to extra duplicated code under the new scheme, so that function was collapsed into a single one and a real hash is used in all circumstances, but is only populated with the one or two fields needed for the emulation. The only part of this commit that I thought could be split off from the rest concerns the fact that localeconv()'s return is not thread-safe, and so must be copied to a safe place (the hash) while in a critical section, locking out all other threads. Before this commit, that copying was accompanied by determining if each string field needed to be marked as UTF-8. That determination isn't necessarily trivial, so should really not be in the critical section. This commit does that. And, with some effort, that part could have been split into a separate commit. but I didn't think it was worth the effort.
* locales: Add LC_NAME capabilitiesKarl Williamson2022-12-061-2/+11
| | | | | | | | | | | | | LC_NAME is a GNU extension that Perl hadn't been aware of. The consequences were that it couldn't be set or queried in Perl (except by using LC_ALL to set everything). There are other GNU extensions that Perl has long known about; this was the only missing one. The values associated with this category are retrievable by the glibc call nl_langinfo(3) in XS code. The standard-specified items are retrievable from pure Perl via I18N::Langinfo, but it doesn't know only about any of the non-standard ones, including the ones for this category.
* locale.c: Add mutex lock around _wsetlocale() callKarl Williamson2022-12-051-0/+4
| | | | | | The lock expands to nothing if unthreaded, or thread-local storage is in effect. But otherwise protects a global value from being clobbered by another thread.
* PERL_STRLEN_NEW_MIN - increase to multiple of pointer sizesRichard Leach2022-11-211-4/+13
| | | | | | | | | | | | | | | | | | | Major malloc implementations, including the popular dlmalloc derivatives all return chunks of memory that are a multiple of the platform's pointer size. Perl's traditional default string allocation of 10 bytes will almost certainly result in a larger allocation than requested. Consequently, the interpreter may try to Renew() an allocation to increase the PV buffer size when it does not actually need to do so. This commit increases the default string size to the nearest pointer multiple. (12 bytes for 32-bit pointers, 16 bytes for 64-bit pointers). This is almost certainly unnecessarily small for 64-bit platforms, since most common malloc implementations seem to return 3*pointer size (i.e. 24 bytes) as the smallest allocation. However, 16 bytes was chosen to prevent an increase in memory usage in memory-constrained platforms which might have a smaller minimum memory allocation.
* Extract minimum PV buffer/AV element size to common definitionsRichard Leach2022-11-211-0/+17
| | | | | | | | | | | | | | | | | In a nutshell, for a long time the minimum PV length (hardcoded in Perl_sv_grow) has been 10 bytes and the minimum AV array size (hardcoded in av_extend_guts) has been 4 elements. These numbers have been used elsewhere for consistency (e.g. Perl_sv_grow_fresh) in the past couple of development cycles. Having a standard definition, rather than hardcoding in multiple places, is more maintainable. This commit therefore introduces into perl.h: PERL_ARRAY_NEW_MIN_KEY PERL_STRLEN_NEW_MIN (Note: Subsequent commit(s) will actually change the values.)
* Figure out I32df, U32uf, etc. in Configure rather than in perl.hTAKAI Kousuke2022-11-141-51/+0
| | | | | | | | | | | | These macros were defined in perl.h using preprocessor conditionals, but determining wheter I32 is "int" or "long" is pretty hard with preprocessor, when INTSIZE == LONGSIZE. The Configure script should know exact underlying type of I32, so it should be able to determine whether %d or %ld shall be used to format I32 value more robustly. Various pre-configured files, such as uconfig.h, are updated to align with this.
* cop.h - get rid of the STRLEN* stuff from cop_warningsYves Orton2022-11-021-2/+2
| | | | | With RCPV strings we can use the RCPV_LEN() macro, and make this logic a little less weird.
* Also add a STRLEN member to ANYPaul "LeoNerd" Evans2022-11-011-0/+1
|
* Add Size_t and SSize_t members to ANYPaul "LeoNerd" Evans2022-11-011-0/+2
|
* Convert tabs to whitespaces in union any {} definitionPaul "LeoNerd" Evans2022-11-011-9/+9
|
* Switch libc per-interpreter data when tTHX changesKarl Williamson2022-10-181-2/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As noted in the previous commit, some library functions now keep per-thread state. So far the only ones we care about are libc locale-changing ones. When perl changes threads by swapping out tTHX, those library functions need to be informed about the new value so that they remain in sync with what perl thinks the locale should be. This commit creates a function to do this, and changes the thread-changing macros to also call this as part of the change. For POSIX 2008, the function just calls uselocale() using the per-interpreter object introduced previously. For Windows, this commit adds a per-interpreter string of the current LC_ALL, and the function calls setlocale on that. We keep the same string for POSIX 2008 implementations that lack querylocale(), so this commit just enables that variable on Windows as well. The code is already in place to free the memory the string occupies when done. The commit also creates a mechanism to skip this during thread destruction. A thread in its death throes doesn't need to have accurate locale information, and the information needed to map from thread to what libc needs to know gets destroyed as part of those throes, while relics of the thread remain. I couldn't find a way to accurately know if we are dealing with a relic or not, so the solution I adopted was to just not switch during destruction. This commit completes fixing #20155.
* locale: Create special variable to hold current LC_ALLKarl Williamson2022-10-181-0/+1
| | | | | | | | | | | | Some configurations require us to store the current locale for each category. Prior to this commit, this was done in the array PL_curlocales, with the entry for LC_ALL being in the highest element. Future commits will need just the value for LC_ALL in some other configurations, without needing the rest of the array. This commit splits off the LC_ALL element into its own per-interpreter variable to accommodate those. It always had to have special handling anyway beyond the rest of the array elements,
* Clean up perl.h/makedef.pl common logicKarl Williamson2022-10-181-27/+35
| | | | | This has gotten two twisty little mazy over time. Clean it up, add comments, and make sure the logic is the same on both.
* Don't #define USE_THREAD_SAFE LOCALE unless threadedKarl Williamson2022-10-181-2/+2
| | | | | | If there aren't threads, yes locales are trivially thread-safe, but the code that gets executed to make them so doesn't need to get compiled, and that is controlled by this #define.
* handy.h: Set macro to false if can't ever be trueKarl Williamson2022-10-101-2/+2
| | | | | | | | | | | | It's unlikely that perl will be compiled with out the LC_CTYPE locale category being enabled. But if it isn't, there is no sense in having per-interpreter variables for various conditions in it, and no sense having code that tests those variables. This commit changes a macro to always yield 'false' when this is disabled, adds a new similar macro, and changes some occurrences that test for a variable to use the macros instead of the variables. That way the compiler knows these to conditions can never be true.
* perl.h: Rmv nested STMT_START...ENDKarl Williamson2022-10-021-2/+2
| | | | The outer pair is all that is necessary.
* Add mutexes for various libc callsKarl Williamson2022-09-291-0/+17
| | | | | | | There are various system calls used by perl that need to be protected by a mutex in some configurations. This commit adds the ones not previously added, for use in future commits. Further details are in the merge commit message for this series of commits.
* perl.h: Finish implementing combo ENV/LOCALE mutexesKarl Williamson2022-09-291-5/+161
| | | | | | | | | | | | | | | | | | There are cases where an executing function is vulnerable to either the locale or environment being changed by another thread. This commit implements macros that use mutexes to protect these critical sections. There are two cases that exist: one where the functions only read; and one where they can also need exclusive control so that a competing thread can't overwrite the returned static buffer before it is safely copied. 5.32 had a placeholder for these, but didn't actually implement it. Instead it locked just the ENV portion. On modern platforms with thread-safe locales, the locale portion is a no-op anyway, so things worked on them. This new commit extends that safety to other platforms. This has long been a vulnerability in Perl.
* perl.h: Move some statementsKarl Williamson2022-09-291-6/+6
| | | | So they are closer to related statements
* Change ENV/LOCALE locking read macro namesKarl Williamson2022-09-291-2/+2
| | | | The old name was confusing.
* Remove ENV_LOCALE_LOCK/UNLOCK macrosKarl Williamson2022-09-291-14/+5
| | | | | These are subsumed by gwENVr_LOCALEr_LOCK created in the previous commit.
* perl.h: Add #define for gwENVr_LOCALEr_UNLOCKKarl Williamson2022-09-291-1/+25
| | | | | This is for functions that read the locale and environment and write to some global space.
* perlhacktips: Add section on writing safer macrosKarl Williamson2022-09-281-104/+21
| | | | | And remove the similar advice but which applied only to STMT_START {} STMT_END
* perl.h: Remove now empty blockKarl Williamson2022-09-211-35/+0
| | | | | Previous commits have left this empty except for comments, and equivalent comments have also been added elsewhere
* perl.h: Move LOCALE_READ_LOCK #definitionKarl Williamson2022-09-211-7/+6
| | | | To enable future simplifications
* perl.h: Move #defining SETLOCALE_LOCKKarl Williamson2022-09-211-13/+8
| | | | This simplifies slightly, and will allow further simplification
* Add POSIX_SETLOCALE_LOCK/UNLOCKKarl Williamson2022-09-211-0/+18
| | | | | | This macro is used to surround raw setlocale() calls so that the return value in a global static buffer can be saved without interference with other threads.
* perl.h: Fix typoKarl Williamson2022-09-181-1/+1
|