summaryrefslogtreecommitdiff
path: root/perl.h
Commit message (Collapse)AuthorAgeFilesLines
* perl.h: Make sure PERL_IMPLICIT_CONTEXT doesn't come backKarl Williamson2023-02-101-4/+8
| | | | | This is an obsolete name, retained for back compat with cpan. Make sure the core doesn't have it defined.
* snprintf() calls need to have proper radixKarl Williamson2023-02-101-3/+10
| | | | | | | | | | | | | | | | | | | Calls to libc snprintf() were neglected to be changed when perl was fixed to change the radix character to the proper one based on whether or not 'use locale' is in effect. Perl-level code is unaffected, but core and XS code is. This commit changes to wrap snprintf() calls with the macros designed for the purpose, long used for similar situations elsewhere in the code. Doing this requires the thread context. I achieved this in a few places by a dTHX, instead of assuming a caller would have the context already available, and adding a pTHX_ parameter. I tried doing it the other way, and got a few breakages in our test suite. Formatting already requires significant CPU time, so this addition should just be in the noise This bug was found by new tests that will be added in a future commit.
* Create a specific SV type for object instancesPaul "LeoNerd" Evans2023-02-101-7/+8
|
* perl.h - break up * lined comment leaders and pod commentsYves Orton2023-02-091-4/+9
| | | | | | Having half of the comment have the * on the left side is confusing for humans and especially so for programs. Split the two style into two comments.
* Correct typos as per GH 20435James E Keenan2022-12-291-1/+1
| | | | | | | | | | | | | | | | | | | In GH 20435 many typos in our C code were corrected. However, this pull request was not applied to blead and developed merge conflicts. I extracted diffs for the individual modified files and applied them with 'git apply', excepting four files where patch conflicts were reported. Those files were: handy.h locale.c regcomp.c toke.c We can handle these in a subsequent commit. Also, had to run these two programs to keep 'make test_porting' happy: $ ./perl -Ilib regen/uconfig_h.pl $ ./perl -Ilib regen/regcomp.pl regnodes.h
* Add HvNAMEfARG() macroPaul "LeoNerd" Evans2022-12-241-0/+2
|
* sv.c - add support for HvNAMEf and HvNAMEf_QUOTEDPREFIX formatsYves Orton2022-12-221-0/+2
| | | | | | | | They are similar to SVf and SVf_QUOTEDPREFIX but take an HV * argument and use HvNAME() and related macros to extract the string. This is helpful as it makes constructing error messages from a stash (HV *) easier. It is the callers responsibility to ensure that the HV is actually a stash.
* Define five new operator precedence levelsPaul "LeoNerd" Evans2022-12-161-6/+13
| | | | | | | | | | | | Assignment operators (`==`) were missing, as were both the logical and the low-precedence shortcutting OR and AND operators (`&&`, `||`, `and`, `or`) Also renumbered them around somewhat to even out the spacing. This is fine during a development cycle. Also renamed the tokenizer/parser symbol names from "PLUG*OP" to "PLUGIN_*_OP" for better readability.
* Add comment to infix operator precedence enum about when we can/can't change ↵Paul "LeoNerd" Evans2022-12-161-1/+5
| | | | the numbers
* regcomp.c - decompose into smaller filesYves Orton2022-12-091-2/+2
| | | | | | | | | | | | | | | | | This splits a bunch of the subcomponents of the regex engine into smaller files. regcomp_debug.c regcomp_internal.h regcomp_invlist.c regcomp_study.c regcomp_trie.c The only real change besides to the build machine to achieve the split is to also adds some new defines which can be used in embed.fnc to control exports without having to enumerate /every/ regex engine file. For instance all of regcomp*.c defines PERL_IN_REGCOMP_ANY, and this is used in embed.fnc to manage exports.
* Define a PL_infix_plugin hook, of a similar style to PL_keyword_pluginPaul "LeoNerd" Evans2022-12-081-0/+17
| | | | | | | | | Runs for identifier-named custom infix operators and sequences of non-identifier symbol characters. Defines multiple precedence levels for custom infix operators that fit alongside exponentiation, multiplication, addition, or relational comparision operators, as well as a "high" and "low" at either end.
* locale.c: Rewrite localeconv() handlingKarl Williamson2022-12-071-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | localeconv() returns a structure contaiing fields that are associated with two different categories: LC_NUMERIC and LC_MONETARY. Perl via POSIX::localeconv() reutrns a hash containing all the fields. Testing on Windows showed that if LC_CTYPE is not the same locale as LC_MONETARY for the monetary fields, or isn't the same as LC_NUMERIC for the numeric ones, mojibake can result. The solution to similar situations elsewhere in the code is to toggle LC_CTYPE into being the same locale as the one for the returned fields. But those situations only have a single locale that LC_CTYPE has to match, so it doesn't work here when LC_NUMERIC and LC_MONETARY are different locales. Unlike Schrödinger's cat, LC_CTYPE has to be one or the other, not both at the same time. The previous implementation did not consider this possibility, and wasn't easily changeable to work. Therefore, this rewrites a bunch of it. The solution used is to call localeconv() twice when the LC_NUMERIC locale and the LC_MONETARY locale don't match (with LC_CTYPE toggled to the corresponding one each time). (Only one call is made if the two categories have the same locale.) This one vs two complicated the code, but I thought it was worth it given that the one call is the most likely case. Another complication is that on platforms that lack nl_langinfo(), (Windows, for example), localeconv() is used to emulate portions of it. Previously there was a separate function to handle this, using an SV() cast as an HV() to avoid using a hash that wasn't actually necessary. That proved to lead to extra duplicated code under the new scheme, so that function was collapsed into a single one and a real hash is used in all circumstances, but is only populated with the one or two fields needed for the emulation. The only part of this commit that I thought could be split off from the rest concerns the fact that localeconv()'s return is not thread-safe, and so must be copied to a safe place (the hash) while in a critical section, locking out all other threads. Before this commit, that copying was accompanied by determining if each string field needed to be marked as UTF-8. That determination isn't necessarily trivial, so should really not be in the critical section. This commit does that. And, with some effort, that part could have been split into a separate commit. but I didn't think it was worth the effort.
* locales: Add LC_NAME capabilitiesKarl Williamson2022-12-061-2/+11
| | | | | | | | | | | | | LC_NAME is a GNU extension that Perl hadn't been aware of. The consequences were that it couldn't be set or queried in Perl (except by using LC_ALL to set everything). There are other GNU extensions that Perl has long known about; this was the only missing one. The values associated with this category are retrievable by the glibc call nl_langinfo(3) in XS code. The standard-specified items are retrievable from pure Perl via I18N::Langinfo, but it doesn't know only about any of the non-standard ones, including the ones for this category.
* locale.c: Add mutex lock around _wsetlocale() callKarl Williamson2022-12-051-0/+4
| | | | | | The lock expands to nothing if unthreaded, or thread-local storage is in effect. But otherwise protects a global value from being clobbered by another thread.
* PERL_STRLEN_NEW_MIN - increase to multiple of pointer sizesRichard Leach2022-11-211-4/+13
| | | | | | | | | | | | | | | | | | | Major malloc implementations, including the popular dlmalloc derivatives all return chunks of memory that are a multiple of the platform's pointer size. Perl's traditional default string allocation of 10 bytes will almost certainly result in a larger allocation than requested. Consequently, the interpreter may try to Renew() an allocation to increase the PV buffer size when it does not actually need to do so. This commit increases the default string size to the nearest pointer multiple. (12 bytes for 32-bit pointers, 16 bytes for 64-bit pointers). This is almost certainly unnecessarily small for 64-bit platforms, since most common malloc implementations seem to return 3*pointer size (i.e. 24 bytes) as the smallest allocation. However, 16 bytes was chosen to prevent an increase in memory usage in memory-constrained platforms which might have a smaller minimum memory allocation.
* Extract minimum PV buffer/AV element size to common definitionsRichard Leach2022-11-211-0/+17
| | | | | | | | | | | | | | | | | In a nutshell, for a long time the minimum PV length (hardcoded in Perl_sv_grow) has been 10 bytes and the minimum AV array size (hardcoded in av_extend_guts) has been 4 elements. These numbers have been used elsewhere for consistency (e.g. Perl_sv_grow_fresh) in the past couple of development cycles. Having a standard definition, rather than hardcoding in multiple places, is more maintainable. This commit therefore introduces into perl.h: PERL_ARRAY_NEW_MIN_KEY PERL_STRLEN_NEW_MIN (Note: Subsequent commit(s) will actually change the values.)
* Figure out I32df, U32uf, etc. in Configure rather than in perl.hTAKAI Kousuke2022-11-141-51/+0
| | | | | | | | | | | | These macros were defined in perl.h using preprocessor conditionals, but determining wheter I32 is "int" or "long" is pretty hard with preprocessor, when INTSIZE == LONGSIZE. The Configure script should know exact underlying type of I32, so it should be able to determine whether %d or %ld shall be used to format I32 value more robustly. Various pre-configured files, such as uconfig.h, are updated to align with this.
* cop.h - get rid of the STRLEN* stuff from cop_warningsYves Orton2022-11-021-2/+2
| | | | | With RCPV strings we can use the RCPV_LEN() macro, and make this logic a little less weird.
* Also add a STRLEN member to ANYPaul "LeoNerd" Evans2022-11-011-0/+1
|
* Add Size_t and SSize_t members to ANYPaul "LeoNerd" Evans2022-11-011-0/+2
|
* Convert tabs to whitespaces in union any {} definitionPaul "LeoNerd" Evans2022-11-011-9/+9
|
* Switch libc per-interpreter data when tTHX changesKarl Williamson2022-10-181-2/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As noted in the previous commit, some library functions now keep per-thread state. So far the only ones we care about are libc locale-changing ones. When perl changes threads by swapping out tTHX, those library functions need to be informed about the new value so that they remain in sync with what perl thinks the locale should be. This commit creates a function to do this, and changes the thread-changing macros to also call this as part of the change. For POSIX 2008, the function just calls uselocale() using the per-interpreter object introduced previously. For Windows, this commit adds a per-interpreter string of the current LC_ALL, and the function calls setlocale on that. We keep the same string for POSIX 2008 implementations that lack querylocale(), so this commit just enables that variable on Windows as well. The code is already in place to free the memory the string occupies when done. The commit also creates a mechanism to skip this during thread destruction. A thread in its death throes doesn't need to have accurate locale information, and the information needed to map from thread to what libc needs to know gets destroyed as part of those throes, while relics of the thread remain. I couldn't find a way to accurately know if we are dealing with a relic or not, so the solution I adopted was to just not switch during destruction. This commit completes fixing #20155.
* locale: Create special variable to hold current LC_ALLKarl Williamson2022-10-181-0/+1
| | | | | | | | | | | | Some configurations require us to store the current locale for each category. Prior to this commit, this was done in the array PL_curlocales, with the entry for LC_ALL being in the highest element. Future commits will need just the value for LC_ALL in some other configurations, without needing the rest of the array. This commit splits off the LC_ALL element into its own per-interpreter variable to accommodate those. It always had to have special handling anyway beyond the rest of the array elements,
* Clean up perl.h/makedef.pl common logicKarl Williamson2022-10-181-27/+35
| | | | | This has gotten two twisty little mazy over time. Clean it up, add comments, and make sure the logic is the same on both.
* Don't #define USE_THREAD_SAFE LOCALE unless threadedKarl Williamson2022-10-181-2/+2
| | | | | | If there aren't threads, yes locales are trivially thread-safe, but the code that gets executed to make them so doesn't need to get compiled, and that is controlled by this #define.
* handy.h: Set macro to false if can't ever be trueKarl Williamson2022-10-101-2/+2
| | | | | | | | | | | | It's unlikely that perl will be compiled with out the LC_CTYPE locale category being enabled. But if it isn't, there is no sense in having per-interpreter variables for various conditions in it, and no sense having code that tests those variables. This commit changes a macro to always yield 'false' when this is disabled, adds a new similar macro, and changes some occurrences that test for a variable to use the macros instead of the variables. That way the compiler knows these to conditions can never be true.
* perl.h: Rmv nested STMT_START...ENDKarl Williamson2022-10-021-2/+2
| | | | The outer pair is all that is necessary.
* Add mutexes for various libc callsKarl Williamson2022-09-291-0/+17
| | | | | | | There are various system calls used by perl that need to be protected by a mutex in some configurations. This commit adds the ones not previously added, for use in future commits. Further details are in the merge commit message for this series of commits.
* perl.h: Finish implementing combo ENV/LOCALE mutexesKarl Williamson2022-09-291-5/+161
| | | | | | | | | | | | | | | | | | There are cases where an executing function is vulnerable to either the locale or environment being changed by another thread. This commit implements macros that use mutexes to protect these critical sections. There are two cases that exist: one where the functions only read; and one where they can also need exclusive control so that a competing thread can't overwrite the returned static buffer before it is safely copied. 5.32 had a placeholder for these, but didn't actually implement it. Instead it locked just the ENV portion. On modern platforms with thread-safe locales, the locale portion is a no-op anyway, so things worked on them. This new commit extends that safety to other platforms. This has long been a vulnerability in Perl.
* perl.h: Move some statementsKarl Williamson2022-09-291-6/+6
| | | | So they are closer to related statements
* Change ENV/LOCALE locking read macro namesKarl Williamson2022-09-291-2/+2
| | | | The old name was confusing.
* Remove ENV_LOCALE_LOCK/UNLOCK macrosKarl Williamson2022-09-291-14/+5
| | | | | These are subsumed by gwENVr_LOCALEr_LOCK created in the previous commit.
* perl.h: Add #define for gwENVr_LOCALEr_UNLOCKKarl Williamson2022-09-291-1/+25
| | | | | This is for functions that read the locale and environment and write to some global space.
* perlhacktips: Add section on writing safer macrosKarl Williamson2022-09-281-104/+21
| | | | | And remove the similar advice but which applied only to STMT_START {} STMT_END
* perl.h: Remove now empty blockKarl Williamson2022-09-211-35/+0
| | | | | Previous commits have left this empty except for comments, and equivalent comments have also been added elsewhere
* perl.h: Move LOCALE_READ_LOCK #definitionKarl Williamson2022-09-211-7/+6
| | | | To enable future simplifications
* perl.h: Move #defining SETLOCALE_LOCKKarl Williamson2022-09-211-13/+8
| | | | This simplifies slightly, and will allow further simplification
* Add POSIX_SETLOCALE_LOCK/UNLOCKKarl Williamson2022-09-211-0/+18
| | | | | | This macro is used to surround raw setlocale() calls so that the return value in a global static buffer can be saved without interference with other threads.
* perl.h: Fix typoKarl Williamson2022-09-181-1/+1
|
* perl.h: Rmv duplicate #defineKarl Williamson2022-09-101-5/+0
| | | | | LOCALE_LOCK has already been defined in all circumstances earlier in the file
* Revert "perl.h: Move #defining SETLOCALE_LOCK"Karl Williamson2022-09-101-8/+13
| | | | | This reverts commit d0b8b8e8a48798446382161f988e6081140578d6. I got ahead of myself. This commit was premature
* perl.h: Move #defining SETLOCALE_LOCKKarl Williamson2022-09-101-13/+8
| | | | This simplifies slightly, and will allow further simplification
* Move #include from locale.c to perl.hKarl Williamson2022-09-101-0/+2
| | | | | | | | | Without this commit, Perl won't compile if -DUSE_NL_LOCALE_NAME is specified to Configure. This is an undocumented feature that uses an undocumented glibc feature that is effectively the querylocale() found on Darwin and some other systems. POSIX 2017 has added a querylocale-like function to the repertoire, and should eventually supplant this option.
* perl.h: Remove LOCALECONV_LOCKKarl Williamson2022-09-091-16/+1
| | | | This is needed in just one function, in locale.c, so more it there.
* perl.h: Remove NL_LANGINFO_LOCKKarl Williamson2022-09-091-10/+1
| | | | This is needed in precisely one place in the code, so move it to there.
* Redefine the POSIX.xs locale macros using prev commitKarl Williamson2022-09-091-22/+10
| | | | | This commit uses the new macro introduced in the previous commit to define the internal locale mutex macros in POSIX.xs
* Add locale macro to wrap global-memory-using functionsKarl Williamson2022-09-091-0/+27
| | | | | | | | | | | | | | Some functions return a result in a global-to-the-program buffer, or they use global memory internally. Other threads must be kept from simultaneously using that function. This macro is to be used for all such ones dealing with locales. Ideally, there would be a separate mutex for each such buffer space. But these functions also have to lock the locale from changing during their execution, and there aren't that many such functions, and they actually are rarely executed. So a single lock will do. This will allow future commits to have more targeted locking for functions that don't affect the global locale.
* Use general locale mutex for numeric operationsKarl Williamson2022-09-091-85/+18
| | | | | | | | | | | | | | | | This commit removes the separate mutex for locking locale-related numeric operations on threaded perls; instead using the general locale one. The previous commit made that a general semaphore, so now suitable for use for this purpose as well. This means that the locale can be locked for the duration of some sprintf operations, longer than before this commit. But on most modern platforms, thread-safe locales cause this lock to expand just to a no-op; so there is no effect on these. And on the impacted platforms, one is not supposed to be using locales and threads in combination, as races can occur. This lock is used on those perls to keep Perl's manipulation of LC_NUMERIC thread-safe. And for those there is also no effect, as they already lock around those sprintf's.
* Make the locale mutex a general semaphoreKarl Williamson2022-09-091-31/+74
| | | | | Future commits will use this new capability, and in Configurations where no locale locking is currently necessary.
* perl.h: Reorder cpp branchesKarl Williamson2022-09-091-3/+3
| | | | Disposing of the trivial case first makes things easier to read.