delta/perl.git - github.com: perl/perl5.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	regen/HeaderParser.pm - remove comments from elif/else/endif when the ↵	Yves Orton	2023-04-05	1	-1/+1
\| \| \| \| \| \| \| \| \|	contents is short it is a bit "noisy" to have comments that duplication the conditions when the original line with the condition is visible on the screen at the same time. this patch changes the rules so we only add these comments when the clause is 10 lines or more from its prior
*	fix incorrect vi filetype declarations in generated files	Lukas Mai	2023-03-24	1	-1/+1
\| \| \| \| \|	Vim's filetype declarations are case sensitive. The correct types for Perl, C, and Pod are perl, c, and pod, respectively.
*	pp_ctl.c - add support for hooking require.	Yves Orton	2023-03-18	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This defines a new magic hash C<%{^HOOK}> which is intended to be used for hooking keywords. It is similar to %SIG in that the values it contains are validated on set, and it is not allowed to store something in C<%{^HOOK}> that isn't supposed to be there. Hooks are expected to be coderefs (people can use currying if they really want to put an object in there, the API is deliberately simple.) The C<%{^HOOK}> hash is documented to have keys of the form "${keyword}__${phase}" where $phase is either "before" or "after" and in this initial release two hooks are supported, "require__before" and "require__after": The C<require__before> hook is called before require is executed, including any @INC hooks that might be fired. It is called with the path of the file being required, just as would be stored in %INC. The hook may alter the filename by writing to $_[0] and it may return a coderef to be executed after the require has completed, otherwise the return is ignored. This coderef is also called with the path of the file which was required, and it will be called regardless as to whether the require (or its dependencies) die during execution. This mechanism makes it trivial and safe to share state between the initial hook and the coderef it returns. The C<require__after> hook is similar to the C<require__before> hook however except that it is called after the require completes (successfully or not), and its return is ignored always.
*	generated files - update mode lines to specify file type	Elvin Aslanov	2023-02-19	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	This updates the mode-line for most of our generated files so that they include file type information so they will be properly syntax highlighted on github. This does not make any other functional changes to the files. [Note: Commit message rewritten by Yves]
*	embed.h - make regen after recent changes	Yves Orton	2022-12-24	1	-349/+351
\| \| \| \|	Note this also tidies embed.fnc as well.
*	Some locale operations need to be done in proper thread	Karl Williamson	2022-10-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a step in solving #20155 The POSIX 2008 locale API introduces per-thread locales. But the previous global locale system is retained, probably for backward compatibility. The POSIX 2008 interface causes memory to be malloc'd that needs to be freed. In order to do this, the caller must first stop using that memory, by switching to another locale. perl accomplishes this during termination by switching to the global locale, which is always available and doesn't need to be freed. Perl has long assumed that all that was needed to switch threads was to change out tTHX. That's because that structure was intended to hold all the information for a given thread. But it turns out that this doesn't work when some library independently holds information about the thread's state. And there are now some libraries that do that. What was happening in this case was that perl thought that it was sufficient to switch tTHX to change to a different thread in order to do the freeing of memory, and then used the POSIX 2008 function to change to the global locale so that the memory could be safely freed. But the POSIX 2008 function doesn't care about tTHX, and actually was typically operating on a different thread, and so changed that thread to the global locale instead of the intended thread. Often that was the top-level thread, thread 0. That caused whatever thread it was to no longer be in the expected locale, and to no longer be thread-safe with regards to localess, This commit causes locale_term(), which has always been called from the actual terminating thread that POSIX 2008 knows about, to change to the global thread and free the memory. It also creates a new per-interpreter variable that effectively maps the tTHX thread to the associated POSIX 2008 memory. During perl_destruct(), it frees the memory this variable points to, instead of blindly assuming the memory to free is the current tTHX thread's. This fixes the symptoms associtated with #20155, but doesn't solve the whole problem. In general, a library that has independent thread status needs to be updated to the new thread when Perl changes threads using tTHX. Future commits will do this.
*	locale: Create special variable to hold current LC_ALL	Karl Williamson	2022-10-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Some configurations require us to store the current locale for each category. Prior to this commit, this was done in the array PL_curlocales, with the entry for LC_ALL being in the highest element. Future commits will need just the value for LC_ALL in some other configurations, without needing the rest of the array. This commit splits off the LC_ALL element into its own per-interpreter variable to accommodate those. It always had to have special handling anyway beyond the rest of the array elements,
*	Use general locale mutex for numeric operations	Karl Williamson	2022-09-09	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit removes the separate mutex for locking locale-related numeric operations on threaded perls; instead using the general locale one. The previous commit made that a general semaphore, so now suitable for use for this purpose as well. This means that the locale can be locked for the duration of some sprintf operations, longer than before this commit. But on most modern platforms, thread-safe locales cause this lock to expand just to a no-op; so there is no effect on these. And on the impacted platforms, one is not supposed to be using locales and threads in combination, as races can occur. This lock is used on those perls to keep Perl's manipulation of LC_NUMERIC thread-safe. And for those there is also no effect, as they already lock around those sprintf's.
*	Make the locale mutex a general semaphore	Karl Williamson	2022-09-09	1	-0/+1
\| \| \| \| \|	Future commits will use this new capability, and in Configurations where no locale locking is currently necessary.
*	locale.c: Rmv no longer used code; UTF8ness cache	Karl Williamson	2022-09-02	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	What these functions do has been subsumed by code introduced in previous commits, and in a more straight forward manner. Also removed in this commit is the cache of the knowing what locales are UTF-8 or not. This data is now cheaper to calculate when needed, and there is now a single entry cache, so I don't think the complexity warrants keeping it. It could be added back if necessary, split off from the remainder of this commit.
*	op.c - Restrict nested eval/BEGIN blocks to a user controllable maximum	Yves Orton	2022-09-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Nested BEGIN blocks can cause us to segfault by exhausting the C stack. Eg: perl -le'sub f { eval "BEGIN { f() }" } f()' will segfault. This adds a new interpreter var PL_eval_begin_nest_depth to keep track of how many layer of eval/BEGIN we have seen, and a new reserved variable called ${^MAX_NESTED_EVAL_BEGIN_BLOCKS} which can be used to raise or lower the limit. When set to 0 it blocks BEGIN entirely, which might be useful from time to time. This fixes https://github.com/Perl/perl5/issues/20176
*	locale.c: Save underlying radix character	Karl Williamson	2022-09-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	When changing locales the new decimal point needs to be calculated. This commit creates a new per-interpreter variable to save that calculation, so it only has to be done when a new locale is set; prior to this commit it was recalculated each time it was needed. The calculation is still performed twice when the new locale is switched into. But the redundant calculation will be removed in a couple of commits hence.
*	locale.c: Cache the current LC_CTYPE locale name	Karl Williamson	2022-08-31	1	-0/+1
\| \| \| \| \| \| \| \|	This is now used as a cache of length 1 to avoid having to lookup up the UTF-8ness as often. This commit also skips doing S_newctype() if the new boss is the same as the old
*	Add a new env var PERL_RAND_SEED	Yves Orton	2022-08-12	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This env var can be used to trigger a repeatable run of a script which calls C<srand()> with no arguments, either explicitly or implicitly via use of C<rand()> prior to calling srand(). This is implemented in such a way that calling C<srand()> with no arguments in forks or subthreads (again explicitly or implicitly) will receive their own seed but the seeds they receive will be repeatable. This is intended for debugging and perl development performance testing, and for running the test suite consistently. It is documented that the exact seeds used to initialize the random state are unspecified, and that they may change between releases or even builds. The only guarantee provided is that the same perl executable will produce the same results twice all other things being equal. In practice and in core testing we do expect consistency, but adding the tightest set of restrictions on our commitments seemed sensible. The env var is ignored when perl is run setuid or setgid similarly to the C<PERL_INTERNAL_RAND_SEED> env var.
*	locale.c: Add fcn to hide edge case undefined behavior	Karl Williamson	2022-08-09	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The POSIX 2008 API has an edge case in that the result of most of the functions when called with a global (as opposed to a per-thread) locale is undefined. The duplocale() function is the exception which will create a per-thread locale containing the values copied from the global one. This commit just calls duplocale, if needed, and the caller need not concern itself with this possibility
*	locale.c: Generalize stdize_locale()	Karl Williamson	2022-08-09	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This function is rewritten to handle LC_ALL, and to make it easier to add new checks. There is also a change, which I think is an improvement, in that everything starting with a \n is trimmed, instead of just a trailing \n. A couple of calls to stdize_locale() are removed, as they are redundant, because they are called only as a result of Perl_setlocale() being called, and that ends up calling stdize_locale always, early on. The call to savepv() is also moved in a couple cases to after the result is known to not be NULL I originally had such a new check in mind, but it turned out that doing it here didn't solve the problem, so this commit has been amended (before ever being pushed) to not include that. chomped.
*	Make fc(), qr//i thread-safe on participating platforms	Karl Williamson	2022-06-11	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A long standing bug in Perl that has gone undetected is that the array is global that is created when changing locales and tells fc() and qr//i matching what the folds are in the new locale. What this means is that any program only has one set of fold definitions that apply to all threads within it, even if we claim that the locales are thread-safe on the given platform. One possibility for this going undetected so long is that no one is using locales on multi-threaded systems much. Another possibility is that modern UTF-8 locales have the same set of folds as any other one. It is a simple matter to make the fold array per-thread instead of per-process, and that solves the problem transparently to other code. I discovered this stress-testing locale handling under threads. That test will be added in a future commit. In order to keep from having a dTHX inside foldEQ_locale, it has to have a pTHX_ parameter. This means that the other functions that function pointer variables get assigned to point to have to have an identical signature, which means adding pTHX_ to functions that don't require it. The bodies of all these are known to the compiler, since they are all in inline.h or in the same .c file as where they are called. Hence the compiler can optimize out the unused parameter. Two calls of STR_WITH_LEN also have to be changed because of C preprocessor limitations; perhaps there is another way to do it that I'm unfamiliar with.
*	make PL_origenviron global	Tomasz Konojacki	2022-05-29	1	-1/+0
\| \| \| \| \|	The purpose of PL_origenviron is to preserve the earliest known value of environ, which is a global. All interpreters should share it.
*	Add a PL_prevailing_version interpreter var	Paul "LeoNerd" Evans	2022-02-13	1	-0/+1
\| \| \| \| \| \|	Save/restore PL_prevailing_version at SAVEHINTS time Have PL_prevailing_version track the applied use VERSION currently in scope
*	replace all instances of PERL_IMPLICIT_CONTEXT with MULTIPLICITY	Tomasz Konojacki	2021-06-09	1	-23/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Since the removal of PERL_OBJECT (acfe0abcedaf592fb4b9cb69ce3468308ae99d91) PERL_IMPLICIT_CONTEXT and MULTIPLICITY have been synonymous and they're being used interchangeably. To simplify the code, this commit replaces all instances of PERL_IMPLICIT_CONTEXT with MULTIPLICITY. PERL_IMPLICIT_CONTEXT will stay defined for compatibility with XS modules.
*	Fix broken PERL_MEM_LOG under threads	Karl Williamson	2020-12-19	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes GH #18341 There are problems with getenv() on threaded perls wchich can lead to incorrect results when compiled with PERL_MEM_LOG. Commit 0b83dfe6dd9b0bda197566adec923f16b9a693cd fixed this for some platforms, but as Tony Cook, pointed out there may be standards-compliant platforms that that didn't fix. The detailed comments outline the issues and (complicated) full solution.
*	Remove obsolete FCRYPT ifdefs and associated PL_cryptseen (#17624)	Richard Leach	2020-07-30	1	-1/+0
\| \| \|	Co-authored-by: Karl Williamson <khw@cpan.org>
*	Remove PERL_GLOBAL_STRUCT	Dagfinn Ilmari Mannsåker	2020-07-20	1	-129/+0
\| \| \| \| \| \| \| \|	This was originally added for MinGW, which no longer needs it, and only still used by Symbian, which is now removed. This also leaves perlapi.[ch] empty, but we keep the header for CPAN backwards compatibility.
*	Make PL_utf8_foldclosures interpreter level	Karl Williamson	2020-06-02	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This resolves #17774. This ticket is because the fixes in GH #17154 failed to get every case, leaving this one outlier to be fixed by this commit. The text in https://github.com/Perl/perl5/issues/17154 gives extensive details as to the problem. But briefly, in an attempt to speed up interpreter cloning, I moved certain SVs from interpreter level to global level in e80a0113c4a8036dfb22aec44d0a9feb65d36fed (v5.27.11, March 2018). This was doable, we thought, because the content of these SVs is constant throughout the life of the program, so no need to copy them when cloning a new interpreter or thread. However when an interpreter exits, all its SVs get cleaned up, which caused these to become garbage in applications where another interpreter remains running. This circumstance is rare enough that the bug wasn't reported until September 2019, #17154. I made an initial attempt to fix the problem, and closed that ticket, but I overlooked one of the variables, which was reported in #17774, which this commit addresses. Effectively the behavior is reverted to the way it was before e80a0113c4a8036dfb22aec44d0a9feb65d36fed.
*	Add mutex for accessing ENV	Karl Williamson	2020-03-11	1	-0/+2
\|
*	optimize sort by inlining comparison functions	Tomasz Konojacki	2020-03-09	1	-1/+0
\| \| \| \| \| \| \| \|	This makes special-cased forms such as sort { $b <=> $a } even faster. Also, since this commit removes PL_sort_RealCmp, it fixes the issue with nested sort calls mentioned in gh #16129
*	Fixup POSIX::mbtowc, wctomb	Karl Williamson	2020-02-19	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit enhances these functions so that on threaded perls, they use mbrtowc and wcrtomb when available, making them thread safe. The substitution isn't completely transparent, as no effort is made to hide any differences in errno setting upon error. And there may be slight differences in edge case behavior on some platforms. This commit also changes the behaviors so that they take a scalar parameter instead of a char *, and this might be 'undef' or not be forceable into a valid PV. If not a PV, the functions initialize the shift state. Previously the shift state was always reinitialized with every call, which meant these could not work on locales with shift states. In addition, there were several issues in mbtowc and wctomb that this commit fixes. mbtowc and wctomb, when used, are now run with a semaphore. This avoids races if called at the same time in another thread. The returned wide character from mbtowc() could well have been garbage. The final parameter to mbtowc is now optional, as passing an SV allows us to determine the length without the need for an extra parameter. It is now used only to restrict the parsing of the string to shorter than the actual length. wctomb would segfault if the string parameter was shared or hadn't been pre-allocated with a string of sufficient length to hold the result.
*	POSIX::mblen() Make thread-safe; allow shift state control	Karl Williamson	2020-02-19	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit changes the behavior so that it takes a scalar parameter instead of a char *, and thus might not be forceable into a valid PV. When not a PV, the shift state is reinitialized, like calling mblen with a NULL first parameter. Previously the shift state was always reinitialized with every call, which meant this could not work on locales with shift states. This commit also changes to use mbrlen() on threaded perls transparently (mostly), when available, to achieve thread-safe operation. It is not completely transparent because mbrlen (under the very rare stateful locales) returns a different value when it's resetting the shift state. It also may set errno differently upon errors, and no effort is made to hide that difference. Also mbrlen on some platforms can handle partial characters. [perl #133928] showed that someone was having trouble with shift states.
*	Revert "Move PL_check to the interp vars to fix threading issues"	Tony Cook	2019-12-16	1	-1/+2
\| \| \| \| \|	and the associated commits, at least until a way to make wrap_op_checker() work is available.
*	Move PL_check to the interp vars to fix threading issues	Stefan Seifert	2019-12-12	1	-2/+1
\| \| \| \|	Fixes issue #14816
*	Move regex global variables to interpreter level	Karl Williamson	2019-11-26	1	-62/+31
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is part of fixing gh #17154 This scenario from the ticket (https://github.com/Perl/perl5/issues/17154#issuecomment-558877358) shows why this fix is necessary: main interpreter initializes PL_AboveLatin1 to an SV it owns loads threads::lite and creates a new thread/interpreter which initializes PL_AboveLatin1 to a SV owned by the new interpreter threads::lite child interpreter finishes, freeing all of its SVs, PL_AboveLatin1 is now invalid main interpreter uses a regexp that relies on PL_AboveLatin1, dies horribly. By making these interpreter level variables, this is avoided. There is extra copying, but it is just the SV headers, as the real data is kept as static C arrays.
*	add explicit 1-arg and 3-arg sig handler functions	David Mitchell	2019-11-18	1	-0/+6
\| \| \| \| \| \| \|	Currently, whether the OS-level signal handler function is declared as 1-arg or 3-arg depends on the configuration. Add explicit versions of these functions, principally so that POSIX.xs can call which version of the handler it wants regardless of configuration: see next commit.
*	Remove generation and use of NonFinalFold table	Karl Williamson	2019-11-16	1	-2/+0
\| \| \| \| \| \|	With the revamping done in cc288b7a2732c37504039083ebb98241954636be, the table of Unicode case folds that are more than a single character is no longer used, so no need to generate it, or having it available.
*	Remove swashes from core	Karl Williamson	2019-11-06	1	-5/+0
\| \| \| \|	Also references to the term.
*	intrpvar.h: Add variable for use in tr///	Karl Williamson	2019-11-06	1	-0/+1
\| \| \| \|	This is part of this branch of changes.
*	Rmv more deprecated characlassify/case change macros	Karl Williamson	2019-10-31	1	-1/+0
\| \| \| \|	These were missed by 059703b088f44d5665f67fba0b9d80cad89085fd.
*	Add hook for Unicode private use override	Karl Williamson	2019-03-07	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	I am starting to write a Unicode::Private_Use module which will allow one to specify the Unicode properties of private use code points, thus making them actually useful. This commit adds a hook to regcomp.c to accommodate this module. The changes are pretty minimal. This way we don't have to wait another release cycle to get it out there. I don't want to document this interface, until it's proven.
*	fix thread issue with PERL_GLOBAL_STRUCT	David Mitchell	2019-02-19	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MY_CXT subsystem allows per-thread pseudo-static data storage. Part of the implementation for this involves each XS module being assigned a unique index in its my_cxt_index static var when first loaded. Because PERL_GLOBAL_STRUCT bans any static vars, under those builds there is instead a table which maps the MY_CXT_KEY identifying string to index. Unfortunately, this table was allocated per-interpreter rather than globally, meaning if multiple threads tried to load the same XS module, crashes could ensue. This manifested itself in failures in ext/XS-APItest/t/keyword_plugin_threads.t The fix is relatively straightforward: allocate PL_my_cxt_keys globally rather than per-interpreter. Also record the size of this struct in a new var, PL_my_cxt_keys_size, rather than doing double duty on PL_my_cxt_size.
*	foo_cloexec() under PERL_GLOBAL_STRUCT_PRIVATE	David Mitchell	2019-02-19	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fix the various Perl_PerlSock_dup2_cloexec() type functions so that t/porting/liberl.a passes under -DPERL_GLOBAL_STRUCT_PRIVATE builds. In these builds it is forbidden to have any static variables, but each of these functions (via convoluted macros) has a static var called 'strategy' which records, for each function, whether a run-time probe has been done to determine the best way of achieving close-exec functionality, and the result. Replace them all with 'global' vars: PL_strategy_dup2 etc. NB these vars aren't thread-safe but it doesn't really matter, as the worst that can happen is for a redundant probe or two to be done before a suitable "don't probe any more" value is written to the var and seen by all the threads.
*	Add global hash to handle \p{user-defined}	Karl Williamson	2019-02-14	1	-0/+4
\| \| \| \| \| \| \|	A global hash has to be specially handled. The keys can't be shared, and all the SVs stored into it must be in its thread. This commit adds the hash, and initialization, and macros for context change, but doesn't use them. The code to deal with this is entirely confined to regcomp.c.
*	Add mutex for dealing with qr/\p{user-defined}/	Karl Williamson	2019-02-14	1	-0/+2
\| \| \| \|	This will be used in future commits
*	Add variable for if the current UTF-8 locale is Turkic	Karl Williamson	2019-02-05	1	-0/+1
\| \| \| \|	It currently is always set false, until later in this series of commits.
*	regen/mk_invlists.pl: Create new inversion list	Karl Williamson	2019-02-05	1	-0/+2
\| \| \| \|	This will be used in a future commit.
*	Change name of PL_NonL1NonFinalFold	Karl Williamson	2018-12-25	1	-2/+2
\| \| \| \| \|	The inversion list this refers to now includes the Latin 1 range, so the name was misleading.
*	Change name of PL_utf8_foldable variable	Karl Williamson	2018-12-25	1	-2/+2
\| \| \| \| \| \|	This variable's name was out-of-date and misleading. It is the name of an inversion list that contains all the code points in the current version of Unicode that participate in any way in a /i type of fold.
*	regen/mk_invlists.pl: Add new table	Karl Williamson	2018-12-07	1	-0/+2
\| \| \| \| \| \| \|	This table contains all the code points that are in any multi-character fold (not the folded-from character, but what that character folds to). It will be used in a future commit.
*	Make global two interpreter variables	Karl Williamson	2018-07-14	1	-2/+4
\| \| \| \| \|	These variables are constant, once initialized, through the life of a program, so having them be per instance is a waste of time and space
*	regcomp.c: Simplify	Karl Williamson	2018-06-25	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Under /a pattern matching, the matches of the [:posix:] classes are restricted to the ASCII range. Previously, in a time/space trade-off that favored space, we created the list of matching characters at pattern compilation time by ANDing the full-range Posix class with the set of ASCII characters. But now, the tables for just the ASCII-range classes are generated anyway, so there's no need to do that compilation-time intersection. This slightly simplifies the code.
*	Use compiled-in C structure for inverted case folds	Karl Williamson	2018-03-31	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	This commit changes to use the C data structures generated by the previous commit to compute what characters fold to a given one. This is used to find out what things should match under /i. This now avoids the expensive start up cost of switching to perl utf8_heavy.pl, loading a file from disk, and constructing a hash from it.
*	Remove obsolete variables	Karl Williamson	2018-03-31	1	-1/+0
\| \| \| \| \|	These were for when some of the Posix character classes were implemented as swashes, which is no longer the case, so these can be removed.