summaryrefslogtreecommitdiff
path: root/perl.c
Commit message (Collapse)AuthorAgeFilesLines
* pp_i_modulo(): remove workaround for ancient glibc bugDagfinn Ilmari Mannsåker2020-02-051-22/+0
| | | | | | | | | Old glibc versions had a buggy modulo implementation for 64 bit integers on 32-bit architectures. This was fixed in glibc 2.3, released in 2002 (the version check in the code is overly cautious). Removing the alternate PP function support is left for the next commit, in case we need to resurrect it in future.
* Bump copyright to 2020 in perl.c and README.Nicolas R2020-01-021-2/+2
| | | | | | | check that porting/copyright.t is passing when run with --now ../perl -I../lib porting/copyright.t --now
* Add memCHRs() macro and use itKarl Williamson2019-12-181-1/+1
| | | | | | | This replaces strchr("list", c) calls throughout the core. They don't work properly when 'c' is a NUL, returning the position of the terminating NUL in "list" instead of failure. This could lead to segfaults or even security issues.
* Revert "Move PL_check to the interp vars to fix threading issues"Tony Cook2019-12-161-1/+0
| | | | | and the associated commits, at least until a way to make wrap_op_checker() work is available.
* Move PL_check to the interp vars to fix threading issuesStefan Seifert2019-12-121-0/+1
| | | | Fixes issue #14816
* Note that G_RETHROW is documentedKarl Williamson2019-12-111-0/+1
| | | | This is for Devel::PPPort.
* Move regex global variables to interpreter levelKarl Williamson2019-11-261-0/+69
| | | | | | | | | | | | | | | | | | | | | | | This is part of fixing gh #17154 This scenario from the ticket (https://github.com/Perl/perl5/issues/17154#issuecomment-558877358) shows why this fix is necessary: main interpreter initializes PL_AboveLatin1 to an SV it owns loads threads::lite and creates a new thread/interpreter which initializes PL_AboveLatin1 to a SV owned by the new interpreter threads::lite child interpreter finishes, freeing all of its SVs, PL_AboveLatin1 is now invalid main interpreter uses a regexp that relies on PL_AboveLatin1, dies horribly. By making these interpreter level variables, this is avoided. There is extra copying, but it is just the SV headers, as the real data is kept as static C arrays.
* add explicit 1-arg and 3-arg sig handler functionsDavid Mitchell2019-11-181-1/+4
| | | | | | | Currently, whether the OS-level signal handler function is declared as 1-arg or 3-arg depends on the configuration. Add explicit versions of these functions, principally so that POSIX.xs can call which version of the handler it wants regardless of configuration: see next commit.
* Add -Dy debugging of tr///, y///Karl Williamson2019-11-171-1/+2
|
* intrpvar.h: Add variable for use in tr///Karl Williamson2019-11-061-0/+3
| | | | This is part of this branch of changes.
* perl.c: Remove obsolete commentKarl Williamson2019-10-311-1/+0
|
* Note that G_METHOD[_NAMED] are documentedKarl Williamson2019-09-021-0/+3
|
* Document my_exit()Karl Williamson2019-09-021-0/+9
|
* Revert "Revert "postpone perl_parse() exit(0) bugfix""Tony Cook2019-08-071-8/+19
| | | | | | | | This reverts commit 2773b4f50f991900e38d33daace2b9c6a0902c6a. I haven't made much progress in resolving the problems this produces downstream, so rather than leaving it broken, I'll revert it until they can be solved.
* Revert "postpone perl_parse() exit(0) bugfix"Tony Cook2019-07-091-19/+8
| | | | | | | | | | This reverts commit 857320cbf85e762add18885ae8a197b5e0c21b69, re-instating the [perl #2754] fix, which was reverted in late 2017 to allow Module::Install based distributions to update or re-work per [perl #132577]. # Conflicts: # t/op/blocks.t
* (perl #134177) add G_RETHROW flag to eval_sv()Tony Cook2019-07-081-9/+16
| | | | and update eval_pv() to use it.
* honour $PERL_DESTRUCT_LEVEL on non-debug buildsDavid Mitchell2019-06-251-4/+0
| | | | | This environment variable was previously only checked for on DEBUGGING builds.
* The Windows CE Chainsaw MassacreSteve Hay2019-06-181-6/+0
| | | | | Remove WinCE support as agreed in the thread starting here: https://www.nntp.perl.org/group/perl.perl5.porters/2018/07/msg251683.html
* In Perl_eval_pv rethrow error via croak_sv()Pali2019-06-051-2/+1
| | | | This would allow to rethrow object exceptions.
* Remove redundant info on =for apidoc linesKarl Williamson2019-05-301-15/+15
| | | | | | | | | This information is already in embed.fnc, and we know it compiles. Some of this information is now out-of-date. Get rid of it. There was one bit of information that was (apparently) wrong in embed.fnc. The apidoc line asked that there be no usage example generated for newXS. I added that flag to the embed.fnc entry.
* PATCH: [perl #133959] Free BSD broken testsKarl Williamson2019-03-271-1/+5
| | | | | | | | | | | | | | | | Commit 70bd6bc82ba64c1d197d3ec823f43c4a454b2920 fixed a leak (likely due to a bug in glibc) by not duplicating the C locale object. However, that meant that there's only one copy running around. And freeing that will cause havoc, as its supposed to be there until destruction. What appears to be happening is that the current locale object is freed upon thread destruction, and that could be this global one. But I don't understand why it's only happening on Free BSD and only on this version. But this commit fixes the problem there, and makes sense. Simply don't free this global object upon thread destruction. This commit also changes it so it doesn't get destroyed at destruction time, leaving it to the final PERL_SYS_TERM to free. I'm not sure, but I think this fixes any issues with embedded perls.
* fix leak in BEGIN { threads->new(...) }David Mitchell2019-03-251-0/+15
| | | | | | | | | Normally by the time we reach perl_destruct(), PL_parser should be null due to having its original (null) value restored by SAVEt_PARSER during leaving scope (usually before run-time starts in fact). But if a thread is created within a BEGIN block, the parser is duped, but the SAVEt_PARSER savestack entry isn't. So PL_parser never gets cleaned up. Clean it up in perl_destruct() instead. This is a bit of a hack.
* Add, improve some debugging stmts for -DL (locales)Karl Williamson2019-03-211-0/+5
|
* Add mutex for dealing with qr/\p{user-defined}/Karl Williamson2019-02-141-0/+1
| | | | This will be used in future commits
* Bump copyright to 2019 in perl.c and README.Abigail2019-01-201-2/+2
|
* Use same mixture of hard-tabs and spaces for indentJames E Keenan2019-01-051-1/+1
| | | | ... as in other ifdefs within S_Internals_V(pTHX_ CV *cv).
* Add USE_THREAD_SAFE_LOCALE to non-bin-compat options listKarl Williamson2018-11-271-0/+3
| | | | Spotted by Tux
* perl.c: Silence compiler warningKarl Williamson2018-10-071-2/+2
| | | | A space is needed in these formats to comply with C++11
* perl.c: Use TAINT_get, instead of PL_tainting.Karl Williamson2018-08-051-1/+1
| | | | The former is designed to be compilable out.
* Make global two interpreter variablesKarl Williamson2018-07-141-6/+0
| | | | | These variables are constant, once initialized, through the life of a program, so having them be per instance is a waste of time and space
* grok_atoUV: allow non-C strings and documentKarl Williamson2018-06-251-1/+2
| | | | | | | | | | This changes the internal function grok_atoUV() to not require its input to be NUL-terminated. That means the existing calls to it must be changed to set the ending position before calling it, as some did already. This function is recommended to use in a couple of pods, but it wasn't documented in perlintern. This commit does that as well.
* revert perl_run() 0 -> 256 return mappingDavid Mitchell2018-05-281-10/+3
| | | | | | | | | | | | | | | | | | | | | | | | | RT #133220 This commit partially reverts v5.27.6-180-g0301e89953. That commit changed the return values of perl_parse() and perl_run() so that an exit(0) wouldn't return 0 (which indicates a normal finish) and instead return 0x100, which a indicates non-normal return, but with a value which if used as an 8-bit process exit value on UNIX, has the modulo value of 0. However, it turns out that perl_run() (via S_run_body()) does a my_exit(0) rather than just running to completion. So it turns out that it's not possible to distinguish between perl code finishing normally, and perl code doing exit(0). This broke code which embedded perl and expected perl_run() to return 0 on normal completion. It may be possible to fix this by getting S_run_body() to not call my_exit(0), but that's too unpredictable change while we're at -RC1. So just revert the new perl_run() 0x100 behaviour for now.
* Move inversion lists to utf8.cKarl Williamson2018-04-201-41/+1
| | | | | | | | | | | | | | | | | These previously were statics in perl.c. A future commit would need access to these from regcomp.c. We could create an access function in perl.c so that regcomp.c could access them, or we could move them to regcomp.c. But doing that means also they would be statics in re_comp.c, and that would mean two copies. So that means an access function is needed. Their use is really unrelated to perl.c, which merely initializes them, so that could have an access function instead. But the most logical place for their home is utf8.c, which is described as for Unicode things, not just UTF-8 things. So this commit moves these inversion lists to utf8.c, and creates an initialization function called on perl startup from perl.c
* Use compiled-in C structure for inverted case foldsKarl Williamson2018-03-311-2/+1
| | | | | | | | | | This commit changes to use the C data structures generated by the previous commit to compute what characters fold to a given one. This is used to find out what things should match under /i. This now avoids the expensive start up cost of switching to perl utf8_heavy.pl, loading a file from disk, and constructing a hash from it.
* Remove obsolete variablesKarl Williamson2018-03-311-4/+0
| | | | | These were for when some of the Posix character classes were implemented as swashes, which is no longer the case, so these can be removed.
* Use charnames inversion listsKarl Williamson2018-03-311-0/+2
| | | | | | | | This commit makes the inversion lists for parsing character name global instead of interpreter level, so can be initialized once per process, and no copies are created upon new thread instantiation. More importantly, this is another instance where utf8_heavy.pl no longer needs to be loaded, and the definition files read from disk.
* Move init of 2 inversion lists to perl.cKarl Williamson2018-03-311-0/+2
| | | | | | | These read-only globals can be initialized in perl.c, which allows us to remove runtime checks that they are initialized. This commit also takes advantage of the fact that they are now always initialized to use them as inversion lists, avoid swash creation.
* Move some inversion list init to perl.cKarl Williamson2018-03-261-1/+8
| | | | | | | | The initialization time spent here is trivial, and this saves a copy of these arrays on some systems. This is because there is only one perl.c, and there is both regcomp.c and re_comp.c which would contain the identical static const array. Some OS's won't remove the duplicate copies.
* Move case change invlists from interpreter to globalKarl Williamson2018-03-261-10/+0
| | | | | These are now constant through the life of the program, so don't need to be duplicated at each new thread instantiation.
* Move UTF-8 case changing data into coreKarl Williamson2018-03-261-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this commit, if a program wanted to compute the case-change of a character above 0xFF, the C code would switch to perl, loading lib/utf8heavy.pl and then read another file from disk, and then create a hash. Future references would use the hash, but the start up cost is quite large. There are five case change types, uc, lc, tc, fc, and simple fc. Only the first encountered requires loading of utf8_heavy, but each required switching to utf8_heavy, and reading the appropriate file from disk. This commit changes these functions to use compiled-in C data structures (inversion maps) to represent the data. To look something up requires a binary search instead of a hash lookup. An individual hash lookup tends to be faster than a binary search, but the differences are small for small sizes. I did some benchmarking some years ago, (commit message 87367d5f9dc9bbf7db1a6cf87820cea76571bf1a) and the results were that for fewer than 512 entries, the binary search was just as fast as a hash, if not actually faster. Now, I've done some more benchmarks on blead, using the tool benchmark.pl, which wasn't available back then. The results below indicate that the differences are minimal up through 2047 entries, which all Unicode properties are well within. A hash, PL_foldclosures, is still constructed at runtime for the case of regular expression /i matching, and this could be generated at Perl compile time, as a further enhancement for later. But reading a file from disk is no longer required to do this. ======================= benchmarking results ======================= Key: Ir Instruction read Dr Data read Dw Data write COND conditional branches IND indirect branches _m branch predict miss _m1 level 1 cache miss _mm last cache (e.g. L3) miss - indeterminate percentage (e.g. 1/0) The numbers represent raw counts per loop iteration. "\x{10000}" =~ qr/\p{CWKCF}/" swash invlist Ratio % fetch search ------ ------- ------- Ir 2259.0 2264.0 99.8 Dr 665.0 664.0 100.2 Dw 406.0 404.0 100.5 COND 406.0 405.0 100.2 IND 17.0 15.0 113.3 COND_m 8.0 8.0 100.0 IND_m 4.0 4.0 100.0 Ir_m1 8.9 17.0 52.4 Dr_m1 4.5 3.4 132.4 Dw_m1 1.9 1.2 158.3 Ir_mm 0.0 0.0 100.0 Dr_mm 0.0 0.0 100.0 Dw_mm 0.0 0.0 100.0 These were constructed by using the file whose contents are below, which uses the property in Unicode that currently has the largest number of entries in its inversion list, > 1600. The test was run on blead -O2, no debugging, no threads. Then the cut-off boundary was changed from 512 to 2047 for when we use a hash vs an inversion list, and the test run again. This yields the difference between a hash fetch and an inversion list binary search ===================== The benchmark file is below =============== no warnings 'once'; my @benchmarks; push @benchmarks, 'swash' => { desc => '"\x{10000}" =~ qr/\p{CWKCF}/"', setup => 'no warnings "once"; my $re = qr/\p{CWKCF}/; my $a = "\x{10000}";', code => '$a =~ $re;', }; \@benchmarks;
* Fix locale failures introduced 5 hours agoKarl Williamson2018-03-191-6/+0
| | | | | | | | | | | | | Commit 9fe4122e6defd7e9204ed6f2370d926d4c3b261b broke threaded builds because it changed to free a global variable upon thread exit (I had forgotten that it wasn't an interpreter variable). I do not know why this passed before pushing; others have had trouble reproducing it. But the same tests were failing for me now. The one difference is that I had been using clang with address sanitizer compiled in but turned off when I made that commit. Now I'm using g++ Spotted by Dave Mitchell
* perl.c: Free some locale stuff on exitKarl Williamson2018-03-191-9/+22
| | | | | This stops potential memory leaks when using POSIX 2008 locale handling, by freeing the current locale object and two special ones.
* Make Unicode data structures globalKarl Williamson2018-03-141-25/+0
| | | | | | | | | | These structures are read-only, use const C strings, and are truly global, so no need to have them be interpreter level. This saves duplicating and freeing them as threads come and go. In doing this, I noticed that not every one was properly being copied/deallocated, so this fixes some potential unreported bugs, and leaks.
* Don't create locale object unless threadedKarl Williamson2018-03-121-1/+1
| | | | | | PL_C_locale_obj is now only created on threaded builds on systems with POSIX 2008. On unthreaded builds, we really should continue to use the old tried and true library calls.
* move init_i18nl10n(1) to after the ENTER in perl_constructYves Orton2018-02-261-1/+1
| | | | init_i18nl10n(1) uses SAVEFREEPV, before any ENTER is performed. Move it afterwards
* Add thread-safe locale handlingKarl Williamson2018-02-181-1/+8
| | | | | | This (large) commit allows locales to be used in threaded perls on platforms that support it. This includes recent Windows and Posix 2008 ones.
* Add Perl_setlocale()Karl Williamson2018-02-181-0/+5
| | | | | | | | | | | | | | | | | | khw could not find any modules on CPAN that correctly use the C library function setlocale(). (The very few that do try, do not use it correctly, looking at the return value incorrectly, so they are broken.) This analysis does not include modules that call non-Perl libaries that may call setlocale(). And, a future commit will render the setlocale() function useless in some configurations on some platforms. So this commit adds Perl_setlocale(), for XS code to call, and which is always effective, but it should not be used to alter the locale except on platforms where the predefined variable ${^SAFE_LOCALES} evaluates to 1. This function is also what POSIX::setlocale() calls to do the real work.
* Use proper #define to see if need PLnumeric underlying_objKarl Williamson2018-02-181-10/+6
| | | | | perl.h has a single #define which is the combination of several that determines if this object should be created or not.
* Avoid changing locale when finding radix charKarl Williamson2018-01-301-0/+13
| | | | | | | | | On systems that have the POSIX 2008 operations, including nl_langinfo_l(), this commit causes them to not have to actually change the locale when determining what the decimal point character is. The locale may have to change during the printing/reading of numbers, but eventually we can use sprintf_l(), if available, to avoid that too.
* perl.c: Move initialization of inversion listsKarl Williamson2018-01-301-22/+23
| | | | | This is now done very early in the file, as it may be needed for initializing the locale handling.