summaryrefslogtreecommitdiff
path: root/ext
Commit message (Collapse)AuthorAgeFilesLines
* Remove my_strftime8()Karl Williamson2023-05-012-2/+2
| | | | | | | | | | | | The comment from Tony Cook https://github.com/Perl/perl5/issues/20373#issuecomment-1524256091 made me realize that this function doesn't fully work. It was added as public API earlier in the 5.37 series, but we don't want it making it into a stable release. This commit renames it so that the original name will no longer work, but POSIX.xs can still, by changing to use the new name.name
* ext/SDBM_File/ - replace "define\t" with "define "Yves Orton2023-04-292-2/+2
| | | | | | "#define\t" is annoying as it is it 8 spaces wide, so it looks like "#define ", yet will not be found in a grep for "define foo" as the space is actually a tab.
* ext/File-Glob/ - replace "define\t" with "define "Yves Orton2023-04-293-60/+60
| | | | | | "#define\t" is annoying as it is it 8 spaces wide, so it looks like "#define ", yet will not be found in a grep for "define foo" as the space is actually a tab.
* test infra - Under -DNO_TAINT_SUPPORT skip tests that use -T or -tYves Orton2023-04-021-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch uses a collection of heuristics to skip test files which would die on a perl compiled with -DNO_TAINT_SUPPORT but without -DSILENT_NO_TAINT_SUPPORT. -DNO_TAINT_SUPPORT disables taint support in a "safe" way, such that if you try to use taint mode with the -t or -T options an exception will be thrown informing you that the perl you are using does not support taint. (The related setting -DSILENT_NO_TAINT_SUPPORT disables taint support but causes the -t and -T options to be silently ignored.) The error from using -t and -T is thrown very early in the process startup and there is no way to "gracefully" handle it and convert it into something else, for instance to skip a test file which contains it. This patch generally fixes our code to skip these tests. * Make t/TEST and t/harness check shebang lines and use filename checks to filter out tests that use -t or -T. Primarily this is the result of checking their shebang line, but some cpan/ files are excluded by name, either from a very short list of exclusions, or because their file name contains the word "taint". Non-cpan test files were fixed individually as noted below. * test.pl - make run_multiple_progs() skip test cases based on the switches that are part of the test definition. This function is used in a great deal of our internal tests, so it fixes a lot of tests in one go. * XS-APITest/t/call.t, t/run/switchDX.t, lib/B/Deparse.t - Skip a small set of tests in each file.
* scope.c - add mortal_destructor_sv() and mortal_svfunc_x()Yves Orton2023-03-183-1/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The function SAVEDESTRUCTOR_X() (save_destructor_x) can be used to execute a C function at the end of the current psuedo-block. Prior to this patch there was no "mortal" equivalent that would execute at the end of the current statement. We offer a collection of functions which are intended to free SV's at either point in time, but only support callbacks at the end of the current pseudo-block. This patch adds two such functions, "mortal_destructor_sv" which can be used to trigger a perl code reference to execute at the end of the current statement, and "mortal_svfunc_x" which can be used to trigger an SVFUNC_t C function at the end of the current statement. Both functions differ from save_destructor_x() in that instead of supporting a void pointer argument they both require their argument to be some sort of SV pointer. The Perl callback function triggered by "mortal_destructor_sv" may be provided no arguments, a single argument or a list of arguments, depending on the type of argument provided to mortal_destructor_sv(): when the argument is a raw AV (with no SV ref wrapping it), then the contents of the AV are passed in as a list of arguments. When the argument is anything else but NULL, the argument is provided as a single argument, and when it is NULL the perl function is called with no arguments. Both functions are implemented on top of a mortal SV (unseen by the user) which has PERL_MAGIC_destruct magic associated with it, which triggers the destructor behavior when the SV is freed. Both functions are provided with macros to match the normal SAVExx() API, with MORTALDESTRUCTOR_SV() wrapping mortal_destructor_sv() and MORTALSVFUNC_X() wrapping mortal_svfunc_x(). The heart of this logic cribbed from Leon Timmermans' Variable-OnDestruct. See the code at: https://metacpan.org/dist/Variable-OnDestruct/source/lib/Variable/OnDestruct.xs#L6-17 I am very grateful to him for his help on this. Any errors or omissions in this code are my fault, not his.
* Fix my_strftime() upper space limitKarl Williamson2023-03-041-10/+28
| | | | | | | | The comments said that 100:1 expansion factor had long been sufficient. But it turns out that was wrong; there are locales with a higher ratio, that we just didn't notice were failing. This commit adds comments and ups the ratio to 2000:1
* POSIX.pod: Remove obsolete C89 referenceKarl Williamson2023-03-011-1/+1
|
* POSIX.xs: Silence compiler warningKarl Williamson2023-03-012-6/+3
| | | | | | | | | This happens only in the unlikely event that localeconv() isn't present on the system. There are two ways used in this file to announce the lack of system. This commit converts to the other way than previously, and the warning goes away.
* ext/XS-APItest/t/magic.t: simplify recent testDavid Mitchell2023-02-281-8/+13
| | | | | | | | | | | A test recently added to check reference count of a stored AV element had to account for extra reference counts from the temporary refs generated by if (\$a[0] == \$j) { ... } Instead, calculate this boolean value in a separate statement so the ref counts are easier to understand.
* XS::APItest::test_EXTEND(): fixupsDavid Mitchell2023-02-281-6/+14
| | | | | | | | | | | | | | | | | | | This XS function is for testing the stack-extending EXTEND() macro. This commit fixes two issues. 1) it uses a nested 'sp' variable declaration in addition to the one declared implicitly, which is confusing. Use a separate variable called new_sp instead. This changes the logic slightly, since the EXTEND() macro messes implicitly with sp, including updating it after a realloc. We have to do that manually now with new_sp. 2) The test function NULLs a couple of items near the top of the (potentially just extended) stack. Where the extend size is zero, it could be NULLing out one or two of the passed arguments on the stack. At the moment these values aren't used any more and are discarded on return; but it will get messy once the stack becomes reference-counted, so only NULL addresses if they're above PL_stack_sp.
* APITest.xs - silence build warning under 32 bit clangYves Orton2023-02-262-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Silence build warning with 32 bit build under clang. Also bump version. APItest.xs:7614:24: warning: format specifies type 'unsigned int' but the argument has type 'U32' (aka 'unsigned long') [-Wformat] i, hash32, vectors_32[i]); ^~~~~~ ../../fakesdio.h:64:50: note: expanded from macro 'printf' #define printf(fmt,args...) PerlIO_stdoutf(fmt,##args) ~~~ ^~~~ APItest.xs:7614:32: warning: format specifies type 'unsigned int' but the argument has type 'U32' (aka 'unsigned long') [-Wformat] i, hash32, vectors_32[i]); ^~~~~~~~~~~~~ ../../fakesdio.h:64:50: note: expanded from macro 'printf' #define printf(fmt,args...) PerlIO_stdoutf(fmt,##args) ~~~ ^~~~ APItest.xs:7835:24: warning: format specifies type 'unsigned int' but the argument has type 'U32' (aka 'unsigned long') [-Wformat] i, hash32, vectors_32[i]); ^~~~~~ ../../fakesdio.h:64:50: note: expanded from macro 'printf' #define printf(fmt,args...) PerlIO_stdoutf(fmt,##args) ~~~ ^~~~ APItest.xs:7835:32: warning: format specifies type 'unsigned int' but the argument has type 'U32' (aka 'unsigned long') [-Wformat] i, hash32, vectors_32[i]); ^~~~~~~~~~~~~ ../../fakesdio.h:64:50: note: expanded from macro 'printf' #define printf(fmt,args...) PerlIO_stdoutf(fmt,##args) ~~~ ^~~~ 4 warnings generated.
* mro.xs - silence maybe-uninitialized warning on gcc 12Yves Orton2023-02-192-10/+4
| | | | | | | | | | | | | | | Silence the following bogus warning: mro.xs:561:25: warning: ‘fq_subname_len’ may be used uninitialized [-Wmaybe-uninitialized] 561 | subname_len = fq_subname_len - (subname - fq_subname); The code does not need to be structured the way it was, and we actually don't need to define fq_subname_len at all. So restructure the code and remove it and make gcc-12 shut up. Fixes GH Issue #20816
* Accept field VAR = EXPR on field varsPaul "LeoNerd" Evans2023-02-101-1/+1
| | | | | Allows non-constant expressions with side effects. Evaluated during the constructor of each instance.
* Initial attack at pod/perlclass.podPaul "LeoNerd" Evans2023-02-101-4/+5
|
* Create a specific SV type for object instancesPaul "LeoNerd" Evans2023-02-102-1/+3
|
* Initial attack at basic 'class' featurePaul "LeoNerd" Evans2023-02-101-1/+3
| | | | | | | | | | | | | Adds a new experimental warning, feature, keywords and enough parsing to implement basic classes with an empty `new` constructor method. Inject a $self lexical into method bodies; populate it with the object instance, suitably shifted Creates a new OP_METHSTART opcode to perform method setup Define an aux flag to remark which stashes are classes Basic implementation of fields. Basic anonymous methods.
* File-Find - set up tempdir early and test that we can chdir to it.Yves Orton2023-02-094-35/+93
| | | | | | | | | If for some reason we die very early in the test script the cleanup() function would get called before we had set up $test_root_dir or $test_temp_dir. This then lead to further errors being generated by trying to chdir into an undefined directory. This patch ensures that the various setup behavior worked correctly, and that if it does not that we have some clear diagnostics about it.
* pod/perlmroapi.pod - document linear MRO function return type.Paul "LeoNerd" Evans2023-02-082-2/+3
| | | | | Small docs clarification to point out that the linear MRO function returns a list of strings
* File-Find/t - rework tempdir creation and cleanupYves Orton2023-01-312-25/+26
| | | | | | | | | | | Fixes #20734 - in a previous patch I missed the early cleanup call and the fact that it could result in a race condition. This hopefully resolves the problem. These tests files are pretty crufty. It would be nice to see them split apart so that the "sanity" checks which expect to be run in t/ are executed in a separate test files from the checks which build a tree to traverse for testing. A perfect task for a new contributor.
* dump.c - dump new regexp fields properlyYves Orton2023-01-232-78/+150
| | | | | | | | | | | | | | Show the pointer values and their contents. Also show the "MOTHER_RE" at the *end* of the dump, as otherwise it can be quite hard to read. This patch also includes stripping out the versioned test adjustments for regexp related dumps. Devel-Peek is in ext/ so it won't be used on an older perl and we can just make it correct for the latest state. The test for the dump of a branch reset pattern is also implicitly tests whether branch reset pointer table logic is working correctly. In the process of writing this patch I discovered there was an off by one error. See 8111bf2fc3870f8146bb46652b66bd517e82b4dd for the fix.
* Export S_ISLNK and S_ISSOCK from POSIX.pmDavid Cantrell2023-01-193-6/+6
| | | | | | They are already available in Fcntl.pm, from whence POSIX.pm already gets all the other S_IS* macros, so this just adds them to the list of imports (from Fcntl into POSIX) and exports (from POSIX)
* regexec.c - fix memory leak in EVAL.Yves Orton2023-01-152-0/+46
| | | | | | | | | | | | | | | | | | EVAL was calling regcppush twice per invocation, once before executing the callback, and once after. But not regcppop'ing twice. So each time we would accumulate an extra "frame" of data. This is/was hidden somewhat by the way we eventually "blow" the stack, so the extra data was just thrown away at the end. This removes the second set of pushes so that the save stack stays a stable size as it unwinds from each failed eval. We also weren't cleaning up after a (?{...}) when we failed to match to its right. This unwinds the stack and restores the parens properly. This adds tests to check how the save stack grows during patterns using (?{ ... }) and (??{ ... }) and ensure that when we backtrack and re-execute the EVAL again it cleans up the stack as it goes.
* POSIX.pod: Clarify mbtowc(), wctomb() podKarl Williamson2023-01-121-6/+8
|
* File::Find: handle \ in symbolic links on Win32Tony Cook2023-01-101-1/+4
|
* regcomp.pl - fixup intflags debug data to handle gaps properlyYves Orton2023-01-094-2/+43
| | | | | | | | | | | | | | We were not handling gaps in the sequence properly, and effectively showing the wrong flag names or missing the last flag. Now we die if there are any collisions or if any of the PREGf defines set more than one bit. This also adds some crude tests to validate that intflags serialization is working properly. Note, extflags handles more complex scenarios and seems to handle this gracefully already, hence the reason I haven't touched it as well. This also tweaks a comment in lexical_debug.t which part of this was cribbed from.
* av.c - av_store() do the refcount dance around magic av'sYves Orton2023-01-092-0/+87
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The api for av_store() says it is the callers responsibility to call SvREFCNT_inc() on the stored item after the store is successful. However inside of av_store() we store the item in the C level array before we trigger magic. To a certain extent this is required because mg_set(av) needs to be able to see the newly stored item. But if the mg_set() or other magic associated with the av_store() operation fails, we would end up with a double free situation, as we will long jump up the stack above and out of the call to av_store(), freeing the mortal as we go (via Perl_croak()), but leaving the reference to the now freed pointer in the array. When the next SV is allocated the reference will be reused, and then we are in a double free scenario. I see comments in pp_aassign talking about defusing the temps stack for the parameters it is passing in, and things like this, which at first looked related. But that commentary doesn't seem that relevant to me, as this bug could happen any time a scalar owned by one data structure was copied into an array with set magic which could die. Eg, I can easily imagine XS code that expected code like this (assume it handles magic properly) to work: SV **svp = av_fetch(av1,0,1); if (av_store(av2,0,*svp)) SvREFCNT_inc(*svp); but if av2 has set magic and it dies the end result will be that both av1 and av2 contain a visible reference to *svp, but its refcount will be 1. So I think this is a bug regardless of what pp_aassign does. This fixes https://github.com/Perl/perl5/issues/20675
* Fix mojibake in POSIX::strftime()Karl Williamson2023-01-051-3/+0
| | | | | Some platforms require LC_CTYPE and LC_TIME to be the same. This toggles LC_CTYPE if necessary.
* perlapi,POSIX.pod: Note problematic behavior with strftime()Karl Williamson2023-01-051-0/+3
| | | | Mojibake currently can happen without taking care
* File-Find/t - use File::Temp to create a private test directory to prevent ↵Yves Orton2023-01-043-4/+18
| | | | | | | | race conditions The tests have been reported as not being parallel safe. Running the tests twice at the same time reveals they dont use a process private directory, so in theory they could race with something else.
* File-Find/t/taint.t - do not use rel2abs after we have forced to Unix modeYves Orton2023-01-041-22/+13
| | | | | | | | | | | | taint.t corrupts @INC, by using rel2abs twice on its values, once after we have forced File::Spec to think we are on a *nix style platform. This breaks use statements that occur after the second rel2abs. Comparing to File-Find/t/find.t which also does this we can see that find.t only does this rel2abs on @INC behavior once. Presumably the second call was a rebase error of some sort. This reorganizes the use statements and ensures that we only use rel2abs() to change @INC prior to overriding it to use *nix semantics.
* XS-APItest/t/locale.t - deal with indented values properlyYves Orton2022-12-301-1/+1
| | | | | | | | | The old code used a regex that would split on exactly one space, so if the data was changed to have more than one then it would get absorbed into the name that was parsed out of the header file, leading the code to test for things like "FOO ", which of course don't exist. Likely this could have caused other issues too, but the defines in practice are single symbols.
* POSIX::fmax - Correct the variable used in exampleAndrew Ruthven2022-12-291-1/+1
| | | | | | | | | The variable in the example should be $max, not $min as we're finding the maximum value. Committer: Andrew Ruthven is now a Perl author. For: https://github.com/Perl/perl5/pull/20652
* ext/FileCache/t/ - fixup paralleism issues in testsYves Orton2022-12-266-16/+16
| | | | | Different test files used the same files for their tests, causing conflicts and test failures in parallel mode.
* sv.c - add support for HvNAMEf and HvNAMEf_QUOTEDPREFIX formatsYves Orton2022-12-223-2/+33
| | | | | | | | They are similar to SVf and SVf_QUOTEDPREFIX but take an HV * argument and use HvNAME() and related macros to extract the string. This is helpful as it makes constructing error messages from a stash (HV *) easier. It is the callers responsibility to ensure that the HV is actually a stash.
* gv.c - rename amagic_find() to amagic_applies()Yves Orton2022-12-223-1/+74
| | | | | | | | | The api for amagic_find() didnt make as much as sense as we thought at first. Most people will be using this as a predicate, and don't care about the returned CV, so to simplify things until we can really think this through the required API this switches it to return a bool and renames it to amagic_applies(), as in "which amagic_applies to this sv".
* Address GH #20571Karl Williamson2022-12-202-1/+13
| | | | | | | | | | | | The blamed commit, 04de022, exposed a bug in the module itself. I will submit a PR to fix it. But this ticket did tell me that there was a problem with that commit. It returned a C language value, CHAR_MAX, which doesn't really have a corresponding concept in Perl. Instead we use -1 to indicate that a positive-valued variable is in some abnormal state. This commit changes to do that, and documents the changes, which should have been done in 04de022.
* Fix broken API: sync_locale()Karl Williamson2022-12-201-6/+2
| | | | | | | | | | | | | | This fixes GH #20565. Lack of tests allowed sync_locale() to get broken until CPAN testing showed it so. Basically, I blew it in 9f5a615be674d7663d3b4719849baa1ba3027f5b. Most egregiously, I forgot to turn back on when a sync_locale() is executed, the toggling for locales whose radix character isn't a dot. And this needs a way to tell the other code that it needs to recompute things at this time, since our records don't reflect what happened before the sync.
* Add testing global locale switching; Todo #20565Karl Williamson2022-12-202-0/+60
| | | | | | API switch_to_global_locale() and sync_locale() weren't tested because I hadn't figured out a way to do so, but @dk showed me the way in his reproducing case for GH #20565.
* API-test:locale.t: Look for a comma radix localeKarl Williamson2022-12-201-5/+7
| | | | | | | | | | | Prior to this commit there was code (whose results were ignored) looking for a locale with a non-dot radix. This can result in a UTF-8 radix, which may not display properly without the terminal and file handle being in sync. Almost all non-dot locales use a comma, which is represented the same in UTF-8 as not, so doesn't suffer from the display problem. So look specifically for a comma. The result is still unused, but the next commit will use it.
* Define OP_HELEMEXISTSOR, a handy LOGOP shortcut for HELEM existence testsPaul "LeoNerd" Evans2022-12-191-1/+2
| | | | | | | | | | | | | | | | | | This op is constructed using an OP_HELEM as the op_first and any scalar expression as the op_other. It is roughly equivalent to the following perl code: exists $hv{$key} ? $hv{$key} : OTHER except that the HV and the KEY expression are evaluated only once, and only one hv_* function is invoked to both test and obtain the value. It is therefore smaller and more efficient. Likewise, adding the OPpHELEMEXISTSOR_DELETE flag turns it into the equivalent of exists $hv{$key} ? delete $hv{$key} : OTHER
* File/Glob.xs: Idempotent setting of PL_opfreehook (fixes GH#20615)Paul "LeoNerd" Evans2022-12-172-3/+5
|
* regcomp.c - decompose into smaller filesYves Orton2022-12-091-25/+44
| | | | | | | | | | | | | | | | | This splits a bunch of the subcomponents of the regex engine into smaller files. regcomp_debug.c regcomp_internal.h regcomp_invlist.c regcomp_study.c regcomp_trie.c The only real change besides to the build machine to achieve the split is to also adds some new defines which can be used in embed.fnc to control exports without having to enumerate /every/ regex engine file. For instance all of regcomp*.c defines PERL_IN_REGCOMP_ANY, and this is used in embed.fnc to manage exports.
* utf8_hop forwards Change continuation start behaviorKarl Williamson2022-12-071-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this commit, when hopping forwards, and the initial position to hop from is a continuation byte, it treats it and each such successive one as a single character until it gets to a start byte, and switches into normal mode. In contrast, in hopping backwards, all the consecutive continuation bytes are considered to be part of a single character (as they indeed are). Thus there is a discrepancy between forward/backwards hopping; and the forward version seems wrong to me. This commit removes the discrepancy. There is no change in behavior if the starting position is to the beginning of a character. All calls in the core except for the API test are of this form. But, if the initial position is in the middle of a character, it now moves to the beginning of the next character, subtracting just 1 from the count of characters to hop (instead of subtracting however many continuation bytes there are). This is how I would have expected it to work all along. Succinctly, getting to the next character now consumes one hop count, no matter the direction nor which byte in the character is the starting position.
* APItest: Generalize hop test for EBCDICKarl Williamson2022-12-071-3/+2
| | | | And stop skipping it
* locales: Add LC_NAME capabilitiesKarl Williamson2022-12-063-3/+3
| | | | | | | | | | | | | LC_NAME is a GNU extension that Perl hadn't been aware of. The consequences were that it couldn't be set or queried in Perl (except by using LC_ALL to set everything). There are other GNU extensions that Perl has long known about; this was the only missing one. The values associated with this category are retrievable by the glibc call nl_langinfo(3) in XS code. The standard-specified items are retrievable from pure Perl via I18N::Langinfo, but it doesn't know only about any of the non-standard ones, including the ones for this category.
* pod2html: Group '*' and 'no*' switches in documentationJames E Keenan2022-12-051-41/+19
| | | | | | | | | | | | | | As the next step toward unifying the documentation in bin/pod2html and lib/Pod/Html.pm, group the '*' and 'no*' command-line switches to the pod2html utility and then adjust the descriptions of the switches accordingly. There are 5 such instances (example: 'poderrors' and 'nopoderrors') already grouped in this manner in lib/Pod/Html.pm. This commit brings this grouping into bin/pod2html as well. No changes in functionality. Correct one editing error spotted by Tony Cook. For: https://github.com/Perl/perl5/pull/20581
* POSIX::localeconv: Return empty/special valuesKarl Williamson2022-11-301-6/+3
| | | | | | | | | | | | | | | | | | | | | This function returns a hash allowing Perl access to the localeconv() data structure, with the keys being the structure's field names, and the values being their corresponding value in the current locale. Prior to this commit, it did not populate the hash with: 1) any string-valued keys whose value is the empty string 2) any numeric-valued keys whose value is the special value CHAR_MAX This is wrong. localeconv() should return a complete list of fields on the platform, regardless of their values. Someone may well wish to iterate over all the keys in the hash. CHAR_MAX just indicates special handling is required for that numeric field. And all string fields legally can be empty, except for the decimal point. For example, the symbol indicating a number is positive is empty in many locales. I couldn't find a reason in the history why these have been omitted.
* pod2html: sort command-line switchesJames E Keenan2022-11-301-78/+77
| | | | | | | | | | | | | | | The pod2html utility is the gateway to the function Pod::Html::pod2html() exported from lib/Pod/Html.pm. It is installed in addition to the perl executable in 'make install' and is often packaged separately in software distributions. Over time, differences have crept in to the way each file documents the command-line switches. These should be rectified. As a first step toward more maintainable documentation, let's have bin/pod2html present those switches in alphabetical order (albeit interleaving the 'no*' switches) as lib/Pod/Html.pm already does. This weill permit us to more easily compare their respective documentation sections.
* perl.c - move PL_restartop assert out of perl_run()Yves Orton2022-11-301-2/+23
| | | | | | | | | | | | | | | | | | | | | | | In dd66b1d793 we added an assert to perl_run() that PL_restartop should never be true when perl_run() is called after perl_parse(). Looked at from the point of the internals, which calls perl_parse() and perl_run() exactly once, this made sense. It turns out however that there is at least one XS module out there that expects to be able to set PL_restartop and then call perl_run(). If that works out for them then we shouldn't block it, as we aren't really trying to say "perl_run() should never be called with PL_restartop set" (at least this assert wasn't trying to say that really), we are trying to assert "between the top level transition from perl_parse() to perl_run() we shouldnt leak any PL_restartop". One could argue the assert maybe should go at the end of perl_parse(), but I chose to put it in Miniperl.pm and thus into perlmain.c and miniperlmain.c as I am not certain that perl_parse() should never be called with PL_restartop set already, and putting it in the main code really does more closely reflect the intent of this assert anyway. This was reported as Blead Breaks CPAN Github Issue #20557.
* Fix POSIX::strxfrm()Karl Williamson2022-11-293-22/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit does two things. Most simply it extends strxfrm() to handle strings containing NUL characters. Previously the transformation stopped at the first NUL encountered. Second, it combines the implementation of this with the existing implementation used for the 'cmp' operator, eliminating existing discrepancies and preventing future ones. This function takes an SV containing a PV. The encoding of that PV is based on the locale of the LC_CTYPE locale. It really doesn't make sense to collate based off of the sequencing of a different locale, which prior to this commit it would do (but not for 'cmp') if the LC_COLLATION locale were different. As an example, consider the string: my $string = quotemeta join "", map { chr } (1..255); and with LC_TYPE=8859-1 (Latin-1, used for several Western European languages), LC_COLLATE set to ja_JP.utf8. This doesn't make much sense, outside of specialty uses such as a lazy implementation of a Japanese/French dictionary, or for quoting snippets in one language in a document written in the other. ('lazy' because such text should really be changing locales to the language of the snippet currently being worked on.) Nevertheless Perl should do something as sensible as possible. and this commit changes POSIX::strxfrm() to use the method already in use by the code implementing 'cmp'. Prior to this commit, POSIX::strxfrm($string) yielded on glibc 12.1: ^\3^\4^\5^\6^\a^\b^\t^\n^\13^\f^\r^\16^\17^\20^\21^\22^\23^\24^\25^\26^\27^\30^\31^\32^\e^\34^\35^\36^\37^ ^!^\"^#^\$^%^&^'^(^)^*^+^,^-^.^/^0^123456789:;^<^=^>^?^\@^A^BCDEFGHIJKLMNOPQRSTUVWXYZ[\\^]^^^_^`a^bcdefghijklmnopqrstuvwxyz{|^}^~^\177^\302\200^\302\201^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3^\3 These are effectively a sorting order, and it is not meant to be human understandable. But it is clear that most of the characters had the same weight of 3, so a libc sort would mark them as ties in sorting order. And after, ^\3^\4^\5^\6^\a^\b^\t^\n^\13^\f^\r^\16^\17^\20^\21^\22^\23^\24^\25^\26^\27^\30^\31^\32^\e^\34^\35^\36^\37^ ^!^\"^#^\$^%^&^'^(^)^*^+^,^-^.^/^0^123456789:;^<^=^>^?^\@^A^BCDEFGHIJKLMNOPQRSTUVWXYZ[\\^]^^^_^`a^bcdefghijklmnopqrstuvwxyz{|^}^~^\177^\302\200^\302\201^\302\202^\302\203^\302\204^\302\205^\302\206^\302\207^\302\210^\302\211^\302\212^\302\213^\302\214^\302\215^\302\216^\302\217^\3\3^\3\3^\302\220^\302\221^\302\222^\302\223^\302\224^\302\225^\302\226^\302\227^\302\230^\302\231^\302\232^\302\233^\302\234^\302\235^\302\236^\302\237^\3\3^\341\257\211^\304\257^\304\260^\341\257\221^\3\3^\341\257\212^\304\266^\303\255^\341\257\216^\341\257\215^\3\3^\305\225^\3\3^\341\257\217^\341\257\203^\304\251^\304\234^\3\3^\3\3^\303\253^\3\3^\305\260^\3\3^\341\257\200^\3\3^\341\257\214^\3\3^\3\3^\3\3^\3\3^\341\257\213^\341\260\236^\341\260\235^\341\260\240^\341\260\246^\341\260\237^\341\260\245^\341\260\202^\341\260\252^\341\260\256^\341\260\255^\341\260\260^\341\260\257^\341\260\273^\341\260\272^\341\260\275^\341\260\274^\3\3^\341\261\213^\341\261\215^\341\261\214^\341\261\217^\341\261\223^\341\261\216^\304\235^\341\260\211^\341\261\236^\341\261\235^\341\261\240^\341\261\237^\341\261\255^\341\260\214^\341\260\232^\341\261\264^\341\261\263^\341\261\266^\341\261\274^\341\261\265^\341\261\273^\341\260\215^\341\262\200^\341\262\204^\341\262\203^\341\262\206^\341\262\205^\341\262\221^\341\262\220^\341\262\223^\341\262\222^\341\260\217^\341\262\240^\341\262\242^\341\262\241^\341\262\244^\341\262\250^\341\262\243^\304\236^\341\260\230^\341\262\263^\341\262\262^\341\262\265^\341\262\264^\341\263\202^\341\260\234^\341\263\203 which shows that most of the ties have been resolved, and hence the results are more sensible