summaryrefslogtreecommitdiff
path: root/ext/XS-APItest
Commit message (Collapse)AuthorAgeFilesLines
* Revert "Remove some deprecated functions from mathoms.c"Karl Williamson2018-07-193-5/+19
| | | | | | This reverts commit e6e9f5a198d7e054e6857a9c6e99a07d639f7f3c. I think it best to revert this until I'm ready for stating in perldelta exactly the options for replacing uses of these functions.
* APItest: Add comprehensive UTF-8 validity testsKarl Williamson2018-07-051-0/+269
| | | | | | These test all combinations of bytes at all likely to have any issues. They are run only when an environment variable is set to a particular obscure value, as they take a long time.
* Remove some deprecated functions from mathoms.cKarl Williamson2018-06-283-19/+5
| | | | | These have been deprecated since 5.18, and have security issues, as they can try to read beyond the end of the buffer.
* grok_atoUV: allow non-C strings and documentKarl Williamson2018-06-252-2/+2
| | | | | | | | | | This changes the internal function grok_atoUV() to not require its input to be NUL-terminated. That means the existing calls to it must be changed to set the ending position before calling it, as some did already. This function is recommended to use in a couple of pods, but it wasn't documented in perlintern. This commit does that as well.
* fix Mingw GCC C++ build errors PL_inf/PL_nanDaniel Dragan2018-05-242-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Trying a USE_CPLUSPLUS=define build with dmake (USE_CPLUSPLUS not implemented in GNUMakefile) causes the following error --------- gcc -c -xc++ -I.\include -I. -I.. -DWIN32 -DPERLDLL -DPERL_CORE -s -O2 -fwrapv - fno-strict-aliasing -DPERL_EXTERNAL_GLOB -DPERL_IS_MINIPERL -omini\globals.o .. \globals.c In file included from ..\globals.c:32:0: ..\perl.h:6754:50: error: too many initializers for 'U8 [8] {aka unsigned char [ 8]}' INFNAN_U8_NV_DECL PL_inf = { { LONGDBLINFBYTES } }; ^ ..\perl.h:6790:50: error: too many initializers for 'U8 [8] {aka unsigned char [ 8]}' INFNAN_U8_NV_DECL PL_nan = { { LONGDBLNANBYTES } }; ^ dmake: Error code 129, while making 'mini\globals.o' --------- in plain C mode builds, this error was just a warning and nobody paid attention to it for a while --------- gcc -c -I.\include -I. -I.. -DWIN32 -DPERLDLL -DPERL_CORE -s -O2 -fwrapv -fno-s trict-aliasing -DPERL_EXTERNAL_GLOB -DPERL_IS_MINIPERL -omini\globals.o ..\glob als.c In file included from ..\globals.c:32:0: ..\perl.h:5432:42: warning: excess elements in array initializer #define INFNAN_U8_NV_DECL EXTCONST union { U8 u8[NVSIZE]; NV nv; } ^ ..\perl.h:6754:1: note: in expansion of macro 'INFNAN_U8_NV_DECL' INFNAN_U8_NV_DECL PL_inf = { { LONGDBLINFBYTES } }; ^ ..\perl.h:5432:42: warning: (near initialization for 'PL_inf.u8') #define INFNAN_U8_NV_DECL EXTCONST union { U8 u8[NVSIZE]; NV nv; } ^ ..\perl.h:6754:1: note: in expansion of macro 'INFNAN_U8_NV_DECL' INFNAN_U8_NV_DECL PL_inf = { { LONGDBLINFBYTES } }; ^ ..\perl.h:5432:42: warning: excess elements in array initializer #define INFNAN_U8_NV_DECL EXTCONST union { U8 u8[NVSIZE]; NV nv; } ^ ------------- Now on VC C++ build, LONGDBLINFBYTES is 8 bytes, as defined in LONGDBLINFBYTES macro in config_H.vc, and for VC, "long double" is always typedefed by the CC to "double", so there was no warning, but on GCC, "long double" is the 80 bit/10 byte type and in config_H.gc the 12 byte version of INF is inside LONGDBLINFBYTES macro. Because LONG_DOUBLESIZE define was previously "8" because of makefile logic regardless of CC, # elif NVSIZE == LONG_DOUBLESIZE && defined(LONGDBLINFBYTES) INFNAN_U8_NV_DECL PL_inf = { { LONGDBLINFBYTES } }; was being hit on GCC, even though NVSIZE is 8 as it should be, but LONGDBLINFBYTES was 12. Hence the warning. I didnt research why this warning on GCC didn't cause test failures. Perhaps full perl recomputes the correct initializer in config_sh.PL and doesn't rely on what was in the miniperl initializer for PL_inf. To fix things, always emit the correct value for LONG_DOUBLESIZE and dont hardcode it at 8 for miniperl. 8 must stay for all VCs, and 12/16 is for GCC. Although GNUMakefile doesn't support a USE_CPLUSPLUS build option, it has provisons to support it one day. To keep things in sync, copy miniperl config.h append changes from makefile.mk to GNUMakefile. Also collapse 2 shell cmd lines in "ifeq ($(USE_LONG_DOUBLE),define)" to reduce number of proc launches of cmd.exe by the maketool (perf issue). Next C++ build issue. APItest.xs: In function 'void XS_XS__APItest__Backrefs_Comctl32Version(Perl Interpreter*, CV*)': APItest.xs:6806:37: error: cast from 'LPSTR {aka char*}' to 'WORD {aka shor t unsigned int}' loses precision [-fpermissive] MAKEINTRESOURCE(VS_FILE_INFO)); ^ dmake: Error code 129, while making 'APItest.o' VS_FILE_INFO is internally "RT_VERSION" which is MAKEINTRESOURCE(16). The output type of MAKEINTRESOURCE is a char *. GCC complains about casting that char * back down to a WORD (aka short). Put in a size_t used for pointer arithimitic to silence the error. Another option is to remove the outer MAKEINTRESOURCE in APItest.xs since RT_VERSION has MAKEINTRESOURCE internally, but that assumes implementation details of headers so pick the less dependency on header design option.
* PATCH: [perl #133121] Fix crash in gv_fetchmeth_svSergey Aleynikov2018-04-191-1/+8
| | | | | | S_gv_fetchmeth_internal supports its arguments being either an SV or a (name, len) pair. But when performing an ISA traversal to get method from a parent class, it accounted only for the latter.
* fix SEGV in XS::APItest::Backrefs::Comctl32Version()Daniel Dragan2018-04-172-2/+2
| | | | | | | | really old Mingw GCCs (3.4.5 specifically) dont implement _alloca correctly, switch to a simpler variation a follow on to perl #133084 and see also problems I had with alloca on very old GCCs in https://rt.cpan.org/Public/Bug/Display.html?id=80217
* Change enum names for new locale function parametersKarl Williamson2018-03-121-2/+2
| | | | | | | | | | | | | | Earlier in the 5.27 series, I introduced Perl_langinfo which calls the system nl_langinfo() on platforms that have it, and emulates it otherwise. For each enum parameter 'foo', I made an equivalent parameter PERL_foo. I did this so that no conflicts would arise if any 'foo' were negative. This is less than ideal to have to rename the parameters. In looking further, I realized that perl has always excluded the possibility of negative values for 'foo', so my precaution is unnecessary. And before this new code is released is the time to fix up the interface.
* APItest/t/locale.t: Store hash return for readabilityKarl Williamson2018-03-121-4/+4
|
* APItest/t/locale.t: Sort some testsKarl Williamson2018-03-121-6/+6
| | | | These now appear in the file in their logical order
* APItest/t/locale.t: Add detail to test namesKarl Williamson2018-02-251-4/+5
| | | | This detail can help in interpreting the results.
* APItest:locale.t Fix ALT_DIGITS testKarl Williamson2018-02-221-1/+1
| | | | | | | | | The value returned for ALT_DIGITS may vary by locale and platform. Change the test to only look for a result, as opposed to a particular result. This should fix https://rt.perl.org/Ticket/Display.html?id=132879 but I'll leave the ticket open until that is verified.
* Fix two bugs when calling &xsub when @_ has holesFather Chrysostomos2018-02-181-0/+7
| | | | | | | | | | | | | | | | | | | | This fixes #132729 in the particular instance where an XSUB is called via ampersand syntax when @_ has ‘holes’, or nonexistent ele- ments, as in: @_ = (); $_[1] = 1; &xsub; This means that if the XSUB or something it calls unshifts @_, the first argument passed to the XSUB will now refer to $_[1], not $_[0]; i.e., as of this commit it is correctly shifted over. Previously, a ‘defelem’ was used, which is a magical scalar that remembers its index in the array, independent of whether the array was shifted. In addition, the old code failed to mortalize the defelem, so this commit fixes a memory leak with the new ‘non-elem’ mechanism (a spe- cially-marked element stored in the array itself).
* APItest: Add U8* typemap, and use itKarl Williamson2018-02-162-82/+83
| | | | | This missiing typemap has slowed me down on numerous occasions, as I keep forgetting it's missing
* APItest: Fix C++ compilesKarl Williamson2018-02-091-4/+8
| | | | | | | | | 0a9f8f95112ff40e13406b3e5aab49c01487f045 introduced failures on C++ compilations. This is a better patch, suggested by ilmari. The issue was in cases where the pointer size is 32 bits and the word size was 64, a (STRLEN) -1 returned as an error was getting turned into 0xFFFFFFFF instead of -1.
* APItest: Correct parameter signKarl Williamson2018-02-081-1/+1
| | | | | | Commit ae315a0a3c51e68887704d4907bb6a502a6d4e3f added tests for utf8_to_bytes(). For testing purposes, this parameter should be signed when it comes from perl space.
* Add uvchr_to_utf8_flags_msgs()Karl Williamson2018-02-072-25/+98
| | | | | This is propmpted by Encode's needs. When called with the proper parameter, it returns any warnings instead of displaying them directly.
* APItest:t/utf8_warn_base.pl: Clarify some commentsKarl Williamson2018-02-071-5/+5
|
* APItest:t/utf8_warn_base.pl: Move a variable outside sub()Karl Williamson2018-02-071-11/+12
| | | | | This is in preparation for a future commit which will want to refer to this variable independently.
* APItest:t/utf8_warn_base.pl; Fix 'ok' testsKarl Williamson2018-02-071-3/+3
| | | | | This was putting the condition for the ok in a string, which always succeeds
* APItest: Add tests for utf8_to_bytes()Karl Williamson2018-02-042-0/+86
|
* APItest:t/utf8_setup.pl: Display printables as themselvesKarl Williamson2018-02-041-1/+5
| | | | Instead of the harder to read \xXX
* Perl_langinfo: Teach about YESSTR and NOSTRKarl Williamson2018-01-301-0/+2
| | | | | | These are items that nl_langinfo() used to be required to return, but are considered obsolete. Nonetheless, this drop-in replacement for that function should know about them for backward compatibility.
* APItest/t/locale.t: Add some testsKarl Williamson2018-01-301-19/+26
| | | | | | | | This makes sure that the entries for which the expected return value may legitimately vary from platform to platform get tested as returning something, skipping the test if the item isn't known on the platform. A couple of comments are also added.
* APItest/APItest.xs: Simplify mappingsKarl Williamson2018-01-301-7/+6
| | | | | Instead of using SVs, use the underlying C type, and so the code here doesn't have to deal with the SV conversions
* APItest/t/utf8_warn_base.pl: White-space onlyKarl Williamson2018-01-301-636/+636
| | | | | | This outdents a bunch of code to make it a shift width of 2 instead of 4 because the nesting was getting too deep, making the space available on a line too short.
* APItest/t/utf8_warn_base.pl: Improve diagnosticsKarl Williamson2018-01-301-6/+13
|
* Add utf8n_to_uvchr_msgs()Karl Williamson2018-01-302-34/+149
| | | | | | This UTF-8 to code point translator variant is to meet the needs of Encode, and provides XS authors with more general capability than the other decoders.
* Allow space for NUL is UTF-8 array declsKarl Williamson2018-01-222-2/+2
| | | | | | In grepping the source, I noticed that several arrays that are for holding UTF-8 characters did not allow space for a trailing NUL. This commit adds that.
* Avoid some branchesKarl Williamson2018-01-151-1/+1
| | | | | | | | | | | | | | | | This replaces some looping with branchless code in two places: looking for the first UTF-8 variant byte in a string (which is used under several circumstances), and looking for an ASCII or non-ASCII character during pattern matching. Recent commits have changed these operations to do word-at-a-time look- ups, essentially vectorizing the problem into 4 or 8 parallel probes. But when the word is found which contains the desired byte, until this commit, that word would be scanned byte-at-a-time in a loop. I found some bit hacks on the internet, which when stitched togther, can find the first desired byte in the word without branching, while doing this while the word is still loaded, without having to load each byte.
* merge branch zefram/dumb_matchZefram2017-12-172-19/+11
|\
| * avoid gratuitous given/when in testZefram2017-11-281-14/+9
| |
| * remove useless "default" mechanismZefram2017-11-281-3/+1
| |
| * eviscerate smartmatchZefram2017-11-221-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Regularise smartmatch's operand handling, by removing the implicit enreferencement and just supplying scalar context. Eviscerate its runtime behaviour, by removing all the matching rules other than rhs overloading. Overload smartmatching in the Regexp package to perform regexp matching. There are consequential customisations to autodie, in two areas. Firstly, autodie::exception objects are matchers, but autodie has been advising smartmatching with the exception on the lhs. This has to change to the rhs, in both documentation and tests. Secondly, it uses smartmatching as part of its hint mechanism. Most of the hint examples, in documentation and tests, have to change to subroutines, to be portable across Perl versions.
| * regularise "when"Zefram2017-11-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | Remove from "when" the implicit enreferencement of array/hash conditions and the implicit smartmatch of most conditions. Delete most of the documentation about behaviour of older versions of given/when, because explaining the now-old "when" behaviour would be excessively cumbersome and there's little compatibility to take advantage of. Delete the documentation about differences of given/when from the Perl 6 feature, because the differences are now even more extensive and it's too much difference to sensibly explain. Add tests of "when" in isolation.
* | Add variant_under_utf8_count() core functionKarl Williamson2017-12-112-0/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function takes a string that isn't encoded in UTF-8 (hence is assumed to be in Latin1), and counts how many of the bytes therein would change if it were to be translated into UTF-8. Each such byte would occupy two UTF-8 bytes. This function is useful for calculating the expansion factor precisely when converting to UTF-8, so as to know how much to malloc. This function uses a non-obvious method to do the calculations word-at-a-time, as opposed to the byte-at-a-time method used now, and hence should be much faster than the current methods. The performance change in short string lengths is equivocal. Here is the result for a single character and a 64-bit word. bytes words Ratio % -------- -------- ------- Ir 932.0 947.0 98.4 Dr 325.0 325.0 100.0 Dw 104.0 104.0 100.0 COND 136.0 137.0 99.3 IND 28.0 28.0 100.0 COND_m 1.0 0.0 Inf IND_m 6.0 6.0 100.0 There are some extra instructions executed and an extra branch to check for and handle the case where we can go word-by-word vs. not. But the one cache miss is removed. The results are essentially the same until we get to being able to handle a full word. Some of the extra instructions are to ensure that if the input is not aligned on a word boundary, that performance doesn't suffer. Here's the results for 8-bytes on a 64-bit system. bytes words Ratio % -------- -------- ------- Ir 974.0 955.0 102.0 Dr 332.0 325.0 102.2 Dw 104.0 104.0 100.0 COND 143.0 138.0 103.6 IND 28.0 28.0 100.0 COND_m 1.0 0.0 Inf IND_m 6.0 6.0 100.0 Things keep improving as the strings get longer. Here's for 24 bytes. bytes words Ratio % -------- -------- ------- Ir 1070.0 975.0 109.7 Dr 348.0 327.0 106.4 Dw 104.0 104.0 100.0 COND 159.0 140.0 113.6 IND 28.0 28.0 100.0 COND_m 2.0 0.0 Inf IND_m 6.0 6.0 100.0 And 96: bytes words Ratio % -------- -------- ------- Ir 1502.0 1065.0 141.0 Dr 420.0 336.0 125.0 Dw 104.0 104.0 100.0 COND 231.0 149.0 155.0 IND 28.0 28.0 100.0 COND_m 2.0 1.0 200.0 IND_m 6.0 6.0 100.0 And 10,000 bytes words Ratio % -------- -------- ------- Ir 60926.0 13445.0 453.1 Dr 10324.0 1574.0 655.9 Dw 104.0 104.0 100.0 COND 10135.0 1387.0 730.7 IND 28.0 28.0 100.0 COND_m 2.0 1.0 200.0 IND_m 6.0 6.0 100.0 I found this trick on the internet many years ago, but I can't seem to find it again to give them credit.
* | is_utf8_invariant_string(): small speed optimizationKarl Williamson2017-12-111-1/+1
| | | | | | | | | | | | This adds a few shifing, masking, and integer arithmetic operations to a conditional which in return makes sure that one branch is taken only when it is going to do some good, avoiding a conditional in it.
* | APItest/t/handy_base.pl: Avoid uninitialized warningKarl Williamson2017-12-111-0/+1
| | | | | | | | | | | | This .pl in /t is generally called from a test file in that directory, but if run by hand, this commit makes sure things are properly initialized
* | APItest.xs: shenanigans to avoid warningsFather Chrysostomos2017-12-111-0/+6
| | | | | | | | | | | | | | | | | | We have an unresolved issue that #include "fakesdio.h" causes one of the typemaps to make assignments between different pointer types, something we can’t fix straightforwardly with casts, since adding casts to the default typemap (which we are trying to test) may suppress real problems in production. This is a temporary plaster till we figure out what to do.
* | Avoid newGVgen in default typemapFather Chrysostomos2017-12-101-0/+14
| | | | | | | | | | | | | | | | | | newGVgen leaks memory, because it puts it vivifies a typeglob in the symbol table, without arranging for it to be deleted. A typemap is not an appropriate place to use it, since callers of newGVgen are responsible for seeing that the GV is freed, if they care. This came up in #115814.
* | APItest/t/utf8.t: Simplify some testsKarl Williamson2017-12-082-26/+64
| | | | | | | | | | | | | | | | | | | | The complicated nested loops of tests this commit replaces don't need to be such. To test utf8_is_invariant_string, we just need to put a single variant in each position of a string that spans more than a full word (since we have full-word lookup now) and includes partial words on either side. We set those partial words up to be one byte each less than a full word. The code needs to work on strings that don't start on a full word, and don't end on one, and this commit continues to do that.
* | APItest/t/utf8.t: Clarify a couple of test namesKarl Williamson2017-12-071-2/+2
| |
* | put shadowing warnings in their own categoryZefram2017-12-062-2/+2
| | | | | | | | As proposed in [perl #125330].
* | stop using &PL_sv_yes as no-op methodZefram2017-12-052-2/+10
| | | | | | | | | | | | | | | | | | | | | | Method lookup yields a fake method for ->import or ->unimport if there's no actual method, for historical reasons so that "use" doesn't barf if there's no import method. This fake method used to be &PL_sv_yes being used as a magic placeholder, recognised specially by pp_entersub. But &PL_sv_yes is a string, which we'd expect to serve as a symbolic CV ref. Change method lookup to yield an actual CV with a body in this case, and remove the special case from pp_entersub. This fixes the remaining part of [perl #126042].
* | APItest: Add ability to test API fcn utf8_length()Karl Williamson2017-11-271-0/+7
| |
* | APItest: Initialize parameterKarl Williamson2017-11-271-1/+1
| | | | | | | | This silences a compiler warning
* | Search for UTF-8 invariants by wordKarl Williamson2017-11-233-1/+38
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The functions is_utf8_invariant_string() and is_utf8_invariant_string_loc() are used in several places in the core and are part of the public API. This commit speeds them up significantly on ASCII (not EBCDIC) platforms, by changing to use word-at-a-time parsing instead of per-byte. (Per-byte is retained for any initial bytes to reach the next word boundary, and any final bytes that don't fill an entire word.) The following results were obtained parsing a long string on a 64-bit word machine: byte word ------ ------ Ir 100.00 665.35 Dr 100.00 797.03 Dw 100.00 102.12 COND 100.00 799.27 IND 100.00 97.56 COND_m 100.00 144.83 IND_m 100.00 75.00 Ir_m1 100.00 100.00 Dr_m1 100.00 100.02 Dw_m1 100.00 104.12 Ir_mm 100.00 100.00 Dr_mm 100.00 100.00 Dw_mm 100.00 100.00 100% is baseline; numbers larger than that are improvements. The COND measurement indicates, for example, that there 1/8 as many conditional branches in the word-at-a-time version.
* Suppress warning in XS-APItest’s sniscow.tFather Chrysostomos2017-11-161-1/+1
|
* test wrap_keyword_plugin (RT #132413)Lukas Mai2017-11-112-2/+34
|
* Replace multiple 'use vars' by 'our' in extNicolas R2017-11-112-7/+5
| | | | | | | | Using vars pragma is discouraged and has been superseded by 'our' declarations available in Perl v5.6.0 or later. This commit is about replacing the usage of 'vars' pragma by 'our' in 'ext' directory.