| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
This reverts commit e6e9f5a198d7e054e6857a9c6e99a07d639f7f3c.
I think it best to revert this until I'm ready for stating in perldelta
exactly the options for replacing uses of these functions.
|
|
|
|
|
|
| |
These test all combinations of bytes at all likely to have any issues.
They are run only when an environment variable is set to a particular
obscure value, as they take a long time.
|
|
|
|
|
| |
These have been deprecated since 5.18, and have security issues, as they
can try to read beyond the end of the buffer.
|
|
|
|
|
|
|
|
|
|
| |
This changes the internal function grok_atoUV() to not require its input
to be NUL-terminated. That means the existing calls to it must be
changed to set the ending position before calling it, as some did
already.
This function is recommended to use in a couple of pods, but it wasn't
documented in perlintern. This commit does that as well.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Trying a USE_CPLUSPLUS=define build with dmake (USE_CPLUSPLUS not
implemented in GNUMakefile) causes the following error
---------
gcc -c -xc++ -I.\include -I. -I.. -DWIN32 -DPERLDLL -DPERL_CORE -s -O2 -fwrapv -
fno-strict-aliasing -DPERL_EXTERNAL_GLOB -DPERL_IS_MINIPERL -omini\globals.o ..
\globals.c
In file included from ..\globals.c:32:0:
..\perl.h:6754:50: error: too many initializers for 'U8 [8] {aka unsigned char [
8]}'
INFNAN_U8_NV_DECL PL_inf = { { LONGDBLINFBYTES } };
^
..\perl.h:6790:50: error: too many initializers for 'U8 [8] {aka unsigned char [
8]}'
INFNAN_U8_NV_DECL PL_nan = { { LONGDBLNANBYTES } };
^
dmake: Error code 129, while making 'mini\globals.o'
---------
in plain C mode builds, this error was just a warning and nobody paid
attention to it for a while
---------
gcc -c -I.\include -I. -I.. -DWIN32 -DPERLDLL -DPERL_CORE -s -O2 -fwrapv -fno-s
trict-aliasing -DPERL_EXTERNAL_GLOB -DPERL_IS_MINIPERL -omini\globals.o ..\glob
als.c
In file included from ..\globals.c:32:0:
..\perl.h:5432:42: warning: excess elements in array initializer
#define INFNAN_U8_NV_DECL EXTCONST union { U8 u8[NVSIZE]; NV nv; }
^
..\perl.h:6754:1: note: in expansion of macro 'INFNAN_U8_NV_DECL'
INFNAN_U8_NV_DECL PL_inf = { { LONGDBLINFBYTES } };
^
..\perl.h:5432:42: warning: (near initialization for 'PL_inf.u8')
#define INFNAN_U8_NV_DECL EXTCONST union { U8 u8[NVSIZE]; NV nv; }
^
..\perl.h:6754:1: note: in expansion of macro 'INFNAN_U8_NV_DECL'
INFNAN_U8_NV_DECL PL_inf = { { LONGDBLINFBYTES } };
^
..\perl.h:5432:42: warning: excess elements in array initializer
#define INFNAN_U8_NV_DECL EXTCONST union { U8 u8[NVSIZE]; NV nv; }
^
-------------
Now on VC C++ build, LONGDBLINFBYTES is 8 bytes, as defined in
LONGDBLINFBYTES macro in config_H.vc, and for VC, "long double" is always
typedefed by the CC to "double", so there was no warning, but on GCC,
"long double" is the 80 bit/10 byte type and in config_H.gc the 12 byte
version of INF is inside LONGDBLINFBYTES macro. Because LONG_DOUBLESIZE
define was previously "8" because of makefile logic regardless of CC,
# elif NVSIZE == LONG_DOUBLESIZE && defined(LONGDBLINFBYTES)
INFNAN_U8_NV_DECL PL_inf = { { LONGDBLINFBYTES } };
was being hit on GCC, even though NVSIZE is 8 as it should be, but
LONGDBLINFBYTES was 12. Hence the warning. I didnt research why this
warning on GCC didn't cause test failures. Perhaps full perl recomputes
the correct initializer in config_sh.PL and doesn't rely on what was in the
miniperl initializer for PL_inf.
To fix things, always emit the correct value for LONG_DOUBLESIZE and dont
hardcode it at 8 for miniperl. 8 must stay for all VCs, and 12/16 is for
GCC. Although GNUMakefile doesn't support a USE_CPLUSPLUS build option,
it has provisons to support it one day. To keep things in sync, copy
miniperl config.h append changes from makefile.mk to GNUMakefile. Also
collapse 2 shell cmd lines in "ifeq ($(USE_LONG_DOUBLE),define)" to reduce
number of proc launches of cmd.exe by the maketool (perf issue).
Next C++ build issue.
APItest.xs: In function 'void XS_XS__APItest__Backrefs_Comctl32Version(Perl
Interpreter*, CV*)':
APItest.xs:6806:37: error: cast from 'LPSTR {aka char*}' to 'WORD {aka shor
t unsigned int}' loses precision [-fpermissive]
MAKEINTRESOURCE(VS_FILE_INFO));
^
dmake: Error code 129, while making 'APItest.o'
VS_FILE_INFO is internally "RT_VERSION" which is MAKEINTRESOURCE(16). The
output type of MAKEINTRESOURCE is a char *. GCC complains about casting
that char * back down to a WORD (aka short). Put in a size_t used for
pointer arithimitic to silence the error. Another option is to
remove the outer MAKEINTRESOURCE in APItest.xs since RT_VERSION has
MAKEINTRESOURCE internally, but that assumes implementation details of
headers so pick the less dependency on header design option.
|
|
|
|
|
|
| |
S_gv_fetchmeth_internal supports its arguments being either an SV or
a (name, len) pair. But when performing an ISA traversal to get method from
a parent class, it accounted only for the latter.
|
|
|
|
|
|
|
|
| |
really old Mingw GCCs (3.4.5 specifically) dont implement _alloca
correctly, switch to a simpler variation
a follow on to perl #133084 and see also problems I had with alloca on
very old GCCs in https://rt.cpan.org/Public/Bug/Display.html?id=80217
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Earlier in the 5.27 series, I introduced Perl_langinfo which calls
the system nl_langinfo() on platforms that have it, and emulates it
otherwise. For each enum parameter 'foo', I made an equivalent
parameter PERL_foo. I did this so that no conflicts would arise if
any 'foo' were negative. This is less than ideal to have to rename the
parameters.
In looking further, I realized that perl has always excluded the
possibility of negative values for 'foo', so my precaution is
unnecessary. And before this new code is released is the time to fix up
the interface.
|
| |
|
|
|
|
| |
These now appear in the file in their logical order
|
|
|
|
| |
This detail can help in interpreting the results.
|
|
|
|
|
|
|
|
|
| |
The value returned for ALT_DIGITS may vary by locale and platform.
Change the test to only look for a result, as opposed to a particular
result.
This should fix https://rt.perl.org/Ticket/Display.html?id=132879
but I'll leave the ticket open until that is verified.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes #132729 in the particular instance where an XSUB is
called via ampersand syntax when @_ has ‘holes’, or nonexistent ele-
ments, as in:
@_ = ();
$_[1] = 1;
&xsub;
This means that if the XSUB or something it calls unshifts @_, the
first argument passed to the XSUB will now refer to $_[1], not $_[0];
i.e., as of this commit it is correctly shifted over. Previously, a
‘defelem’ was used, which is a magical scalar that remembers its index
in the array, independent of whether the array was shifted.
In addition, the old code failed to mortalize the defelem, so this
commit fixes a memory leak with the new ‘non-elem’ mechanism (a spe-
cially-marked element stored in the array itself).
|
|
|
|
|
| |
This missiing typemap has slowed me down on numerous occasions, as I
keep forgetting it's missing
|
|
|
|
|
|
|
|
|
| |
0a9f8f95112ff40e13406b3e5aab49c01487f045 introduced failures on C++
compilations. This is a better patch, suggested by ilmari.
The issue was in cases where the pointer size is 32 bits and the word
size was 64, a (STRLEN) -1 returned as an error was getting turned into
0xFFFFFFFF instead of -1.
|
|
|
|
|
|
| |
Commit ae315a0a3c51e68887704d4907bb6a502a6d4e3f added tests for
utf8_to_bytes(). For testing purposes, this parameter should be signed
when it comes from perl space.
|
|
|
|
|
| |
This is propmpted by Encode's needs. When called with the proper
parameter, it returns any warnings instead of displaying them directly.
|
| |
|
|
|
|
|
| |
This is in preparation for a future commit which will want to refer to
this variable independently.
|
|
|
|
|
| |
This was putting the condition for the ok in a string, which always
succeeds
|
| |
|
|
|
|
| |
Instead of the harder to read \xXX
|
|
|
|
|
|
| |
These are items that nl_langinfo() used to be required to return, but
are considered obsolete. Nonetheless, this drop-in replacement for that
function should know about them for backward compatibility.
|
|
|
|
|
|
|
|
| |
This makes sure that the entries for which the expected return value may
legitimately vary from platform to platform get tested as returning
something, skipping the test if the item isn't known on the platform.
A couple of comments are also added.
|
|
|
|
|
| |
Instead of using SVs, use the underlying C type, and so the code here
doesn't have to deal with the SV conversions
|
|
|
|
|
|
| |
This outdents a bunch of code to make it a shift width of 2 instead of 4
because the nesting was getting too deep, making the space available on
a line too short.
|
| |
|
|
|
|
|
|
| |
This UTF-8 to code point translator variant is to meet the needs of
Encode, and provides XS authors with more general capability than
the other decoders.
|
|
|
|
|
|
| |
In grepping the source, I noticed that several arrays that are for
holding UTF-8 characters did not allow space for a trailing NUL. This
commit adds that.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This replaces some looping with branchless code in two places: looking
for the first UTF-8 variant byte in a string (which is used under
several circumstances), and looking for an ASCII or non-ASCII character
during pattern matching.
Recent commits have changed these operations to do word-at-a-time look-
ups, essentially vectorizing the problem into 4 or 8 parallel probes.
But when the word is found which contains the desired byte, until this
commit, that word would be scanned byte-at-a-time in a loop.
I found some bit hacks on the internet, which when stitched togther, can
find the first desired byte in the word without branching, while doing
this while the word is still loaded, without having to load each byte.
|
|\ |
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Regularise smartmatch's operand handling, by removing the implicit
enreferencement and just supplying scalar context. Eviscerate its runtime
behaviour, by removing all the matching rules other than rhs overloading.
Overload smartmatching in the Regexp package to perform regexp matching.
There are consequential customisations to autodie, in two areas. Firstly,
autodie::exception objects are matchers, but autodie has been advising
smartmatching with the exception on the lhs. This has to change to the
rhs, in both documentation and tests. Secondly, it uses smartmatching as
part of its hint mechanism. Most of the hint examples, in documentation
and tests, have to change to subroutines, to be portable across Perl
versions.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Remove from "when" the implicit enreferencement of array/hash conditions
and the implicit smartmatch of most conditions. Delete most of the
documentation about behaviour of older versions of given/when, because
explaining the now-old "when" behaviour would be excessively cumbersome
and there's little compatibility to take advantage of. Delete the
documentation about differences of given/when from the Perl 6 feature,
because the differences are now even more extensive and it's too much
difference to sensibly explain. Add tests of "when" in isolation.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This function takes a string that isn't encoded in UTF-8 (hence is
assumed to be in Latin1), and counts how many of the bytes therein
would change if it were to be translated into UTF-8. Each such byte
would occupy two UTF-8 bytes.
This function is useful for calculating the expansion factor precisely
when converting to UTF-8, so as to know how much to malloc.
This function uses a non-obvious method to do the calculations
word-at-a-time, as opposed to the byte-at-a-time method used now, and
hence should be much faster than the current methods.
The performance change in short string lengths is equivocal. Here is
the result for a single character and a 64-bit word.
bytes words Ratio %
-------- -------- -------
Ir 932.0 947.0 98.4
Dr 325.0 325.0 100.0
Dw 104.0 104.0 100.0
COND 136.0 137.0 99.3
IND 28.0 28.0 100.0
COND_m 1.0 0.0 Inf
IND_m 6.0 6.0 100.0
There are some extra instructions executed and an extra branch to check
for and handle the case where we can go word-by-word vs. not. But the
one cache miss is removed.
The results are essentially the same until we get to being able to
handle a full word. Some of the extra instructions are to ensure that
if the input is not aligned on a word boundary, that performance doesn't
suffer.
Here's the results for 8-bytes on a 64-bit system.
bytes words Ratio %
-------- -------- -------
Ir 974.0 955.0 102.0
Dr 332.0 325.0 102.2
Dw 104.0 104.0 100.0
COND 143.0 138.0 103.6
IND 28.0 28.0 100.0
COND_m 1.0 0.0 Inf
IND_m 6.0 6.0 100.0
Things keep improving as the strings get longer. Here's for 24 bytes.
bytes words Ratio %
-------- -------- -------
Ir 1070.0 975.0 109.7
Dr 348.0 327.0 106.4
Dw 104.0 104.0 100.0
COND 159.0 140.0 113.6
IND 28.0 28.0 100.0
COND_m 2.0 0.0 Inf
IND_m 6.0 6.0 100.0
And 96:
bytes words Ratio %
-------- -------- -------
Ir 1502.0 1065.0 141.0
Dr 420.0 336.0 125.0
Dw 104.0 104.0 100.0
COND 231.0 149.0 155.0
IND 28.0 28.0 100.0
COND_m 2.0 1.0 200.0
IND_m 6.0 6.0 100.0
And 10,000
bytes words Ratio %
-------- -------- -------
Ir 60926.0 13445.0 453.1
Dr 10324.0 1574.0 655.9
Dw 104.0 104.0 100.0
COND 10135.0 1387.0 730.7
IND 28.0 28.0 100.0
COND_m 2.0 1.0 200.0
IND_m 6.0 6.0 100.0
I found this trick on the internet many years ago, but I can't seem to
find it again to give them credit.
|
| |
| |
| |
| |
| |
| | |
This adds a few shifing, masking, and integer arithmetic operations to a
conditional which in return makes sure that one branch is taken only
when it is going to do some good, avoiding a conditional in it.
|
| |
| |
| |
| |
| |
| | |
This .pl in /t is generally called from a test file in that directory,
but if run by hand, this commit makes sure things are properly
initialized
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We have an unresolved issue that #include "fakesdio.h" causes one
of the typemaps to make assignments between different pointer types,
something we can’t fix straightforwardly with casts, since adding
casts to the default typemap (which we are trying to test) may
suppress real problems in production. This is a temporary plaster
till we figure out what to do.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
newGVgen leaks memory, because it puts it vivifies a typeglob in the
symbol table, without arranging for it to be deleted. A typemap is not
an appropriate place to use it, since callers of newGVgen are responsible
for seeing that the GV is freed, if they care.
This came up in #115814.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The complicated nested loops of tests this commit replaces don't need to
be such. To test utf8_is_invariant_string, we just need to put a single
variant in each position of a string that spans more than a full word
(since we have full-word lookup now) and includes partial words on either
side. We set those partial words up to be one byte each less than a
full word. The code needs to work on strings that don't start on a full
word, and don't end on one, and this commit continues to do that.
|
| | |
|
| |
| |
| |
| | |
As proposed in [perl #125330].
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Method lookup yields a fake method for ->import or ->unimport if there's
no actual method, for historical reasons so that "use" doesn't barf
if there's no import method. This fake method used to be &PL_sv_yes
being used as a magic placeholder, recognised specially by pp_entersub.
But &PL_sv_yes is a string, which we'd expect to serve as a symbolic
CV ref. Change method lookup to yield an actual CV with a body in this
case, and remove the special case from pp_entersub. This fixes the
remaining part of [perl #126042].
|
| | |
|
| |
| |
| |
| | |
This silences a compiler warning
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The functions is_utf8_invariant_string() and
is_utf8_invariant_string_loc() are used in several places in the core
and are part of the public API. This commit speeds them up
significantly on ASCII (not EBCDIC) platforms, by changing to use
word-at-a-time parsing instead of per-byte. (Per-byte is retained for
any initial bytes to reach the next word boundary, and any final bytes
that don't fill an entire word.)
The following results were obtained parsing a long string on a 64-bit
word machine:
byte word
------ ------
Ir 100.00 665.35
Dr 100.00 797.03
Dw 100.00 102.12
COND 100.00 799.27
IND 100.00 97.56
COND_m 100.00 144.83
IND_m 100.00 75.00
Ir_m1 100.00 100.00
Dr_m1 100.00 100.02
Dw_m1 100.00 104.12
Ir_mm 100.00 100.00
Dr_mm 100.00 100.00
Dw_mm 100.00 100.00
100% is baseline; numbers larger than that are improvements. The COND
measurement indicates, for example, that there 1/8 as many conditional
branches in the word-at-a-time version.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Using vars pragma is discouraged and has been superseded by 'our' declarations
available in Perl v5.6.0 or later.
This commit is about replacing the usage of 'vars' pragma
by 'our' in 'ext' directory.
|