| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
When yylex() attempts to report a UTF-8 encoding error, it
indirectly accesses PL_op, this would cause an access to freed
memory if the CV containing that op (and the op itself) had been
freed.
|
|
|
|
| |
Requires logging the output of "make test" with HARNESS_TIMER=1
|
| |
|
|
|
|
| |
Scan for it in all platforms.
|
|
|
|
|
|
|
|
|
|
|
| |
This bug is a result of 32-bit vs 64-bit words, and is a problem in the
test file and not the underlying code.
The blamed commit changed things so that is a UTF-8 sequence has
multiple malformations, a diagnostic is generated for each. Some of the
tests in utf8decode.t overflow on 32-bit words, but not 64. The
solution is to change the .t to also look for the extra overflow
warnings on 32 bit machines.
|
|
|
|
| |
Some compilers wrongly warn that this is used uninitialized.
|
| |
|
| |
|
|
|
|
|
|
| |
Commit 1c5665476f0d7250c7d93f82eab2b7cda1e6937f added explicit cast
for one of the clock_gettime() arguments, but darwin lacks clockid_t,
so update emulation layer to match function prototype too.
|
|
|
|
|
|
|
|
|
|
|
|
| |
These tests seem to often be outliers in execution time. In faster
modern machines the slowness is not noticeable, but in slower machines
these are excruciatingly slow.
In slow machines these tests may grind for hours, but that is not
that useful information. We know the machine is slow, already.
The uniprops.t could also use the watchdog, except that TestProp.pl
seems to be purposefully avoiding using test.pl.
|
|
|
|
|
|
| |
Also include required headers and report errors on failure.
(Inspired by afoken's post at <http://perlmonks.org/?node_id=1173959>.)
|
| |
|
|
|
|
| |
This complicated macro boils down to just one bit.
|
|
|
|
|
|
|
|
|
|
| |
This new function behaves like utf8n_to_uvchr(), but takes an extra
parameter that points to a U32 which will be set to 0 if no errors are
found; otherwise each error found will set a bit in it. This can be
used by the caller to figure out precisely what the error(s) is/are.
Previously, one would have to capture and parse the warning/error
messages raised. This can be used, for example, to customize the
messages to the expected end-user's knowledge level.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some UTF-8 sequences can have multiple malformations. For example, a
sequence can be the start of an overlong representation of a code point,
and still be incomplete. Until this commit what was generally done was
to stop looking when the first malformation was found. This was not
correct behavior, as that malformation may be allowed, while another
unallowed one went unnoticed. (But this did not actually create
security holes, as those allowed malformations replaced the input with a
REPLACEMENT CHARACTER.) This commit refactors the error handling of
this function to set a flag and keep going if a malformation is found
that doesn't preclude others. Then each is handled in a loop at the
end, warning if warranted. The result is that there is a warning for
each malformation for which warnings should be generated, and an error
return is made if any one is disallowed.
Overflow doesn't happen except for very high code points, well above the
Unicode range, and above fitting in 31 bits. Hence the latter 2
potential malformations are subsets of overflow, so only one warning is
output--the most dire.
This will speed up the normal case slightly, as the test for overflow is
pulled out of the loop, allowing the UV to overflow. Then a single test
after the loop is done to see if there was overflow or not.
|
|
|
|
|
|
|
| |
And reflow to fit in 80 columns. This is in preparation for the next
commit which will enlocde this new code with two more for loops.
Several lines that were missing semi-colons have these added (they were
at the end of nested blocks, so it wasn't an error)
|
|
|
|
|
|
| |
These two tests are overlong malformations, besides being the ones
purportedly being tested. Make them not overlong, so are testing just
one thing
|
|
|
|
|
|
| |
Under some circumstances we weren't validating that the generated
warnings are correct. This required reordering some 'if' tests, and
revised special casing of the overflow test.
|
| |
|
|
|
|
|
| |
This is in preparation for the same functionality to each be used in a
new place in a future commit
|
|
|
|
|
|
| |
These were recently added in 2b47960981adadbe81b9635d4ca7861c45ccdced.
This also removes the #undefs of these in preparation for them to be
used later in the file.
|
|
|
|
|
|
| |
These #defines give flag bits in a U32. This commit opens a gap that
will be filled in a future commit. A test file has to change to
correspond, as it duplicates the defines.
|
|
|
|
|
|
|
| |
There are many instances of this simple code to dump an array of trapped
warning messages. The problem is that they display better when joined
by "" rather than by a comma. Rather than change each instance to do
that, I changed each instance to a sub call and changed it there.
|
|
|
|
| |
for branch prediction
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've long been unsatisfied with the information contained in the
error/warning messages raised when some input is malformed UTF-8, but
have been reluctant to change the text in case some one is relying on
it. One reason that someone might be parsing the messages is that there
has been no convenient way to otherwise pin down what the exact
malformation might be. A few commits from now will add a facility
to get the type of malformation unambiguously. This will be a better
mechanism to use for those rare modules that need to know what's the
exact malformation.
So, I will fix and issue pull requests for any module broken by this
commit.
The messages are changed by now dumping (in \xXY format) the bytes that
make up the malformed character, and extra details are added in most
cases.
Messages about overlongs now display the code point they evaluate to and
what the shortest UTF-8 sequence for generating that code point is.
Messages about overflowing now just display that it overflows, since the
entire byte sequence is now dumped. The previous message displayed just
the byte which was being processed where overflow was detected, but that
information is not at all meaningfull.
|
|
|
|
| |
This text is generated in 2 places; consolidate into one place.
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
Such lists were silently ignored. This fixes things so no pod commands
are silently ignored, and hence lists are now accepted in a function's
pod. This fixes the entry for 'vverify', whose =item list was not
getting picked up.
|
|
|
|
|
|
| |
These 2 macros were using =item instead of =for apidoc, so they silently
were not included in perlapi. This commit also changes their
references to links.
|
| |
|
| |
|
| |
|
|
|
|
| |
This reverts commit 26d58bfed57736ec1e1f1dfd579484f8b6fcccd7.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Perl_moreswitches processes a single switch, and returns a pointer
to the start of the next switch. It can return either
the a pointer to the next flag itself:
#!perl -n -p
^ Can point here
Or, to the space before the next "arg":
#!perl -n -p
^ Can point here
(Where the next call to Perl_moreswitches will consume " -".)
In the case of -i[extension], the pointer is by default pointing at
the space after the end of the argument. The current code tries to
do the former, by unconditionally advancing the pointer, and then
advancing it again if it is on a '-'. But that is incorrect:
#!perl -i p
^ Will point here, but that isn't a flag
I could fix this by removing the unconditional s++, and having it
increment by 2 if *(s+1)=='-', but this work isn't actually
necessary - it's better to just leave it pointing at the space after
the argument.
|
|
|
|
|
|
|
|
|
|
|
| |
tempfile() with template gives a file in the current directory.
tempfile() without template gives a file in /tmp. These may be on different
kinds of filesystems (in the instant case, the former was on ext2, the latter
on ext4) with different characteristics with respect to high-resolution
timing.
Originally reported at:
https://rt.cpan.org/Public/Bug/Display.html?id=116127#txn-1674171
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is saving about 500k when using SelectSaver
as most of the time Carp is not required.
before> perl -I. -e 'require q{lib/SelectSaver.pm}; print qx{grep VmRSS /proc/$$/status}'
VmRSS: 2920 kB
after> perl -I. -e 'require q{lib/SelectSaver.pm}; print qx{grep VmRSS /proc/$$/status}'
VmRSS: 2352 kB
Committer: Increment SelectSaver $VERSION. Add perldelta entry for SelectSaver.
For: RT # 129235
|
|
|
|
|
|
|
|
|
| |
Recent commit 418080dc73a4b9e525a76d6d3b5034ff616716b4 fixing a test in
this file that was failing only on EBCDIC platforms had an error. It
applied a correction to a test that didn't require it, causing it to
fail. This commit changes that to use a different method to detect
which tests to apply the correction to, and knowing that some things can
be determined earlier as a result.
|
|
|
|
|
|
|
|
| |
This string literal contains U8's. It normally is only compiled on
EBCDIC, and when I tried it (by changing the #ifdef's around) on Linux
g++, it fails to compile. Apparently it does compile on z/OS, but the
logs don't show the result. But there are a bunch of failures there
involving this function, and this could explain them.
|
|
|
|
|
|
|
| |
If the target is utf8 and either the anchored or floating substrings
are not, we need to create utf8 copies to check against. The state
of the two substrings may not be the same, but we were only testing
whichever we planned to check first.
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
op.c: In function ‘OP* Perl_newASSIGNOP(PerlInterpreter*, I32, OP*, I32, OP*)’:
op.c:6605:15: error: jump to label ‘detach_split’ [-fpermissive]
detach_split:
^
op.c:6631:22: note: from here
goto detach_split;
^
op.c:6584:30: note: skips initialization of ‘PMOP* const pm’
PMOP * const pm = (PMOP*)right;
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The code that prints '$i:1,2'' in something like 'padsv[$i:1,2]':
extract it out into a separate function, then use it with split
to display the array name rather than just a target number in:
$ perl -MO=Concise -e'my @a = split()'
...
split(/" "/ => @a:1,2)[t2] vK/LVINTRO,RTIME,ASSIGN,LEX,IMPLIM ->6
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
RT #127999 Slowdown in split + list assign
The compile-time common-value detection mechanism for OP_ASSIGN
was getting OP_SPLIT wrong.
It was assuming that OP_SPLIT was always dangerous. In fact,
OP_SPLIT is usually completely safe, not passing though any of its
arguments, except where the assign in (@a = split()) has been optimised
away and the array attached directly to the OP_SPLIT op, or the ops that
produce the array have been appended as an extra child of the OP_SPLIT op
(OPf_STACKED).
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
There are currently two optimisations for when the results of a split
are assigned to an array.
For the first,
@array = split(...);
the aassign and padav/rv2av are optimised away, and pp_split() directly
assigns to the array attached to the split op (via op_pmtargetoff or
op_pmtargetgv).
For the second,
my @array = split(...);
local @array = split(...);
@{$expr} = split(...);
The aassign is optimised away, but the padav/rv2av is kept as an additional
arg to split. pp_split itself then uses the first arg popped off the stack
as the array (This was introduced by FC with v5.21.4-409-gef7999f).
This commit moves these two:
my @array = split(...);
local @array = split(...);
from the second case to the first case, by simply setting OPpLVAL_INTRO
on the OP_SPLIT, and making pp_split() do SAVECLEARSV() or save_ary()
as appropriate.
This makes my @a = split(...) a few percent faster.
|
| |
| |
| |
| |
| |
| |
| | |
Whitepace-only change.
This is a followup to the previous commit, which simplified that code
somewhat.
|