| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
This code only works by coincidence on ASCII platforms, due to the
chance ways the underlying UTF-8 is represented. But it definitely
doesn't on EBCDIC. Test before assuming is UTF-8.
|
|
|
|
|
| |
It's just a little bit better to do the warning (which could be made
fatal) before setting something that's only needed later.
|
|
|
|
|
| |
By moving the setting of this to after two branches of a conditional
come together, it gets set always, instead of sometimes.
|
|
|
|
| |
This simplies a bit.
|
|
|
|
|
|
| |
The data contained in this variable is a copy of const data stored
elsewhere. Instead of making a copy, simplify to just point to the
already-stored data
|
|
|
|
|
| |
This variable doesn't add anything. We can use other variables to
just as conveniently get at the information it contains.
|
|
|
|
|
| |
To get to the removed conditional, it has already been checked for being
true.
|
|
|
|
| |
Its better if the comment and code mesh.
|
| |
|
| |
|
|
|
|
|
| |
For: https://github.com/Perl/perl5/issues/19569, as reported by
kbulgrien.
|
|
|
|
|
|
|
|
| |
These pods have some very long lines that make sense to keep on a single
line, such as output from a program. That means that someone viewing
them will either enlarge their window to view them unbroken or all is
lost anyway; there may be a few lines that could be shortened, but no
real value to do so.
|
| |
|
|
|
|
| |
To make it easier to read.
|
|
|
|
| |
So don't check it.
|
| |
|
| |
|
|
|
|
| |
This closes #18458
|
| |
|
| |
|
|
|
|
|
|
| |
setlocale() is a no-op on this system after the first thread is created,
making it an outlier of platforms, so the tests assume otherwise, hence
would fail.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
This test fails on EBCDIC systems, because it wants a non-ASCII
character, and the one it chose, E9, is ASCII on EBCDIC ('Z').
perlhacktips suggests B6 as a character to use in such tests, and this
commit changes to use that.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The prior commit shows what can happen when two branches do the same
thing: they can get out of sync
Since this test file was originally written, the testing infrastructure
has improved so that there are functions that handle the gory details of
character set differences for you. This test file hadn't been updated
since it wasn't causing a problem, until now.
This commit changes to use the new infrastructure, and as a result one
branch gets removed each from the two tests that varied depending on
character set.
|
|
|
|
|
| |
This file was recently changed, and the EBCDIC side of the change had a
typo.
|
|
|
|
|
|
| |
After 271c3af797, early bailout from the inner one of a pair of nested
lookbehinds would leave the desired match_end pointing at the wrong
place, so the outer lookbehind could give the wrong answer.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
scan_str() calls s=skipspace(s). It turns out that this function can
actually change the buffer 's' is pointing to, so that the original
'start' passed in to the function is obsolete. Just update it. This is
very much like the paradigm already in S_force_word().
This bug previously existed, but commit
32b87797e986f5d99836e16ea6b9d9ff5a56d3be increased the frequency of
occurrence from close to non-existent to relatively often. It only
happened when the string being delimited had some spaces before it, and
only if the buffer got moved. This depends on the position the
construct is in the file, and on the buffering of the reading of that
file, hence the symptoms had it occurring much more often using stdio
than PerlIO. (it could just as well have been the reverse, I suppose.)
The mentioned commit collapsed two different loops; one of which didn't
bother with a check it should have been doing. Without that check, the
likelihood of this being triggered was much less. (But illegal input
would get by.)
There is a nuance here, which resulted in the need for this commit to
also update the test file, from having two occurrences of an error on a
single line to just one. This is because, if the buffer moves, we reset
'start' to 's'. This makes 's' appear to be at the left edge of the
input when it really is just at the left edge of the buffer. The test
that failed used a combining character (I'll call it 'cc' for short)
after a space, to check that the code accurately catches the illegality
that you can't delimit a string with a character that doesn't stand on
its own, such as a cc. However when such a character comes at the
beginning of the input, there's nothing for it to combine with, and
Unicode says that is legal, so we do too. So this moving 'start' makes
something that is illegal look to be legal. I don't think this is a
problem because the code looks up the cc and discovers there is no
mirror for it, so it must also be the terminator for the string. If
this cc is just from a single typo in the input, there won't be a
matching terminator, and the compilation will abort. If the program
intended to use a cc as both fore and aft of a string, the terminating
occurrence of this cc will also be checked for validity, and it will
almost certainly be seen to be an illegal cc in this context, so again
the compilation will fail. That is indeed what is happening in
t/lib/warnings/toke. If the buffering were such that the terminating cc
also began a new buffer, it again would be viewed as at the edge and the
string would be parsed as being ok, when it really shouldn't have been.
Should this happen, I don't see a real problem. An attacker could craft
a string with the precise length to make this happen, but to do so they
would have to control the source code, and the war is already lost.
|
| |
|
| |
|
|
|
|
|
| |
This is in response to
https://github.com/Perl/perl5/pull/19558#issuecomment-1076659884
|
|
|
|
|
| |
Commit d1e771d8c533168553df9b2a858d967f707fc9fe broke EBCDIC builds by
doubly encoding some UTF-8 characters.
|
|
|
|
| |
Failing to check for max iterations caused an assertion failure.
|
| |
|
|
|
|
|
|
| |
Use 'F<>' for strings that are simply filenames.
As reported by Tux on #p5p.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
The function can be simplified by using the now-inlined newSV_type
function, directly using SvNV_set, and twiddling the required flags.
This cuts out any function call overhead, a switch statement leading
to a sv_upgrade(sv, SVt_NV) call, and a touch more bit-twiddling than
is necessary.
|
|
|
|
| |
rather than encoding.pm
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|