| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
(includes regen/opcode.pl)
|
|
|
|
| |
(includes regen/opcode.pl)
|
| |
|
|
|
|
|
|
|
| |
This takes the few latest changes in the draft Unicode 12.1, ahead of
our freeze. None are substantive. No further non-substantive changes
will be added, except in the unlikely event that a substantive change is
made, we will take it and potentially delay Perl 5.30.
|
|
|
|
|
|
|
| |
Currently Deparse fails to output a backslash, turning the result
into a multi-dimensional array lookup. This is a long-standing fault.
For now, mark it TODO, and remove the construct from uni/fold.t, which is
where I first spotted the issue by running 'TEST -deparse'.
|
|
|
|
| |
A variable needed to be updated for Unicode 12.1
|
|
|
|
|
|
| |
I realized that commit f9c1e7e9ed13a16099c8471c2030b93deb482571
works now, but future Unicode versions may add fractions that fool it.
This commit should handle any such event
|
| |
|
|
|
|
| |
Indent block newly formed in previous commit
|
|
|
|
|
|
|
| |
This turns out to be because Windows doesn't necessarily round to even
on floating point %e conversions. The solution is to add an extra entry
rounding up to odd when a fraction is precisely representable in binary.
So far, the only case where this occurs is 1/32.
|
|
|
|
| |
This inadvertently was left on, slowing down the process a little
|
|
|
|
|
|
|
|
| |
Somehow I missed updating some files with the result that a few official
12.0 final corrections did not make it into
906f46d96ca4ba2d1039d576954bc5a47868348c.
These are mostly tests and break property changes for a few characters
|
| |
|
| |
|
| |
|
|
|
|
| |
This supports this new feature.
|
| |
|
|
|
|
| |
These debugging lines were left in by 21c34e9717d
|
| |
|
|
|
|
|
| |
This renames a variable to more accurately reflect its content, and adds
a new one which has the old name but with an accurate content.
|
| |
|
|
|
|
|
|
|
| |
Prior to this commit 'use utf8' loaded utf8_heavy.pl. But previous
commits in the 5.29 series mean it is not needed from the core unless a
tr/// is using UTF-8, a much less likely occurrence. So load it only on
demand
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I don't know how this ever worked.
Previously, DB::sub() would hold a lock on $DB::DBGR for it's entire
body, including the call to the subroutine being called.
This could cause problems in two cases:
a) on creation of a new thread, CLONE() is called in the context of
the new interpreter before the new thread is created. So you'd have a
sequence like:
threads->new
DB::sub for threads::new (lock $DBGR)
call into threads::new which creates a new interpreter
Cwd::CLONE() (in the new interpreter)
DB::sub for Cwd::CLONE (in the new interpreter) (deadlock trying to lock $DBGR)
One workaround I tried for this was to prevent pp_entersub calling
DB::sub if we were cloning (by checking PL_ptr_table). This did
improve matters, but wasn't needed in the final patch.
Note that the recursive lock on $DBGR would have been fine if the new
code was executing in the same interpreter, since the locking code
simply bumps a reference count if the current interpreter already
holds the lock.
b) when the called subroutine blocks. For the test case this could
happen with the call to $thr->join. There would be a sequence like:
(parent) $thr->join
(parent) DB::sub for threads::join (lock $DBGR)
(parent) call threads::join and block
(child) try to call main::sub1
(child) DB::sub for main::sub1 (deadlock trying to lock $DBGR)
This isn't limited to threads::join obviously, one thread could be
waiting for input, sleeping, or performing a complex calculation.
The solution I chose here was the obvious one - don't hold the lock
for the actual call.
This required some rearrangement of the code and removed some
duplication too.
|
|
|
|
|
|
|
|
|
|
| |
I am starting to write a Unicode::Private_Use module which will allow
one to specify the Unicode properties of private use code points, thus
making them actually useful. This commit adds a hook to regcomp.c to
accommodate this module. The changes are pretty minimal. This way we
don't have to wait another release cycle to get it out there.
I don't want to document this interface, until it's proven.
|
|
|
|
| |
Unicode 12.0 is finalized. Change to use it.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some systems fake their locales, so that they pretend to accept a locale
change, but they either do nothing, making everything the C locale, or
on some systems there is a a second locale "C-UTF-8" that can be
switched to. Configure probes have been added to find such systems, and
this commit changes to use the results of these probes, so that we don't
try looking for other locales (any names we came up with would be
accepted as valid, but don't work, and tests were failing as a result).
Anything running the musl library fits, as does OpenBSD and its kin, as
they view locales as security risks. This commit allows us to take out
some code that was looking for particular OS's.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
These are in a generated structure.
|
|
|
|
|
|
|
|
|
|
|
| |
This removes the most obvious and easy things that are no longer needed
since regexes no longer use swashes at all.
tr/// continues, for the time being, to use swashes, so not all swash
handling is removable now. But tr/// doesn't use inversion lists, and
so a bunch of code is ripped out here. Other code could have been, but
I did only the relatively easy stuff. The rest can be ripped out all at
once when tr/// is stops using swashes.
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a Turkic locale, these are problematic because their mappings
cross the 255/256 boundary.
This change has the side effect of causing U+307 to be added to the
problematic list, and it normally really isn't problematic, because in
those locales where U+130 and U+131 are problematic, U+307 isn't used.
But applications could switch in and out of Turkic locales, so it's best
to leave it be considered problematic. The consequences of making this
mark problematic are simply slightly less optimized regex pattern code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RT #133789
In the path taken through pp_multiconcat() when one or more args have
side-effects such tieing or overloading, multiconcat has to decide
whether to just return the result of all the concatting as-is, or to
first assign it to an expression or variable if the op includes an
implicit assign (such as $lex = x.y.z or $a[0] = x.y.z).
The code was getting this right for those two cases, and was also
getting it right for the append cases ($lex .= x.y.z and $a[0] .= x.y.z),
which don't need assigns. But for the bare case (x.y.z) it was assigning
to the op's targ as well as returning the value. Hence leaking a
reference until destruction of the sub and its pad.
This commit stops the assign in that last case.
|
| |
|
|
|
|
|
|
|
|
|
| |
These 2 Unicode-like property definitions used internally by the regular
expression compiler are moved by this commit from regen/mk_invlists.pl
to lib/unicore/mktables.
By placing all these in the same place, maintainers only have to learn
one bit of code, instead of two.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For: RT # 133683
pod/perlmodlib.pod is a file generated by pod/perlmodlib.PL, which is
run by 'miniperl' during 'make'. That program parses the 'NAME' header
of .pod files and fragments of POD found in 'regen/opcode.pl'. The POD
for B::Op_private is one such fragment. Correcting a superfluous
whitespace in that fragment did not suffice to prevent the downstream
formatting error reported in the RT -- an error visible with 'pod2text'
and 'pod2html' as well. We also had to make the regex which
perlmodlib.PL uses to parse the 'NAME' header more flexible.
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Adapt tests in various files to removal of these variables. Add
t/lib/croak/gv to test fatalizations of $# and $* -- tests therein
adapted from tests formerly in t/lib/warnings/gv.
Per: RT # 133583
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
sigtrap defines a signal handler apparently intended to be called
under unsafe signals, since a) the code was written before safe
signals were implemented and b) it uses syswrite() for output and
avoid creating new SVs where it can.
Unfortunately syswrite() doesn't handle PerlIO layers, *and* with
syswrite() being disallowed for :utf8 handlers, throws an exception.
This causes the sigtrap tests to fail if PERL_UNICODE is set and the
current locale is a UTF-8 locale.
I want to avoid allocating new SVs until the point where the code
originally did so, so the code now attempts a syswrite() under
eval, falling back to print, and then at the point where the original
code started allocating SVs uses PerlIO::get_layers() to check if
any layers might make a difference to the output.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This includes removing the :utf8 logic from pp_syswrite. pp_sysread
retains it, since it's also used for read().
Tests that are specifically testing the behaviour against :utf8
handles have been removed (eg in lib/open.t), several other tests
that incidentally used those functions on :utf8 handles have been
adapted to use :raw handles instead (eg. op/readline.t).
Test lib/sigtrap.t fails if STDERR is :utf8, in code from the
original 5.000 commit, which is intended to run in a signal handler
|
|
|
|
|
|
|
|
| |
Committer: For porting tests: Update $VERSION in 4 files.
Run:
./perl -Ilib regen/mk_invlists.pl
./perl -Ilib regen/regcharclass.pl
|
| |
|
|
|
|
| |
This removes arybase and all its surrounding machinery.
|