diff options
author | Nicholas Clark <nick@ccl4.org> | 2019-11-04 16:58:03 +0100 |
---|---|---|
committer | Nicholas Clark <nick@ccl4.org> | 2020-01-19 20:46:05 +0100 |
commit | f172023d734315edbea9c6ee707ca0f95dd6ae80 (patch) | |
tree | 399f4fdcabe28a1ff418bf85a89f07395f95d238 /META.json | |
parent | 7a992ccc8be4ce4c27268e1980edb4701f9948d9 (diff) | |
download | perl-f172023d734315edbea9c6ee707ca0f95dd6ae80.tar.gz |
Loading IO is now threadsafe, avoiding the core bug reported as GH #14816.
Re-implement getline() and getlines() as XS code.
The underlying problem that we're trying to solve here is making
getline() and getlines() in IO::Handle respect the open pragma.
That bug was first addressed in Sept 2011 by commit 986a805c4b258067:
Make IO::Handle::getline(s) respect the open pragma
However, that fix introduced a more subtle bug, hence this reworking.
Including the entirety of the rest of that commit message because it
explains both the bug the previous approach:
See <https://rt.cpan.org/Ticket/Display.html?id=66474>. Also, this
came up in <https://rt.perl.org/rt3/Ticket/Display.html?id=92728>.
The <> operator, when reading from the magic ARGV handle, automatic-
ally opens the next file. Layers set by the lexical open pragma are
applied, if they are in scope at the point where <> is used.
This works almost all the time, because the common convention is:
use open ":utf8";
while(<>) {
...
}
IO::Handle’s getline and getlines methods are Perl subroutines
that call <> themselves. But that happens within the scope of
IO/Handle.pm, so the caller’s I/O layer settings are ignored. That
means that these two expressions are not equivalent within in a
‘use open’ scope:
<>
*ARGV->getline
The latter will open the next file with no layers applied.
This commit solves that by putting PL_check hooks in place in
IO::Handle before compiling the getline and getlines subroutines.
Those hooks cause every state op (nextstate, or dbstate under the
debugger) to have a custom pp function that saves the previous value
of PL_curcop, calls the default pp function, and then restores
PL_curcop.
That means that getline and getlines run with the caller’s compile-
time hints. Another way to see it is that getline and getlines’s own
lexical hints are never activated.
(A state op carries all the lexical pragmata. Every statement
has one. When any op executes, it’s ‘pp’ function is called.
pp_nextstate and pp_dbstate both set PL_curcop to the op itself. Any
code that checks hints looks at PL_curcop, which contains the current
run-time hints.)
The problem with this approach is that the (current) design and implementation
of PL_check hooks is actually not threadsafe. There's one array (as a global),
which is used by all interpreters in the process. But as the code added to
IO.xs demonstrates, realistically it needs to be possible to change the hook
just for this interpreter.
GH #14816 has a fix for that bug for blead. However, it will be tricky (to
impossible) to backport to earlier perl versions.
Hence it's also worthwhile to change IO.xs to use a different approach to
solve the original bug. As described above, the bug is fixed by having the
readline OP (that implements getline() and getlines()) see the caller's
lexical state, not their "own". Unlike Perl subroutines, XS subroutines don't
have any lexical hints of their own. getline() and getlines() are very
simple, mostly parameter checking, ending with a one line that maps to
a single core OP, whose values are directly returned.
Hence "all" we need to do re-implement the Perl code as XS. This might look
easy, but turns out to be trickier than expected. There isn't any API to be
called for the OP in question, pp_readline(). The body of the OP inspects
interpreter state, it directly calls pp_rv2gv() which also inspects state,
and then it tail calls Perl_do_readline(), which inspects state.
The easiest approach seems to be to set up enough state, and then call
pp_readline() directly. This leaves us very tightly coupled to the
internals, but so do all other approaches to try to tackle this bug.
The current implementation of PL_check (and possibly other arrays) still
needs to be addressed.
Diffstat (limited to 'META.json')
-rw-r--r-- | META.json | 1 |
1 files changed, 1 insertions, 0 deletions
@@ -87,6 +87,7 @@ "dist/IO/t/io_dup.t", "dist/IO/t/io_file.t", "dist/IO/t/io_file_export.t", + "dist/IO/t/io_getline.t", "dist/IO/t/io_leak.t", "dist/IO/t/io_linenum.t", "dist/IO/t/io_multihomed.t", |