summaryrefslogtreecommitdiff
path: root/META.json
diff options
context:
space:
mode:
authorNicholas Clark <nick@ccl4.org>2019-11-04 16:58:03 +0100
committerNicholas Clark <nick@ccl4.org>2020-01-19 20:46:05 +0100
commitf172023d734315edbea9c6ee707ca0f95dd6ae80 (patch)
tree399f4fdcabe28a1ff418bf85a89f07395f95d238 /META.json
parent7a992ccc8be4ce4c27268e1980edb4701f9948d9 (diff)
downloadperl-f172023d734315edbea9c6ee707ca0f95dd6ae80.tar.gz
Loading IO is now threadsafe, avoiding the core bug reported as GH #14816.
Re-implement getline() and getlines() as XS code. The underlying problem that we're trying to solve here is making getline() and getlines() in IO::Handle respect the open pragma. That bug was first addressed in Sept 2011 by commit 986a805c4b258067: Make IO::Handle::getline(s) respect the open pragma However, that fix introduced a more subtle bug, hence this reworking. Including the entirety of the rest of that commit message because it explains both the bug the previous approach: See <https://rt.cpan.org/Ticket/Display.html?id=66474>. Also, this came up in <https://rt.perl.org/rt3/Ticket/Display.html?id=92728>. The <> operator, when reading from the magic ARGV handle, automatic- ally opens the next file. Layers set by the lexical open pragma are applied, if they are in scope at the point where <> is used. This works almost all the time, because the common convention is: use open ":utf8"; while(<>) { ... } IO::Handle’s getline and getlines methods are Perl subroutines that call <> themselves. But that happens within the scope of IO/Handle.pm, so the caller’s I/O layer settings are ignored. That means that these two expressions are not equivalent within in a ‘use open’ scope: <> *ARGV->getline The latter will open the next file with no layers applied. This commit solves that by putting PL_check hooks in place in IO::Handle before compiling the getline and getlines subroutines. Those hooks cause every state op (nextstate, or dbstate under the debugger) to have a custom pp function that saves the previous value of PL_curcop, calls the default pp function, and then restores PL_curcop. That means that getline and getlines run with the caller’s compile- time hints. Another way to see it is that getline and getlines’s own lexical hints are never activated. (A state op carries all the lexical pragmata. Every statement has one. When any op executes, it’s ‘pp’ function is called. pp_nextstate and pp_dbstate both set PL_curcop to the op itself. Any code that checks hints looks at PL_curcop, which contains the current run-time hints.) The problem with this approach is that the (current) design and implementation of PL_check hooks is actually not threadsafe. There's one array (as a global), which is used by all interpreters in the process. But as the code added to IO.xs demonstrates, realistically it needs to be possible to change the hook just for this interpreter. GH #14816 has a fix for that bug for blead. However, it will be tricky (to impossible) to backport to earlier perl versions. Hence it's also worthwhile to change IO.xs to use a different approach to solve the original bug. As described above, the bug is fixed by having the readline OP (that implements getline() and getlines()) see the caller's lexical state, not their "own". Unlike Perl subroutines, XS subroutines don't have any lexical hints of their own. getline() and getlines() are very simple, mostly parameter checking, ending with a one line that maps to a single core OP, whose values are directly returned. Hence "all" we need to do re-implement the Perl code as XS. This might look easy, but turns out to be trickier than expected. There isn't any API to be called for the OP in question, pp_readline(). The body of the OP inspects interpreter state, it directly calls pp_rv2gv() which also inspects state, and then it tail calls Perl_do_readline(), which inspects state. The easiest approach seems to be to set up enough state, and then call pp_readline() directly. This leaves us very tightly coupled to the internals, but so do all other approaches to try to tackle this bug. The current implementation of PL_check (and possibly other arrays) still needs to be addressed.
Diffstat (limited to 'META.json')
-rw-r--r--META.json1
1 files changed, 1 insertions, 0 deletions
diff --git a/META.json b/META.json
index 7d45353024..72a455b323 100644
--- a/META.json
+++ b/META.json
@@ -87,6 +87,7 @@
"dist/IO/t/io_dup.t",
"dist/IO/t/io_file.t",
"dist/IO/t/io_file_export.t",
+ "dist/IO/t/io_getline.t",
"dist/IO/t/io_leak.t",
"dist/IO/t/io_linenum.t",
"dist/IO/t/io_multihomed.t",