From b3a2acfa0c0e4f8e48e1f6eb4d6fd143f293d2c6 Mon Sep 17 00:00:00 2001 From: Yves Orton Date: Mon, 3 Feb 2014 22:20:13 +0800 Subject: deal with assignment to $/ better, deprecate edge cases, and forbid others The actual behavior of $/ under various settings and how it is documented varies quite a bit. Clarify the documentation, and add various checks that are validated when setting $/. The gist of the problem was that the way that weirdo ref assignments were handled was mostly broken: * setting to a reference to an array, hash, or other higher level construct would behave similarly to setting it to a reference to a an integer, by numifying the ref and using it as an integer. This behavior was entirely undocumented. * setting to a reference to 0 or to -1 was *documented* in triggering "slurp" behavior, but actually did not. Instead it would set the separator to the stringified form of the ref, which would *appear* as slurp behavior due to the unlikelihood of a file actually containing a string which matched, however was less efficient, and if someone's luck were *terrible* might actually behave as a split. In the future we wish to support more sophisticated ways of setting the input record separator, possibly supporting things like: $/= [ "foo", "bar" ]; $/= qr/foo|bar/; Accordingly this patch *forbids* the use of a non scalar ref, and raises a fatal exception when one does so. Additionally it treats non-positive refs *exactly* the same as assigning undef, *including* ignoring the original value and setting $/ to undef. The means the implementation now matches the documentation. However since this might involve some crazy script changing in behavior (as one can't fetch back the original ref from $/) I have added a warning in category "deprecated" advising the user what has happened and recommending setting to "undef" explicitly. As far as I can tell this will only *break* code doing extremely dodgy things with $/. While putting together this patch I encountered numerous problems with porting tests. First off was porting/podcheck.t, which failed test without saying why or what to do, even under TEST_VERBOSE=1. Then when I did a regen to update the exceptions database and then used that information to try to fix the reported problems it seems that it does not work properly anyway. Specifically you aren't allowed to have a / in the interesting parts of a L<> reference. If you replace the / with an E<0x2f> then the link is valid POD, but podcheck.t then considers it a broken link. If you then replace the / in perdiag with E<0x2f> as well then porting/diag.t complains that you have an undocumented diagnostic! Accordingly I used the --regen option of podcheck.t to add exceptions to the exception database. I have no idea if the pod is correctly formatted or not. --- MANIFEST | 1 + mg.c | 16 ++++++++++++++-- pod/perldelta.pod | 26 +++++++++++++++++++++++++- pod/perldiag.pod | 24 ++++++++++++++++++++++++ pod/perlvar.pod | 7 ++++++- t/lib/warnings/9uninit | 7 +++---- t/lib/warnings/irs | 14 ++++++++++++++ t/porting/known_pod_issues.dat | 4 +++- 8 files changed, 90 insertions(+), 9 deletions(-) create mode 100644 t/lib/warnings/irs diff --git a/MANIFEST b/MANIFEST index e7d6b1d520..f20141db24 100644 --- a/MANIFEST +++ b/MANIFEST @@ -5023,6 +5023,7 @@ t/lib/warnings/doio Tests for doio.c for warnings.t t/lib/warnings/doop Tests for doop.c for warnings.t t/lib/warnings/gv Tests for gv.c for warnings.t t/lib/warnings/hv Tests for hv.c for warnings.t +t/lib/warnings/irs Tests for $/ for warnings.t t/lib/warnings/malloc Tests for malloc.c for warnings.t t/lib/warnings/mg Tests for mg.c for warnings.t t/lib/warnings/op Tests for op.c for warnings.t diff --git a/mg.c b/mg.c index a8032f50bf..6d04536a62 100644 --- a/mg.c +++ b/mg.c @@ -2749,8 +2749,20 @@ Perl_magic_set(pTHX_ SV *sv, MAGIC *mg) } break; case '/': - SvREFCNT_dec(PL_rs); - PL_rs = newSVsv(sv); + if (!SvROK(sv) || ( SvTYPE(SvRV(sv)) < SVt_PVAV && SvIV(SvRV(sv)) > 0 ) ) { + SvREFCNT_dec(PL_rs); + PL_rs = newSVsv(sv); + } else if (SvTYPE(SvRV(sv)) >= SVt_PVAV) { + Perl_croak(aTHX_ "Setting $/ to a %s reference is forbidden", sv_reftype(SvRV(sv),0)); + } else { + /* treat as undef */ + Perl_ck_warner(aTHX_ packWARN(WARN_DEPRECATED), + "Setting $/ to a reference to %s as a form of slurp is deprecated, treating as undef", + SvIV(SvRV(sv)) < 0 ? "a negative integer" : "zero" + ); + SvREFCNT_dec(PL_rs); + PL_rs= newSVsv(&PL_sv_undef); + } break; case '\\': SvREFCNT_dec(PL_ors_sv); diff --git a/pod/perldelta.pod b/pod/perldelta.pod index f5fb28e9d2..45ee8b4526 100644 --- a/pod/perldelta.pod +++ b/pod/perldelta.pod @@ -76,7 +76,23 @@ points is now tainted. This leads to more consistent tainting results. =head1 Deprecations -XXX Any deprecated features, syntax, modules etc. should be listed here. +=over 4 + +=item * + +Setting C<$/> to a reference to zero or a reference to a negative integer is +now deprecated, and will behave B as though it was set to C. +If you want slurp behavior set C<$/> to C explicitly. + +=item * + +Setting C<$/> to a reference to a non integer is now forbidden and will +throw an error. Perl has never documented what would happen in this +context and while it used to behave the same as setting C<$/> to +the address of the references in future it may behave differently, so we +have forbidden this usage. + +=back =head2 Module removals @@ -254,6 +270,10 @@ and New Warnings =item * +Added L< to a %s reference is forbidden|perldiag/"Setting $E<0x2f> to a %s reference is forbidden">> + +=item * + XXX L =back @@ -264,6 +284,10 @@ XXX L =item * +Added L< to a reference to %s as a form of slurp is deprecated, treating as undef|perldiag/"Setting $E<0x2f> to a reference to %s as a form of slurp is deprecated, treating as undef">> + +=item * + XXX L =back diff --git a/pod/perldiag.pod b/pod/perldiag.pod index 98eb026f2f..37acf2bac2 100644 --- a/pod/perldiag.pod +++ b/pod/perldiag.pod @@ -4905,6 +4905,30 @@ less. Please see the following for more information: You should also look at L. +=item Setting $/ to a reference to %s as a form of slurp is deprecated, treating as undef + +(W deprecated) You assigned a reference to a scalar to C<$/> where the +referenced item is not a positive integer. In older perls this B +to work the same as setting it to C but was in fact internally +different, less efficient and with very bad luck could have resulted in +your file being split by a stringified form of the reference. + +In Perl 5.19.9 this was changed so that it would be B the same as +setting C<$/> to undef, with the exception that this warning would be +thrown. + +You are recommended to change your code to set C<$/> to C +explicitly if you wish to slurp the file. In future versions of Perl +assigning a reference to will throw a fatal error. + +=item Setting $/ to a %s reference is forbidden + +(F) You tried to assign a reference to a non integer to C<$/>. In older +Perls this would have behaved similarly to setting it to a reference +to a positive integer, where the integer was the address of the reference. +As of Perl 5.19.9 this is a fatal error, to allow future versions of Perl +to use non integer refs for more interesting purposes. + =item setegid() not implemented (F) You tried to assign to C<$)>, and your operating system doesn't diff --git a/pod/perlvar.pod b/pod/perlvar.pod index dd1d0d693a..4dd4a549ff 100644 --- a/pod/perlvar.pod +++ b/pod/perlvar.pod @@ -1404,7 +1404,12 @@ not reading from a record-oriented file (or your OS doesn't have record-oriented files), then you'll likely get a full chunk of data with every read. If a record is larger than the record size you've set, you'll get the record back in pieces. Trying to set the record -size to zero or less will cause reading in the (rest of the) whole file. +size to zero or less is deprecated and will cause $/ to have the value +of "undef", which will cause reading in the (rest of the) whole file. + +As of 5.19.9 setting C<$/> to any other form of reference will throw a +fatal exception. This is in preparation for supporting new ways to set +C<$/> in the future. On VMS only, record reads bypass PerlIO layers and any associated buffering, so you must not mix record and non-record reads on the diff --git a/t/lib/warnings/9uninit b/t/lib/warnings/9uninit index ce06b6b8d2..d9e5b9bed7 100644 --- a/t/lib/warnings/9uninit +++ b/t/lib/warnings/9uninit @@ -409,10 +409,9 @@ chomp $x; chop $x; my $y; chomp ($x, $y); chop ($x, $y); EXPECT -Use of uninitialized value ${$/} in scalar chomp at - line 6. -Use of uninitialized value ${$/} in chomp at - line 8. -Use of uninitialized value ${$/} in chomp at - line 8. -Use of uninitialized value $y in chomp at - line 8. +Use of uninitialized value $m1 in scalar assignment at - line 4. +Use of uninitialized value $m1 in scalar assignment at - line 4. +Setting $/ to a reference to zero as a form of slurp is deprecated, treating as undef at - line 4. Use of uninitialized value $y in chop at - line 8. ######## use warnings 'uninitialized'; diff --git a/t/lib/warnings/irs b/t/lib/warnings/irs new file mode 100644 index 0000000000..9e1d3dea09 --- /dev/null +++ b/t/lib/warnings/irs @@ -0,0 +1,14 @@ +Test warnings related to $/ +__END__ +-w +# warnable code, warnings enabled via command line switch +$/ = \0; +EXPECT +Setting $/ to a reference to zero as a form of slurp is deprecated, treating as undef at - line 3. +######## +-w +# warnable code, warnings enabled via command line switch +$/ = \-1; +EXPECT +Setting $/ to a reference to a negative integer as a form of slurp is deprecated, treating as undef at - line 3. + diff --git a/t/porting/known_pod_issues.dat b/t/porting/known_pod_issues.dat index 69f79ff64c..bb0c7d7a3c 100644 --- a/t/porting/known_pod_issues.dat +++ b/t/porting/known_pod_issues.dat @@ -232,7 +232,7 @@ pod/perlcygwin.pod Verbatim line length including indents exceeds 79 by 25 pod/perldebguts.pod Verbatim line length including indents exceeds 79 by 34 pod/perldebtut.pod Verbatim line length including indents exceeds 79 by 22 pod/perldebug.pod Verbatim line length including indents exceeds 79 by 3 -pod/perldelta.pod Apparent broken link 1 +pod/perldelta.pod Apparent broken link 3 pod/perldsc.pod Verbatim line length including indents exceeds 79 by 4 pod/perldtrace.pod Verbatim line length including indents exceeds 79 by 26 pod/perlebcdic.pod Verbatim line length including indents exceeds 79 by 3 @@ -262,6 +262,7 @@ pod/perlos2.pod ? Should you be using L<...> instead of 2 pod/perlos2.pod Verbatim line length including indents exceeds 79 by 21 pod/perlos390.pod Verbatim line length including indents exceeds 79 by 11 pod/perlperf.pod Verbatim line length including indents exceeds 79 by 154 +pod/perlqnx.pod Verbatim line length including indents exceeds 79 by 1 pod/perlrun.pod Verbatim line length including indents exceeds 79 by 3 pod/perlsolaris.pod Verbatim line length including indents exceeds 79 by 14 pod/perlsource.pod ? Should you be using F<...> or maybe L<...> instead of 1 @@ -280,6 +281,7 @@ porting/todo.pod Verbatim line length including indents exceeds 79 by 7 utils/c2ph Verbatim line length including indents exceeds 79 by 44 lib/benchmark.pm Verbatim line length including indents exceeds 79 by 2 lib/config.pod ? Should you be using L<...> instead of -1 +lib/config.pod Verbatim line length including indents exceeds 79 by 5 lib/extutils/embed.pm Verbatim line length including indents exceeds 79 by 2 lib/perl5db.pl ? Should you be using L<...> instead of 1 lib/pod/text/overstrike.pm Verbatim line length including indents exceeds 79 by 1 -- cgit v1.2.1