From 412912b60bb061ff9d485db6de9e019158766942 Mon Sep 17 00:00:00 2001 From: Ricardo Signes Date: Wed, 15 Feb 2012 22:39:47 -0500 Subject: begin filling the 5.16.0 delta from 5.15.6 This is largely a copy and paste job. Once I copy and paste most things in, I will then start condensing them. This does *not* include the following sections from perl5156delta: * module updates * internals changes --- Porting/perl5160delta.pod | 549 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 535 insertions(+), 14 deletions(-) diff --git a/Porting/perl5160delta.pod b/Porting/perl5160delta.pod index bd76f7f97e..0b29c26697 100644 --- a/Porting/perl5160delta.pod +++ b/Porting/perl5160delta.pod @@ -19,6 +19,186 @@ XXX Any important notices here =head1 Core Enhancements +=head2 C + +The XS-callable function C, when presented with +malformed UTF-8 input, can read up to 12 bytes beyond the end of the +string. This cannot be fixed without changing its API. It is not +called from CPAN. The documentation now describes how to use it +safely. + +=head2 Other C functions, as well as C, etc. + +Most of the other XS-callable functions that take UTF-8 encoded input +implicitly assume that the UTF-8 is valid (not malformed) in regards to +buffer length. Do not do things such as change a character's case or +see if it is alphanumeric without first being sure that it is valid +UTF-8. This can be safely done for a whole string by using one of the +functions C, C, and +C. + +=head2 C> + +As of this release, version declarations like C now disable +all features before enabling the new feature bundle. This means that +the following holds true: + + use 5.016; + # 5.16 features enabled here + use 5.014; + # 5.16 features disabled here + +C and higher continue to enable strict, but explicit C and C now override the version declaration, even +when they come first: + + no strict; + use 5.012; + # no strict here + +There is a new ":default" feature bundle that represents the set of +features enabled before any version declaration or C has +been seen. Version declarations below 5.10 now enable the ":default" +feature set. This does not actually change the behaviour of C, because features added to the ":default" set are those that were +traditionally enabled by default, before they could be turned off. + +C<$[> is now disabled under C. It is part of the default +feature set and can be turned on or off explicitly with C. + +=head2 C + +The change to C in 5.15.2 has been reverted. It +now returns a stringified version object once more. + +=head2 C lvalue revamp + +When C is called in lvalue or potential lvalue context with two +or three arguments, a special lvalue scalar is returned that modifies +the original string (the first argument) when assigned to. + +Previously, the offsets (the second and third arguments) passed to +C would be converted immediately to match the string, negative +offsets being translated to positive and offsets beyond the end of the +string being truncated. + +Now, the offsets are recorded without modification in the special +lvalue scalar that is returned, and the original string is not even +looked at by C itself, but only when the returned lvalue is +read or modified. + +These changes result in several incompatible changes and bug fixes: + +=over + +=item * + +If the original string changes length after the call to C but +before assignment to its return value, negative offsets will remember +their position from the end of the string, affecting code like this: + + my $string = "string"; + my $lvalue = \substr $string, -4, 2; + print $lvalue, "\n"; # prints "ri" + $string = "bailing twine"; + print $lvalue, "\n"; # prints "wi"; used to print "il" + +The same thing happens with an omitted third argument. The returned +lvalue will always extend to the end of the string, even if the string +becomes longer. + +=item * + +Tied (and otherwise magical) variables are no longer exempt from the +"Attempt to use reference as lvalue in substr" warning. + +=item * + +That warning now occurs when the returned lvalue is assigned to, not +when C itself is called. This only makes a difference if the +return value of C is referenced and assigned to later. + +=item * + +The order in which "uninitialized" warnings occur for arguments to +C has changed. + +=item * + +Passing a substring of a read-only value or a typeglob to a function +(potential lvalue context) no longer causes an immediate "Can't coerce" +or "Modification of a read-only value" error. That error only occurs +if and when the value passed is assigned to. + +The same thing happens with the "substr outside of string" error. If +the lvalue is only read, not written to, it is now just a warning, as +with rvalue C. + +=item * + +C assignments no longer call FETCH twice if the first argument +is a tied variable, just once. + +=back + +It was impossible to fix all the bugs without an incompatible change, +and the behaviour of negative offsets was never specified, so the +change was deemed acceptable. + +=head2 Return value of C + +C returns C in scalar context or an empty list in list +context when there is a run-time error. When C was passed a +string in list context and a syntax error occurred, it used to return a +list containing a single undefined element. Now it returns an empty +list in list context for all errors [perl #80630]. + +=head2 Anonymous handles + +Automatically generated file handles are now named __ANONIO__ when the +variable name cannot be determined, rather than $__ANONIO__. + +=head2 Last-accessed filehandle + +Perl has an internal variable that stores the last filehandle to be +accessed. It is used by C<$.> and by C and C without +arguments. + +It used to be possible to set this internal variable to a glob copy and +then modify that glob copy to be something other than a glob, and still +have the last-accessed filehandle associated with the variable after +assigning a glob to it again: + + my $foo = *STDOUT; # $foo is a glob copy + <$foo>; # $foo is now the last-accessed handle + $foo = 3; # no longer a glob + $foo = *STDERR; # still the last-accessed handle + +Now the C<$foo = 3> assignment unsets that internal variable, so there +is no last-accessed filehandle, just as if C<< <$foo> >> had never +happened. + +=head2 C<__SUB__> + +The new C<__SUB__> token, available under the "current_sub" feature +(see L) or C, returns a reference to the current +subroutine, making it easier to write recursive closures. + +=head2 New option for the debugger's B command + +The B command in the debugger, which toggles tracing mode, now +accepts a numeric argument that determines how many levels of +subroutine calls to trace. + +=head2 Return value of C + +The value returned by C on a tied variable is now the actual +scalar that holds the object to which the variable is tied. This +allows ties to be weakened with C. + + =head2 More consistent C The C operator sometimes treats a string argument as a sequence of @@ -203,6 +383,35 @@ in your C section: =over 4 +=item * + +Perl 5.12.0 sped up the destruction of objects whose classes define +empty C methods (to prevent autoloading), by simply not +calling such empty methods. This release takes this optimisation a +step further, by not calling any C method that begins with a +C statement. This can be useful for destructors that are only +used for debugging: + + use constant DEBUG => 1; + sub DESTROY { return unless DEBUG; ... } + +Constant-folding will reduce the first statement to C if DEBUG +is set to 0, triggering this optimisation. + +=item * + +Assigning to a variable that holds a typeglob or copy-on-write scalar +is now much faster. Previously the typeglob would be stringified or +the copy-on-write scalar would be copied before being clobbered. + +=item * + +Assignment to C in void context is now more than twice its +previous speed. Instead of creating and returning a special lvalue +scalar that is then assigned to, C modifies the original string +itself. + + =item * C no longer calculates a value to return when called in void @@ -571,6 +780,19 @@ Perl. It is still a work in progress. =head2 Changes to Existing Documentation +=head3 L + +=over 4 + +=item * + +The example function for checking for taintedness contained a subtle +error. C<$@> needs to be localized to prevent its changing this +global's value outside the function. The preferred method to check for +this remains L. + +=back + =head3 L, L =over 4 @@ -911,6 +1133,33 @@ of C<$[> as a module. =over 4 +=item * + +Redefinition warnings for constant subroutines used to be mandatory, +even occurring under C. Now they respect the L +pragma. + +=item * + +The "Attempt to free non-existent shared string" has had the spelling +of "non-existent" corrected to "nonexistent". It was already listed +with the correct spelling in L. + +=item * + +The 'Use of "foo" without parentheses is ambiguous' warning has been +extended to apply also to user-defined subroutines with a (;$) +prototype, and not just to built-in functions. + +=item * + +The error messages for using C and C outside of a +topicalizer have been standardised to match the messages for +C and loop controls. They now read 'Can't "default" outside +a topicalizer' and 'Can't "when" outside a topicalizer'. They both +used to be 'Can't use when() outside a topicalizer' [perl #91514]. + + =item * The uninitialized warning for C when C<$_> is implicit and undefined @@ -965,12 +1214,17 @@ XXX Describe change here =head1 Utility Changes -XXX Changes to installed programs such as F and F go -here. Most of these are built within the directories F and F. +=head3 L + +=over 4 + +=item * -[ List utility changes as a =head3 entry for each utility and =item -entries for each change -Use L with program names to get proper documentation linking. ] +L displays information about the internal record structure +of the zip file. It is not concerned with displaying any details of +the compressed data stored in the zip file. + +=back =head3 L @@ -980,9 +1234,9 @@ Use L with program names to get proper documentation linking. ] L used to generate code of the form - unless(defined(&FOO)) { - sub FOO () {42;} - } + unless(defined(&FOO)) { + sub FOO () {42;} + } But the subroutine is a compile-time declaration, and is hence unaffected by the condition. It has now been corrected to emit a string C @@ -996,6 +1250,11 @@ around the subroutine [perl #99368]. =item * +The -Dusesitecustomize and -Duserelocatableinc options now work +together properly. + +=item * + F has been modified for compatibility with GCC's B<-Werror> option, as used by some projects that include perl's header files (5.14.1). @@ -1088,33 +1347,53 @@ XXX =head2 Platform-Specific Notes +=head3 VMS + =over 4 -=item VMS +=item * Remove unnecessary includes, fix miscellaneous compiler warnings and close some unclosed comments on F. Remove sockadapt layer from the VMS build. -=item GNU/Hurd +=item * + +A link-time error on VMS versions without C support was +introduced in 5.15.1, but has now been corrected. + +=item * + +Explicit support for VMS versions prior to v7.0 and DEC C versions +prior to v6.0 has been removed. + +=item * + +Since Perl 5.10.1, the home-grown C wrapper has been unable to +distinguish between a directory name containing an underscore and an +otherwise-identical filename containing a dot in the same position +(e.g., t/test_pl as a directory and t/test.pl as a file). This problem +has been corrected. + +=back + +=head3 GNU/Hurd Numerous build and test failures on GNU/Hurd have been resolved with hints for building DBM modules, detection of the library search path, and enabling of large file support. -=item OpenVOS +=head3 OpenVOS Perl is now built with dynamic linking on OpenVOS, the minimum supported version of which is now Release 17.1.0. -=item SunOS +=head3 SunOS The CC workshop C++ compiler is now detected and used on systems that ship without cc. -=back - =head1 Internal Changes =over 4 @@ -1367,6 +1646,248 @@ fixed [perl #85026]. =item * +RT #78266: The regex engine has been leaking memory when accessing +named captures that weren't matched as part of a regex ever since 5.10 +when they were introduced, e.g. this would consume over a hundred MB of +memory: + + for (1..10_000_000) { + if ("foo" =~ /(foo|(?bar))?/) { + my $capture = $+{capture} + } + } + system "ps -o rss $$"' + +=item * + +A constant subroutine assigned to a glob whose name contains a null +will no longer cause extra globs to pop into existence when the +constant is referenced under its new name. + +=item * + +C was not treating C and C as equivalent when +such a sub was provided as the comparison routine. It used to croak on +C. + +=item * + +Subroutines from the C namespace are once more exempt from +redefinition warnings. This used to work in 5.005, but was broken in +5.6 for most subroutines. For subs created via XS that redefine +subroutines from the C package, this stopped working in 5.10. + +=item * + +New XSUBs now produce redefinition warnings if they overwrite existing +subs, as they did in 5.8.x. (The C logic was reversed in +5.10-14. Only subroutines from the C namespace would warn +when clobbered.) + +=item * + +Redefinition warnings triggered by the creation of XSUBs now respect +Unicode glob names, instead of using the internal representation. This +was missed in 5.15.4, partly because this warning was so hard to +trigger. (See the previous item.) + +=item * + +C used to use compile-time warning hints, instead of +run-time hints. The following code should never produce a redefinition +warning, but it used to, if C redefined an existing +subroutine: + + use warnings; + BEGIN { + no warnings; + some_XS_function_that_calls_new_CONSTSUB(); + } + +=item * + +Redefinition warnings for constant subroutines are on by default (what +are known as severe warnings in L). This was only the case +when it was a glob assignment or declaration of a Perl subroutine that +caused the warning. If the creation of XSUBs triggered the warning, it +was not a default warning. This has been corrected. + +=item * + +The internal check to see whether a redefinition warning should occur +used to emit "uninitialized" warnings in cases like this: + + use warnings "uninitialized"; + use constant {u => undef, v => undef}; + sub foo(){u} + sub foo(){v} + +=item * + +A bug fix in Perl 5.14 introduced a new bug, causing "uninitialized" +warnings to report the wrong variable if the operator in question had +two operands and one was C<%{...}> or C<@{...}>. This has been fixed +[perl #103766]. + +=item * + +C<< version->new("version") >> and C no longer +crash [perl #102586]. + +=item * + +C<$tied =~ y/a/b/>, C and C now call FETCH +just once when $tied holds a reference. + +=item * + +Four-argument C no longer produces its "Non-string passed as +bitmask" warning on tied or tainted variables that are strings. + +=item * + +C now always calls FETCH on the buffer passed to it if the +buffer is tied. It used to skip the call if the tied variable happened +to hold a typeglob. + +=item * + +C<< $tied .= <> >> now calls FETCH once on C<$tied>. It used to call +it multiple times if the last value assigned to or returned from the +tied variable was anything other than a string or typeglob. + +=item * + +The C keyword added in 5.15.5 was respecting C +declarations from the outer scope, when it should have been ignoring +them. + +=item * + +C no longer crashes, but produces an error message, when +the unwinding of the current subroutine's scope fires a destructor that +undefines the subroutine being "goneto" [perl #99850]. + +=item * + +Arithmetic assignment (C<$left += $right>) involving overloaded objects +that rely on the 'nomethod' override no longer segfault when the left +operand is not overloaded. + +=item * + +Assigning C<__PACKAGE__> or any other shared hash key scalar to a stash +element no longer causes a double free. Regardless of this change, the +results of such assignments are still undefined. + +=item * + +Assigning C<__PACKAGE__> or another shared hash key string to a +variable no longer stops that variable from being tied if it happens to +be a PVMG or PVLV internally. + +=item * + +Creating a C sub no longer stops C<%+>, C<%-> and +C<%!> from working some of the time [perl #105024]. + +=item * + +When presented with malformed UTF-8 input, the XS-callable functions +C, C, and +C could read beyond the end of the input +string by up to 12 bytes. This no longer happens. [perl #32080]. +However, currently, C still has this defect, see +L above. + +=item * + +Doing a substitution on a tied variable returning a copy-on-write +scalar used to cause an assertion failure or an "Attempt to free +nonexistent shared string" warning. + +=item * + +A change in perl 5.15.4 caused C to produce malloc errors and +a crash with Perl's own malloc, and possibly with other malloc +implementations, too [perl #104034]. + +=item * + +A bug fix in 5.15.5 could sometimes result in assertion failures under +debugging builds of perl for certain syntax errors in C, such as +C + +=item * + +The "c [line num]" debugger command was broken by other debugger +changes released in 5.15.3. This is now fixed. + +=item * + +Breakpoints were not properly restored after a debugger restart using +the "R" command. This was broken in 5.15.3. This is now fixed. + +=item * + +The debugger prompt did not display the current line. This was broken +in 5.15.3. This is now fixed. + +=item * + +Class method calls still suffered from the Unicode bug with Latin-1 +package names. This was missed in the Unicode package name cleanup in +5.15.4 [perl #105922]. + +=item * + +The debugger no longer tries to do C when dumping data +structures. + +=item * + +Calling C where $fh is a glob copy (e.g., after C<$fh = +*STDOUT>), assigning something other than a glob to $fh, and then +freeing $fh (e.g., by leaving the scope where it is defined) no longer +causes the internal variable used by C<$.> (C) to point +to a freed scalar, that could be reused for some other glob, causing +C<$.> to use some unrelated filehandle [perl #97988]. + +=item * + +A regression in 5.14 caused these statements not to set the internal +variable that holds the handle used by C<$.>: + + my $fh = *STDOUT; + tell $fh; + eof $fh; + seek $fh, 0,0; + tell *$fh; + eof *$fh; + seek *$fh, 0,0; + readline *$fh; + +This is now fixed, but C still has the problem, and it +is not clear how to fix it [perl #106536]. + +=item * + +Version comparisons, such as those that happen implicitly with C, no longer cause locale settings to change [perl #105784]. + +=item * + +F, which generates L, put path names in the +L file. This bug was introduced in 5.15.1. + +=item * + Perl now holds an extra reference count on the package that code is currently compiling in. This means that the following code no longer crashes [perl #101486]: -- cgit v1.2.1