diff options
author | Ricardo Signes <rjbs@cpan.org> | 2012-02-15 22:39:47 -0500 |
---|---|---|
committer | Ricardo Signes <rjbs@cpan.org> | 2012-02-15 22:50:34 -0500 |
commit | 412912b60bb061ff9d485db6de9e019158766942 (patch) | |
tree | 7a87bb31171900746a2dcc822f4766a538904de8 | |
parent | b325a3a2e3b11c481c03495d5f9088f9fd23640e (diff) | |
download | perl-412912b60bb061ff9d485db6de9e019158766942.tar.gz |
begin filling the 5.16.0 delta from 5.15.6
This is largely a copy and paste job. Once I copy and paste most
things in, I will then start condensing them.
This does *not* include the following sections from perl5156delta:
* module updates
* internals changes
-rw-r--r-- | Porting/perl5160delta.pod | 549 |
1 files changed, 535 insertions, 14 deletions
diff --git a/Porting/perl5160delta.pod b/Porting/perl5160delta.pod index bd76f7f97e..0b29c26697 100644 --- a/Porting/perl5160delta.pod +++ b/Porting/perl5160delta.pod @@ -19,6 +19,186 @@ XXX Any important notices here =head1 Core Enhancements +=head2 C<is_utf8_char()> + +The XS-callable function C<is_utf8_char()>, when presented with +malformed UTF-8 input, can read up to 12 bytes beyond the end of the +string. This cannot be fixed without changing its API. It is not +called from CPAN. The documentation now describes how to use it +safely. + +=head2 Other C<is_utf8_foo()> functions, as well as C<utf8_to_foo()>, etc. + +Most of the other XS-callable functions that take UTF-8 encoded input +implicitly assume that the UTF-8 is valid (not malformed) in regards to +buffer length. Do not do things such as change a character's case or +see if it is alphanumeric without first being sure that it is valid +UTF-8. This can be safely done for a whole string by using one of the +functions C<is_utf8_string()>, C<is_utf8_string_loc()>, and +C<is_utf8_string_loclen()>. + +=head2 C<use I<VERSION>> + +As of this release, version declarations like C<use v5.16> now disable +all features before enabling the new feature bundle. This means that +the following holds true: + + use 5.016; + # 5.16 features enabled here + use 5.014; + # 5.16 features disabled here + +C<use v5.12> and higher continue to enable strict, but explicit C<use +strict> and C<no strict> now override the version declaration, even +when they come first: + + no strict; + use 5.012; + # no strict here + +There is a new ":default" feature bundle that represents the set of +features enabled before any version declaration or C<use feature> has +been seen. Version declarations below 5.10 now enable the ":default" +feature set. This does not actually change the behaviour of C<use +v5.8>, because features added to the ":default" set are those that were +traditionally enabled by default, before they could be turned off. + +C<$[> is now disabled under C<use v5.16>. It is part of the default +feature set and can be turned on or off explicitly with C<use feature +'array_base'>. + +=head2 C<UNIVERSAL::VERSION> + +The change to C<UNIVERSAL::VERSION> in 5.15.2 has been reverted. It +now returns a stringified version object once more. + +=head2 C<substr> lvalue revamp + +When C<substr> is called in lvalue or potential lvalue context with two +or three arguments, a special lvalue scalar is returned that modifies +the original string (the first argument) when assigned to. + +Previously, the offsets (the second and third arguments) passed to +C<substr> would be converted immediately to match the string, negative +offsets being translated to positive and offsets beyond the end of the +string being truncated. + +Now, the offsets are recorded without modification in the special +lvalue scalar that is returned, and the original string is not even +looked at by C<substr> itself, but only when the returned lvalue is +read or modified. + +These changes result in several incompatible changes and bug fixes: + +=over + +=item * + +If the original string changes length after the call to C<substr> but +before assignment to its return value, negative offsets will remember +their position from the end of the string, affecting code like this: + + my $string = "string"; + my $lvalue = \substr $string, -4, 2; + print $lvalue, "\n"; # prints "ri" + $string = "bailing twine"; + print $lvalue, "\n"; # prints "wi"; used to print "il" + +The same thing happens with an omitted third argument. The returned +lvalue will always extend to the end of the string, even if the string +becomes longer. + +=item * + +Tied (and otherwise magical) variables are no longer exempt from the +"Attempt to use reference as lvalue in substr" warning. + +=item * + +That warning now occurs when the returned lvalue is assigned to, not +when C<substr> itself is called. This only makes a difference if the +return value of C<substr> is referenced and assigned to later. + +=item * + +The order in which "uninitialized" warnings occur for arguments to +C<substr> has changed. + +=item * + +Passing a substring of a read-only value or a typeglob to a function +(potential lvalue context) no longer causes an immediate "Can't coerce" +or "Modification of a read-only value" error. That error only occurs +if and when the value passed is assigned to. + +The same thing happens with the "substr outside of string" error. If +the lvalue is only read, not written to, it is now just a warning, as +with rvalue C<substr>. + +=item * + +C<substr> assignments no longer call FETCH twice if the first argument +is a tied variable, just once. + +=back + +It was impossible to fix all the bugs without an incompatible change, +and the behaviour of negative offsets was never specified, so the +change was deemed acceptable. + +=head2 Return value of C<eval> + +C<eval> returns C<undef> in scalar context or an empty list in list +context when there is a run-time error. When C<eval> was passed a +string in list context and a syntax error occurred, it used to return a +list containing a single undefined element. Now it returns an empty +list in list context for all errors [perl #80630]. + +=head2 Anonymous handles + +Automatically generated file handles are now named __ANONIO__ when the +variable name cannot be determined, rather than $__ANONIO__. + +=head2 Last-accessed filehandle + +Perl has an internal variable that stores the last filehandle to be +accessed. It is used by C<$.> and by C<tell> and C<eof> without +arguments. + +It used to be possible to set this internal variable to a glob copy and +then modify that glob copy to be something other than a glob, and still +have the last-accessed filehandle associated with the variable after +assigning a glob to it again: + + my $foo = *STDOUT; # $foo is a glob copy + <$foo>; # $foo is now the last-accessed handle + $foo = 3; # no longer a glob + $foo = *STDERR; # still the last-accessed handle + +Now the C<$foo = 3> assignment unsets that internal variable, so there +is no last-accessed filehandle, just as if C<< <$foo> >> had never +happened. + +=head2 C<__SUB__> + +The new C<__SUB__> token, available under the "current_sub" feature +(see L<feature>) or C<use v5.15>, returns a reference to the current +subroutine, making it easier to write recursive closures. + +=head2 New option for the debugger's B<t> command + +The B<t> command in the debugger, which toggles tracing mode, now +accepts a numeric argument that determines how many levels of +subroutine calls to trace. + +=head2 Return value of C<tied> + +The value returned by C<tied> on a tied variable is now the actual +scalar that holds the object to which the variable is tied. This +allows ties to be weakened with C<Scalar::Util::weaken(tied +$tied_variable)>. + + =head2 More consistent C<eval> The C<eval> operator sometimes treats a string argument as a sequence of @@ -205,6 +385,35 @@ in your C<TYPEMAP> section: =item * +Perl 5.12.0 sped up the destruction of objects whose classes define +empty C<DESTROY> methods (to prevent autoloading), by simply not +calling such empty methods. This release takes this optimisation a +step further, by not calling any C<DESTROY> method that begins with a +C<return> statement. This can be useful for destructors that are only +used for debugging: + + use constant DEBUG => 1; + sub DESTROY { return unless DEBUG; ... } + +Constant-folding will reduce the first statement to C<return;> if DEBUG +is set to 0, triggering this optimisation. + +=item * + +Assigning to a variable that holds a typeglob or copy-on-write scalar +is now much faster. Previously the typeglob would be stringified or +the copy-on-write scalar would be copied before being clobbered. + +=item * + +Assignment to C<substr> in void context is now more than twice its +previous speed. Instead of creating and returning a special lvalue +scalar that is then assigned to, C<substr> modifies the original string +itself. + + +=item * + C<substr> no longer calculates a value to return when called in void context. @@ -571,6 +780,19 @@ Perl. It is still a work in progress. =head2 Changes to Existing Documentation +=head3 L<perlsec/Laundering and Detecting Tainted Data> + +=over 4 + +=item * + +The example function for checking for taintedness contained a subtle +error. C<$@> needs to be localized to prevent its changing this +global's value outside the function. The preferred method to check for +this remains L<Scalar::Util/tainted>. + +=back + =head3 L<perlfunc>, L<open> =over 4 @@ -913,6 +1135,33 @@ of C<$[> as a module. =item * +Redefinition warnings for constant subroutines used to be mandatory, +even occurring under C<no warnings>. Now they respect the L<warnings> +pragma. + +=item * + +The "Attempt to free non-existent shared string" has had the spelling +of "non-existent" corrected to "nonexistent". It was already listed +with the correct spelling in L<perldiag>. + +=item * + +The 'Use of "foo" without parentheses is ambiguous' warning has been +extended to apply also to user-defined subroutines with a (;$) +prototype, and not just to built-in functions. + +=item * + +The error messages for using C<default> and C<when> outside of a +topicalizer have been standardised to match the messages for +C<continue> and loop controls. They now read 'Can't "default" outside +a topicalizer' and 'Can't "when" outside a topicalizer'. They both +used to be 'Can't use when() outside a topicalizer' [perl #91514]. + + +=item * + The uninitialized warning for C<y///r> when C<$_> is implicit and undefined now mentions the variable name, just like the non-/r variation of the operator. @@ -965,12 +1214,17 @@ XXX Describe change here =head1 Utility Changes -XXX Changes to installed programs such as F<perlbug> and F<xsubpp> go -here. Most of these are built within the directories F<utils> and F<x2p>. +=head3 L<zipdetails> + +=over 4 + +=item * -[ List utility changes as a =head3 entry for each utility and =item -entries for each change -Use L<XXX> with program names to get proper documentation linking. ] +L<zipdetails> displays information about the internal record structure +of the zip file. It is not concerned with displaying any details of +the compressed data stored in the zip file. + +=back =head3 L<h2ph> @@ -980,9 +1234,9 @@ Use L<XXX> with program names to get proper documentation linking. ] L<h2ph> used to generate code of the form - unless(defined(&FOO)) { - sub FOO () {42;} - } + unless(defined(&FOO)) { + sub FOO () {42;} + } But the subroutine is a compile-time declaration, and is hence unaffected by the condition. It has now been corrected to emit a string C<eval> @@ -996,6 +1250,11 @@ around the subroutine [perl #99368]. =item * +The -Dusesitecustomize and -Duserelocatableinc options now work +together properly. + +=item * + F<regexp.h> has been modified for compatibility with GCC's B<-Werror> option, as used by some projects that include perl's header files (5.14.1). @@ -1088,33 +1347,53 @@ XXX =head2 Platform-Specific Notes +=head3 VMS + =over 4 -=item VMS +=item * Remove unnecessary includes, fix miscellaneous compiler warnings and close some unclosed comments on F<vms/vms.c>. Remove sockadapt layer from the VMS build. -=item GNU/Hurd +=item * + +A link-time error on VMS versions without C<symlink> support was +introduced in 5.15.1, but has now been corrected. + +=item * + +Explicit support for VMS versions prior to v7.0 and DEC C versions +prior to v6.0 has been removed. + +=item * + +Since Perl 5.10.1, the home-grown C<stat> wrapper has been unable to +distinguish between a directory name containing an underscore and an +otherwise-identical filename containing a dot in the same position +(e.g., t/test_pl as a directory and t/test.pl as a file). This problem +has been corrected. + +=back + +=head3 GNU/Hurd Numerous build and test failures on GNU/Hurd have been resolved with hints for building DBM modules, detection of the library search path, and enabling of large file support. -=item OpenVOS +=head3 OpenVOS Perl is now built with dynamic linking on OpenVOS, the minimum supported version of which is now Release 17.1.0. -=item SunOS +=head3 SunOS The CC workshop C++ compiler is now detected and used on systems that ship without cc. -=back - =head1 Internal Changes =over 4 @@ -1367,6 +1646,248 @@ fixed [perl #85026]. =item * +RT #78266: The regex engine has been leaking memory when accessing +named captures that weren't matched as part of a regex ever since 5.10 +when they were introduced, e.g. this would consume over a hundred MB of +memory: + + for (1..10_000_000) { + if ("foo" =~ /(foo|(?<capture>bar))?/) { + my $capture = $+{capture} + } + } + system "ps -o rss $$"' + +=item * + +A constant subroutine assigned to a glob whose name contains a null +will no longer cause extra globs to pop into existence when the +constant is referenced under its new name. + +=item * + +C<sort> was not treating C<sub {}> and C<sub {()}> as equivalent when +such a sub was provided as the comparison routine. It used to croak on +C<sub {()}>. + +=item * + +Subroutines from the C<autouse> namespace are once more exempt from +redefinition warnings. This used to work in 5.005, but was broken in +5.6 for most subroutines. For subs created via XS that redefine +subroutines from the C<autouse> package, this stopped working in 5.10. + +=item * + +New XSUBs now produce redefinition warnings if they overwrite existing +subs, as they did in 5.8.x. (The C<autouse> logic was reversed in +5.10-14. Only subroutines from the C<autouse> namespace would warn +when clobbered.) + +=item * + +Redefinition warnings triggered by the creation of XSUBs now respect +Unicode glob names, instead of using the internal representation. This +was missed in 5.15.4, partly because this warning was so hard to +trigger. (See the previous item.) + +=item * + +C<newCONSTSUB> used to use compile-time warning hints, instead of +run-time hints. The following code should never produce a redefinition +warning, but it used to, if C<newCONSTSUB> redefined an existing +subroutine: + + use warnings; + BEGIN { + no warnings; + some_XS_function_that_calls_new_CONSTSUB(); + } + +=item * + +Redefinition warnings for constant subroutines are on by default (what +are known as severe warnings in L<perldiag>). This was only the case +when it was a glob assignment or declaration of a Perl subroutine that +caused the warning. If the creation of XSUBs triggered the warning, it +was not a default warning. This has been corrected. + +=item * + +The internal check to see whether a redefinition warning should occur +used to emit "uninitialized" warnings in cases like this: + + use warnings "uninitialized"; + use constant {u => undef, v => undef}; + sub foo(){u} + sub foo(){v} + +=item * + +A bug fix in Perl 5.14 introduced a new bug, causing "uninitialized" +warnings to report the wrong variable if the operator in question had +two operands and one was C<%{...}> or C<@{...}>. This has been fixed +[perl #103766]. + +=item * + +C<< version->new("version") >> and C<printf "%vd", "version"> no longer +crash [perl #102586]. + +=item * + +C<$tied =~ y/a/b/>, C<chop $tied> and C<chomp $tied> now call FETCH +just once when $tied holds a reference. + +=item * + +Four-argument C<select> now always calls FETCH on tied arguments. It +used to skip the call if the tied argument happened to hold C<undef> or +a typeglob. + +=item * + +Four-argument C<select> no longer produces its "Non-string passed as +bitmask" warning on tied or tainted variables that are strings. + +=item * + +C<sysread> now always calls FETCH on the buffer passed to it if the +buffer is tied. It used to skip the call if the tied variable happened +to hold a typeglob. + +=item * + +C<< $tied .= <> >> now calls FETCH once on C<$tied>. It used to call +it multiple times if the last value assigned to or returned from the +tied variable was anything other than a string or typeglob. + +=item * + +The C<evalbytes> keyword added in 5.15.5 was respecting C<use utf8> +declarations from the outer scope, when it should have been ignoring +them. + +=item * + +C<goto &func> no longer crashes, but produces an error message, when +the unwinding of the current subroutine's scope fires a destructor that +undefines the subroutine being "goneto" [perl #99850]. + +=item * + +Arithmetic assignment (C<$left += $right>) involving overloaded objects +that rely on the 'nomethod' override no longer segfault when the left +operand is not overloaded. + +=item * + +Assigning C<__PACKAGE__> or any other shared hash key scalar to a stash +element no longer causes a double free. Regardless of this change, the +results of such assignments are still undefined. + +=item * + +Assigning C<__PACKAGE__> or another shared hash key string to a +variable no longer stops that variable from being tied if it happens to +be a PVMG or PVLV internally. + +=item * + +Creating a C<UNIVERSAL::AUTOLOAD> sub no longer stops C<%+>, C<%-> and +C<%!> from working some of the time [perl #105024]. + +=item * + +When presented with malformed UTF-8 input, the XS-callable functions +C<is_utf8_string()>, C<is_utf8_string_loc()>, and +C<is_utf8_string_loclen()> could read beyond the end of the input +string by up to 12 bytes. This no longer happens. [perl #32080]. +However, currently, C<is_utf8_char()> still has this defect, see +L</is_utf8_char()> above. + +=item * + +Doing a substitution on a tied variable returning a copy-on-write +scalar used to cause an assertion failure or an "Attempt to free +nonexistent shared string" warning. + +=item * + +A change in perl 5.15.4 caused C<caller()> to produce malloc errors and +a crash with Perl's own malloc, and possibly with other malloc +implementations, too [perl #104034]. + +=item * + +A bug fix in 5.15.5 could sometimes result in assertion failures under +debugging builds of perl for certain syntax errors in C<eval>, such as +C<eval q|""!=!~//|> + +=item * + +The "c [line num]" debugger command was broken by other debugger +changes released in 5.15.3. This is now fixed. + +=item * + +Breakpoints were not properly restored after a debugger restart using +the "R" command. This was broken in 5.15.3. This is now fixed. + +=item * + +The debugger prompt did not display the current line. This was broken +in 5.15.3. This is now fixed. + +=item * + +Class method calls still suffered from the Unicode bug with Latin-1 +package names. This was missed in the Unicode package name cleanup in +5.15.4 [perl #105922]. + +=item * + +The debugger no longer tries to do C<local $_> when dumping data +structures. + +=item * + +Calling C<readline($fh)> where $fh is a glob copy (e.g., after C<$fh = +*STDOUT>), assigning something other than a glob to $fh, and then +freeing $fh (e.g., by leaving the scope where it is defined) no longer +causes the internal variable used by C<$.> (C<PL_last_in_gv>) to point +to a freed scalar, that could be reused for some other glob, causing +C<$.> to use some unrelated filehandle [perl #97988]. + +=item * + +A regression in 5.14 caused these statements not to set the internal +variable that holds the handle used by C<$.>: + + my $fh = *STDOUT; + tell $fh; + eof $fh; + seek $fh, 0,0; + tell *$fh; + eof *$fh; + seek *$fh, 0,0; + readline *$fh; + +This is now fixed, but C<tell *{ *$fh }> still has the problem, and it +is not clear how to fix it [perl #106536]. + +=item * + +Version comparisons, such as those that happen implicitly with C<use +v5.43>, no longer cause locale settings to change [perl #105784]. + +=item * + +F<pod/buildtoc>, which generates L<perltoc>, put path names in the +L<perltoc> file. This bug was introduced in 5.15.1. + +=item * + Perl now holds an extra reference count on the package that code is currently compiling in. This means that the following code no longer crashes [perl #101486]: |