From 4d99a7193c0f7ae68e0c1fc7b626e4ee3a8ce67b Mon Sep 17 00:00:00 2001 From: Ricardo Signes Date: Mon, 20 Feb 2012 19:08:16 -0500 Subject: omnibus perl5160delta editing mess Omnibus not because it does all the editing, but because I did it while riding the bus. The preliminary editing is about 10% done. Once things are better pruned and in their place, I will go through to edit the prose. --- Porting/perl5160delta.pod | 800 ++++++++++++++++++++++++---------------------- 1 file changed, 409 insertions(+), 391 deletions(-) diff --git a/Porting/perl5160delta.pod b/Porting/perl5160delta.pod index f897c98a62..a022b338a6 100644 --- a/Porting/perl5160delta.pod +++ b/Porting/perl5160delta.pod @@ -1,5 +1,40 @@ =encoding utf8 + XXX: This is here for my own reference while editing! -- rjbs, 2012-02-20 + + =head1 Notice + =head1 Core Enhancements + =head1 Security + =head1 Incompatible Changes + =head1 Deprecations + =head1 Performance Enhancements + =head1 Modules and Pragmata + =head2 New Modules and Pragmata + =head2 Updated Modules and Pragmata + =head2 Removed Modules and Pragmata + =head1 Documentation + =head2 New Documentation + =head2 Changes to Existing Documentation + =head1 Diagnostics + =head2 New Diagnostics + =head3 New Errors + =head3 New Warnings + =head2 Changes to Existing Diagnostics + =head1 Utility Changes + =head1 Configuration and Compilation + =head1 Testing + =head1 Platform Support + =head2 New Platforms + =head2 Discontinued Platforms + =head2 Platform-Specific Notes + =head1 Internal Changes + =head1 Selected Bug Fixes + =head1 Known Problems + =head1 Obituary + =head1 Acknowledgements + =head1 Reporting Bugs + =head1 SEE ALSO + =head1 NAME perl5160delta - what is new for perl v5.16.0 @@ -19,50 +54,6 @@ XXX Any important notices here =head1 Core Enhancements -=head2 C no longer needed for C<\N{I}> - -The C module is now automatically loaded when needed as if -the C<:full> and C<:short> options had been specified. See -L. - -=head2 Improved performance for Unicode properties in regular expressions - -Matching a code point against a Unicode property is now done via a -binary search instead of linear. This means for example that the worst -case for a 1000 item property is 10 probes instead of 1000. This -inefficiency has been compensated for in the past by permanently storing -in a hash the results of a given probe plus the results for the adjacent -64 code points, under the theory that near-by code points are likely to -be searched for. A separate hash was used for each mention of a Unicode -property in each regular expression. Thus, C -would generate two hashes. Any probes in one instance would be unknown -to the other, and the hashes could expand separately to be quite large -if the regular expression were used on many different widely-separated -code points. This can lead to running out of memory in extreme cases. -Now, however, there is just one hash shared by all instances of a given -property. This means that if C<\p{foo}> is matched against "A" in one -regular expression in a thread, the result will be known immediately to -all regular expressions, and the relentless march of using up memory is -slowed considerably. - -=head2 C - -The XS-callable function C, when presented with -malformed UTF-8 input, can read up to 12 bytes beyond the end of the -string. This cannot be fixed without changing its API. It is not -called from CPAN. The documentation now describes how to use it -safely. - -=head2 Other C functions, as well as C, etc. - -Most of the other XS-callable functions that take UTF-8 encoded input -implicitly assume that the UTF-8 is valid (not malformed) in regards to -buffer length. Do not do things such as change a character's case or -see if it is alphanumeric without first being sure that it is valid -UTF-8. This can be safely done for a whole string by using one of the -functions C, C, and -C. - =head2 C> As of this release, version declarations like C now disable @@ -70,9 +61,9 @@ all features before enabling the new feature bundle. This means that the following holds true: use 5.016; - # 5.16 features enabled here + # only 5.16 features enabled here use 5.014; - # 5.16 features disabled here + # only 5.14 features enabled here (not 5.16) C and higher continue to enable strict, but explicit C and C now override the version declaration, even @@ -93,12 +84,43 @@ C<$[> is now disabled under C. It is part of the default feature set and can be turned on or off explicitly with C. -=head2 C +=head2 C<__SUB__> + +The new C<__SUB__> token, available under the C feature +(see L) or C, returns a reference to the current +subroutine, making it easier to write recursive closures. + +=head2 New and Improved Built-ins -The change to C in 5.15.2 has been reverted. It -now returns a stringified version object once more. +=head3 Return value of C -=head2 C lvalue revamp +C returns C in scalar context or an empty list in list +context when there is a run-time error. When C was passed a +string in list context and a syntax error occurred, it used to return a +list containing a single undefined element. Now it returns an empty +list in list context for all errors [perl #80630]. + +=head3 More consistent C + +The C operator sometimes treats a string argument as a sequence of +characters and sometimes as a sequence of bytes, depending on the +internal encoding. The internal encoding is not supposed to make any +difference, but there is code that relies on this inconsistency. + +The new C and C features (enabled under C resolve this. The C feature causes C to treat the string always as Unicode. The C +features provides a function, itself called C, which +evaluates its argument always as a string of bytes. + +These features also fix oddities with source filters leaking to outer +dynamic scopes. + +See L for more detail. + +=head3 C lvalue revamp + +=for comment Can this be compacted some? -- rjbs, 2012-02-20 When C is called in lvalue or potential lvalue context with two or three arguments, a special lvalue scalar is returned that modifies @@ -172,88 +194,22 @@ It was impossible to fix all the bugs without an incompatible change, and the behaviour of negative offsets was never specified, so the change was deemed acceptable. -=head2 Return value of C - -C returns C in scalar context or an empty list in list -context when there is a run-time error. When C was passed a -string in list context and a syntax error occurred, it used to return a -list containing a single undefined element. Now it returns an empty -list in list context for all errors [perl #80630]. - -=head2 Anonymous handles - -Automatically generated file handles are now named __ANONIO__ when the -variable name cannot be determined, rather than $__ANONIO__. - -=head2 Last-accessed filehandle - -Perl has an internal variable that stores the last filehandle to be -accessed. It is used by C<$.> and by C and C without -arguments. - -It used to be possible to set this internal variable to a glob copy and -then modify that glob copy to be something other than a glob, and still -have the last-accessed filehandle associated with the variable after -assigning a glob to it again: - - my $foo = *STDOUT; # $foo is a glob copy - <$foo>; # $foo is now the last-accessed handle - $foo = 3; # no longer a glob - $foo = *STDERR; # still the last-accessed handle - -Now the C<$foo = 3> assignment unsets that internal variable, so there -is no last-accessed filehandle, just as if C<< <$foo> >> had never -happened. - -=head2 C<__SUB__> - -The new C<__SUB__> token, available under the "current_sub" feature -(see L) or C, returns a reference to the current -subroutine, making it easier to write recursive closures. - -=head2 New option for the debugger's B command - -The B command in the debugger, which toggles tracing mode, now -accepts a numeric argument that determines how many levels of -subroutine calls to trace. - -=head2 Return value of C +=head3 Return value of C The value returned by C on a tied variable is now the actual scalar that holds the object to which the variable is tied. This allows ties to be weakened with C. +=head2 Unicode Support -=head2 More consistent C +=head3 C is no longer needed for C<\N{I}> -The C operator sometimes treats a string argument as a sequence of -characters and sometimes as a sequence of bytes, depending on the internal -encoding. The internal encoding is not supposed to make any difference, -but there is code that relies on this inconsistency. +When C<\N{I}> is encountered, the C module is now +automatically loaded when needed as if the C<:full> and C<:short> +options had been specified. See L for more information. -Under C and higher, the C and C -features resolve this. The C feature causes C -to treat the string always as Unicode. The C features provides -a function, itself called C, which evaluates its argument always -as a string of bytes. - -These features also fix oddities with source filters leaking to outer -dynamic scopes. - -See L for more detail. - -=head2 $^X converted to an absolute path on FreeBSD, OS X and Solaris - -C<$^X> is now converted to an absolute path on OS X, FreeBSD (without -needing F mounted) and Solaris 10 and 11. This augments the -previous approach of using F on Linux, FreeBSD and NetBSD -(in all cases, where mounted). - -This makes relocatable perl installations more useful on these platforms. -(See "Relocatable @INC" in F) - -=head2 Unicode Symbol Names +=head3 Unicode Symbol Names Perl now has proper support for Unicode in symbol names. It used to be that C<*{$foo}> would ignore the internal UTF8 flag and use the bytes of @@ -307,26 +263,261 @@ Subroutine prototypes Attributes -=item * +=item * + +Various warnings and error messages that mention variable names or values, +methods, etc. + +=back + +In addition, a parsing bug has been fixed that prevented C<*{é}> from +implicitly quoting the name, but instead interpreted it as C<*{+é}>, which +would cause a strict violation. + +C<*{"*a::b"}> automatically strips off the * if it is followed by an ASCII +letter. That has been extended to all Unicode identifier characters. + +C<$é> is now subject to "Used only once" warnings. It used to be exempt, +as it was treated as a punctuation variable. + +Also, single-character Unicode punctuation variables (like $‰) are now +supported [perl #69032]. They are also supported with C and C, +but that is a mistake that will be fixed before 5.16. + +=head2 The Unicode C property is now supported. + +New in Unicode 6.0, this is an improved C