diff options
author | Karl Williamson <khw@cpan.org> | 2019-05-16 10:10:33 -0600 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2019-05-16 10:17:36 -0600 |
commit | 3fc52dae25ac89b88b7c7eb8ee7d3675953495d2 (patch) | |
tree | b8b1e02b210048586476c717e87789d6614b2979 | |
parent | 3271cf246fc5e8e632a91eaeadd71a58c7eaebb8 (diff) | |
download | perl-3fc52dae25ac89b88b7c7eb8ee7d3675953495d2.tar.gz |
perldelta: Improvements
This includes some small wording changes, reordering by importance,
collapsing near-duplicate entries.
-rw-r--r-- | pod/perldelta.pod | 116 |
1 files changed, 50 insertions, 66 deletions
diff --git a/pod/perldelta.pod b/pod/perldelta.pod index eaa04d9df6..983ce88c10 100644 --- a/pod/perldelta.pod +++ b/pod/perldelta.pod @@ -19,6 +19,23 @@ L<[perl #133788]|https://rt.perl.org/Ticket/Display.html?id=133788>. =head1 Core Enhancements +=head2 Limited variable length lookbehind in regular expression pattern matching is now experimentally supported + +Using a lookbehind assertion (like C<(?<=foo?)> or C<(?<!ba{1,9}r)> previously +would generate an error and refuse to compile. Now it compiles (if the +maximum lookbehind is at most 255 characters), but raises a warning in +the new C<experimental::vlb> warnings category. This is to caution you +that the precise behavior is subject to change based on feedback from +use in the field. + +See L<perlre/(?<=pattern)> and L<perlre/(?<!pattern)>. + +=head2 The upper limit C<"n"> specifiable in a regular expression quantifier of the form C<"{m,n}"> has been doubled to 65534 + +The meaning of an unbounded upper quantifier C<"{m,}"> remains unchanged. +It matches 2**31 - 1 times on most platforms, and more on ones where a C +language short variable is more than 4 bytes long. + =head2 Unicode 12.1 is supported Because of a change in Unicode release cycles, Perl jumps from Unicode @@ -41,23 +58,15 @@ as causing breaks: TAB, NO BREAK SPACE, and FIGURE SPACE (U+2007). We have decided to continue to use the previous Perl tailoring with regards to these. -=head2 The upper limit C<"n"> specifiable in a regular expression -quantifier of the form C<"{m,n}"> has been doubled to 65534 - -The meaning of an unbounded upper quantifier C<"{m,}"> remains unchanged. -It matches 2**31 - 1 times on most platforms, and more on ones where a C -language short variable is more than 4 bytes long. - - -=head2 Wildcards in Unicode property value specifications are now -partially supported +=head2 Wildcards in Unicode property value specifications are now partially supported You can now do something like this in a regular expression pattern qr! \p{nv= /(?x) \A [0-5] \z / }! -which matches all Unicode code points which have numeric value is -between 0 and 5 inclusive. +which matches all Unicode code points whose numeric value is +between 0 and 5 inclusive. So, it could match the Thai or Bengali +digits whose numeric values are 0, 1, 2, 3, 4, or 5. This marks another step in implementing the regular expression features the Unicode Consortium suggests. @@ -71,34 +80,6 @@ Previously it was an error to evaluate a named character C<\N{...}> within a single quoted regular expression pattern (whose evaluation is deferred from the normal place). This restriction is now removed. -=head2 It is now possible to compile perl to always use thread-safe -locale operations. - -Previously, these calls were only used when the perl was compiled to be -multi-threaded. To always enable them, add - - -Accflags='-DUSE_THREAD_SAFE_LOCALE' - -to your F<Configure> flags. - -=head2 Limited variable length lookbehind in regular expression pattern matching -is now experimentally supported - -Using a lookbehind assertion (like C<(?<=foo?)> or C<(?<!ba{1,9}r)> previously -would generate an error and refuse to compile. Now it compiles (if the -maximum lookbehind is at most 255 characters), but raises a warning in -the new C<experimental::vlb> warnings category. This is to caution you -that the precise behavior is subject to change based on feedback from -use in the field. - -See L<perlre/(?<=pattern)> and L<perlre/(?<!pattern)>. - -=head2 Use faster method to convert to UTF-8 - -There is a special inline function that's used when converting a single -byte to UTF-8, that is faster than the more general one used prior to -this commit. - =head2 Turkic UTF-8 locales are now seamlessly supported Turkic languages have different casing rules than other languages for @@ -109,6 +90,15 @@ rules for use with Turkic languages. Previously, Perl ignored these, but now, it uses them when it detects that it is operating under a Turkic UTF-8 locale. +=head2 It is now possible to compile perl to always use thread-safe locale operations. + +Previously, these calls were only used when the perl was compiled to be +multi-threaded. To always enable them, add + + -Accflags='-DUSE_THREAD_SAFE_LOCALE' + +to your F<Configure> flags. + =head2 Eliminate opASSIGN macro usage from core This macro is still defined but no longer used in core @@ -120,11 +110,11 @@ possible regular expression debugging. =head1 Incompatible Changes -=head2 Pattern delimiters must now be graphemes +=head2 Assigning non-zero to C<$[> is fatal -This usage has been deprecated and scheduled for removal in 5.30. See -L<perldeprecation/Use of unassigned code point or non-standalone -grapheme for a delimiter.> +Setting L<< C<$[>|perlvar/$[ >> to a non-zero value has been deprecated since +Perl 5.12 and now throws a fatal error. +See L<<< perldeprecation/Assigning non-zero to C<< $[ >> is fatal >>>. =head2 Delimiters must now be graphemes @@ -138,12 +128,6 @@ But to avoid breaking code unnecessarily, most instances that issued a deprecation warning, remain legal and now have a non-deprecation warning raised. See L<perldeprecation/Unescaped left braces in regular expressions>. -=head2 Assigning non-zero to C<$[> is fatal - -Setting L<< C<$[>|perlvar/$[ >> to a non-zero value has been deprecated since -Perl 5.12 and now throws a fatal error. -See L<<< perldeprecation/Assigning non-zero to C<< $[ >> is fatal >>>. - =head2 Previously deprecated sysread()/syswrite() on :utf8 handles is now fatal Calling sysread(), syswrite(), send() or recv() on a C<:utf8> handle, @@ -206,8 +190,7 @@ malformed UTF-8. This protects agains potential security threats. This is considered a bug fix as well. L<[perl #131642]|https://rt.perl.org/Ticket/Display.html?id=131642>. -=head2 Any set of digits in the Common script are legal in a script run -of another script +=head2 Any set of digits in the Common script are legal in a script run of another script There are several sets of digits in the Common script. C<[0-9]> is the most familiar. But there are also C<[\x{FF10}-\x{FF19}]> (FULLWIDTH @@ -245,7 +228,7 @@ Translating from UTF-8 into the code point it represents now is done via a deterministic finite automaton, speeding it up. As a typical example, C<ord("\x7fff")> now requires 12% fewer instructions than before. The performance of checking that a sequence of bytes is valid UTF-8 is similarly -improved, again by using a dfa. +improved, again by using a DFA. =item * @@ -282,11 +265,11 @@ Code optimizations in F<regcomp.c>, F<regcomp.h>, F<regexec.c>. =item * Regular expression pattern matching of things like C<qr/[^I<a>]/> is -significantly sped up, where I<a> is any ASCII character. Which classes -will get this speed up is complicated and depends on the underlying bit -patterns of those characters, so differs between ASCII and EBCDIC -platforms, but all case pairs, like C<qr/[Gg]/> are included, as is -C<[^01]>. +significantly sped up, where I<a> is any ASCII character. Other classes +can get this speed up, but which ones is complicated and depends on the +underlying bit patterns of those characters, so differs between ASCII +and EBCDIC platforms, but all case pairs, like C<qr/[Gg]/> are included, +as is C<[^01]>. =back @@ -354,7 +337,7 @@ L<CPAN> has been upgraded from version 2.20 to 2.22. L<Data::Dumper> has been upgraded from version 2.170 to 2.174 -L<Data::Dumper> now avoids leak when C<croak>ing. +L<Data::Dumper> now avoids leaking when C<croak>ing. =item * @@ -370,7 +353,7 @@ L<Devel::Peek> has been upgraded from version 1.27 to 1.28. =item * -L<Devel::PPPort> has been upgraded from version 3.40 to 3.51. +L<Devel::PPPort> has been upgraded from version 3.40 to 3.52. =item * @@ -862,7 +845,8 @@ L<[perl #133524]|https://rt.perl.org/Ticket/Display.html?id=133524> =item * Under C<< -Dr >> (or C<< use re 'Debug' >>) the compiled regex engine -program is displayed. It used two different spellings for I<< infinity >>, +program is displayed. It used to use two different spellings for I<< +infinity >>, C<< INFINITY >>, and C<< INFTY >>. It now uses the latter exclusively, as that spelling has been around the longest. @@ -889,9 +873,9 @@ L<[perl #133654]|https://rt.perl.org/Ticket/Display.html?id=133654>. =item * -Normally the thread-safe functions are used only on threaded builds. -It is now possible to force their use on unthreaded builds on systems -that have them available, by including the +Normally the thread-safe locale functions are used only on threaded +builds. It is now possible to force their use on unthreaded builds on +systems that have them available, by including the C<-Accflags='-DUSE_THREAD_SAFE_LOCALE'> option to F<Configure>. =item * @@ -905,11 +889,11 @@ L<[perl #133760]|https://rt.perl.org/Ticket/Display.html?id=133760>. =item * -Fix -DPERL_GLOBAL_STRUCT_PRIVATE build option. +Multiple improvements and fixes for -DPERL_GLOBAL_STRUCT build option. =item * -Multiple improvements and fixes for -DPERL_GLOBAL_STRUCT build option. +Fix -DPERL_GLOBAL_STRUCT_PRIVATE build option. =back |