summaryrefslogtreecommitdiff
path: root/pod/perl5139delta.pod
blob: fb9bf36784bd32bb7f4acfe341ae0d81cec930c8 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
=encoding utf8

=head1 NAME

perl5139delta - what is new for perl v5.13.9

=head1 DESCRIPTION

This document describes differences between the 5.13.8 release and
the 5.13.9 release.

If you are upgrading from an earlier release such as 5.13.7, first read
L<perl5138delta>, which describes differences between 5.13.7 and
5.13.8.

=head1 Core Enhancements

=head2 New regular expression modifier C</a>

The C</a> regular expression modifier restricts C<\s> to match precisely
the five characters C<[ \f\n\r\t]>, C<\d> to match precisely the 10
characters C<[0-9]>, C<\w> to match precisely the 63 characters
C<[A-Za-z0-9_]>, and the Posix (C<[[:posix:]]>) character classes to
match only the appropriate ASCII characters.  The complements, of
course, match everything but; and C<\b> and C<\B> are correspondingly
affected.  Otherwise, C</a> behaves like the C</u> modifier, in that
case-insensitive matching uses Unicode semantics; for example, "k" will
match the Unicode C<\N{KELVIN SIGN}> under C</i> matching, and code
points in the Latin1 range, above ASCII will have Unicode semantics when
it comes to case-insensitive matching.  Like its cousins (C</u>, C</l>,
and C</d>), and in spite of the terminology, C</a> in 5.14 will not
actually be able to be used as a suffix at the end of a regular
expression (this restriction is planned to be lifted in 5.16).  It must
occur either as an infix modifier, such as C<(?a:...)> or (C<(?a)...>,
or it can be turned on within the lexical scope of C<use re '/a'>.
Turning on C</a> turns off the other "character set" modifiers.

=head2 Any unsigned value can be encoded as a character

With this release, Perl is adopting a model that any unsigned value can
be treated as a code point and encoded internally (as utf8) without
warnings -- not just the code points that are legal in Unicode.
However, unless utf8 warnings have been
explicitly lexically turned off, outputting or performing a
Unicode-defined operation (such as upper-casing) on such a code point
will generate a warning.  Attempting to input these using strict rules
(such as with the C<:encoding('UTF-8')> layer) will continue to fail.
Prior to this release the handling was very inconsistent, and incorrect
in places.  Also, the Unicode non-characters, some of which previously were
erroneously considered illegal in places by Perl, contrary to the Unicode
standard, are now always legal internally.  But inputting or outputting
them will work the same as for the non-legal Unicode code points, as the
Unicode standard says they are illegal for "open interchange".

=head2 Regular expression debugging output improvement

Regular expression debugging output (turned on by C<use re 'debug';>) now
uses hexadecimal when escaping non-ASCII characters, instead of octal.

=head1 Security

=head2 Restrict \p{IsUserDefined} to In\w+ and Is\w+

In L<perlunicode/"User-Defined Character Properties">, it says you can
create custom properties by defining subroutines whose names begin with
"In" or "Is". However, perl doesn't actually enforce that naming
restriction, so \p{foo::bar} will call foo::Bar() if it exists.

This commit finally enforces this convention. Note that this broke a
number of existing tests for properties, since they didn't always use an
Is/In prefix.

=head1 Incompatible Changes

=head2 All objects are destroyed

It used to be possible to prevent a destructor from being called during
global destruction by artificially increasing the reference count of an
object.

Now such objects I<will> will be destroyed, as a result of a bug fix
L<[perl #81230]|http://rt.perl.org/rt3/Public/Bug/Display.html?id=81230>.

This has the potential to break some XS modules. (In fact, it break some.
See L</Known Problems>, below.)

=head1 Modules and Pragmata

=head2 New Modules and Pragmata

=over 4

=item *

C<CPAN::Meta::YAML> 0.003 has been added as a dual-life module.  It supports a
subset of YAML sufficient for reading and writing META.yml and MYMETA.yml files
included with CPAN distributions or generated by the module installation
toolchain. It should not be used for any other general YAML parsing or
generation task.

=item *

C<HTTP::Tiny> 0.009 has been added as a dual-life module.  It is a very
small, simple HTTP/1.1 client designed for simple GET requests and file
mirroring.  It has has been added to enable CPAN.pm and CPANPLUS to
"bootstrap" HTTP access to CPAN using pure Perl without relying on external
binaries like F<curl> or F<wget>.

=item *

C<JSON::PP> 2.27103 has been added as a dual-life module, for the sake of
reading F<META.json> files in CPAN distributions.

=item *

C<Module::Metadata> 1.000003 has been added as a dual-life module.  It gathers
package and POD information from Perl module files.  It is a standalone module
based on Module::Build::ModuleInfo for use by other module installation
toolchain components.  Module::Build::ModuleInfo has been deprecated in
favor of this module instead.

=item *

C<Perl::OSType> 1.002 has been added as a dual-life module.  It maps Perl
operating system names (e.g. 'dragonfly' or 'MSWin32') to more generic types
with standardized names (e.g.  "Unix" or "Windows").  It has been refactored
out of Module::Build and ExtUtils::CBuilder and consolidates such mappings into
a single location for easier maintenance.

=back

=head2 Updated Modules and Pragmata

=over 4

=item *

C<Archive::Extract> has been upgraded from version 0.46 to 0.48

=item *

C<Archive::Tar> has been upgraded from version 1.74 to 1.76

=item *

C<CGI> has been upgraded from version 3.50 to 3.51

Further improvements have been made to guard against newline injections
in headers.

=item *

C<Compress::Raw::Bzip2> has been upgraded from version 2.031 to 2.033

=item *

C<Compress::Raw::Zlib> has been upgraded from version 2.030 to 2.033

=item *

C<CPAN> has been upgraded from version 1.94_62 to 1.94_63

=item *

C<CPANPLUS> has been upgraded from version 0.9010 to 0.9011

=item *

C<CPANPLUS::Dist::Build> has been upgraded from version 0.50 to 0.52

=item *

C<DB_File> has been upgraded from version 1.820 to 1.821

=item *

C<Encode> has been upgraded from version 2.40 to 2.42.
Now, all 66 Unicode non-characters are treated the same way U+FFFF has
always been treated; if it was disallowed, all 66 are disallowed; if it
warned, all 66 warn.

=item *

C<File::Fetch> has been upgraded from version 0.28 to 0.32

=item *

C<IO::Compress> has been upgraded from version 2.030 to 2.033

=item *

C<IPC::Cmd> has been upgraded from version 0.66 to 0.68

=item *

C<Log::Message> has been upgraded from version 0.02 to 0.04

=item *

C<Log::Message::Simple> has been upgraded from version 0.06 to 0.08

=item *

C<Module::Load::Conditional> has been upgraded from version 0.38 to 0.40

=item *

C<Object::Accessor> has been upgraded from version 0.36 to 0.38

=item *

C<Params::Check> has been upgraded from version 0.26 to 0.28

=item *

C<Pod::LaTeX> has been upgraded from version 0.58 to 0.59

=item *

C<Socket> has been updated with new affordances for IPv6,
including implementations of the C<Socket::getaddrinfo()> and
C<Socket::getnameinfo()> functions, along with related constants.

=item *

C<Term::UI> has been upgraded from version 0.20 to 0.24

=item *

C<Thread::Queue> has been upgraded from version 2.11 to 2.12.

=item *

C<Thread::Semaphore> has been upgraded from version 2.11 to 2.12.

=item *

C<threads> has been upgraded from version 1.81_03 to 1.82

=item *

C<threads::shared> has been upgraded from version 1.35 to 1.36

=item *

C<Time::Local> has been upgraded from version 1.1901_01 to 1.2000.

=item *

C<Unicode::Normalize> has been upgraded from version 1.07 to 1.10

=item *

C<version> has been upgraded from 0.86 to 0.88.

=item *

C<Win32> has been upgraded from version 0.41 to 0.44.

=back

=head1 Documentation

=head2 Changes to Existing Documentation

=head3 All documentation

=over

=item *

Numerous POD warnings were fixed.

=item *

Many, many spelling errors and typographical mistakes were corrected throughout Perl's core.

=back

=head3 C<perlhack>

=over 4

=item *

C<perlhack> was extensively reorganized.

=back

=head3 C<perlfunc>

=over 4

=item *

It has now been documented that C<ord> returns 0 for an empty string.

=back

=head1 Diagnostics

The following additions or changes have been made to diagnostic output,
including warnings and fatal error messages.  For the complete list of
diagnostic messages, see L<perldiag>.

=head2 New Diagnostics

=over 4

=item *

Performing an operation requiring Unicode semantics (such as case-folding)
on a Unicode surrogate or a non-Unicode character now triggers a warning:
'Operation "%s" returns its argument for ...'.

=back

=head2 Changes to Existing Diagnostics

=over 4

=item *

Previously, if none of the C<gethostbyaddr>, C<gethostbyname> and
C<gethostent> functions were implemented on a given platform, they would
all die with the message 'Unsupported socket function "gethostent" called',
with analogous messages for C<getnet*> and C<getserv*>. This has been
corrected.

=back

=head1 Utility Changes

=head3 C<perlbug>

=over 4

=item *

C<perlbug> did not previously generate a From: header, potentially
resulting in dropped mail. Now it does include that header.

=back

=head3 C<buildtoc>

=over 4

=item *

F<pod/buildtoc> has been modernized and can now be used to test the
well-formedness of F<pod/perltoc.pod> automatically.

=back

=head1 Testing

=over 4

=item *

C<lib/File/DosGlob.t> has been modernized and now uses C<Test::More>.

=item *

A new test script, C<t/porting/filenames.t>, makes sure that filenames and
paths are reasonably portable.

=item *

C<t/porting/diag.t> is now several orders of magnitude faster.

=item *

C<t/porting/buildtoc.t> now tests that the documentation TOC file is current and well-formed.

=item *

C<t/base/while.t> now tests the basics of a while loop with minimal dependencies.

=item *

C<t/cmd/while.t> now uses F<test.pl> for better maintainability.

=item *

C<t/op/split.t> now tests calls to C<split> without any pattern specified.

=back



=head1 Platform Support

=head2 Discontinued Platforms

=over 4

=item Apollo DomainOS

The last vestiges of support for this platform have been excised from the
Perl distribution. It was officially discontinued in version 5.12.0. It had
not worked for years before that.

=item MacOS Classic

The last vestiges of support for this platform have been excised from the
Perl distribution. It was officially discontinued in an earlier version.

=back

=head2 Platform-Specific Notes

=over 4


=item Cygwin

=over

=item *

Updated MakeMaker to build man pages on cygwin.

=item *

Improved rebase behaviour

If a dll is updated on cygwin reuse the old imagebase address.
This solves most rebase errors, esp when updating on core dll's.
See L<http://www.tishler.net/jason/software/rebase/rebase-2.4.2.README> for more information.

=item *

Support the standard cyg dll prefix, which is e.g. needed for FFI's.

=item *

Updated build hints file

=back


=item Solaris

DTrace is now supported on Solaris. There used to be build failures, but
these have been fixed
L<[perl #73630]|http://rt.perl.org/rt3/Public/Bug/Display.html?id=73630>.

=back

=head1 Internal Changes

=over 4

=item *

The opcode bodies for C<chop> and C<chomp> and for C<schop> and C<schomp> have
been merged. The implementation functions C<Perl_do_chop()> and
C<Perl_do_chomp()>, never part of the public API, have been merged and moved to
a static function in F<pp.c>. This shrinks the perl binary slightly, and should
not affect any code outside the core (unless it is relying on the order of side
effects when C<chomp> is passed a I<list> of values).

=item *

Some of the flags parameters to the uvuni_to_utf8_flags() and
utf8n_to_uvuni() have changed.  This is a result of Perl now allowing
internal storage and manipulation of code points that are problematic
in some situations.  Hence, the default actions for these functions has
been complemented to allow these code points.  The new flags are
documented in L<perlapi>.  Code that requires the problematic code
points to be rejected needs to change to use these flags.  Some flag
names are retained for backward source compatibility, though they do
nothing, as they are now the default.  However the flags
C<UNICODE_ALLOW_FDD0>, C<UNICODE_ALLOW_FFFF>, C<UNICODE_ILLEGAL>, and
C<UNICODE_IS_ILLEGAL> have been removed, as they stem from a
fundamentally broken model of how the Unicode non-character code points
should be handled, which is now described in
L<perlunicode/Non-character code points>.  See also L</Selected Bug Fixes>.

=item *

Certain shared flags in the C<pmop.op_pmflags> and C<regexp.extflags>
structures have been removed.  These are: C<Rxf_Pmf_LOCALE>,
C<Rxf_Pmf_UNICODE>, and C<PMf_LOCALE>.  Instead there are encodes and
three static in-line functions for accessing the information:
C<get_regex_charset()>, C<set_regex_charset()>, and C<get_regex_charset_name()>,
which are defined in the places where the original flags were.

=item *

A new option has been added to C<pv_escape> to dump all characters above
ASCII in hexadecimal. Before, one could get all characters as hexadecimal
or the Latin1 non-ASCII as octal


=item *

Generate pp_* prototypes in pp_proto.h, and remove pp.sym

Eliminate the #define pp_foo Perl_pp_foo(pTHX) macros, and update the 13
locations that relied on them.

regen/opcode.pl now generates prototypes for the PP functions directly, into
pp_proto.h. It no longer writes pp.sym, and regen/embed.pl no longer reads
this, removing the only ordering dependency in the regen scripts. opcode.pl
is now responsible for prototypes for pp_* functions. (embed.pl remains
responsible for ck_* functions, reading from regen/opcodes)

=back

=head1 Selected Bug Fixes

=over 4

=item *

The handling of Unicode non-characters has changed.
Previously they were mostly considered illegal, except that only one of
the 66 of them was known about in places.  The Unicode standard
considers them legal, but forbids the "open interchange" of them.
This is part of the change to allow the internal use of any code point
(see L</Core Enhancements>).  Together, these changes resolve
L<# 38722|https://rt.perl.org/rt3/Ticket/Display.html?id=38722>,
L<# 51918|http://rt.perl.org/rt3/Ticket/Display.html?id=51918>,
L<# 51936|http://rt.perl.org/rt3/Ticket/Display.html?id=51936>,
L<# 63446|http://rt.perl.org/rt3/Ticket/Display.html?id=63446>

=item *

Sometimes magic (ties, tainted, etc.) attached to variables could cause an
object to last longer than it should, or cause a crash if a tied variable
were freed from within a tie method. These have been fixed
L<[perl #81230]|http://rt.perl.org/rt3/Public/Bug/Display.html?id=81230>.

=item *

Most I/O functions were not warning for unopened handles unless the
'closed' and 'unopened' warnings categories were both enabled. Now only
C<use warnings 'unopened'> is necessary to trigger these warnings (as was
always meant to be the case.

=item *

C<< E<lt>exprE<gt> >> always respects overloading now if the expression is
overloaded.

Due to the way that 'E<lt>E<gt> as glob' was parsed differently from
'E<lt>E<gt> as filehandle' from 5.6 onwards, something like C<< E<lt>$foo[0]E<gt> >> did
not handle overloading, even if C<$foo[0]> was an overloaded object. This
was contrary to the documentation for overload, and meant that C<< E<lt>E<gt> >>
could not be used as a general overloaded iterator operator.

=item *

Destructors on objects were not called during global destruction on objects
that were not referenced by any scalars. This could happen if an array
element were blessed (e.g., C<bless \$a[0]>) or if a closure referenced a
blessed variable (C<bless \my @a; sub foo { @a }>).

Now there is an extra pass during global destruction to fire destructors on
any objects that might be left after the usual passes that check for
objects referenced by scalars
L<[perl #36347]|http://rt.perl.org/rt3/Public/Bug/Display.html?id=36347>.

=item *

A long standing bug has now been fully fixed (partial fixes came in
earlier releases), in which some Latin-1 non-ASCII characters on
ASCII-platforms would match both a character class and its complement,
such as U+00E2 being both in C<\w> and C<\W>, depending on the
UTF-8-ness of the regular expression pattern and target string.
Fixing this did expose some bugs in various modules and tests that
relied on the previous behavior of C<[[:alpha:]]> not ever matching
U+00FF, "LATIN SMALL LETTER Y WITH DIAERESIS", even when it should, in
Unicode mode; now it does match when appropriate.
L<[perl #60156]|http://rt.perl.org/rt3/Ticket/Display.html?id=60156>.

=back

=head1 Known Problems

=over 4

=item *

The fix for [perl #81230] causes test failures for C<Tk> version 804.029.
This is still being investigated.

=back

=head1 Acknowledgements

Perl 5.13.9 represents approximately one month of development since
Perl 5.13.8 and contains approximately 48000 lines of changes across
809 files from 35 authors and committers:

Abigail, Ævar Arnfjörð Bjarmason, brian d foy, Chris 'BinGOs' Williams,
Craig A. Berry, David Golden, David Leadbeater, David Mitchell, Father
Chrysostomos, Florian Ragwitz, Gerard Goossen, H.Merijn Brand, Jan
Dubois, Jerry D. Hedden, Jesse Vincent, John Peacock, Karl Williamson,
Leon Timmermans, Michael Parker, Michael Stevens, Nicholas Clark,
Nuno Carvalho, Paul "LeoNerd" Evans, Peter J. Acklam, Peter Martini,
Rainer Tammer, Reini Urban, Renee Baecker, Ricardo Signes, Robin Barker,
Tony Cook, Vadim Konovalov, Vincent Pit, Zefram, and Zsbán Ambrus.

Many of the changes included in this version originated in the CPAN
modules included in Perl's core. We're grateful to the entire CPAN
community for helping Perl to flourish.

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/perlbug/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the L<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

If the bug you are reporting has security implications, which make it
inappropriate to send to a publicly archived mailing list, then please send
it to perl5-security-report@perl.org. This points to a closed subscription
unarchived mailing list, which includes all the core committers, who be able
to help assess the impact of issues, figure out a resolution, and help
co-ordinate the release of patches to mitigate or fix the problem across all
platforms on which Perl is supported. Please only use this address for
security issues in the Perl core, not for modules independently
distributed on CPAN.

=head1 SEE ALSO

The F<Changes> file for an explanation of how to view exhaustive details
on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut