1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
|
=head1 NAME
perldelta - what is new for perl v5.9.2
=head1 DESCRIPTION
This document describes differences between the 5.9.1 and the 5.9.2
development releases. See L<perl590delta> and L<perl591delta> for the
differences between 5.8.0 and 5.9.1.
=head1 Incompatible Changes
=head2 Packing and UTF-8 strings
The semantics of pack() and unpack() regarding UTF-8-encoded data has been
changed. Processing is now by default character per character instead of
byte per byte on the underlying encoding. Notably, code that used things
like C<pack("a*", $string)> to see through the encoding of string will now
simply get back the original $string. Packed strings can also get upgraded
during processing when you store upgraded characters. You can get the old
behaviour by using C<use bytes>.
To be consistent with pack(), the C<C0> in unpack() templates indicates
that the data is to be processed in character mode, i.e. character by
character; at the contrary, C<U0> in unpack() indicates UTF-8 mode, where
the packed string is processed in its UTF-8-encoded Unicode form on a byte
by byte basis. This is reversed with regard to perl 5.8.X.
Moreover, C<C0> and C<U0> can also be used in pack() templates to specify
respectively character and byte modes.
C<C0> and C<U0> in the middle of a pack or unpack format now switch to the
specified encoding mode, honoring parens grouping. Previously, parens were
ignored.
Also, there is a new pack() character format, C<W>, which is intended to
replace the old C<C>. C<C> is kept for unsigned chars coded as bytes in
the strings internal representation. C<W> represents unsigned (logical)
character values, which can be greater than 255. It is therefore more
robust when dealing with potentially UTF-8-encoded data (as C<C> will wrap
values outside the range 0..255, and not respect the string encoding).
In practice, that means that pack formats are now encoding-neutral, except
C<C>.
=head1 Core Enhancements
=head1 Modules and Pragmata
=head1 Utility Changes
=head1 Documentation
=head1 Performance Enhancements
=head1 Installation and Configuration Improvements
=head1 Selected Bug Fixes
=head1 New or Changed Diagnostics
=head1 Changed Internals
=head1 Known Problems
=head2 Platform Specific Problems
=head1 Reporting Bugs
If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://bugs.perl.org/ . There may also be
information at http://www.perl.org/ , the Perl Home Page.
If you believe you have an unreported bug, please run the B<perlbug>
program included with your release. Be sure to trim your bug down
to a tiny but sufficient test case. Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.
=head1 SEE ALSO
The F<Changes> file for exhaustive details on what changed.
The F<INSTALL> file for how to build Perl.
The F<README> file for general stuff.
The F<Artistic> and F<Copying> files for copyright information.
=cut
|