summaryrefslogtreecommitdiff
path: root/pod/perl595delta.pod
blob: cc7ea80f1afac57ef4a44193c75f9fa4e4bfbcea (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
=head1 NAME

perldelta - what is new for perl v5.9.5

=head1 DESCRIPTION

This document describes differences between the 5.9.4 and the 5.9.5
development releases. See L<perl590delta>, L<perl591delta>,
L<perl592delta>, L<perl593delta> and L<perl594delta> for the differences
between 5.8.0 and 5.9.4.

=head1 Incompatible Changes

=head2 Tainting and printf

When perl is run under taint mode, C<printf()> and C<sprintf()> will now
reject any tainted format argument. (Rafael Garcia-Suarez)

=head2 undef and signal handlers

Undefining or deleting a signal handler via C<undef $SIG{FOO}> is now
equivalent to setting it to C<'DEFAULT'>. (Rafael)

=head2 strictures and array/hash dereferencing in defined()

C<defined @$foo> and C<defined %$bar> are now subject to C<strict 'refs'>
(that is, C<$foo> and C<$bar> shall be proper references there.)
(Nicholas Clark)

(However, C<defined(@foo)> and C<defined(%bar)> are discouraged constructs
anyway.)

=head2 Removal of the bytecode compiler and of perlcc

C<perlcc>, the byteloader and the supporting modules (B::C, B::CC,
B::Bytecode, etc.) are no longer distributed with the perl sources. Those
experimental tools have never worked reliably, and, due to the lack of
volunteers to keep them in line with the perl interpreter developments, it
was decided to remove them instead of shipping a broken version of those.
The last version of those modules can be found with perl 5.9.4.

However the B compiler framework stays supported in the perl core, as with
the more useful modules it has permitted (among others, B::Deparse and
B::Concise).

=head2 Removal of the JPL

The JPL (Java-Perl Linguo) has been removed from the perl sources tarball.

=head1 Core Enhancements

=head2 Regular expressions

=over 4

=item Recursive Patterns

It is now possible to write recursive patterns without using the C<(??{})>
construct. This new way is more efficient, and in many cases easier to
read.

Each capturing parenthesis can now be treated as an independent pattern
that can be entered by using the C<(?PARNO)> syntax (C<PARNO> standing for
"parenthesis number"). For example, the following pattern will match
nested balanced angle brackets:

    /
     ^                      # start of line
     (                      # start capture buffer 1
	<                   #   match an opening angle bracket
	(?:                 #   match one of:
	    (?>             #     don't backtrack over the inside of this group
		[^<>]+      #       one or more non angle brackets
	    )               #     end non backtracking group
	|                   #     ... or ...
	    (?1)            #     recurse to bracket 1 and try it again
	)*                  #   0 or more times.
	>                   #   match a closing angle bracket
     )                      # end capture buffer one
     $                      # end of line
    /x

Note, users experienced with PCRE will find that the Perl implementation
of this feature differs from the PCRE one in that it is possible to
backtrack into a recursed pattern, whereas in PCRE the recursion is
atomic or "possessive" in nature. (Yves Orton)

=item Named Capture Buffers

It is now possible to name capturing parenthesis in a pattern and refer to
the captured contents by name. The naming syntax is C<< (?<NAME>....) >>.
It's possible to backreference to a named buffer with the C<< \k<NAME> >>
syntax. In code, the new magical hashes C<%+> and C<%-> can be used to
access the contents of the capture buffers.

Thus, to replace all doubled chars, one could write

    s/(?<letter>.)\k<letter>/$+{letter}/g

Only buffers with defined contents will be "visible" in the C<%+> hash, so
it's possible to do something like

    foreach my $name (keys %+) {
        print "content of buffer '$name' is $+{$name}\n";
    }

The C<%-> hash is a bit more complete, since it will contain array refs
holding values from all capture buffers similarly named, if there should
be many of them.

C<%+> and C<%-> are implemented as tied hashes through the new module
C<Tie::Hash::NamedCapture>.

Users exposed to the .NET regex engine will find that the perl
implementation differs in that the numerical ordering of the buffers
is sequential, and not "unnamed first, then named". Thus in the pattern

   /(A)(?<B>B)(C)(?<D>D)/

$1 will be 'A', $2 will be 'B', $3 will be 'C' and $4 will be 'D' and not
$1 is 'A', $2 is 'C' and $3 is 'B' and $4 is 'D' that a .NET programmer
would expect. This is considered a feature. :-) (Yves Orton)

=item Possessive Quantifiers

Perl now supports the "possessive quantifier" syntax of the "atomic match"
pattern. Basically a possessive quantifier matches as much as it can and never
gives any back. Thus it can be used to control backtracking. The syntax is
similar to non-greedy matching, except instead of using a '?' as the modifier
the '+' is used. Thus C<?+>, C<*+>, C<++>, C<{min,max}+> are now legal
quantifiers. (Yves Orton)

=item Backtracking control verbs

The regex engine now supports a number of special-purpose backtrack
control verbs: (*THEN), (*PRUNE), (*MARK), (*SKIP), (*COMMIT), (*FAIL)
and (*ACCEPT). See L<perlre> for their descriptions. (Yves Orton)

=item Relative backreferences

A new syntax C<\g{N}> or C<\gN> where "N" is a decimal integer allows a
safer form of back-reference notation as well as allowing relative
backreferences. This should make it easier to generate and embed patterns
that contain backreferences. See L<perlre/"Capture buffers">. (Yves Orton)

=item C<\K> escape

The functionality of Jeff Pinyan's module Regexp::Keep has been added to
the core. You can now use in regular expressions the special escape C<\K>
as a way to do something like floating length positive lookbehind. It is
also useful in substitutions like:

  s/(foo)bar/$1/g

that can now be converted to

  s/foo\Kbar//g

which is much more efficient. (Yves Orton)

=back

=head2 The C<_> prototype

A new prototype character has been added. C<_> is equivalent to C<$> (it
denotes a scalar), but defaults to C<$_> if the corresponding argument
isn't supplied. Due to the optional nature of the argument, you can only
use it at the end of a prototype, or before a semicolon.

This has a small incompatible consequence: the prototype() function has
been adjusted to return C<_> for some built-ins in appropriate cases (for
example, C<prototype('CORE::rmdir')>). (Rafael)

=head2 UNITCHECK blocks

C<UNITCHECK>, a new special code block has been introduced, in addition to
C<BEGIN>, C<CHECK>, C<INIT> and C<END>.

C<CHECK> and C<INIT> blocks, while useful for some specialized purposes,
are always executed at the transition between the compilation and the
execution of the main program, and thus are useless whenever code is
loaded at runtime. On the other hand, C<UNITCHECK> blocks are executed
just after the unit which defined them has been compiled. See L<perlmod>
for more information. (Alex Gough)

=head2 readpipe() is now overridable

The built-in function readpipe() is now overridable. Overriding it permits
also to override its operator counterpart, C<qx//> (a.k.a. C<``>). (Rafael)

=head2 UCD 5.0.0

The copy of the Unicode Character Database included in Perl 5.9 has
been updated to version 5.0.0.

=head2 Smart match

The smart match operator (C<~~>) is now available by default (you don't
need to enable it with C<use feature> any longer). (Michael G Schwern)

=head1 Modules and Pragmas

=head2 New Core Modules

=over 4

=item *

C<Locale::Maketext::Simple>, needed by CPANPLUS, is a simple wrapper around
C<Locale::Maketext::Lexicon>. Note that C<Locale::Maketext::Lexicon> isn't
included in the perl core; the behaviour of C<Locale::Maketext::Simple>
gracefully degrades when the later isn't present.

=item *

C<Params::Check> implements a generic input parsing/checking mechanism. It
is used by CPANPLUS.

=item *

C<Term::UI> simplifies the task to ask questions at a terminal prompt.

=item *

C<Object::Accessor> provides an interface to create per-object accessors.

=item *

C<Module::Pluggable> is a simple framework to create modules that accept
pluggable sub-modules.

=item *

C<Module::Load::Conditional> provides simple ways to query and possibly
load installed modules.

=item *

C<Time::Piece> provides an object oriented interface to time functions,
overriding the built-ins localtime() and gmtime().

=item *

C<IPC::Cmd> helps to find and run external commands, possibly
interactively.

=item *

C<File::Fetch> provide a simple generic file fetching mechanism.

=item *

C<Archive::Extract> is a generic archive extraction mechanism
for F<.tar> (plain, gziped or bzipped) or F<.zip> files.

=back

=head2 Module changes

=over 4

=item C<base>

The C<base> pragma now warns if a class tries to inherit from itself.
(Curtis "Ovid" Poe)

=item C<warnings>

The C<warnings> pragma doesn't load C<Carp> anymore. That means that code
that used C<Carp> routines without having loaded it at compile time might
need to be adjusted; typically, the following (faulty) code won't work
anymore, and will require parentheses to be added after the function name:

    use warnings;
    require Carp;
    Carp::confess "argh";

=item C<less>

C<less> now does something useful (or at least it tries to). In fact, it
has been turned into a lexical pragma. So, in your modules, you can now
test whether your users have requested to use less CPU, or less memory,
less magic, or maybe even less fat. See L<less> for more. (Joshua ben
Jore)

=item C<Attribute::Handlers>

C<Attribute::Handlers> can now report the caller's file and line number.
(David Feldman)

=item C<B::Lint>

C<B::Lint> is now based on C<Module::Pluggable>, and so can be extended
with plugins. (Joshua ben Jore)

=item C<B>

It's now possible to access the lexical pragma hints (C<%^H>) by using the
method B::COP::hints_hash(). It returns a C<B::RHE> object, which in turn
can be used to get a hash reference via the method B::RHE::HASH(). (Joshua
ben Jore)

=for p5p XXX document this in B.pm too

=back

=head1 Utility Changes

=head1 Documentation

=head1 Performance Enhancements

=head1 Installation and Configuration Improvements

=head2 C++ compatibility

Efforts have been made to make perl and the core XS modules compilable
with various C++ compilers (although the situation is not perfect with
some of the compilers on some of the platforms tested.)

=head2 Static build on Win32

It's now possible to build a C<perl-static.exe> that doesn't depend
on C<perl59.dll> on Win32. See the Win32 makefiles for details.
(Vadim Konovalov)

=head2 Ports

Perl has been reported to work on MidnightBSD.

=head1 Selected Bug Fixes

PerlIO::scalar will now prevent writing to read-only scalars. Moreover,
seek() is now supported with PerlIO::scalar-based filehandles, the
underlying string being zero-filled as needed. (Rafael, Jarkko Hietaniemi)

study() never worked for UTF-8 strings, but could lead to false results.
It's now a no-op on UTF-8 data. (Yves Orton)

The signals SIGILL, SIGBUS and SIGSEGV are now always delivered in an
"unsafe" manner (contrary to other signals, that are deferred until the
perl interpreter reaches a reasonably stable state; see
L<perlipc/"Deferred Signals (Safe Signals)">). (Rafael)

When a module or a file is loaded through an @INC-hook, and when this hook
has set a filename entry in %INC, __FILE__ is now set for this module
accordingly to the contents of that %INC entry. (Rafael)

The C<-w> and C<-t> switches can now be used together without messing
up what categories of warnings are activated or not. (Rafael)

=head1 New or Changed Diagnostics

=head1 Changed Internals

The anonymous hash and array constructors now take 1 op in the optree
instead of 3, now that pp_anonhash and pp_anonlist return a reference to
an hash/array when the op is flagged with OPf_SPECIAL (Nicholas Clark).

=for p5p XXX have we some docs on how to create regexp engine plugins, since that's now possible ? (perlreguts)

=for p5p XXX new BIND SV type, #29544, #29642

=head1 Known Problems

=head2 Platform Specific Problems

=head1 Reporting Bugs

If you find what you think is a bug, you might check the articles
recently posted to the comp.lang.perl.misc newsgroup and the perl
bug database at http://rt.perl.org/rt3/ .  There may also be
information at http://www.perl.org/ , the Perl Home Page.

If you believe you have an unreported bug, please run the B<perlbug>
program included with your release.  Be sure to trim your bug down
to a tiny but sufficient test case.  Your bug report, along with the
output of C<perl -V>, will be sent off to perlbug@perl.org to be
analysed by the Perl porting team.

=head1 SEE ALSO

The F<Changes> file for exhaustive details on what changed.

The F<INSTALL> file for how to build Perl.

The F<README> file for general stuff.

The F<Artistic> and F<Copying> files for copyright information.

=cut