diff options
author | Chris 'BinGOs' Williams <chris@bingosnet.co.uk> | 2011-09-08 15:54:31 +0100 |
---|---|---|
committer | Chris 'BinGOs' Williams <chris@bingosnet.co.uk> | 2011-09-08 16:44:19 +0100 |
commit | a9feb6cb11fbf4270519aa7318b89f5becccd87c (patch) | |
tree | 4df7e40c425d009068a3712af2225a1d585597df /cpan | |
parent | 39afdc5aa55747b1a9225da0c4c89a1bd0c8489b (diff) | |
download | perl-a9feb6cb11fbf4270519aa7318b89f5becccd87c.tar.gz |
Update perlfaq to CPAN version 5.0150033
[DELTA]
5.0150033 Tue 30 Aug 2011 21:46:25 +0100
* Perl 6 changes (apeiron)
* Replace tabs with spaces (ranguard)
* Link to www.cpan.org/ports/ for binary (ranguard)
* Switch C<> over to L<> for any module on CPAN (ranguard)
* Change brace stype (apeiron)
* Extra dzil tests (rafl)
* Changes so more semantic, and better rendered HTML (ranguard)
* Point to repo for how to update perlfaq (ranguard)
Diffstat (limited to 'cpan')
-rw-r--r-- | cpan/perlfaq/lib/perlfaq.pm | 4 | ||||
-rw-r--r-- | cpan/perlfaq/lib/perlfaq.pod | 36 | ||||
-rw-r--r-- | cpan/perlfaq/lib/perlfaq1.pod | 40 | ||||
-rw-r--r-- | cpan/perlfaq/lib/perlfaq2.pod | 83 | ||||
-rw-r--r-- | cpan/perlfaq/lib/perlfaq3.pod | 227 | ||||
-rw-r--r-- | cpan/perlfaq/lib/perlfaq4.pod | 1832 | ||||
-rw-r--r-- | cpan/perlfaq/lib/perlfaq5.pod | 1049 | ||||
-rw-r--r-- | cpan/perlfaq/lib/perlfaq6.pod | 699 | ||||
-rw-r--r-- | cpan/perlfaq/lib/perlfaq7.pod | 558 | ||||
-rw-r--r-- | cpan/perlfaq/lib/perlfaq8.pod | 758 | ||||
-rw-r--r-- | cpan/perlfaq/lib/perlfaq9.pod | 437 |
11 files changed, 2849 insertions, 2874 deletions
diff --git a/cpan/perlfaq/lib/perlfaq.pm b/cpan/perlfaq/lib/perlfaq.pm index bca26a56d8..153db2c922 100644 --- a/cpan/perlfaq/lib/perlfaq.pm +++ b/cpan/perlfaq/lib/perlfaq.pm @@ -1,6 +1,6 @@ package perlfaq; -BEGIN { - $perlfaq::VERSION = '5.01500302'; +{ + $perlfaq::VERSION = '5.0150033'; } 0; # not is it supposed to be loaded diff --git a/cpan/perlfaq/lib/perlfaq.pod b/cpan/perlfaq/lib/perlfaq.pod index 929b03deaa..efae64e3ee 100644 --- a/cpan/perlfaq/lib/perlfaq.pod +++ b/cpan/perlfaq/lib/perlfaq.pod @@ -14,11 +14,11 @@ The perlfaq comes with the standard Perl distribution, so if you have Perl you should have the perlfaq. You should also have the C<perldoc> tool that lets you read the L<perlfaq>: - $ perldoc perlfaq + $ perldoc perlfaq or search the perlfaq question headings: - $ perldoc -q open + $ perldoc -q open See L<perldoc> for more information. @@ -39,30 +39,7 @@ https://github.com/tpf/perlfaq which is the latest live version that drives http://faq.perl.org/ and will be distributed with the next release of Perl 5. -You can mail corrections, additions, and suggestions to -C<< <perlfaq-workers AT perl DOT org> >>. The perlfaq volunteers use this -address to coordinate their efforts and track the perlfaq development. -They appreciate your contributions to the FAQ but do not have time to -provide individual help, so don't use this address to ask FAQs. - -The perlfaq server posts extracts of the perlfaq to that newsgroup -every 6 hours (or so), and the community of volunteers reviews and -updates the answers. If you'd like to help review and update the -answers, check out comp.lang.perl.misc. - -You can also fork the perl repository, make your changes, and send them -to Perl 5 Porters. See L<perlgit>. - -=head2 What will happen if you mail your Perl programming problems to the authors? - -The perlfaq-workers like to keep all traffic on the perlfaq-workers list -so that everyone can see the work being done (and the work that needs to -be done). The mailing list serves as an official record. If you email the -authors or maintainers directly, you'll probably get a reply asking you -to post to the mailing list. If you don't get a reply, it probably means -that the person never saw the message or didn't have time to deal with -it. Posting to the list allows the volunteers with time to deal with it -when others are busy. +=head2 What if my question isn't answered in the FAQ? If you have a question that isn't in the FAQ and you would like help with it, try the resources in L<perlfaq2>. @@ -70,10 +47,9 @@ it, try the resources in L<perlfaq2>. =head1 CREDITS Tom Christiansen wrote the original perlfaq then expanded it with the -help of Nat Torkington. The perlfaq-workers maintain current document -and the denizens of comp.lang.perl.misc regularly review and update the -FAQ. Several people have contributed answers, corrections, and comments, -and the perlfaq notes those contributions wherever appropriate. +help of Nat Torkington. brian d foy substantialy edited and expanded +the perlfaq. perlfaq-workers and others have also supplied feedback +and corrections over the years. =head1 AUTHOR AND COPYRIGHT diff --git a/cpan/perlfaq/lib/perlfaq1.pod b/cpan/perlfaq/lib/perlfaq1.pod index 643ff4671e..c351397821 100644 --- a/cpan/perlfaq/lib/perlfaq1.pod +++ b/cpan/perlfaq/lib/perlfaq1.pod @@ -107,11 +107,9 @@ as its whitewashed bones have fractured or eroded. =item * -There is no Perl 6 release scheduled, but it will be available when -it's ready. The joke is that it's scheduled for Christmas, but that we -just don't know which one. Stay tuned, but don't worry that you'll -have to change major versions of Perl; no one is going to take Perl 5 -away from you. +The current leading implementation of Perl 6, Rakudo, released a "useful, +usable, 'early adopter'" distribution of Perl 6 (called Rakudo Star) in July of +2010. Please see http://rakudo.org/ for more information. =item * @@ -129,8 +127,9 @@ minor release (i.e. perl5.9.x, where 9 is the minor release). (contributed by brian d foy) -In short, Perl 4 is the past, Perl 5 is the present, and Perl 6 is the -future. +In short, Perl 4 is the parent to both Perl 5 and Perl 6. Perl 5 is the older +sibling, and though they are different languages, someone who knows one will +spot many similarities in the other. The number after Perl (i.e. the 5 after Perl 5) is the major release of the perl interpreter as well as the version of the language. Each @@ -143,13 +142,16 @@ The current major release of Perl is Perl 5, and was first released in concept of references, complex data structures, and modules. The Perl 5 interpreter was a complete re-write of the previous perl sources. -Perl 6 is the next major version of Perl, although it's not intended to -replace Perl 5. It's still in development in both its syntax and -design. The work started in 2002 and is still ongoing. Some of the -most interesting features have shown up in the latest versions of Perl -5, and some Perl 5 modules allow you to use some Perl 6 syntax in your -programs. The current leading implementation of Perl 6 is Rakudo ( -http://rakudo.org ). +Perl 6 was originally described as the community's rewrite of Perl 5. +Development started in 2002; syntax and design work continue to this day. As the +language has evolved, it has become clear that it is a separate language, +incompatible with Perl 5 but in the same language family. Contrary to popular +belief, Perl 6 and Perl 5 peacefully coexist with one another. That said, Perl 6 +has proven to be a fascinating source of ideas for those using Perl 5 (the Moose +object system is a well-known example). There is overlap in the communities, and +this overlap fosters the tradition of sharing and borrowing that have been +instrumental to Perl's success. The current leading implementation of Perl 6 is +Rakudo, and you can learn more about it at http://rakudo.org. See L<perlhist> for a history of Perl revisions. @@ -165,13 +167,13 @@ If you want to learn more about Perl 6, or have a desire to help in the crusade to make Perl a better place then read the Perl 6 developers page at http://dev.perl.org/perl6/ and get involved. -Perl 6 is not scheduled for release yet, and Perl 5 will still be supported -for quite awhile after its release. Do not wait for Perl 6 to do whatever -you need to do. - "We're really serious about reinventing everything that needs reinventing." --Larry Wall +As Perl 6 is a reinvention of Perl, it is a language in the same lineage but +not compatible. The two are complementary, not mutually exclusive. Perl 6 is not +meant to replace Perl 5, and vice versa. + =head2 How stable is Perl? Production releases, which incorporate bug fixes and new functionality, @@ -323,7 +325,7 @@ to sign email and usenet messages starting in the late 1980s. He previously used the phrase with many subjects ("Just another x hacker,"), so to distinguish his JAPH, he started to write them as Perl programs: - print "Just another Perl hacker,"; + print "Just another Perl hacker,"; Other people picked up on this and started to write clever or obfuscated programs to produce the same output, spinning things quickly out of diff --git a/cpan/perlfaq/lib/perlfaq2.pod b/cpan/perlfaq/lib/perlfaq2.pod index 59cf6d6d08..cf778ee4a2 100644 --- a/cpan/perlfaq/lib/perlfaq2.pod +++ b/cpan/perlfaq/lib/perlfaq2.pod @@ -31,23 +31,7 @@ source release of perl). =head2 How can I get a binary version of perl? -(contributed by brian d foy) - -ActiveState: Windows, Linux, Mac OS X, Solaris, AIX and HP-UX - - http://www.activestate.com/ - -Sunfreeware.com: Solaris 2.5 to Solaris 10 (SPARC and x86) - - http://www.sunfreeware.com/ - -Strawberry Perl: Windows, Perl 5.8.8 and 5.10.0 - - http://www.strawberryperl.com - -IndigoPerl: Windows - - http://indigostar.com/ +See L<CPAN Ports|http://www.cpan.org/ports/> =head2 I don't have a C compiler. How can I build my own Perl interpreter? @@ -61,8 +45,8 @@ information on where to get such a binary version. You might look around the net for a pre-built binary of Perl (or a C compiler!) that meets your needs, though: -For Windows, Vanilla Perl ( http://vanillaperl.com/ ) and Strawberry Perl -( http://strawberryperl.com/ ) come with a +For Windows, L<Vanilla Perl|http://vanillaperl.com/> and +L<Strawberry Perl|http://strawberryperl.com/> come with a bundled C compiler. ActivePerl is a pre-compiled version of Perl ready-to-use. @@ -173,19 +157,25 @@ expressions, L<perlthrtut> for threads, L<perldebtut> for debugging, and L<perlxstut> for linking C and Perl together. There may be more by the time you read this. These URLs might also be useful: - http://perldoc.perl.org/ - http://www.perl.org/ - http://learn.perl.org/ +=over 4 + +=item * L<Perldoc|http://perldoc.perl.org/> + +=item * L<Perl.org|http://www.perl.org/> + +=item * L<Learn.perl.org|http://learn.perl.org/> + +=back =head2 What are the Perl newsgroups on Usenet? Where do I post questions? Several groups devoted to the Perl language are on Usenet: - comp.lang.perl.announce Moderated announcement group - comp.lang.perl.misc High traffic general Perl discussion - comp.lang.perl.moderated Moderated discussion group - comp.lang.perl.modules Use and development of Perl modules - comp.lang.perl.tk Using Tk (and X) from Perl + comp.lang.perl.announce Moderated announcement group + comp.lang.perl.misc High traffic general Perl discussion + comp.lang.perl.moderated Moderated discussion group + comp.lang.perl.modules Use and development of Perl modules + comp.lang.perl.tk Using Tk (and X) from Perl Some years ago, comp.lang.perl was divided into those groups, and comp.lang.perl itself officially removed. While that group may still @@ -258,13 +248,12 @@ http://www.ddj.com/ or brian d foy's index of online TPJ content =head2 What mailing lists are there for Perl? -Most of the major modules (C<Tk>, C<CGI>, C<libwww-perl>) have their own +Most of the major modules (L<Tk>, L<CGI>, L<libwww-perl>) have their own mailing lists. Consult the documentation that came with the module for subscription information. -A comprehensive list of Perl-related mailing lists can be found at: - - http://lists.perl.org/ +A comprehensive list of Perl-related mailing lists can be found at +http://lists.perl.org/ =head2 Where are the archives for comp.lang.perl.misc? @@ -313,17 +302,17 @@ information about your installation to include with your message, then sends the message to the right place. To determine if a module came with your version of Perl, you can -use the C<Module::CoreList> module. It has the information about +use the L<Module::CoreList> module. It has the information about the modules (with their versions) included with each release of Perl. -If C<Module::CoreList> is not installed on your system, check out +If L<Module::CoreList> is not installed on your system, check out http://perlpunks.de/corelist . Every CPAN module has a bug tracker set up in RT, http://rt.cpan.org . You can submit bugs to RT either through its web interface or by email. To email a bug report, send it to bug-E<lt>distribution-nameE<gt>@rt.cpan.org . For example, if you -wanted to report a bug in C<Business::ISBN>, you could send a message to +wanted to report a bug in L<Business::ISBN>, you could send a message to bug-Business-ISBN@rt.cpan.org . Some modules might have special reporting requirements, such as a @@ -332,32 +321,38 @@ module documentation too. =head2 What is perl.com? Perl Mongers? pm.org? perl.org? cpan.org? -Perl.com ( http://www.perl.com/ ) used to be part of the O'Reilly +L<Perl.com|http://www.perl.com/> used to be part of the O'Reilly Network, a subsidiary of O'Reilly Media. Although it retains most of the original content from its O'Reilly Network, it is now hosted by The Perl Foundation. The Perl Foundation is an advocacy organization for the Perl language -which maintains the web site ( http://www.perl.org/ ) as a general +which maintains the web site L<Perl.org|http://www.perl.org/> as a general advocacy site for the Perl language. It uses the domain to provide general support services to the Perl community, including the hosting of mailing lists, web sites, and other services. There are also many other sub-domains for special topics like learning Perl, Perl news, jobs in Perl, such as: - http://www.perl.org/ - http://learn.perl.org/ - http://jobs.perl.org/ - http://lists.perl.org/ +=over 4 + +=item * L<Perl.org|http://www.perl.org/> + +=item * L<Learn.perl.org|http://learn.perl.org/> + +=item * L<Jobs.perl.org|http://jobs.perl.org/> + +=item * L<Lists.perl.org|http://lists.perl.org/> + +=back Perl Mongers uses the pm.org domain for services related to Perl user groups, including the hosting of mailing lists and web sites. See the -Perl Mongers website ( http://www.pm.org/ ) for more information about +L<Perl Mongers website|http://www.pm.org/> for more information about joining, starting, or requesting services for a Perl user group. -CPAN, or the Comprehensive Perl Archive Network ( -http://www.cpan.org/ ), is a replicated, worldwide repository of Perl -software. +CPAN, or the L<Comprehensive Perl Archive Network|http://www.cpan.org/>, +is a replicated, worldwide repository of Perl software. See L<What is CPAN?|/"What modules and extensions are available for Perl? What is CPAN? What does CPANE<sol>srcE<sol>... mean?">. =head1 AUTHOR AND COPYRIGHT diff --git a/cpan/perlfaq/lib/perlfaq3.pod b/cpan/perlfaq/lib/perlfaq3.pod index 7ba64158e6..689523ca46 100644 --- a/cpan/perlfaq/lib/perlfaq3.pod +++ b/cpan/perlfaq/lib/perlfaq3.pod @@ -13,25 +13,58 @@ Have you looked at CPAN (see L<perlfaq2>)? The chances are that someone has already written a module that can solve your problem. Have you read the appropriate manpages? Here's a brief index: - Basics perldata, perlvar, perlsyn, perlop, perlsub - Execution perlrun, perldebug - Functions perlfunc - Objects perlref, perlmod, perlobj, perltie - Data Structures perlref, perllol, perldsc - Modules perlmod, perlmodlib, perlsub - Regexes perlre, perlfunc, perlop, perllocale - Moving to perl5 perltrap, perl - Linking w/C perlxstut, perlxs, perlcall, perlguts, perlembed - Various http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz - (not a man-page but still useful, a collection - of various essays on Perl techniques) +=over 4 + +=item Basics + +perldata, perlvar, perlsyn, perlop, perlsub + +=item Execution + +perlrun, perldebug + +=item Functions + +perlfunc + +=item Objects + +perlref, perlmod, perlobj, perltie + +=item Data Structures + +perlref, perllol, perldsc + +=item Modules + +perlmod, perlmodlib, perlsub + +=item Regexes + +perlre, perlfunc, perlop, perllocale + +=item Moving to perl5 + +perltrap, perl + +=item Linking with C + +perlxstut, perlxs, perlcall, perlguts, perlembed + +=item Various + +http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz +(not a man-page but still useful, a collection of various essays on +Perl techniques) + +=back A crude table of contents for the Perl manpage set is found in L<perltoc>. =head2 How can I use Perl interactively? The typical approach uses the Perl debugger, described in the -C<perldebug(1)> manpage, on an "empty" program, like this: +L<perldebug(1)> manpage, on an "empty" program, like this: perl -de 42 @@ -42,18 +75,17 @@ operations typically found in symbolic debuggers. =head2 Is there a Perl shell? -The C<psh> (Perl sh) is currently at version 1.8. The Perl Shell is a shell +The L<psh> (Perl sh) is currently at version 1.8. The Perl Shell is a shell that combines the interactive nature of a Unix shell with the power of Perl. The goal is a full-featured shell that behaves as expected for normal shell activity and uses Perl syntax and functionality for -control-flow statements and other things. You can get C<psh> at -http://sourceforge.net/projects/psh/ . +control-flow statements and other things. You can get L<psh> at +https://metacpan.org/release/psh . -C<Zoidberg> is a similar project and provides a shell written in perl, +L<Zoidberg> is a similar project and provides a shell written in perl, configured in perl and operated in perl. It is intended as a login shell and development environment. It can be found at -http://pardus-larus.student.utwente.nl/~pardus/projects/zoidberg/ -or your local CPAN mirror. +https://metacpan.org/release/Zoidberg. The C<Shell.pm> module (distributed with Perl) makes Perl try commands which aren't part of the Perl language as shell commands. C<perlsh> from @@ -64,54 +96,54 @@ be what you want. From the command line, you can use the C<cpan> command's C<-l> switch: - $ cpan -l + $ cpan -l You can also use C<cpan>'s C<-a> switch to create an autobundle file that C<CPAN.pm> understands and can use to re-install every module: - $ cpan -a + $ cpan -a -Inside a Perl program, you can use the C<ExtUtils::Installed> module to +Inside a Perl program, you can use the L<ExtUtils::Installed> module to show all installed distributions, although it can take awhile to do its magic. The standard library which comes with Perl just shows up -as "Perl" (although you can get those with C<Module::CoreList>). +as "Perl" (although you can get those with L<Module::CoreList>). - use ExtUtils::Installed; + use ExtUtils::Installed; - my $inst = ExtUtils::Installed->new(); - my @modules = $inst->modules(); + my $inst = ExtUtils::Installed->new(); + my @modules = $inst->modules(); If you want a list of all of the Perl module filenames, you -can use C<File::Find::Rule>: +can use L<File::Find::Rule>: - use File::Find::Rule; + use File::Find::Rule; - my @files = File::Find::Rule-> - extras({follow => 1})-> - file()-> - name( '*.pm' )-> - in( @INC ) - ; + my @files = File::Find::Rule-> + extras({follow => 1})-> + file()-> + name( '*.pm' )-> + in( @INC ) + ; If you do not have that module, you can do the same thing -with C<File::Find> which is part of the standard library: +with L<File::Find> which is part of the standard library: - use File::Find; - my @files; + use File::Find; + my @files; - find( - { - wanted => sub { - push @files, $File::Find::fullname - if -f $File::Find::fullname && /\.pm$/ - }, - follow => 1, - follow_skip => 2, - }, - @INC - ); + find( + { + wanted => sub { + push @files, $File::Find::fullname + if -f $File::Find::fullname && /\.pm$/ + }, + follow => 1, + follow_skip => 2, + }, + @INC + ); - print join "\n", @files; + print join "\n", @files; If you simply need to check quickly to see if a module is available, you can check for its documentation. If you can @@ -119,12 +151,12 @@ read the documentation the module is most likely installed. If you cannot read the documentation, the module might not have any (in rare cases): - $ perldoc Module::Name + $ perldoc Module::Name You can also try to include the module in a one-liner to see if perl finds it: - $ perl -MModule::Name -e1 + $ perl -MModule::Name -e1 =head2 How do I debug my Perl programs? @@ -136,28 +168,28 @@ on warnings and strictures, you can head off many problems before they get too big. You can find out more about these in L<strict> and L<warnings>. - #!/usr/bin/perl - use strict; - use warnings; + #!/usr/bin/perl + use strict; + use warnings; Beyond that, the simplest debugger is the C<print> function. Use it to look at values as you run your program: - print STDERR "The value is [$value]\n"; + print STDERR "The value is [$value]\n"; -The C<Data::Dumper> module can pretty-print Perl data structures: +The L<Data::Dumper> module can pretty-print Perl data structures: - use Data::Dumper qw( Dumper ); - print STDERR "The hash is " . Dumper( \%hash ) . "\n"; + use Data::Dumper qw( Dumper ); + print STDERR "The hash is " . Dumper( \%hash ) . "\n"; Perl comes with an interactive debugger, which you can start with the C<-d> switch. It's fully explained in L<perldebug>. -If you'd like a graphical user interface and you have C<Tk>, you can use +If you'd like a graphical user interface and you have L<Tk>, you can use C<ptkdb>. It's on CPAN and available for free. If you need something much more sophisticated and controllable, Leon -Brocard's C<Devel::ebug> (which you can call with the C<-D> switch as C<-Debug>) +Brocard's L<Devel::ebug> (which you can call with the C<-D> switch as C<-Debug>) gives you the programmatic hooks into everything you need to write your own (without too much pain and suffering). @@ -171,20 +203,20 @@ from Activestate (Windows and Mac OS X), or EPIC (most platforms). The C<Devel> namespace has several modules which you can use to profile your Perl programs. -The C<Devel::NYTProf> (New York Times Profiler) does both statement +The L<Devel::NYTProf> (New York Times Profiler) does both statement and subroutine profiling. It's available from CPAN and you also invoke it with the C<-d> switch: - perl -d:NYTProf some_perl.pl + perl -d:NYTProf some_perl.pl It creates a database of the profile information that you can turn into reports. The C<nytprofhtml> command turns the data into an HTML report -similar to the C<Devel::Cover> report: +similar to the L<Devel::Cover> report: - nytprofhtml + nytprofhtml CPAN has several other profilers that you can invoke in the same -fashion. You might also be interested in using the C<Benchmark> to +fashion. You might also be interested in using the L<Benchmark> to measure and compare code snippets. You can read more about profiling in I<Programming Perl>, chapter 20, @@ -209,18 +241,17 @@ http://www.stonehenge.com/merlyn/LinuxMag/col75.html . =head2 How do I cross-reference my Perl programs? -The C<B::Xref> module can be used to generate cross-reference reports +The L<B::Xref> module can be used to generate cross-reference reports for Perl programs. perl -MO=Xref[,OPTIONS] scriptname.plx =head2 Is there a pretty-printer (formatter) for Perl? -C<Perltidy> is a Perl script which indents and reformats Perl scripts -to make them easier to read by trying to follow the rules of the -L<perlstyle>. If you write Perl scripts, or spend much time reading -them, you will probably find it useful. It is available at -http://perltidy.sourceforge.net . +L<Perl::Tidy> comes with a perl script L<perltidy> which indents and +reformats Perl scripts to make them easier to read by trying to follow +the rules of the L<perlstyle>. If you write Perl, or spend much time reading +Perl, you will probably find it useful. Of course, if you simply follow the guidelines in L<perlstyle>, you shouldn't need to reformat. The habit of formatting your code @@ -651,22 +682,22 @@ toward this: Don't read an entire file into memory if you can process it line by line. Or more concretely, use a loop like this: - # - # Good Idea - # - while (<FILE>) { - # ... - } + # + # Good Idea + # + while (<FILE>) { + # ... + } instead of this: - # - # Bad Idea - # - @data = <FILE>; - foreach (@data) { - # ... - } + # + # Bad Idea + # + @data = <FILE>; + foreach (@data) { + # ... + } When the files you're processing are small, it doesn't much matter which way you do it, but it makes a huge difference when they start getting @@ -700,21 +731,21 @@ only makes one copy. Ditto for stringifying large arrays: - { - local $, = "\n"; - print @big_array; - } + { + local $, = "\n"; + print @big_array; + } is much more memory-efficient than either - print join "\n", @big_array; + print join "\n", @big_array; or - { - local $" = "\n"; - print "@big_array"; - } + { + local $" = "\n"; + print "@big_array"; + } =item * Pass by reference @@ -741,8 +772,8 @@ Yes. Perl's garbage collection system takes care of this so everything works out right. sub makeone { - my @a = ( 1 .. 10 ); - return \@a; + my @a = ( 1 .. 10 ); + return \@a; } for ( 1 .. 10 ) { @@ -931,7 +962,7 @@ Yes. Read L<perlrun> for more information. Some examples follow. # display reasonable manpath echo $PATH | perl -nl -072 -e ' - s![^/+]*$!man!&&-d&&!$s{$_}++&&push@m,$_;END{print"@m"}' + s![^/+]*$!man!&&-d&&!$s{$_}++&&push@m,$_;END{print"@m"}' OK, the last one was actually an Obfuscated Perl Contest entry. :-) @@ -989,7 +1020,7 @@ do I get 500 Errors" or "Why doesn't it run from the browser right when it runs fine on the command line", see the troubleshooting guides and references in L<perlfaq9> or in the CGI MetaFAQ: - http://www.perl.org/CGI_MetaFAQ.html + http://www.perl.org/CGI_MetaFAQ.html =head2 Where can I learn about object-oriented Perl programming? @@ -1043,7 +1074,7 @@ or (contributed by brian d foy) -The C<ExtUtils::MakeMaker> module, better known simply as "MakeMaker", +The L<ExtUtils::MakeMaker> module, better known simply as "MakeMaker", turns a Perl script, typically called C<Makefile.PL>, into a Makefile. The Unix tool C<make> uses this file to manage dependencies and actions to process and install a Perl distribution. diff --git a/cpan/perlfaq/lib/perlfaq4.pod b/cpan/perlfaq/lib/perlfaq4.pod index 40000f8905..6bca48e794 100644 --- a/cpan/perlfaq/lib/perlfaq4.pod +++ b/cpan/perlfaq/lib/perlfaq4.pod @@ -28,9 +28,9 @@ To limit the number of decimal places in your numbers, you can use the C<printf> or C<sprintf> function. See L<perlop/"Floating-point Arithmetic"> for more details. - printf "%.2f", 10/3; + printf "%.2f", 10/3; - my $number = sprintf "%.2f", 10/3; + my $number = sprintf "%.2f", 10/3; =head2 Why is int() broken? @@ -43,7 +43,7 @@ First, see the answer to "Why am I getting long decimals For example, this - print int(0.6/0.2-2), "\n"; + print int(0.6/0.2-2), "\n"; will in most computers print 0, not 1, because even such simple numbers as 0.6 and 0.2 cannot be presented exactly by floating-point @@ -59,38 +59,38 @@ converts as a decimal number. When Perl converts a string to a number, it ignores leading spaces and zeroes, then assumes the rest of the digits are in base 10: - my $string = '0644'; + my $string = '0644'; - print $string + 0; # prints 644 + print $string + 0; # prints 644 - print $string + 44; # prints 688, certainly not octal! + print $string + 44; # prints 688, certainly not octal! This problem usually involves one of the Perl built-ins that has the same name a Unix command that uses octal numbers as arguments on the command line. In this example, C<chmod> on the command line knows that its first argument is octal because that's what it does: - %prompt> chmod 644 file + %prompt> chmod 644 file If you want to use the same literal digits (644) in Perl, you have to tell Perl to treat them as octal numbers either by prefixing the digits with a C<0> or using C<oct>: - chmod( 0644, $file); # right, has leading zero - chmod( oct(644), $file ); # also correct + chmod( 0644, $file); # right, has leading zero + chmod( oct(644), $file ); # also correct The problem comes in when you take your numbers from something that Perl thinks is a string, such as a command line argument in C<@ARGV>: - chmod( $ARGV[0], $file); # wrong, even if "0644" + chmod( $ARGV[0], $file); # wrong, even if "0644" - chmod( oct($ARGV[0]), $file ); # correct, treat string as octal + chmod( oct($ARGV[0]), $file ); # correct, treat string as octal You can always check the value you're using by printing it in octal notation to ensure it matches what you think it should be. Print it in octal and decimal format: - printf "0%o %d", $number, $number; + printf "0%o %d", $number, $number; =head2 Does Perl have a round() function? What about ceil() and floor()? Trig functions? @@ -98,20 +98,20 @@ Remember that C<int()> merely truncates toward 0. For rounding to a certain number of digits, C<sprintf()> or C<printf()> is usually the easiest route. - printf("%.3f", 3.1415926535); # prints 3.142 + printf("%.3f", 3.1415926535); # prints 3.142 -The C<POSIX> module (part of the standard Perl distribution) +The Perltidy module (part of the standard Perl distribution) implements C<ceil()>, C<floor()>, and a number of other mathematical and trigonometric functions. - use POSIX; - $ceil = ceil(3.5); # 4 - $floor = floor(3.5); # 3 + use POSIX; + $ceil = ceil(3.5); # 4 + $floor = floor(3.5); # 3 -In 5.000 to 5.003 perls, trigonometry was done in the C<Math::Complex> -module. With 5.004, the C<Math::Trig> module (part of the standard Perl +In 5.000 to 5.003 perls, trigonometry was done in the L<Math::Complex> +module. With 5.004, the L<Math::Trig> module (part of the standard Perl distribution) implements the trigonometric functions. Internally it -uses the C<Math::Complex> module and some functions can break out from +uses the L<Math::Complex> module and some functions can break out from the real axis into the complex plane, for example the inverse sine of 2. @@ -124,10 +124,10 @@ need yourself. To see why, notice how you'll still have an issue on half-way-point alternation: - for ($i = 0; $i < 1.01; $i += 0.05) { printf "%.1f ",$i} + for ($i = 0; $i < 1.01; $i += 0.05) { printf "%.1f ",$i} - 0.0 0.1 0.1 0.2 0.2 0.2 0.3 0.3 0.4 0.4 0.5 0.5 0.6 0.7 0.7 - 0.8 0.8 0.9 0.9 1.0 1.0 + 0.0 0.1 0.1 0.2 0.2 0.2 0.3 0.3 0.4 0.4 0.5 0.5 0.6 0.7 0.7 + 0.8 0.8 0.9 0.9 1.0 1.0 Don't blame Perl. It's the same as in C. IEEE says we have to do this. Perl numbers whose absolute values are integers under 2**31 (on @@ -141,8 +141,8 @@ few examples of approaches to making common conversions between number representations. This is intended to be representational rather than exhaustive. -Some of the examples later in L<perlfaq4> use the C<Bit::Vector> -module from CPAN. The reason you might choose C<Bit::Vector> over the +Some of the examples later in L<perlfaq4> use the L<Bit::Vector> +module from CPAN. The reason you might choose L<Bit::Vector> over the perl built-in functions is that it works with numbers of ANY size, that it is optimized for speed on some operations, and for at least some programmers the notation might be familiar. @@ -153,119 +153,119 @@ some programmers the notation might be familiar. Using perl's built in conversion of C<0x> notation: - $dec = 0xDEADBEEF; + $dec = 0xDEADBEEF; Using the C<hex> function: - $dec = hex("DEADBEEF"); + $dec = hex("DEADBEEF"); Using C<pack>: - $dec = unpack("N", pack("H8", substr("0" x 8 . "DEADBEEF", -8))); + $dec = unpack("N", pack("H8", substr("0" x 8 . "DEADBEEF", -8))); Using the CPAN module C<Bit::Vector>: - use Bit::Vector; - $vec = Bit::Vector->new_Hex(32, "DEADBEEF"); - $dec = $vec->to_Dec(); + use Bit::Vector; + $vec = Bit::Vector->new_Hex(32, "DEADBEEF"); + $dec = $vec->to_Dec(); =item How do I convert from decimal to hexadecimal Using C<sprintf>: - $hex = sprintf("%X", 3735928559); # upper case A-F - $hex = sprintf("%x", 3735928559); # lower case a-f + $hex = sprintf("%X", 3735928559); # upper case A-F + $hex = sprintf("%x", 3735928559); # lower case a-f Using C<unpack>: - $hex = unpack("H*", pack("N", 3735928559)); + $hex = unpack("H*", pack("N", 3735928559)); -Using C<Bit::Vector>: +Using L<Bit::Vector>: - use Bit::Vector; - $vec = Bit::Vector->new_Dec(32, -559038737); - $hex = $vec->to_Hex(); + use Bit::Vector; + $vec = Bit::Vector->new_Dec(32, -559038737); + $hex = $vec->to_Hex(); -And C<Bit::Vector> supports odd bit counts: +And L<Bit::Vector> supports odd bit counts: - use Bit::Vector; - $vec = Bit::Vector->new_Dec(33, 3735928559); - $vec->Resize(32); # suppress leading 0 if unwanted - $hex = $vec->to_Hex(); + use Bit::Vector; + $vec = Bit::Vector->new_Dec(33, 3735928559); + $vec->Resize(32); # suppress leading 0 if unwanted + $hex = $vec->to_Hex(); =item How do I convert from octal to decimal Using Perl's built in conversion of numbers with leading zeros: - $dec = 033653337357; # note the leading 0! + $dec = 033653337357; # note the leading 0! Using the C<oct> function: - $dec = oct("33653337357"); + $dec = oct("33653337357"); -Using C<Bit::Vector>: +Using L<Bit::Vector>: - use Bit::Vector; - $vec = Bit::Vector->new(32); - $vec->Chunk_List_Store(3, split(//, reverse "33653337357")); - $dec = $vec->to_Dec(); + use Bit::Vector; + $vec = Bit::Vector->new(32); + $vec->Chunk_List_Store(3, split(//, reverse "33653337357")); + $dec = $vec->to_Dec(); =item How do I convert from decimal to octal Using C<sprintf>: - $oct = sprintf("%o", 3735928559); + $oct = sprintf("%o", 3735928559); -Using C<Bit::Vector>: +Using L<Bit::Vector>: - use Bit::Vector; - $vec = Bit::Vector->new_Dec(32, -559038737); - $oct = reverse join('', $vec->Chunk_List_Read(3)); + use Bit::Vector; + $vec = Bit::Vector->new_Dec(32, -559038737); + $oct = reverse join('', $vec->Chunk_List_Read(3)); =item How do I convert from binary to decimal Perl 5.6 lets you write binary numbers directly with the C<0b> notation: - $number = 0b10110110; + $number = 0b10110110; Using C<oct>: - my $input = "10110110"; - $decimal = oct( "0b$input" ); + my $input = "10110110"; + $decimal = oct( "0b$input" ); Using C<pack> and C<ord>: - $decimal = ord(pack('B8', '10110110')); + $decimal = ord(pack('B8', '10110110')); Using C<pack> and C<unpack> for larger strings: - $int = unpack("N", pack("B32", - substr("0" x 32 . "11110101011011011111011101111", -32))); - $dec = sprintf("%d", $int); + $int = unpack("N", pack("B32", + substr("0" x 32 . "11110101011011011111011101111", -32))); + $dec = sprintf("%d", $int); - # substr() is used to left-pad a 32-character string with zeros. + # substr() is used to left-pad a 32-character string with zeros. -Using C<Bit::Vector>: +Using L<Bit::Vector>: - $vec = Bit::Vector->new_Bin(32, "11011110101011011011111011101111"); - $dec = $vec->to_Dec(); + $vec = Bit::Vector->new_Bin(32, "11011110101011011011111011101111"); + $dec = $vec->to_Dec(); =item How do I convert from decimal to binary Using C<sprintf> (perl 5.6+): - $bin = sprintf("%b", 3735928559); + $bin = sprintf("%b", 3735928559); Using C<unpack>: - $bin = unpack("B*", pack("N", 3735928559)); + $bin = unpack("B*", pack("N", 3735928559)); -Using C<Bit::Vector>: +Using L<Bit::Vector>: - use Bit::Vector; - $vec = Bit::Vector->new_Dec(32, -559038737); - $bin = $vec->to_Bin(); + use Bit::Vector; + $vec = Bit::Vector->new_Dec(32, -559038737); + $bin = $vec->to_Bin(); The remaining transformations (e.g. hex -> oct, bin -> hex, etc.) are left as an exercise to the inclined reader. @@ -290,59 +290,59 @@ stringify the arguments explicitly (using C<""> or C<qq()>) or convert them to numbers explicitly (using C<0+$arg>). The rest arise because the programmer says: - if ("\020\020" & "\101\101") { - # ... - } + if ("\020\020" & "\101\101") { + # ... + } but a string consisting of two null bytes (the result of C<"\020\020" & "\101\101">) is not a false value in Perl. You need: - if ( ("\020\020" & "\101\101") !~ /[^\000]/) { - # ... - } + if ( ("\020\020" & "\101\101") !~ /[^\000]/) { + # ... + } =head2 How do I multiply matrices? -Use the C<Math::Matrix> or C<Math::MatrixReal> modules (available from CPAN) -or the C<PDL> extension (also available from CPAN). +Use the L<Math::Matrix> or L<Math::MatrixReal> modules (available from CPAN) +or the L<PDL> extension (also available from CPAN). =head2 How do I perform an operation on a series of integers? To call a function on each element in an array, and collect the results, use: - @results = map { my_func($_) } @array; + @results = map { my_func($_) } @array; For example: - @triple = map { 3 * $_ } @single; + @triple = map { 3 * $_ } @single; To call a function on each element of an array, but ignore the results: - foreach $iterator (@array) { - some_func($iterator); - } + foreach $iterator (@array) { + some_func($iterator); + } To call a function on each integer in a (small) range, you B<can> use: - @results = map { some_func($_) } (5 .. 25); + @results = map { some_func($_) } (5 .. 25); but you should be aware that the C<..> operator creates a list of all integers in the range. This can take a lot of memory for large ranges. Instead use: - @results = (); - for ($i=5; $i <= 500_005; $i++) { - push(@results, some_func($i)); - } + @results = (); + for ($i=5; $i <= 500_005; $i++) { + push(@results, some_func($i)); + } This situation has been fixed in Perl5.005. Use of C<..> in a C<for> loop will iterate over the range, without creating the entire range. - for my $i (5 .. 500_005) { - push(@results, some_func($i)); - } + for my $i (5 .. 500_005) { + push(@results, some_func($i)); + } will not create a list of 500,000 integers. @@ -355,7 +355,7 @@ Get the L<http://www.cpan.org/modules/by-module/Roman> module. If you're using a version of Perl before 5.004, you must call C<srand> once at the start of your program to seed the random number generator. - BEGIN { srand() if $] < 5.004 } + BEGIN { srand() if $] < 5.004 } 5.004 and later automatically call C<srand> at the beginning. Don't call C<srand> more than once--you make your numbers less random, @@ -370,7 +370,7 @@ who attempts to generate random numbers by deterministic means is, of course, living in a state of sin." If you want numbers that are more random than C<rand> with C<srand> -provides, you should also check out the C<Math::TrulyRandom> module from +provides, you should also check out the L<Math::TrulyRandom> module from CPAN. It uses the imperfections in your system's timer to generate random numbers, but this takes quite a while. If you want a better pseudorandom generator than comes with your operating system, look at @@ -389,19 +389,19 @@ from 0 to the difference between your I<X> and I<Y>. That is, to get a number between 10 and 15, inclusive, you want a random number between 0 and 5 that you can then add to 10. - my $number = 10 + int rand( 15-10+1 ); # ( 10,11,12,13,14, or 15 ) + my $number = 10 + int rand( 15-10+1 ); # ( 10,11,12,13,14, or 15 ) Hence you derive the following simple function to abstract that. It selects a random integer between the two given integers (inclusive), For example: C<random_int_between(50,120)>. - sub random_int_between { - my($min, $max) = @_; - # Assumes that the two arguments are integers themselves! - return $min if $min == $max; - ($min, $max) = ($max, $min) if $min > $max; - return $min + int rand(1 + $max - $min); - } + sub random_int_between { + my($min, $max) = @_; + # Assumes that the two arguments are integers themselves! + return $min if $min == $max; + ($min, $max) = ($max, $min) if $min > $max; + return $min + int rand(1 + $max - $min); + } =head1 Data: Dates @@ -411,48 +411,48 @@ The day of the year is in the list returned by the C<localtime> function. Without an argument C<localtime> uses the current time. - my $day_of_year = (localtime)[7]; + my $day_of_year = (localtime)[7]; -The C<POSIX> module can also format a date as the day of the year or +The L<POSIX> module can also format a date as the day of the year or week of the year. - use POSIX qw/strftime/; - my $day_of_year = strftime "%j", localtime; - my $week_of_year = strftime "%W", localtime; + use POSIX qw/strftime/; + my $day_of_year = strftime "%j", localtime; + my $week_of_year = strftime "%W", localtime; -To get the day of year for any date, use C<POSIX>'s C<mktime> to get +To get the day of year for any date, use L<POSIX>'s C<mktime> to get a time in epoch seconds for the argument to C<localtime>. - use POSIX qw/mktime strftime/; - my $week_of_year = strftime "%W", - localtime( mktime( 0, 0, 0, 18, 11, 87 ) ); + use POSIX qw/mktime strftime/; + my $week_of_year = strftime "%W", + localtime( mktime( 0, 0, 0, 18, 11, 87 ) ); -You can also use C<Time::Piece>, which comes with Perl and provides a +You can also use L<Time::Piece>, which comes with Perl and provides a C<localtime> that returns an object: - use Time::Piece; - my $day_of_year = localtime->yday; - my $week_of_year = localtime->week; + use Time::Piece; + my $day_of_year = localtime->yday; + my $week_of_year = localtime->week; -The C<Date::Calc> module provides two functions to calculate these, too: +The L<Date::Calc> module provides two functions to calculate these, too: - use Date::Calc; - my $day_of_year = Day_of_Year( 1987, 12, 18 ); - my $week_of_year = Week_of_Year( 1987, 12, 18 ); + use Date::Calc; + my $day_of_year = Day_of_Year( 1987, 12, 18 ); + my $week_of_year = Week_of_Year( 1987, 12, 18 ); =head2 How do I find the current century or millennium? Use the following simple functions: - sub get_century { - return int((((localtime(shift || time))[5] + 1999))/100); - } + sub get_century { + return int((((localtime(shift || time))[5] + 1999))/100); + } - sub get_millennium { - return 1+int((((localtime(shift || time))[5] + 1899))/1000); - } + sub get_millennium { + return 1+int((((localtime(shift || time))[5] + 1899))/1000); + } -On some systems, the C<POSIX> module's C<strftime()> function has been +On some systems, the L<POSIX> module's C<strftime()> function has been extended in a non-standard way to use a C<%C> format, which they sometimes claim is the "century". It isn't, because on most such systems, this is only the first two digits of the four-digit year, and @@ -466,71 +466,71 @@ millennium. You could just store all your dates as a number and then subtract. Life isn't always that simple though. -The C<Time::Piece> module, which comes with Perl, replaces C<localtime> +The L<Time::Piece> module, which comes with Perl, replaces L<localtime> with a version that returns an object. It also overloads the comparison operators so you can compare them directly: - use Time::Piece; - my $date1 = localtime( $some_time ); - my $date2 = localtime( $some_other_time ); + use Time::Piece; + my $date1 = localtime( $some_time ); + my $date2 = localtime( $some_other_time ); - if( $date1 < $date2 ) { - print "The date was in the past\n"; - } + if( $date1 < $date2 ) { + print "The date was in the past\n"; + } You can also get differences with a subtraction, which returns a -C<Time::Seconds> object: +L<Time::Seconds> object: - my $diff = $date1 - $date2; - print "The difference is ", $date_diff->days, " days\n"; + my $diff = $date1 - $date2; + print "The difference is ", $date_diff->days, " days\n"; -If you want to work with formatted dates, the C<Date::Manip>, -C<Date::Calc>, or C<DateTime> modules can help you. +If you want to work with formatted dates, the L<Date::Manip>, +L<Date::Calc>, or L<DateTime> modules can help you. =head2 How can I take a string and turn it into epoch seconds? If it's a regular enough string that it always has the same format, you can split it up and pass the parts to C<timelocal> in the standard -C<Time::Local> module. Otherwise, you should look into the C<Date::Calc>, -C<Date::Parse>, and C<Date::Manip> modules from CPAN. +L<Time::Local> module. Otherwise, you should look into the L<Date::Calc>, +L<Date::Parse>, and L<Date::Manip> modules from CPAN. =head2 How can I find the Julian Day? (contributed by brian d foy and Dave Cross) -You can use the C<Time::Piece> module, part of the Standard Library, +You can use the L<Time::Piece> module, part of the Standard Library, which can convert a date/time to a Julian Day: - $ perl -MTime::Piece -le 'print localtime->julian_day' - 2455607.7959375 + $ perl -MTime::Piece -le 'print localtime->julian_day' + 2455607.7959375 Or the modified Julian Day: - $ perl -MTime::Piece -le 'print localtime->mjd' - 55607.2961226851 + $ perl -MTime::Piece -le 'print localtime->mjd' + 55607.2961226851 Or even the day of the year (which is what some people think of as a Julian day): - $ perl -MTime::Piece -le 'print localtime->yday' - 45 + $ perl -MTime::Piece -le 'print localtime->yday' + 45 -You can also do the same things with the C<DateTime> module: +You can also do the same things with the L<DateTime> module: - $ perl -MDateTime -le'print DateTime->today->jd' - 2453401.5 - $ perl -MDateTime -le'print DateTime->today->mjd' - 53401 - $ perl -MDateTime -le'print DateTime->today->doy' - 31 + $ perl -MDateTime -le'print DateTime->today->jd' + 2453401.5 + $ perl -MDateTime -le'print DateTime->today->mjd' + 53401 + $ perl -MDateTime -le'print DateTime->today->doy' + 31 -You can use the C<Time::JulianDay> module available on CPAN. Ensure +You can use the L<Time::JulianDay> module available on CPAN. Ensure that you really want to find a Julian day, though, as many people have different ideas about Julian days (see http://www.hermetic.ch/cal_stud/jdn.htm for instance): - $ perl -MTime::JulianDay -le 'print local_julian_day( time )' - 55608 + $ perl -MTime::JulianDay -le 'print local_julian_day( time )' + 55608 =head2 How do I find yesterday's date? X<date> X<yesterday> X<DateTime> X<Date::Calc> X<Time::Local> @@ -540,24 +540,24 @@ X<timelocal> (contributed by brian d foy) To do it correctly, you can use one of the C<Date> modules since they -work with calendars instead of times. The C<DateTime> module makes it +work with calendars instead of times. The L<DateTime> module makes it simple, and give you the same time of day, only the day before, despite daylight saving time changes: - use DateTime; + use DateTime; - my $yesterday = DateTime->now->subtract( days => 1 ); + my $yesterday = DateTime->now->subtract( days => 1 ); - print "Yesterday was $yesterday\n"; + print "Yesterday was $yesterday\n"; -You can also use the C<Date::Calc> module using its C<Today_and_Now> +You can also use the L<Date::Calc> module using its C<Today_and_Now> function. - use Date::Calc qw( Today_and_Now Add_Delta_DHMS ); + use Date::Calc qw( Today_and_Now Add_Delta_DHMS ); - my @date_time = Add_Delta_DHMS( Today_and_Now(), -1, 0, 0, 0 ); + my @date_time = Add_Delta_DHMS( Today_and_Now(), -1, 0, 0, 0 ); - print "@date_time\n"; + print "@date_time\n"; Most people try to use the time rather than the calendar to figure out dates, but that assumes that days are twenty-four hours each. For @@ -565,31 +565,31 @@ most people, there are two days a year when they aren't: the switch to and from summer time throws this off. For example, the rest of the suggestions will be wrong sometimes: -Starting with Perl 5.10, C<Time::Piece> and C<Time::Seconds> are part +Starting with Perl 5.10, L<Time::Piece> and L<Time::Seconds> are part of the standard distribution, so you might think that you could do something like this: - use Time::Piece; - use Time::Seconds; + use Time::Piece; + use Time::Seconds; - my $yesterday = localtime() - ONE_DAY; # WRONG - print "Yesterday was $yesterday\n"; + my $yesterday = localtime() - ONE_DAY; # WRONG + print "Yesterday was $yesterday\n"; -The C<Time::Piece> module exports a new C<localtime> that returns an -object, and C<Time::Seconds> exports the C<ONE_DAY> constant that is a +The L<Time::Piece> module exports a new C<localtime> that returns an +object, and L<Time::Seconds> exports the C<ONE_DAY> constant that is a set number of seconds. This means that it always gives the time 24 hours ago, which is not always yesterday. This can cause problems around the end of daylight saving time when there's one day that is 25 hours long. -You have the same problem with C<Time::Local>, which will give the wrong +You have the same problem with L<Time::Local>, which will give the wrong answer for those same special cases: - # contributed by Gunnar Hjalmarsson - use Time::Local; - my $today = timelocal 0, 0, 12, ( localtime )[3..5]; - my ($d, $m, $y) = ( localtime $today-86400 )[3..5]; # WRONG - printf "Yesterday: %d-%02d-%02d\n", $y+1900, $m+1, $d; + # contributed by Gunnar Hjalmarsson + use Time::Local; + my $today = timelocal 0, 0, 12, ( localtime )[3..5]; + my ($d, $m, $y) = ( localtime $today-86400 )[3..5]; # WRONG + printf "Yesterday: %d-%02d-%02d\n", $y+1900, $m+1, $d; =head2 Does Perl have a Year 2000 or 2038 problem? Is Perl Y2K compliant? @@ -603,15 +603,15 @@ Starting with Perl 5.12, C<localtime> and C<gmtime> can handle dates past 03:14:08 January 19, 2038, when a 32-bit based time would overflow. You still might get a warning on a 32-bit C<perl>: - % perl5.12 -E 'say scalar localtime( 0x9FFF_FFFFFFFF )' - Integer overflow in hexadecimal number at -e line 1. - Wed Nov 1 19:42:39 5576711 + % perl5.12 -E 'say scalar localtime( 0x9FFF_FFFFFFFF )' + Integer overflow in hexadecimal number at -e line 1. + Wed Nov 1 19:42:39 5576711 On a 64-bit C<perl>, you can get even larger dates for those really long running projects: - % perl5.12 -E 'say scalar gmtime( 0x9FFF_FFFFFFFF )' - Thu Nov 2 00:42:39 5576711 + % perl5.12 -E 'say scalar gmtime( 0x9FFF_FFFFFFFF )' + Thu Nov 2 00:42:39 5576711 You're still out of luck if you need to keep track of decaying protons though. @@ -625,11 +625,11 @@ though. There are many ways to ensure that values are what you expect or want to accept. Besides the specific examples that we cover in the perlfaq, you can also look at the modules with "Assert" and "Validate" -in their names, along with other modules such as C<Regexp::Common>. +in their names, along with other modules such as L<Regexp::Common>. Some modules have validation for particular types of input, such -as C<Business::ISBN>, C<Business::CreditCard>, C<Email::Valid>, -and C<Data::Validate::IP>. +as L<Business::ISBN>, L<Business::CreditCard>, L<Email::Valid>, +and L<Data::Validate::IP>. =head2 How do I unescape a string? @@ -637,7 +637,7 @@ It depends just what you mean by "escape". URL escapes are dealt with in L<perlfaq9>. Shell escapes with the backslash (C<\>) character are removed with - s/\\(.)/$1/g; + s/\\(.)/$1/g; This won't expand C<"\n"> or C<"\t"> or any other special escapes. @@ -652,7 +652,7 @@ store the matched character in the back-reference C<\g1> and we use that to require that the same thing immediately follow it. We replace that part of the string with the character in C<$1>. - s/(.)\g1/$1/g; + s/(.)\g1/$1/g; We can also use the transliteration operator, C<tr///>. In this example, the search list side of our C<tr///> contains nothing, but @@ -663,8 +663,8 @@ replace the character with itself). However, the C<s> option squashes duplicated and consecutive characters in the string so a character does not show up next to itself - my $str = 'Haarlem'; # in the Netherlands - $str =~ tr///cs; # Now Harlem, like in New York + my $str = 'Haarlem'; # in the Netherlands + $str =~ tr///cs; # Now Harlem, like in New York =head2 How do I expand function calls in a string? @@ -676,7 +676,7 @@ function inside the braces used to dereference a reference. If we have more than one return value, we can construct and dereference an anonymous array. In this case, we call the function in list context. - print "The time values are @{ [localtime] }.\n"; + print "The time values are @{ [localtime] }.\n"; If we want to call the function in scalar context, we have to do a bit more work. We can really have any code we like inside the braces, so @@ -685,29 +685,29 @@ that is up to you, and you can use code inside the braces. Note that the use of parens creates a list context, so we need C<scalar> to force the scalar context on the function: - print "The time is ${\(scalar localtime)}.\n" + print "The time is ${\(scalar localtime)}.\n" - print "The time is ${ my $x = localtime; \$x }.\n"; + print "The time is ${ my $x = localtime; \$x }.\n"; If your function already returns a reference, you don't need to create the reference yourself. - sub timestamp { my $t = localtime; \$t } + sub timestamp { my $t = localtime; \$t } - print "The time is ${ timestamp() }.\n"; + print "The time is ${ timestamp() }.\n"; The C<Interpolation> module can also do a lot of magic for you. You can specify a variable name, in this case C<E>, to set up a tied hash that does the interpolation for you. It has several other methods to do this as well. - use Interpolation E => 'eval'; - print "The time values are $E{localtime()}.\n"; + use Interpolation E => 'eval'; + print "The time values are $E{localtime()}.\n"; In most cases, it is probably easier to simply use string concatenation, which also forces scalar context. - print "The time is " . localtime() . ".\n"; + print "The time is " . localtime() . ".\n"; =head2 How do I find matching/nesting anything? @@ -723,68 +723,68 @@ parser. If you are serious about writing a parser, there are a number of modules or oddities that will make your life a lot easier. There are -the CPAN modules C<Parse::RecDescent>, C<Parse::Yapp>, and -C<Text::Balanced>; and the C<byacc> program. Starting from perl 5.8 -the C<Text::Balanced> is part of the standard distribution. +the CPAN modules L<Parse::RecDescent>, L<Parse::Yapp>, and +L<Text::Balanced>; and the C<byacc> program. Starting from perl 5.8 +the L<Text::Balanced> is part of the standard distribution. One simple destructive, inside-out approach that you might try is to pull out the smallest nesting parts one at a time: - while (s/BEGIN((?:(?!BEGIN)(?!END).)*)END//gs) { - # do something with $1 - } + while (s/BEGIN((?:(?!BEGIN)(?!END).)*)END//gs) { + # do something with $1 + } A more complicated and sneaky approach is to make Perl's regular expression engine do it for you. This is courtesy Dean Inada, and rather has the nature of an Obfuscated Perl Contest entry, but it really does work: - # $_ contains the string to parse - # BEGIN and END are the opening and closing markers for the - # nested text. + # $_ contains the string to parse + # BEGIN and END are the opening and closing markers for the + # nested text. - @( = ('(',''); - @) = (')',''); - ($re=$_)=~s/((BEGIN)|(END)|.)/$)[!$3]\Q$1\E$([!$2]/gs; - @$ = (eval{/$re/},$@!~/unmatched/i); - print join("\n",@$[0..$#$]) if( $$[-1] ); + @( = ('(',''); + @) = (')',''); + ($re=$_)=~s/((BEGIN)|(END)|.)/$)[!$3]\Q$1\E$([!$2]/gs; + @$ = (eval{/$re/},$@!~/unmatched/i); + print join("\n",@$[0..$#$]) if( $$[-1] ); =head2 How do I reverse a string? Use C<reverse()> in scalar context, as documented in L<perlfunc/reverse>. - $reversed = reverse $string; + $reversed = reverse $string; =head2 How do I expand tabs in a string? You can do it yourself: - 1 while $string =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e; + 1 while $string =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e; -Or you can just use the C<Text::Tabs> module (part of the standard Perl +Or you can just use the L<Text::Tabs> module (part of the standard Perl distribution). - use Text::Tabs; - @expanded_lines = expand(@lines_with_tabs); + use Text::Tabs; + @expanded_lines = expand(@lines_with_tabs); =head2 How do I reformat a paragraph? -Use C<Text::Wrap> (part of the standard Perl distribution): +Use L<Text::Wrap> (part of the standard Perl distribution): - use Text::Wrap; - print wrap("\t", ' ', @paragraphs); + use Text::Wrap; + print wrap("\t", ' ', @paragraphs); -The paragraphs you give to C<Text::Wrap> should not contain embedded -newlines. C<Text::Wrap> doesn't justify the lines (flush-right). +The paragraphs you give to L<Text::Wrap> should not contain embedded +newlines. L<Text::Wrap> doesn't justify the lines (flush-right). -Or use the CPAN module C<Text::Autoformat>. Formatting files can be +Or use the CPAN module L<Text::Autoformat>. Formatting files can be easily done by making a shell alias, like so: - alias fmt="perl -i -MText::Autoformat -n0777 \ - -e 'print autoformat $_, {all=>1}' $*" + alias fmt="perl -i -MText::Autoformat -n0777 \ + -e 'print autoformat $_, {all=>1}' $*" -See the documentation for C<Text::Autoformat> to appreciate its many +See the documentation for L<Text::Autoformat> to appreciate its many capabilities. =head2 How can I access or change N characters of a string? @@ -794,17 +794,17 @@ To get the first character, for example, start at position 0 and grab the string of length 1. - $string = "Just another Perl Hacker"; - $first_char = substr( $string, 0, 1 ); # 'J' + $string = "Just another Perl Hacker"; + $first_char = substr( $string, 0, 1 ); # 'J' To change part of a string, you can use the optional fourth argument which is the replacement string. - substr( $string, 13, 4, "Perl 5.8.0" ); + substr( $string, 13, 4, "Perl 5.8.0" ); You can also use substr() as an lvalue. - substr( $string, 13, 4 ) = "Perl 5.8.0"; + substr( $string, 13, 4 ) = "Perl 5.8.0"; =head2 How do I change the Nth occurrence of something? @@ -813,29 +813,29 @@ to change the fifth occurrence of C<"whoever"> or C<"whomever"> into C<"whosoever"> or C<"whomsoever">, case insensitively. These all assume that $_ contains the string to be altered. - $count = 0; - s{((whom?)ever)}{ - ++$count == 5 # is it the 5th? - ? "${2}soever" # yes, swap - : $1 # renege and leave it there - }ige; + $count = 0; + s{((whom?)ever)}{ + ++$count == 5 # is it the 5th? + ? "${2}soever" # yes, swap + : $1 # renege and leave it there + }ige; In the more general case, you can use the C</g> modifier in a C<while> loop, keeping count of matches. - $WANT = 3; - $count = 0; - $_ = "One fish two fish red fish blue fish"; - while (/(\w+)\s+fish\b/gi) { - if (++$count == $WANT) { - print "The third fish is a $1 one.\n"; - } - } + $WANT = 3; + $count = 0; + $_ = "One fish two fish red fish blue fish"; + while (/(\w+)\s+fish\b/gi) { + if (++$count == $WANT) { + print "The third fish is a $1 one.\n"; + } + } That prints out: C<"The third fish is a red one."> You can also use a repetition count and repeated pattern like this: - /(?:\w+\s+fish\s+){2}(\w+)\s+fish/i; + /(?:\w+\s+fish\s+){2}(\w+)\s+fish/i; =head2 How can I count the number of occurrences of a substring within a string? @@ -843,9 +843,9 @@ There are a number of ways, with varying efficiency. If you want a count of a certain single character (X) within a string, you can use the C<tr///> function like so: - $string = "ThisXlineXhasXsomeXx'sXinXit"; - $count = ($string =~ tr/X//); - print "There are $count X characters in the string"; + $string = "ThisXlineXhasXsomeXx'sXinXit"; + $count = ($string =~ tr/X//); + print "There are $count X characters in the string"; This is fine if you are just looking for a single character. However, if you are trying to count multiple character substrings within a @@ -853,14 +853,14 @@ larger string, C<tr///> won't work. What you can do is wrap a while() loop around a global pattern match. For example, let's count negative integers: - $string = "-9 55 48 -2 23 -76 4 14 -44"; - while ($string =~ /-\d+/g) { $count++ } - print "There are $count negative numbers in the string"; + $string = "-9 55 48 -2 23 -76 4 14 -44"; + while ($string =~ /-\d+/g) { $count++ } + print "There are $count negative numbers in the string"; Another version uses a global match in list context, then assigns the result to a scalar, producing a count of the number of matches. - $count = () = $string =~ /-\d+/g; + $count = () = $string =~ /-\d+/g; =head2 How do I capitalize all the words on one line? X<Text::Autoformat> X<capitalize> X<case, title> X<case, sentence> @@ -870,20 +870,20 @@ X<Text::Autoformat> X<capitalize> X<case, title> X<case, sentence> Damian Conway's L<Text::Autoformat> handles all of the thinking for you. - use Text::Autoformat; - my $x = "Dr. Strangelove or: How I Learned to Stop ". - "Worrying and Love the Bomb"; + use Text::Autoformat; + my $x = "Dr. Strangelove or: How I Learned to Stop ". + "Worrying and Love the Bomb"; - print $x, "\n"; - for my $style (qw( sentence title highlight )) { - print autoformat($x, { case => $style }), "\n"; - } + print $x, "\n"; + for my $style (qw( sentence title highlight )) { + print autoformat($x, { case => $style }), "\n"; + } How do you want to capitalize those words? - FRED AND BARNEY'S LODGE # all uppercase - Fred And Barney's Lodge # title case - Fred and Barney's Lodge # highlight case + FRED AND BARNEY'S LODGE # all uppercase + Fred And Barney's Lodge # title case + Fred and Barney's Lodge # highlight case It's not as easy a problem as it looks. How many words do you think are in there? Wait for it... wait for it.... If you answered 5 @@ -891,52 +891,52 @@ you're right. Perl words are groups of C<\w+>, but that's not what you want to capitalize. How is Perl supposed to know not to capitalize that C<s> after the apostrophe? You could try a regular expression: - $string =~ s/ ( - (^\w) #at the beginning of the line - | # or - (\s\w) #preceded by whitespace - ) - /\U$1/xg; + $string =~ s/ ( + (^\w) #at the beginning of the line + | # or + (\s\w) #preceded by whitespace + ) + /\U$1/xg; - $string =~ s/([\w']+)/\u\L$1/g; + $string =~ s/([\w']+)/\u\L$1/g; Now, what if you don't want to capitalize that "and"? Just use L<Text::Autoformat> and get on with the next problem. :) =head2 How can I split a [character]-delimited string except when inside [character]? -Several modules can handle this sort of parsing--C<Text::Balanced>, -C<Text::CSV>, C<Text::CSV_XS>, and C<Text::ParseWords>, among others. +Several modules can handle this sort of parsing--L<Text::Balanced>, +L<Text::CSV>, L<Text::CSV_XS>, and L<Text::ParseWords>, among others. Take the example case of trying to split a string that is comma-separated into its different fields. You can't use C<split(/,/)> because you shouldn't split if the comma is inside quotes. For example, take a data line like this: - SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped" + SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped" Due to the restriction of the quotes, this is a fairly complex problem. Thankfully, we have Jeffrey Friedl, author of I<Mastering Regular Expressions>, to handle these for us. He suggests (assuming your string is contained in C<$text>): - @new = (); - push(@new, $+) while $text =~ m{ - "([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside the quotes - | ([^,]+),? - | , - }gx; - push(@new, undef) if substr($text,-1,1) eq ','; + @new = (); + push(@new, $+) while $text =~ m{ + "([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside the quotes + | ([^,]+),? + | , + }gx; + push(@new, undef) if substr($text,-1,1) eq ','; If you want to represent quotation marks inside a quotation-mark-delimited field, escape them with backslashes (eg, C<"like \"this\"">. -Alternatively, the C<Text::ParseWords> module (part of the standard +Alternatively, the L<Text::ParseWords> module (part of the standard Perl distribution) lets you say: - use Text::ParseWords; - @new = quotewords(",", 0, $text); + use Text::ParseWords; + @new = quotewords(",", 0, $text); =head2 How do I strip blank space from the beginning/end of a string? @@ -946,14 +946,14 @@ A substitution can do this for you. For a single line, you want to replace all the leading or trailing whitespace with nothing. You can do that with a pair of substitutions: - s/^\s+//; - s/\s+$//; + s/^\s+//; + s/\s+$//; You can also write that as a single substitution, although it turns out the combined statement is slower than the separate ones. That might not matter to you, though: - s/^\s+|\s+$//g; + s/^\s+|\s+$//g; In this regular expression, the alternation matches either at the beginning or the end of the string since the anchors have a lower @@ -965,10 +965,10 @@ the newline to the output, which has the added benefit of preserving "blank" (consisting entirely of whitespace) lines which the C<^\s+> would remove all by itself: - while( <> ) { - s/^\s+|\s+$//g; - print "$_\n"; - } + while( <> ) { + s/^\s+|\s+$//g; + print "$_\n"; + } For a multi-line string, you can apply the regular expression to each logical line in the string by adding the C</m> flag (for @@ -976,7 +976,7 @@ logical line in the string by adding the C</m> flag (for embedded newline, so it doesn't remove it. This pattern still removes the newline at the end of the string: - $string =~ s/^\s+|\s+$//gm; + $string =~ s/^\s+|\s+$//gm; Remember that lines consisting entirely of whitespace will disappear, since the first part of the alternation can match the entire string @@ -984,7 +984,7 @@ and replace it with nothing. If you need to keep embedded blank lines, you have to do a little more work. Instead of matching any whitespace (since that includes a newline), just match the other whitespace: - $string =~ s/^[\t\f ]+|[\t\f ]+$//mg; + $string =~ s/^[\t\f ]+|[\t\f ]+$//mg; =head2 How do I pad a string with blanks or pad a number with zeroes? @@ -1001,20 +1001,20 @@ truncate the result. The C<pack> function can only pad strings on the right with blanks and it will truncate the result to a maximum length of C<$pad_len>. - # Left padding a string with blanks (no truncation): - $padded = sprintf("%${pad_len}s", $text); - $padded = sprintf("%*s", $pad_len, $text); # same thing + # Left padding a string with blanks (no truncation): + $padded = sprintf("%${pad_len}s", $text); + $padded = sprintf("%*s", $pad_len, $text); # same thing - # Right padding a string with blanks (no truncation): - $padded = sprintf("%-${pad_len}s", $text); - $padded = sprintf("%-*s", $pad_len, $text); # same thing + # Right padding a string with blanks (no truncation): + $padded = sprintf("%-${pad_len}s", $text); + $padded = sprintf("%-*s", $pad_len, $text); # same thing - # Left padding a number with 0 (no truncation): - $padded = sprintf("%0${pad_len}d", $num); - $padded = sprintf("%0*d", $pad_len, $num); # same thing + # Left padding a number with 0 (no truncation): + $padded = sprintf("%0${pad_len}d", $num); + $padded = sprintf("%0*d", $pad_len, $num); # same thing - # Right padding a string with blanks using pack (will truncate): - $padded = pack("A$pad_len",$text); + # Right padding a string with blanks using pack (will truncate): + $padded = pack("A$pad_len",$text); If you need to pad with a character other than blank or zero you can use one of the following methods. They all generate a pad string with the @@ -1023,13 +1023,13 @@ not truncate C<$text>. Left and right padding with any character, creating a new string: - $padded = $pad_char x ( $pad_len - length( $text ) ) . $text; - $padded = $text . $pad_char x ( $pad_len - length( $text ) ); + $padded = $pad_char x ( $pad_len - length( $text ) ) . $text; + $padded = $text . $pad_char x ( $pad_len - length( $text ) ); Left and right padding with any character, modifying C<$text> directly: - substr( $text, 0, 0 ) = $pad_char x ( $pad_len - length( $text ) ); - $text .= $pad_char x ( $pad_len - length( $text ) ); + substr( $text, 0, 0 ) = $pad_char x ( $pad_len - length( $text ) ); + $text .= $pad_char x ( $pad_len - length( $text ) ); =head2 How do I extract selected columns from a string? @@ -1038,31 +1038,31 @@ Left and right padding with any character, modifying C<$text> directly: If you know the columns that contain the data, you can use C<substr> to extract a single column. - my $column = substr( $line, $start_column, $length ); + my $column = substr( $line, $start_column, $length ); You can use C<split> if the columns are separated by whitespace or some other delimiter, as long as whitespace or the delimiter cannot appear as part of the data. - my $line = ' fred barney betty '; - my @columns = split /\s+/, $line; - # ( '', 'fred', 'barney', 'betty' ); + my $line = ' fred barney betty '; + my @columns = split /\s+/, $line; + # ( '', 'fred', 'barney', 'betty' ); - my $line = 'fred||barney||betty'; - my @columns = split /\|/, $line; - # ( 'fred', '', 'barney', '', 'betty' ); + my $line = 'fred||barney||betty'; + my @columns = split /\|/, $line; + # ( 'fred', '', 'barney', '', 'betty' ); If you want to work with comma-separated values, don't do this since that format is a bit more complicated. Use one of the modules that -handle that format, such as C<Text::CSV>, C<Text::CSV_XS>, or -C<Text::CSV_PP>. +handle that format, such as L<Text::CSV>, L<Text::CSV_XS>, or +L<Text::CSV_PP>. If you want to break apart an entire line of fixed columns, you can use C<unpack> with the A (ASCII) format. By using a number after the format specifier, you can denote the column width. See the C<pack> and C<unpack> entries in L<perlfunc> for more details. - my @fields = unpack( $line, "A8 A8 A8 A16 A4" ); + my @fields = unpack( $line, "A8 A8 A8 A16 A4" ); Note that spaces in the format argument to C<unpack> do not denote literal spaces. If you have space separated data, you may want C<split> instead. @@ -1071,28 +1071,28 @@ spaces. If you have space separated data, you may want C<split> instead. (contributed by brian d foy) -You can use the Text::Soundex module. If you want to do fuzzy or close -matching, you might also try the C<String::Approx>, and -C<Text::Metaphone>, and C<Text::DoubleMetaphone> modules. +You can use the C<Text::Soundex> module. If you want to do fuzzy or close +matching, you might also try the L<String::Approx>, and +L<Text::Metaphone>, and L<Text::DoubleMetaphone> modules. =head2 How can I expand variables in text strings? (contributed by brian d foy) If you can avoid it, don't, or if you can use a templating system, -such as C<Text::Template> or C<Template> Toolkit, do that instead. You +such as L<Text::Template> or L<Template> Toolkit, do that instead. You might even be able to get the job done with C<sprintf> or C<printf>: - my $string = sprintf 'Say hello to %s and %s', $foo, $bar; + my $string = sprintf 'Say hello to %s and %s', $foo, $bar; However, for the one-off simple case where I don't want to pull out a full templating system, I'll use a string that has two Perl scalar variables in it. In this example, I want to expand C<$foo> and C<$bar> to their variable's values: - my $foo = 'Fred'; - my $bar = 'Barney'; - $string = 'Say hello to $foo and $bar'; + my $foo = 'Fred'; + my $bar = 'Barney'; + $string = 'Say hello to $foo and $bar'; One way I can do this involves the substitution operator and a double C</e> flag. The first C</e> evaluates C<$1> on the replacement side and @@ -1100,7 +1100,7 @@ turns it into C<$foo>. The second /e starts with C<$foo> and replaces it with its value. C<$foo>, then, turns into 'Fred', and that's finally what's left in the string: - $string =~ s/(\$\w+)/$1/eeg; # 'Say hello to Fred and Barney' + $string =~ s/(\$\w+)/$1/eeg; # 'Say hello to Fred and Barney' The C</e> will also silently ignore violations of strict, replacing undefined variable names with the empty string. Since I'm using the @@ -1115,18 +1115,18 @@ can check the hash to ensure the value exists, and if it doesn't, I can replace the missing value with a marker, in this case C<???> to signal that I missed something: - my $string = 'This has $foo and $bar'; + my $string = 'This has $foo and $bar'; - my %Replacements = ( - foo => 'Fred', - ); + my %Replacements = ( + foo => 'Fred', + ); - # $string =~ s/\$(\w+)/$Replacements{$1}/g; - $string =~ s/\$(\w+)/ - exists $Replacements{$1} ? $Replacements{$1} : '???' - /eg; + # $string =~ s/\$(\w+)/$Replacements{$1}/g; + $string =~ s/\$(\w+)/ + exists $Replacements{$1} ? $Replacements{$1} : '???' + /eg; - print $string; + print $string; =head2 What's wrong with always quoting "$vars"? @@ -1138,26 +1138,26 @@ have a string, why do you need more? If you get used to writing odd things like these: - print "$var"; # BAD - $new = "$old"; # BAD - somefunc("$var"); # BAD + print "$var"; # BAD + $new = "$old"; # BAD + somefunc("$var"); # BAD You'll be in trouble. Those should (in 99.8% of the cases) be the simpler and more direct: - print $var; - $new = $old; - somefunc($var); + print $var; + $new = $old; + somefunc($var); Otherwise, besides slowing you down, you're going to break code when the thing in the scalar is actually neither a string nor a number, but a reference: - func(\@array); - sub func { - my $aref = shift; - my $oref = "$aref"; # WRONG - } + func(\@array); + sub func { + my $aref = shift; + my $oref = "$aref"; # WRONG + } You can also get into subtle problems on those few operations in Perl that actually do care about the difference between a string and a @@ -1166,9 +1166,9 @@ syscall() function. Stringification also destroys arrays. - @lines = `command`; - print "@lines"; # WRONG - extra blanks - print @lines; # right + @lines = `command`; + print "@lines"; # WRONG - extra blanks + print @lines; # right =head2 Why don't my E<lt>E<lt>HERE documents work? @@ -1228,29 +1228,29 @@ subsequent line. This works with leading special strings, dynamically determined: - $remember_the_main = fix<<' MAIN_INTERPRETER_LOOP'; - @@@ int - @@@ runops() { - @@@ SAVEI32(runlevel); - @@@ runlevel++; - @@@ while ( op = (*op->op_ppaddr)() ); - @@@ TAINT_NOT; - @@@ return 0; - @@@ } - MAIN_INTERPRETER_LOOP + $remember_the_main = fix<<' MAIN_INTERPRETER_LOOP'; + @@@ int + @@@ runops() { + @@@ SAVEI32(runlevel); + @@@ runlevel++; + @@@ while ( op = (*op->op_ppaddr)() ); + @@@ TAINT_NOT; + @@@ return 0; + @@@ } + MAIN_INTERPRETER_LOOP Or with a fixed amount of leading whitespace, with remaining indentation correctly preserved: - $poem = fix<<EVER_ON_AND_ON; + $poem = fix<<EVER_ON_AND_ON; Now far ahead the Road has gone, - And I must follow, if I can, + And I must follow, if I can, Pursuing it with eager feet, - Until it joins some larger way + Until it joins some larger way Where many paths and errands meet. - And whither then? I cannot say. - --Bilbo in /usr/src/perl/pp_ctl.c - EVER_ON_AND_ON + And whither then? I cannot say. + --Bilbo in /usr/src/perl/pp_ctl.c + EVER_ON_AND_ON =head1 Data: Arrays @@ -1262,20 +1262,20 @@ A list is a fixed collection of scalars. An array is a variable that holds a variable collection of scalars. An array can supply its collection for list operations, so list operations also work on arrays: - # slices - ( 'dog', 'cat', 'bird' )[2,3]; - @animals[2,3]; + # slices + ( 'dog', 'cat', 'bird' )[2,3]; + @animals[2,3]; - # iteration - foreach ( qw( dog cat bird ) ) { ... } - foreach ( @animals ) { ... } + # iteration + foreach ( qw( dog cat bird ) ) { ... } + foreach ( @animals ) { ... } - my @three = grep { length == 3 } qw( dog cat bird ); - my @three = grep { length == 3 } @animals; + my @three = grep { length == 3 } qw( dog cat bird ); + my @three = grep { length == 3 } @animals; - # supply an argument list - wash_animals( qw( dog cat bird ) ); - wash_animals( @animals ); + # supply an argument list + wash_animals( qw( dog cat bird ) ); + wash_animals( @animals ); Array operations, which change the scalars, rearranges them, or adds or subtracts some scalars, only work on arrays. These can't work on a @@ -1284,21 +1284,21 @@ C<push>, C<pop>, and C<splice>. An array can also change its length: - $#animals = 1; # truncate to two elements - $#animals = 10000; # pre-extend to 10,001 elements + $#animals = 1; # truncate to two elements + $#animals = 10000; # pre-extend to 10,001 elements You can change an array element, but you can't change a list element: - $animals[0] = 'Rottweiler'; - qw( dog cat bird )[0] = 'Rottweiler'; # syntax error! + $animals[0] = 'Rottweiler'; + qw( dog cat bird )[0] = 'Rottweiler'; # syntax error! - foreach ( @animals ) { - s/^d/fr/; # works fine - } + foreach ( @animals ) { + s/^d/fr/; # works fine + } - foreach ( qw( dog cat bird ) ) { - s/^d/fr/; # Error! Modification of read only value! - } + foreach ( qw( dog cat bird ) ) { + s/^d/fr/; # Error! Modification of read only value! + } However, if the list element is itself a variable, it appears that you can change a list element. However, the list element is the variable, not @@ -1310,14 +1310,14 @@ You also have to be careful about context. You can assign an array to a scalar to get the number of elements in the array. This only works for arrays, though: - my $count = @animals; # only works with arrays + my $count = @animals; # only works with arrays If you try to do the same thing with what you think is a list, you get a quite different result. Although it looks like you have a list on the righthand side, Perl actually sees a bunch of scalars separated by a comma: - my $scalar = ( 'dog', 'cat', 'bird' ); # $scalar gets bird + my $scalar = ( 'dog', 'cat', 'bird' ); # $scalar gets bird Since you're assigning to a scalar, the righthand side is in scalar context. The comma operator (yes, it's an operator!) in scalar @@ -1327,7 +1327,7 @@ that list-lookalike assigns to C<$scalar> it's rightmost value. Many people mess this up because they choose a list-lookalike whose last element is also the count they expect: - my $scalar = ( 1, 2, 3 ); # $scalar gets 3, accidentally + my $scalar = ( 1, 2, 3 ); # $scalar gets 3, accidentally =head2 What is the difference between $array[1] and @array[1]? @@ -1355,13 +1355,13 @@ context to the righthand side. This can lead to unexpected results. For instance, if you want to read a single line from a filehandle, assigning to a scalar value is fine: - $array[1] = <STDIN>; + $array[1] = <STDIN>; However, in list context, the line input operator returns all of the lines as a list. The first line goes into C<@array[1]> and the rest of the lines mysteriously disappear: - @array[1] = <STDIN>; # most likely not what you want + @array[1] = <STDIN>; # most likely not what you want Either the C<use warnings> pragma or the B<-w> flag will warn you when you use an array slice with a single index. @@ -1378,21 +1378,21 @@ create the hash then extract the keys. It's not important how you create that hash: just that you use C<keys> to get the unique elements. - my %hash = map { $_, 1 } @array; - # or a hash slice: @hash{ @array } = (); - # or a foreach: $hash{$_} = 1 foreach ( @array ); + my %hash = map { $_, 1 } @array; + # or a hash slice: @hash{ @array } = (); + # or a foreach: $hash{$_} = 1 foreach ( @array ); - my @unique = keys %hash; + my @unique = keys %hash; If you want to use a module, try the C<uniq> function from -C<List::MoreUtils>. In list context it returns the unique elements, +L<List::MoreUtils>. In list context it returns the unique elements, preserving their order in the list. In scalar context, it returns the number of unique elements. - use List::MoreUtils qw(uniq); + use List::MoreUtils qw(uniq); - my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7 - my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7 + my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7 + my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7 You can also go through each element and skip the ones you've seen before. Use a hash to keep track. The first time the loop sees an @@ -1404,20 +1404,19 @@ the hash I<and> the value for that key is true (since it's not 0 or C<undef>), so the next skips that iteration and the loop goes to the next element. - my @unique = (); - my %seen = (); + my @unique = (); + my %seen = (); - foreach my $elem ( @array ) - { - next if $seen{ $elem }++; - push @unique, $elem; - } + foreach my $elem ( @array ) { + next if $seen{ $elem }++; + push @unique, $elem; + } You can write this more briefly using a grep, which does the same thing. - my %seen = (); - my @unique = grep { ! $seen{ $_ }++ } @array; + my %seen = (); + my @unique = grep { ! $seen{ $_ }++ } @array; =head2 How can I tell whether a certain element is contained in a list or array? @@ -1431,26 +1430,24 @@ That being said, there are several ways to approach this. In Perl 5.10 and later, you can use the smart match operator to check that an item is contained in an array or a hash: - use 5.010; + use 5.010; - if( $item ~~ @array ) - { - say "The array contains $item" - } + if( $item ~~ @array ) { + say "The array contains $item" + } - if( $item ~~ %hash ) - { - say "The hash contains $item" - } + if( $item ~~ %hash ) { + say "The hash contains $item" + } With earlier versions of Perl, you have to do a bit more work. If you are going to make this query many times over arbitrary string values, the fastest way is probably to invert the original array and maintain a hash whose keys are the first array's values: - @blues = qw/azure cerulean teal turquoise lapis-lazuli/; - %is_blue = (); - for (@blues) { $is_blue{$_} = 1 } + @blues = qw/azure cerulean teal turquoise lapis-lazuli/; + %is_blue = (); + for (@blues) { $is_blue{$_} = 1 } Now you can check whether C<$is_blue{$some_color}>. It might have been a good idea to keep the blues all in a hash in the first place. @@ -1458,19 +1455,19 @@ been a good idea to keep the blues all in a hash in the first place. If the values are all small integers, you could use a simple indexed array. This kind of an array will take up less space: - @primes = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31); - @is_tiny_prime = (); - for (@primes) { $is_tiny_prime[$_] = 1 } - # or simply @istiny_prime[@primes] = (1) x @primes; + @primes = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31); + @is_tiny_prime = (); + for (@primes) { $is_tiny_prime[$_] = 1 } + # or simply @istiny_prime[@primes] = (1) x @primes; Now you check whether $is_tiny_prime[$some_number]. If the values in question are integers instead of strings, you can save quite a lot of space by using bit strings instead: - @articles = ( 1..10, 150..2000, 2017 ); - undef $read; - for (@articles) { vec($read,$_,1) = 1 } + @articles = ( 1..10, 150..2000, 2017 ); + undef $read; + for (@articles) { vec($read,$_,1) = 1 } Now check whether C<vec($read,$n,1)> is true for some C<$n>. @@ -1478,43 +1475,43 @@ These methods guarantee fast individual tests but require a re-organization of the original list or array. They only pay off if you have to test multiple values against the same array. -If you are testing only once, the standard module C<List::Util> exports +If you are testing only once, the standard module L<List::Util> exports the function C<first> for this purpose. It works by stopping once it finds the element. It's written in C for speed, and its Perl equivalent looks like this subroutine: - sub first (&@) { - my $code = shift; - foreach (@_) { - return $_ if &{$code}(); - } - undef; - } + sub first (&@) { + my $code = shift; + foreach (@_) { + return $_ if &{$code}(); + } + undef; + } If speed is of little concern, the common idiom uses grep in scalar context (which returns the number of items that passed its condition) to traverse the entire list. This does have the benefit of telling you how many matches it found, though. - my $is_there = grep $_ eq $whatever, @array; + my $is_there = grep $_ eq $whatever, @array; If you want to actually extract the matching elements, simply use grep in list context. - my @matches = grep $_ eq $whatever, @array; + my @matches = grep $_ eq $whatever, @array; =head2 How do I compute the difference of two arrays? How do I compute the intersection of two arrays? Use a hash. Here's code to do both and more. It assumes that each element is unique in a given array: - @union = @intersection = @difference = (); - %count = (); - foreach $element (@array1, @array2) { $count{$element}++ } - foreach $element (keys %count) { - push @union, $element; - push @{ $count{$element} > 1 ? \@intersection : \@difference }, $element; - } + @union = @intersection = @difference = (); + %count = (); + foreach $element (@array1, @array2) { $count{$element}++ } + foreach $element (keys %count) { + push @union, $element; + push @{ $count{$element} > 1 ? \@intersection : \@difference }, $element; + } Note that this is the I<symmetric difference>, that is, all elements in either A or in B but not in both. Think of it as an xor operation. @@ -1524,59 +1521,57 @@ in either A or in B but not in both. Think of it as an xor operation. With Perl 5.10 and later, the smart match operator can give you the answer with the least amount of work: - use 5.010; + use 5.010; - if( @array1 ~~ @array2 ) - { - say "The arrays are the same"; - } + if( @array1 ~~ @array2 ) { + say "The arrays are the same"; + } - if( %hash1 ~~ %hash2 ) # doesn't check values! - { - say "The hash keys are the same"; - } + if( %hash1 ~~ %hash2 ) # doesn't check values! { + say "The hash keys are the same"; + } The following code works for single-level arrays. It uses a stringwise comparison, and does not distinguish defined versus undefined empty strings. Modify if you have other needs. - $are_equal = compare_arrays(\@frogs, \@toads); + $are_equal = compare_arrays(\@frogs, \@toads); - sub compare_arrays { - my ($first, $second) = @_; - no warnings; # silence spurious -w undef complaints - return 0 unless @$first == @$second; - for (my $i = 0; $i < @$first; $i++) { - return 0 if $first->[$i] ne $second->[$i]; - } - return 1; - } + sub compare_arrays { + my ($first, $second) = @_; + no warnings; # silence spurious -w undef complaints + return 0 unless @$first == @$second; + for (my $i = 0; $i < @$first; $i++) { + return 0 if $first->[$i] ne $second->[$i]; + } + return 1; + } For multilevel structures, you may wish to use an approach more -like this one. It uses the CPAN module C<FreezeThaw>: +like this one. It uses the CPAN module L<FreezeThaw>: - use FreezeThaw qw(cmpStr); - @a = @b = ( "this", "that", [ "more", "stuff" ] ); + use FreezeThaw qw(cmpStr); + @a = @b = ( "this", "that", [ "more", "stuff" ] ); - printf "a and b contain %s arrays\n", - cmpStr(\@a, \@b) == 0 - ? "the same" - : "different"; + printf "a and b contain %s arrays\n", + cmpStr(\@a, \@b) == 0 + ? "the same" + : "different"; This approach also works for comparing hashes. Here we'll demonstrate two different answers: - use FreezeThaw qw(cmpStr cmpStrHard); + use FreezeThaw qw(cmpStr cmpStrHard); - %a = %b = ( "this" => "that", "extra" => [ "more", "stuff" ] ); - $a{EXTRA} = \%b; - $b{EXTRA} = \%a; + %a = %b = ( "this" => "that", "extra" => [ "more", "stuff" ] ); + $a{EXTRA} = \%b; + $b{EXTRA} = \%a; - printf "a and b contain %s hashes\n", - cmpStr(\%a, \%b) == 0 ? "the same" : "different"; + printf "a and b contain %s hashes\n", + cmpStr(\%a, \%b) == 0 ? "the same" : "different"; - printf "a and b contain %s hashes\n", - cmpStrHard(\%a, \%b) == 0 ? "the same" : "different"; + printf "a and b contain %s hashes\n", + cmpStrHard(\%a, \%b) == 0 ? "the same" : "different"; The first reports that both those the hashes contain the same data, @@ -1586,34 +1581,34 @@ an exercise to the reader. =head2 How do I find the first array element for which a condition is true? To find the first array element which satisfies a condition, you can -use the C<first()> function in the C<List::Util> module, which comes +use the C<first()> function in the L<List::Util> module, which comes with Perl 5.8. This example finds the first element that contains "Perl". - use List::Util qw(first); + use List::Util qw(first); - my $element = first { /Perl/ } @array; + my $element = first { /Perl/ } @array; -If you cannot use C<List::Util>, you can make your own loop to do the +If you cannot use L<List::Util>, you can make your own loop to do the same thing. Once you find the element, you stop the loop with last. - my $found; - foreach ( @array ) { - if( /Perl/ ) { $found = $_; last } - } + my $found; + foreach ( @array ) { + if( /Perl/ ) { $found = $_; last } + } If you want the array index, you can iterate through the indices and check the array element at each index until you find one that satisfies the condition. - my( $found, $index ) = ( undef, -1 ); - for( $i = 0; $i < @array; $i++ ) { - if( $array[$i] =~ /Perl/ ) { - $found = $array[$i]; - $index = $i; - last; - } - } + my( $found, $index ) = ( undef, -1 ); + for( $i = 0; $i < @array; $i++ ) { + if( $array[$i] =~ /Perl/ ) { + $found = $array[$i]; + $index = $i; + last; + } + } =head2 How do I handle linked lists? @@ -1633,7 +1628,7 @@ the chain. Steve Lembark goes through the details in his YAPC::NA 2009 talk "Perly Linked Lists" ( http://www.slideshare.net/lembark/perly-linked-lists ), -although you can just use his C<LinkedList::Single> module. +although you can just use his L<LinkedList::Single> module. =head2 How do I handle circular lists? X<circular> X<array> X<Tie::Cycle> X<Array::Iterator::Circular> @@ -1644,65 +1639,65 @@ X<cycle> X<modulus> If you want to cycle through an array endlessly, you can increment the index modulo the number of elements in the array: - my @array = qw( a b c ); - my $i = 0; + my @array = qw( a b c ); + my $i = 0; - while( 1 ) { - print $array[ $i++ % @array ], "\n"; - last if $i > 20; - } + while( 1 ) { + print $array[ $i++ % @array ], "\n"; + last if $i > 20; + } -You can also use C<Tie::Cycle> to use a scalar that always has the +You can also use L<Tie::Cycle> to use a scalar that always has the next element of the circular array: - use Tie::Cycle; + use Tie::Cycle; - tie my $cycle, 'Tie::Cycle', [ qw( FFFFFF 000000 FFFF00 ) ]; + tie my $cycle, 'Tie::Cycle', [ qw( FFFFFF 000000 FFFF00 ) ]; - print $cycle; # FFFFFF - print $cycle; # 000000 - print $cycle; # FFFF00 + print $cycle; # FFFFFF + print $cycle; # 000000 + print $cycle; # FFFF00 -The C<Array::Iterator::Circular> creates an iterator object for +The L<Array::Iterator::Circular> creates an iterator object for circular arrays: - use Array::Iterator::Circular; + use Array::Iterator::Circular; - my $color_iterator = Array::Iterator::Circular->new( - qw(red green blue orange) - ); + my $color_iterator = Array::Iterator::Circular->new( + qw(red green blue orange) + ); - foreach ( 1 .. 20 ) { - print $color_iterator->next, "\n"; - } + foreach ( 1 .. 20 ) { + print $color_iterator->next, "\n"; + } =head2 How do I shuffle an array randomly? If you either have Perl 5.8.0 or later installed, or if you have Scalar-List-Utils 1.03 or later installed, you can say: - use List::Util 'shuffle'; + use List::Util 'shuffle'; - @shuffled = shuffle(@list); + @shuffled = shuffle(@list); If not, you can use a Fisher-Yates shuffle. - sub fisher_yates_shuffle { - my $deck = shift; # $deck is a reference to an array - return unless @$deck; # must not be empty! + sub fisher_yates_shuffle { + my $deck = shift; # $deck is a reference to an array + return unless @$deck; # must not be empty! - my $i = @$deck; - while (--$i) { - my $j = int rand ($i+1); - @$deck[$i,$j] = @$deck[$j,$i]; - } - } + my $i = @$deck; + while (--$i) { + my $j = int rand ($i+1); + @$deck[$i,$j] = @$deck[$j,$i]; + } + } - # shuffle my mpeg collection - # - my @mpeg = <audio/*/*.mp3>; - fisher_yates_shuffle( \@mpeg ); # randomize @mpeg in place - print @mpeg; + # shuffle my mpeg collection + # + my @mpeg = <audio/*/*.mp3>; + fisher_yates_shuffle( \@mpeg ); # randomize @mpeg in place + print @mpeg; Note that the above implementation shuffles an array in place, unlike the C<List::Util::shuffle()> which takes a list and returns @@ -1711,12 +1706,12 @@ a new shuffled list. You've probably seen shuffling algorithms that work using splice, randomly picking another element to swap the current element with - srand; - @new = (); - @old = 1 .. 10; # just a demo - while (@old) { - push(@new, splice(@old, rand @old, 1)); - } + srand; + @new = (); + @old = 1 .. 10; # just a demo + while (@old) { + push(@new, splice(@old, rand @old, 1)); + } This is bad because splice is already O(N), and since you do it N times, you just invented a quadratic algorithm; that is, O(N**2). @@ -1727,31 +1722,31 @@ won't notice this until you have rather largish arrays. Use C<for>/C<foreach>: - for (@lines) { - s/foo/bar/; # change that word - tr/XZ/ZX/; # swap those letters - } + for (@lines) { + s/foo/bar/; # change that word + tr/XZ/ZX/; # swap those letters + } Here's another; let's compute spherical volumes: - for (@volumes = @radii) { # @volumes has changed parts - $_ **= 3; - $_ *= (4/3) * 3.14159; # this will be constant folded - } + for (@volumes = @radii) { # @volumes has changed parts + $_ **= 3; + $_ *= (4/3) * 3.14159; # this will be constant folded + } which can also be done with C<map()> which is made to transform one list into another: - @volumes = map {$_ ** 3 * (4/3) * 3.14159} @radii; + @volumes = map {$_ ** 3 * (4/3) * 3.14159} @radii; If you want to do the same thing to modify the values of the hash, you can use the C<values> function. As of Perl 5.6 the values are not copied, so if you modify $orbit (in this case), you modify the value. - for $orbit ( values %orbits ) { - ($orbit **= 3) *= (4/3) * 3.14159; - } + for $orbit ( values %orbits ) { + ($orbit **= 3) *= (4/3) * 3.14159; + } Prior to perl 5.6 C<values> returned copies of the values, so older perl code often contains constructions such as @@ -1762,64 +1757,64 @@ the hash is to be modified. Use the C<rand()> function (see L<perlfunc/rand>): - $index = rand @array; - $element = $array[$index]; + $index = rand @array; + $element = $array[$index]; Or, simply: - my $element = $array[ rand @array ]; + my $element = $array[ rand @array ]; =head2 How do I permute N elements of a list? X<List::Permutor> X<permute> X<Algorithm::Loops> X<Knuth> X<The Art of Computer Programming> X<Fischer-Krause> -Use the C<List::Permutor> module on CPAN. If the list is actually an -array, try the C<Algorithm::Permute> module (also on CPAN). It's +Use the L<List::Permutor> module on CPAN. If the list is actually an +array, try the L<Algorithm::Permute> module (also on CPAN). It's written in XS code and is very efficient: - use Algorithm::Permute; + use Algorithm::Permute; - my @array = 'a'..'d'; - my $p_iterator = Algorithm::Permute->new ( \@array ); + my @array = 'a'..'d'; + my $p_iterator = Algorithm::Permute->new ( \@array ); - while (my @perm = $p_iterator->next) { - print "next permutation: (@perm)\n"; - } + while (my @perm = $p_iterator->next) { + print "next permutation: (@perm)\n"; + } For even faster execution, you could do: - use Algorithm::Permute; + use Algorithm::Permute; - my @array = 'a'..'d'; + my @array = 'a'..'d'; - Algorithm::Permute::permute { - print "next permutation: (@array)\n"; - } @array; + Algorithm::Permute::permute { + print "next permutation: (@array)\n"; + } @array; Here's a little program that generates all permutations of all the words on each line of input. The algorithm embodied in the C<permute()> function is discussed in Volume 4 (still unpublished) of Knuth's I<The Art of Computer Programming> and will work on any list: - #!/usr/bin/perl -n - # Fischer-Krause ordered permutation generator - - sub permute (&@) { - my $code = shift; - my @idx = 0..$#_; - while ( $code->(@_[@idx]) ) { - my $p = $#idx; - --$p while $idx[$p-1] > $idx[$p]; - my $q = $p or return; - push @idx, reverse splice @idx, $p; - ++$q while $idx[$p-1] > $idx[$q]; - @idx[$p-1,$q]=@idx[$q,$p-1]; - } - } - - permute { print "@_\n" } split; - -The C<Algorithm::Loops> module also provides the C<NextPermute> and + #!/usr/bin/perl -n + # Fischer-Krause ordered permutation generator + + sub permute (&@) { + my $code = shift; + my @idx = 0..$#_; + while ( $code->(@_[@idx]) ) { + my $p = $#idx; + --$p while $idx[$p-1] > $idx[$p]; + my $q = $p or return; + push @idx, reverse splice @idx, $p; + ++$q while $idx[$p-1] > $idx[$q]; + @idx[$p-1,$q]=@idx[$q,$p-1]; + } + } + + permute { print "@_\n" } split; + +The L<Algorithm::Loops> module also provides the C<NextPermute> and C<NextPermuteNum> functions which efficiently find all unique permutations of an array, even if it contains duplicate values, modifying it in-place: if its elements are in reverse-sorted order then the array is reversed, @@ -1829,7 +1824,7 @@ permutation is returned. C<NextPermute> uses string order and C<NextPermuteNum> numeric order, so you can enumerate all the permutations of C<0..9> like this: - use Algorithm::Loops qw(NextPermuteNum); + use Algorithm::Loops qw(NextPermuteNum); my @list= 0..9; do { print "@list\n" } while NextPermuteNum @list; @@ -1838,7 +1833,7 @@ you can enumerate all the permutations of C<0..9> like this: Supply a comparison function to sort() (described in L<perlfunc/sort>): - @list = sort { $a <=> $b } @list; + @list = sort { $a <=> $b } @list; The default sort function is cmp, string comparison, which would sort C<(1, 2, 10)> into C<(1, 10, 2)>. C<< <=> >>, used above, is @@ -1851,27 +1846,27 @@ same element. Here's an example of how to pull out the first word after the first number on each item, and then sort those words case-insensitively. - @idx = (); - for (@data) { - ($item) = /\d+\s*(\S+)/; - push @idx, uc($item); - } - @sorted = @data[ sort { $idx[$a] cmp $idx[$b] } 0 .. $#idx ]; + @idx = (); + for (@data) { + ($item) = /\d+\s*(\S+)/; + push @idx, uc($item); + } + @sorted = @data[ sort { $idx[$a] cmp $idx[$b] } 0 .. $#idx ]; which could also be written this way, using a trick that's come to be known as the Schwartzian Transform: - @sorted = map { $_->[0] } - sort { $a->[1] cmp $b->[1] } - map { [ $_, uc( (/\d+\s*(\S+)/)[0]) ] } @data; + @sorted = map { $_->[0] } + sort { $a->[1] cmp $b->[1] } + map { [ $_, uc( (/\d+\s*(\S+)/)[0]) ] } @data; If you need to sort on several fields, the following paradigm is useful. - @sorted = sort { - field1($a) <=> field1($b) || - field2($a) cmp field2($b) || - field3($a) cmp field3($b) - } @data; + @sorted = sort { + field1($a) <=> field1($b) || + field2($a) cmp field2($b) || + field3($a) cmp field3($b) + } @data; This can be conveniently combined with precalculation of keys as given above. @@ -1892,11 +1887,11 @@ For example, you don't have to store individual bits in an array array of bits to a string, use C<vec()> to set the right bits. This sets C<$vec> to have bit N set only if C<$ints[N]> was set: - @ints = (...); # array of bits, e.g. ( 1, 0, 0, 1, 1, 0 ... ) - $vec = ''; - foreach( 0 .. $#ints ) { - vec($vec,$_,1) = 1 if $ints[$_]; - } + @ints = (...); # array of bits, e.g. ( 1, 0, 0, 1, 1, 0 ... ) + $vec = ''; + foreach( 0 .. $#ints ) { + vec($vec,$_,1) = 1 if $ints[$_]; + } The string C<$vec> only takes up as many bits as it needs. For instance, if you had 16 entries in C<@ints>, C<$vec> only needs two @@ -1905,37 +1900,37 @@ bytes to store them (not counting the scalar variable overhead). Here's how, given a vector in C<$vec>, you can get those bits into your C<@ints> array: - sub bitvec_to_list { - my $vec = shift; - my @ints; - # Find null-byte density then select best algorithm - if ($vec =~ tr/\0// / length $vec > 0.95) { - use integer; - my $i; - - # This method is faster with mostly null-bytes - while($vec =~ /[^\0]/g ) { - $i = -9 + 8 * pos $vec; - push @ints, $i if vec($vec, ++$i, 1); - push @ints, $i if vec($vec, ++$i, 1); - push @ints, $i if vec($vec, ++$i, 1); - push @ints, $i if vec($vec, ++$i, 1); - push @ints, $i if vec($vec, ++$i, 1); - push @ints, $i if vec($vec, ++$i, 1); - push @ints, $i if vec($vec, ++$i, 1); - push @ints, $i if vec($vec, ++$i, 1); - } - } - else { - # This method is a fast general algorithm - use integer; - my $bits = unpack "b*", $vec; - push @ints, 0 if $bits =~ s/^(\d)// && $1; - push @ints, pos $bits while($bits =~ /1/g); - } - - return \@ints; - } + sub bitvec_to_list { + my $vec = shift; + my @ints; + # Find null-byte density then select best algorithm + if ($vec =~ tr/\0// / length $vec > 0.95) { + use integer; + my $i; + + # This method is faster with mostly null-bytes + while($vec =~ /[^\0]/g ) { + $i = -9 + 8 * pos $vec; + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + } + } + else { + # This method is a fast general algorithm + use integer; + my $bits = unpack "b*", $vec; + push @ints, 0 if $bits =~ s/^(\d)// && $1; + push @ints, pos $bits while($bits =~ /1/g); + } + + return \@ints; + } This method gets faster the more sparse the bit vector is. (Courtesy of Tim Bunce and Winfried Koenig.) @@ -1943,61 +1938,61 @@ This method gets faster the more sparse the bit vector is. You can make the while loop a lot shorter with this suggestion from Benjamin Goldberg: - while($vec =~ /[^\0]+/g ) { - push @ints, grep vec($vec, $_, 1), $-[0] * 8 .. $+[0] * 8; - } + while($vec =~ /[^\0]+/g ) { + push @ints, grep vec($vec, $_, 1), $-[0] * 8 .. $+[0] * 8; + } -Or use the CPAN module C<Bit::Vector>: +Or use the CPAN module L<Bit::Vector>: - $vector = Bit::Vector->new($num_of_bits); - $vector->Index_List_Store(@ints); - @ints = $vector->Index_List_Read(); + $vector = Bit::Vector->new($num_of_bits); + $vector->Index_List_Store(@ints); + @ints = $vector->Index_List_Read(); -C<Bit::Vector> provides efficient methods for bit vector, sets of +L<Bit::Vector> provides efficient methods for bit vector, sets of small integers and "big int" math. Here's a more extensive illustration using vec(): - # vec demo - $vector = "\xff\x0f\xef\xfe"; - print "Ilya's string \\xff\\x0f\\xef\\xfe represents the number ", - unpack("N", $vector), "\n"; - $is_set = vec($vector, 23, 1); - print "Its 23rd bit is ", $is_set ? "set" : "clear", ".\n"; - pvec($vector); - - set_vec(1,1,1); - set_vec(3,1,1); - set_vec(23,1,1); - - set_vec(3,1,3); - set_vec(3,2,3); - set_vec(3,4,3); - set_vec(3,4,7); - set_vec(3,8,3); - set_vec(3,8,7); - - set_vec(0,32,17); - set_vec(1,32,17); - - sub set_vec { - my ($offset, $width, $value) = @_; - my $vector = ''; - vec($vector, $offset, $width) = $value; - print "offset=$offset width=$width value=$value\n"; - pvec($vector); - } - - sub pvec { - my $vector = shift; - my $bits = unpack("b*", $vector); - my $i = 0; - my $BASE = 8; - - print "vector length in bytes: ", length($vector), "\n"; - @bytes = unpack("A8" x length($vector), $bits); - print "bits are: @bytes\n\n"; - } + # vec demo + $vector = "\xff\x0f\xef\xfe"; + print "Ilya's string \\xff\\x0f\\xef\\xfe represents the number ", + unpack("N", $vector), "\n"; + $is_set = vec($vector, 23, 1); + print "Its 23rd bit is ", $is_set ? "set" : "clear", ".\n"; + pvec($vector); + + set_vec(1,1,1); + set_vec(3,1,1); + set_vec(23,1,1); + + set_vec(3,1,3); + set_vec(3,2,3); + set_vec(3,4,3); + set_vec(3,4,7); + set_vec(3,8,3); + set_vec(3,8,7); + + set_vec(0,32,17); + set_vec(1,32,17); + + sub set_vec { + my ($offset, $width, $value) = @_; + my $vector = ''; + vec($vector, $offset, $width) = $value; + print "offset=$offset width=$width value=$value\n"; + pvec($vector); + } + + sub pvec { + my $vector = shift; + my $bits = unpack("b*", $vector); + my $i = 0; + my $BASE = 8; + + print "vector length in bytes: ", length($vector), "\n"; + @bytes = unpack("A8" x length($vector), $bits); + print "bits are: @bytes\n\n"; + } =head2 Why does defined() return true on empty arrays and hashes? @@ -2019,36 +2014,36 @@ To go through all of the keys, use the C<keys> function. This extracts all of the keys of the hash and gives them back to you as a list. You can then get the value through the particular key you're processing: - foreach my $key ( keys %hash ) { - my $value = $hash{$key} - ... - } + foreach my $key ( keys %hash ) { + my $value = $hash{$key} + ... + } Once you have the list of keys, you can process that list before you process the hash elements. For instance, you can sort the keys so you can process them in lexical order: - foreach my $key ( sort keys %hash ) { - my $value = $hash{$key} - ... - } + foreach my $key ( sort keys %hash ) { + my $value = $hash{$key} + ... + } Or, you might want to only process some of the items. If you only want to deal with the keys that start with C<text:>, you can select just those using C<grep>: - foreach my $key ( grep /^text:/, keys %hash ) { - my $value = $hash{$key} - ... - } + foreach my $key ( grep /^text:/, keys %hash ) { + my $value = $hash{$key} + ... + } If the hash is very large, you might not want to create a long list of keys. To save some memory, you can grab one key-value pair at a time using C<each()>, which returns a pair you haven't seen yet: - while( my( $key, $value ) = each( %hash ) ) { - ... - } + while( my( $key, $value ) = each( %hash ) ) { + ... + } The C<each> operator returns the pairs in apparently random order, so if ordering matters to you, you'll have to stick with the C<keys> method. @@ -2076,46 +2071,40 @@ to a new hash (C<%new_hash>), then add the keys from the other hash C<%new_hash> gives you a chance to decide what to do with the duplicates: - my %new_hash = %hash1; # make a copy; leave %hash1 alone - - foreach my $key2 ( keys %hash2 ) - { - if( exists $new_hash{$key2} ) - { - warn "Key [$key2] is in both hashes!"; - # handle the duplicate (perhaps only warning) - ... - next; - } - else - { - $new_hash{$key2} = $hash2{$key2}; - } - } + my %new_hash = %hash1; # make a copy; leave %hash1 alone + + foreach my $key2 ( keys %hash2 ) { + if( exists $new_hash{$key2} ) { + warn "Key [$key2] is in both hashes!"; + # handle the duplicate (perhaps only warning) + ... + next; + } + else { + $new_hash{$key2} = $hash2{$key2}; + } + } If you don't want to create a new hash, you can still use this looping technique; just change the C<%new_hash> to C<%hash1>. - foreach my $key2 ( keys %hash2 ) - { - if( exists $hash1{$key2} ) - { - warn "Key [$key2] is in both hashes!"; - # handle the duplicate (perhaps only warning) - ... - next; - } - else - { - $hash1{$key2} = $hash2{$key2}; - } - } + foreach my $key2 ( keys %hash2 ) { + if( exists $hash1{$key2} ) { + warn "Key [$key2] is in both hashes!"; + # handle the duplicate (perhaps only warning) + ... + next; + } + else { + $hash1{$key2} = $hash2{$key2}; + } + } If you don't care that one hash overwrites keys and values from the other, you could just use a hash slice to add one hash to another. In this case, values from C<%hash2> replace values from C<%hash1> when they have keys in common: - @hash1{ keys %hash2 } = values %hash2; + @hash1{ keys %hash2 } = values %hash2; =head2 What happens if I add or remove keys from a hash while iterating over it? @@ -2133,23 +2122,23 @@ entry for C<each()> in L<perlfunc>. Create a reverse hash: - %by_value = reverse %by_key; - $key = $by_value{$value}; + %by_value = reverse %by_key; + $key = $by_value{$value}; That's not particularly efficient. It would be more space-efficient to use: - while (($key, $value) = each %by_key) { - $by_value{$value} = $key; - } + while (($key, $value) = each %by_key) { + $by_value{$value} = $key; + } If your hash could have repeated values, the methods above will only find one of the associated keys. This may or may not worry you. If it does worry you, you can always reverse the hash into a hash of arrays instead: - while (($key, $value) = each %by_key) { - push @{$key_list_by_value{$value}}, $key; - } + while (($key, $value) = each %by_key) { + push @{$key_list_by_value{$value}}, $key; + } =head2 How can I know how many entries are in a hash? @@ -2161,23 +2150,23 @@ L<perlfaq4>, but a bit simpler in the common cases. You can use the C<keys()> built-in function in scalar context to find out have many entries you have in a hash: - my $key_count = keys %hash; # must be scalar context! + my $key_count = keys %hash; # must be scalar context! If you want to find out how many entries have a defined value, that's a bit different. You have to check each value. A C<grep> is handy: - my $defined_value_count = grep { defined } values %hash; + my $defined_value_count = grep { defined } values %hash; You can use that same structure to count the entries any way that you like. If you want the count of the keys with vowels in them, you just test for that instead: - my $vowel_count = grep { /[aeiou]/ } keys %hash; + my $vowel_count = grep { /[aeiou]/ } keys %hash; The C<grep> in scalar context returns the count. If you want the list of matching items, just use it in list context instead: - my @defined_values = grep { defined } values %hash; + my @defined_values = grep { defined } values %hash; The C<keys()> function also resets the iterator, which means that you may see strange results if you use this between uses of other hash operators @@ -2193,12 +2182,11 @@ might be affected by your locale settings). The output list has the keys in ASCIIbetical order. Once we have the keys, we can go through them to create a report which lists the keys in ASCIIbetical order. - my @keys = sort { $a cmp $b } keys %hash; + my @keys = sort { $a cmp $b } keys %hash; - foreach my $key ( @keys ) - { - printf "%-20s %6d\n", $key, $hash{$key}; - } + foreach my $key ( @keys ) { + printf "%-20s %6d\n", $key, $hash{$key}; + } We could get more fancy in the C<sort()> block though. Instead of comparing the keys, we can compute a value with them and use that @@ -2209,7 +2197,7 @@ the C<\L> sequence in a double-quoted string to make everything lowercase. The C<sort()> block then compares the lowercased values to determine in which order to put the keys. - my @keys = sort { "\L$a" cmp "\L$b" } keys %hash; + my @keys = sort { "\L$a" cmp "\L$b" } keys %hash; Note: if the computation is expensive or the hash has many elements, you may want to look at the Schwartzian Transform to cache the @@ -2219,23 +2207,23 @@ If we want to sort by the hash value instead, we use the hash key to look it up. We still get out a list of keys, but this time they are ordered by their value. - my @keys = sort { $hash{$a} <=> $hash{$b} } keys %hash; + my @keys = sort { $hash{$a} <=> $hash{$b} } keys %hash; From there we can get more complex. If the hash values are the same, we can provide a secondary sort on the hash key. - my @keys = sort { - $hash{$a} <=> $hash{$b} - or - "\L$a" cmp "\L$b" - } keys %hash; + my @keys = sort { + $hash{$a} <=> $hash{$b} + or + "\L$a" cmp "\L$b" + } keys %hash; =head2 How can I always keep my hash sorted? X<hash tie sort DB_File Tie::IxHash> You can look into using the C<DB_File> module and C<tie()> using the C<$DB_BTREE> hash bindings as documented in L<DB_File/"In Memory -Databases">. The C<Tie::IxHash> module from CPAN might also be +Databases">. The L<Tie::IxHash> module from CPAN might also be instructive. Although this does keep your hash sorted, you might not like the slowdown you suffer from the tie interface. Are you sure you need to do this? :) @@ -2254,70 +2242,70 @@ being in the hash. Pictures help... Here's the C<%hash> table: - keys values - +------+------+ - | a | 3 | - | x | 7 | - | d | 0 | - | e | 2 | - +------+------+ + keys values + +------+------+ + | a | 3 | + | x | 7 | + | d | 0 | + | e | 2 | + +------+------+ And these conditions hold - $hash{'a'} is true - $hash{'d'} is false - defined $hash{'d'} is true - defined $hash{'a'} is true - exists $hash{'a'} is true (Perl 5 only) - grep ($_ eq 'a', keys %hash) is true + $hash{'a'} is true + $hash{'d'} is false + defined $hash{'d'} is true + defined $hash{'a'} is true + exists $hash{'a'} is true (Perl 5 only) + grep ($_ eq 'a', keys %hash) is true If you now say - undef $hash{'a'} + undef $hash{'a'} your table now reads: - keys values - +------+------+ - | a | undef| - | x | 7 | - | d | 0 | - | e | 2 | - +------+------+ + keys values + +------+------+ + | a | undef| + | x | 7 | + | d | 0 | + | e | 2 | + +------+------+ and these conditions now hold; changes in caps: - $hash{'a'} is FALSE - $hash{'d'} is false - defined $hash{'d'} is true - defined $hash{'a'} is FALSE - exists $hash{'a'} is true (Perl 5 only) - grep ($_ eq 'a', keys %hash) is true + $hash{'a'} is FALSE + $hash{'d'} is false + defined $hash{'d'} is true + defined $hash{'a'} is FALSE + exists $hash{'a'} is true (Perl 5 only) + grep ($_ eq 'a', keys %hash) is true Notice the last two: you have an undef value, but a defined key! Now, consider this: - delete $hash{'a'} + delete $hash{'a'} your table now reads: - keys values - +------+------+ - | x | 7 | - | d | 0 | - | e | 2 | - +------+------+ + keys values + +------+------+ + | x | 7 | + | d | 0 | + | e | 2 | + +------+------+ and these conditions now hold; changes in caps: - $hash{'a'} is false - $hash{'d'} is false - defined $hash{'d'} is true - defined $hash{'a'} is false - exists $hash{'a'} is FALSE (Perl 5 only) - grep ($_ eq 'a', keys %hash) is FALSE + $hash{'a'} is false + $hash{'d'} is false + defined $hash{'d'} is true + defined $hash{'a'} is false + exists $hash{'a'} is FALSE (Perl 5 only) + grep ($_ eq 'a', keys %hash) is FALSE See, the whole entry is gone! @@ -2337,8 +2325,8 @@ You can use the C<keys> or C<values> functions to reset C<each>. To simply reset the iterator used by C<each> without doing anything else, use one of them in void context: - keys %hash; # resets iterator, nothing else. - values %hash; # resets iterator, nothing else. + keys %hash; # resets iterator, nothing else. + values %hash; # resets iterator, nothing else. See the documentation for C<each> in L<perlfunc>. @@ -2347,26 +2335,26 @@ See the documentation for C<each> in L<perlfunc>. First you extract the keys from the hashes into lists, then solve the "removing duplicates" problem described above. For example: - %seen = (); - for $element (keys(%foo), keys(%bar)) { - $seen{$element}++; - } - @uniq = keys %seen; + %seen = (); + for $element (keys(%foo), keys(%bar)) { + $seen{$element}++; + } + @uniq = keys %seen; Or more succinctly: - @uniq = keys %{{%foo,%bar}}; + @uniq = keys %{{%foo,%bar}}; Or if you really want to save space: - %seen = (); - while (defined ($key = each %foo)) { - $seen{$key}++; - } - while (defined ($key = each %bar)) { - $seen{$key}++; - } - @uniq = keys %seen; + %seen = (); + while (defined ($key = each %foo)) { + $seen{$key}++; + } + while (defined ($key = each %bar)) { + $seen{$key}++; + } + @uniq = keys %seen; =head2 How can I store a multidimensional array in a DBM file? @@ -2377,18 +2365,18 @@ it can be a bit slow. =head2 How can I make my hash remember the order I put elements into it? -Use the C<Tie::IxHash> from CPAN. +Use the L<Tie::IxHash> from CPAN. - use Tie::IxHash; + use Tie::IxHash; - tie my %myhash, 'Tie::IxHash'; + tie my %myhash, 'Tie::IxHash'; - for (my $i=0; $i<20; $i++) { - $myhash{$i} = 2*$i; - } + for (my $i=0; $i<20; $i++) { + $myhash{$i} = 2*$i; + } - my @keys = keys %myhash; - # @keys = (0,1,2,3,...) + my @keys = keys %myhash; + # @keys = (0,1,2,3,...) =head2 Why does passing a subroutine an undefined element in a hash create it? @@ -2399,50 +2387,50 @@ Are you using a really old version of Perl? Normally, accessing a hash key's value for a nonexistent key will I<not> create the key. - my %hash = (); - my $value = $hash{ 'foo' }; - print "This won't print\n" if exists $hash{ 'foo' }; + my %hash = (); + my $value = $hash{ 'foo' }; + print "This won't print\n" if exists $hash{ 'foo' }; Passing C<$hash{ 'foo' }> to a subroutine used to be a special case, though. Since you could assign directly to C<$_[0]>, Perl had to be ready to make that assignment so it created the hash key ahead of time: my_sub( $hash{ 'foo' } ); - print "This will print before 5.004\n" if exists $hash{ 'foo' }; + print "This will print before 5.004\n" if exists $hash{ 'foo' }; - sub my_sub { - # $_[0] = 'bar'; # create hash key in case you do this - 1; - } + sub my_sub { + # $_[0] = 'bar'; # create hash key in case you do this + 1; + } Since Perl 5.004, however, this situation is a special case and Perl creates the hash key only when you make the assignment: my_sub( $hash{ 'foo' } ); - print "This will print, even after 5.004\n" if exists $hash{ 'foo' }; + print "This will print, even after 5.004\n" if exists $hash{ 'foo' }; - sub my_sub { - $_[0] = 'bar'; - } + sub my_sub { + $_[0] = 'bar'; + } However, if you want the old behavior (and think carefully about that because it's a weird side effect), you can pass a hash slice instead. Perl 5.004 didn't make this a special case: - my_sub( @hash{ qw/foo/ } ); + my_sub( @hash{ qw/foo/ } ); =head2 How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays? Usually a hash ref, perhaps like this: - $record = { - NAME => "Jason", - EMPNO => 132, - TITLE => "deputy peon", - AGE => 23, - SALARY => 37_000, - PALS => [ "Norbert", "Rhys", "Phineas"], - }; + $record = { + NAME => "Jason", + EMPNO => 132, + TITLE => "deputy peon", + AGE => 23, + SALARY => 37_000, + PALS => [ "Norbert", "Rhys", "Phineas"], + }; References are documented in L<perlref> and L<perlreftut>. Examples of complex data structures are given in L<perldsc> and @@ -2484,66 +2472,66 @@ required work for you. The trick to this problem is avoiding accidental autovivification. If you want to check three keys deep, you might naE<0xEF>vely try this: - my %hash; - if( exists $hash{key1}{key2}{key3} ) { - ...; - } + my %hash; + if( exists $hash{key1}{key2}{key3} ) { + ...; + } Even though you started with a completely empty hash, after that call to C<exists> you've created the structure you needed to check for C<key3>: - %hash = ( - 'key1' => { - 'key2' => {} - } - ); + %hash = ( + 'key1' => { + 'key2' => {} + } + ); That's autovivification. You can get around this in a few ways. The easiest way is to just turn it off. The lexical C<autovivification> pragma is available on CPAN. Now you don't add to the hash: - { - no autovivification; - my %hash; - if( exists $hash{key1}{key2}{key3} ) { - ...; - } - } + { + no autovivification; + my %hash; + if( exists $hash{key1}{key2}{key3} ) { + ...; + } + } -The C<Data::Diver> module on CPAN can do it for you too. Its C<Dive> +The L<Data::Diver> module on CPAN can do it for you too. Its C<Dive> subroutine can tell you not only if the keys exist but also get the value: - use Data::Diver qw(Dive); + use Data::Diver qw(Dive); my @exists = Dive( \%hash, qw(key1 key2 key3) ); if( ! @exists ) { ...; # keys do not exist - } + } elsif( ! defined $exists[0] ) { ...; # keys exist but value is undef - } + } You can easily do this yourself too by checking each level of the hash before you move onto the next level. This is essentially what -C<Data::Diver> does for you: +L<Data::Diver> does for you: - if( check_hash( \%hash, qw(key1 key2 key3) ) ) { - ...; - } + if( check_hash( \%hash, qw(key1 key2 key3) ) ) { + ...; + } - sub check_hash { - my( $hash, @keys ) = @_; + sub check_hash { + my( $hash, @keys ) = @_; - return unless @keys; + return unless @keys; - foreach my $key ( @keys ) { - return unless eval { exists $hash->{$key} }; - $hash = $hash->{$key}; - } + foreach my $key ( @keys ) { + return unless eval { exists $hash->{$key} }; + $hash = $hash->{$key}; + } - return 1; - } + return 1; + } =head1 Data: Misc @@ -2565,29 +2553,29 @@ some gotchas. See the section on Regular Expressions. Assuming that you don't care about IEEE notations like "NaN" or "Infinity", you probably just want to use a regular expression: - use 5.010; - - given( $number ) { - when( /\D/ ) - { say "\thas nondigits"; continue } - when( /^\d+\z/ ) - { say "\tis a whole number"; continue } - when( /^-?\d+\z/ ) - { say "\tis an integer"; continue } - when( /^[+-]?\d+\z/ ) - { say "\tis a +/- integer"; continue } - when( /^-?(?:\d+\.?|\.\d)\d*\z/ ) - { say "\tis a real number"; continue } - when( /^[+-]?(?=\.?\d)\d*\.?\d*(?:e[+-]?\d+)?\z/i) - { say "\tis a C float" } - } + use 5.010; + + given( $number ) { + when( /\D/ ) + { say "\thas nondigits"; continue } + when( /^\d+\z/ ) + { say "\tis a whole number"; continue } + when( /^-?\d+\z/ ) + { say "\tis an integer"; continue } + when( /^[+-]?\d+\z/ ) + { say "\tis a +/- integer"; continue } + when( /^-?(?:\d+\.?|\.\d)\d*\z/ ) + { say "\tis a real number"; continue } + when( /^[+-]?(?=\.?\d)\d*\.?\d*(?:e[+-]?\d+)?\z/i) + { say "\tis a C float" } + } There are also some commonly used modules for the task. L<Scalar::Util> (distributed with 5.8) provides access to perl's internal function C<looks_like_number> for determining whether a variable looks like a number. L<Data::Types> exports functions that validate data types using both the above and other regular -expressions. Thirdly, there is C<Regexp::Common> which has regular +expressions. Thirdly, there is L<Regexp::Common> which has regular expressions to match various types of numbers. Those three modules are available from the CPAN. @@ -2599,22 +2587,22 @@ takes a string and returns the number it found, or C<undef> for input that isn't a C float. The C<is_numeric> function is a front end to C<getnum> if you just want to say, "Is this a float?" - sub getnum { - use POSIX qw(strtod); - my $str = shift; - $str =~ s/^\s+//; - $str =~ s/\s+$//; - $! = 0; - my($num, $unparsed) = strtod($str); - if (($str eq '') || ($unparsed != 0) || $!) { - return undef; - } - else { - return $num; - } - } - - sub is_numeric { defined getnum($_[0]) } + sub getnum { + use POSIX qw(strtod); + my $str = shift; + $str =~ s/^\s+//; + $str =~ s/\s+$//; + $! = 0; + my($num, $unparsed) = strtod($str); + if (($str eq '') || ($unparsed != 0) || $!) { + return undef; + } + else { + return $num; + } + } + + sub is_numeric { defined getnum($_[0]) } Or you could check out the L<String::Scanf> module on the CPAN instead. @@ -2622,34 +2610,34 @@ instead. =head2 How do I keep persistent data across program calls? For some specific applications, you can use one of the DBM modules. -See L<AnyDBM_File>. More generically, you should consult the C<FreezeThaw> -or C<Storable> modules from CPAN. Starting from Perl 5.8 C<Storable> is part -of the standard distribution. Here's one example using C<Storable>'s C<store> +See L<AnyDBM_File>. More generically, you should consult the L<FreezeThaw> +or L<Storable> modules from CPAN. Starting from Perl 5.8 L<Storable> is part +of the standard distribution. Here's one example using L<Storable>'s C<store> and C<retrieve> functions: - use Storable; - store(\%hash, "filename"); + use Storable; + store(\%hash, "filename"); - # later on... - $href = retrieve("filename"); # by ref - %hash = %{ retrieve("filename") }; # direct to hash + # later on... + $href = retrieve("filename"); # by ref + %hash = %{ retrieve("filename") }; # direct to hash =head2 How do I print out or copy a recursive data structure? -The C<Data::Dumper> module on CPAN (or the 5.005 release of Perl) is great -for printing out data structures. The C<Storable> module on CPAN (or the +The L<Data::Dumper> module on CPAN (or the 5.005 release of Perl) is great +for printing out data structures. The L<Storable> module on CPAN (or the 5.8 release of Perl), provides a function called C<dclone> that recursively copies its argument. - use Storable qw(dclone); - $r2 = dclone($r1); + use Storable qw(dclone); + $r2 = dclone($r1); Where C<$r1> can be a reference to any kind of data structure you'd like. It will be deeply copied. Because C<dclone> takes and returns references, you'd have to add extra punctuation if you had a hash of arrays that you wanted to copy. - %newhash = %{ dclone(\%oldhash) }; + %newhash = %{ dclone(\%oldhash) }; =head2 How do I define methods for every class/object? @@ -2664,13 +2652,13 @@ Moose that supports roles. =head2 How do I verify a credit card checksum? -Get the C<Business::CreditCard> module from CPAN. +Get the L<Business::CreditCard> module from CPAN. =head2 How do I pack arrays of doubles or floats for XS code? -The arrays.h/arrays.c code in the C<PGPLOT> module on CPAN does just this. +The arrays.h/arrays.c code in the L<PGPLOT> module on CPAN does just this. If you're doing a lot of float or double processing, consider using -the C<PDL> module from CPAN instead--it makes number-crunching easy. +the L<PDL> module from CPAN instead--it makes number-crunching easy. See L<http://search.cpan.org/dist/PGPLOT> for the code. diff --git a/cpan/perlfaq/lib/perlfaq5.pod b/cpan/perlfaq/lib/perlfaq5.pod index e7ed594517..011e2e1ce2 100644 --- a/cpan/perlfaq/lib/perlfaq5.pod +++ b/cpan/perlfaq/lib/perlfaq5.pod @@ -22,76 +22,75 @@ screen for every line you process to watch the progress of your program. Instead of seeing a dot for every line, Perl buffers the output and you have a long wait before you see a row of 50 dots all at once: - # long wait, then row of dots all at once - while( <> ) { - print "."; - print "\n" unless ++$count % 50; + # long wait, then row of dots all at once + while( <> ) { + print "."; + print "\n" unless ++$count % 50; - #... expensive line processing operations - } + #... expensive line processing operations + } To get around this, you have to unbuffer the output filehandle, in this case, C<STDOUT>. You can set the special variable C<$|> to a true value (mnemonic: making your filehandles "piping hot"): - $|++; + $|++; - # dot shown immediately - while( <> ) { - print "."; - print "\n" unless ++$count % 50; + # dot shown immediately + while( <> ) { + print "."; + print "\n" unless ++$count % 50; - #... expensive line processing operations - } + #... expensive line processing operations + } The C<$|> is one of the per-filehandle special variables, so each filehandle has its own copy of its value. If you want to merge standard output and standard error for instance, you have to unbuffer each (although STDERR might be unbuffered by default): - { - my $previous_default = select(STDOUT); # save previous default - $|++; # autoflush STDOUT - select(STDERR); - $|++; # autoflush STDERR, to be sure - select($previous_default); # restore previous default - } - - # now should alternate . and + - while( 1 ) - { - sleep 1; - print STDOUT "."; - print STDERR "+"; - print STDOUT "\n" unless ++$count % 25; - } + { + my $previous_default = select(STDOUT); # save previous default + $|++; # autoflush STDOUT + select(STDERR); + $|++; # autoflush STDERR, to be sure + select($previous_default); # restore previous default + } + + # now should alternate . and + + while( 1 ) { + sleep 1; + print STDOUT "."; + print STDERR "+"; + print STDOUT "\n" unless ++$count % 25; + } Besides the C<$|> special variable, you can use C<binmode> to give your filehandle a C<:unix> layer, which is unbuffered: - binmode( STDOUT, ":unix" ); + binmode( STDOUT, ":unix" ); - while( 1 ) { - sleep 1; - print "."; - print "\n" unless ++$count % 50; - } + while( 1 ) { + sleep 1; + print "."; + print "\n" unless ++$count % 50; + } For more information on output layers, see the entries for C<binmode> -and C<open> in L<perlfunc>, and the C<PerlIO> module documentation. +and L<open> in L<perlfunc>, and the L<PerlIO> module documentation. -If you are using C<IO::Handle> or one of its subclasses, you can +If you are using L<IO::Handle> or one of its subclasses, you can call the C<autoflush> method to change the settings of the filehandle: - use IO::Handle; - open my( $io_fh ), ">", "output.txt"; - $io_fh->autoflush(1); + use IO::Handle; + open my( $io_fh ), ">", "output.txt"; + $io_fh->autoflush(1); -The C<IO::Handle> objects also have a C<flush> method. You can flush +The L<IO::Handle> objects also have a C<flush> method. You can flush the buffer any time you want without auto-buffering - $io_fh->flush; + $io_fh->flush; =head2 How do I change, delete, or insert a line in a file, or append to the beginning of a file? X<file, editing> @@ -103,20 +102,19 @@ file involves reading and printing the file to the point you want to make the change, making the change, then reading and printing the rest of the file. Perl doesn't provide random access to lines (especially since the record input separator, C<$/>, is mutable), although modules -such as C<Tie::File> can fake it. +such as L<Tie::File> can fake it. A Perl program to do these tasks takes the basic form of opening a file, printing its lines, then closing the file: - open my $in, '<', $file or die "Can't read old file: $!"; - open my $out, '>', "$file.new" or die "Can't write new file: $!"; + open my $in, '<', $file or die "Can't read old file: $!"; + open my $out, '>', "$file.new" or die "Can't write new file: $!"; - while( <$in> ) - { - print $out $_; - } + while( <$in> ) { + print $out $_; + } - close $out; + close $out; Within that basic form, add the parts that you need to insert, change, or delete lines. @@ -124,91 +122,85 @@ or delete lines. To prepend lines to the beginning, print those lines before you enter the loop that prints the existing lines. - open my $in, '<', $file or die "Can't read old file: $!"; - open my $out, '>', "$file.new" or die "Can't write new file: $!"; + open my $in, '<', $file or die "Can't read old file: $!"; + open my $out, '>', "$file.new" or die "Can't write new file: $!"; - print $out "# Add this line to the top\n"; # <--- HERE'S THE MAGIC + print $out "# Add this line to the top\n"; # <--- HERE'S THE MAGIC - while( <$in> ) - { - print $out $_; - } + while( <$in> ) { + print $out $_; + } - close $out; + close $out; To change existing lines, insert the code to modify the lines inside the C<while> loop. In this case, the code finds all lowercased versions of "perl" and uppercases them. The happens for every line, so be sure that you're supposed to do that on every line! - open my $in, '<', $file or die "Can't read old file: $!"; - open my $out, '>', "$file.new" or die "Can't write new file: $!"; + open my $in, '<', $file or die "Can't read old file: $!"; + open my $out, '>', "$file.new" or die "Can't write new file: $!"; - print $out "# Add this line to the top\n"; + print $out "# Add this line to the top\n"; - while( <$in> ) - { - s/\b(perl)\b/Perl/g; - print $out $_; - } + while( <$in> ) { + s/\b(perl)\b/Perl/g; + print $out $_; + } - close $out; + close $out; To change only a particular line, the input line number, C<$.>, is useful. First read and print the lines up to the one you want to change. Next, read the single line you want to change, change it, and print it. After that, read the rest of the lines and print those: - while( <$in> ) # print the lines before the change - { - print $out $_; - last if $. == 4; # line number before change - } + while( <$in> ) { # print the lines before the change + print $out $_; + last if $. == 4; # line number before change + } - my $line = <$in>; - $line =~ s/\b(perl)\b/Perl/g; - print $out $line; + my $line = <$in>; + $line =~ s/\b(perl)\b/Perl/g; + print $out $line; - while( <$in> ) # print the rest of the lines - { - print $out $_; - } + while( <$in> ) { # print the rest of the lines + print $out $_; + } To skip lines, use the looping controls. The C<next> in this example skips comment lines, and the C<last> stops all processing once it encounters either C<__END__> or C<__DATA__>. - while( <$in> ) - { - next if /^\s+#/; # skip comment lines - last if /^__(END|DATA)__$/; # stop at end of code marker - print $out $_; - } + while( <$in> ) { + next if /^\s+#/; # skip comment lines + last if /^__(END|DATA)__$/; # stop at end of code marker + print $out $_; + } Do the same sort of thing to delete a particular line by using C<next> to skip the lines you don't want to show up in the output. This example skips every fifth line: - while( <$in> ) - { - next unless $. % 5; - print $out $_; - } + while( <$in> ) { + next unless $. % 5; + print $out $_; + } If, for some odd reason, you really want to see the whole file at once rather than processing line-by-line, you can slurp it in (as long as you can fit the whole thing in memory!): - open my $in, '<', $file or die "Can't read old file: $!" - open my $out, '>', "$file.new" or die "Can't write new file: $!"; + open my $in, '<', $file or die "Can't read old file: $!" + open my $out, '>', "$file.new" or die "Can't write new file: $!"; - my @lines = do { local $/; <$in> }; # slurp! + my @lines = do { local $/; <$in> }; # slurp! - # do your magic here + # do your magic here - print $out @lines; + print $out @lines; -Modules such as C<File::Slurp> and C<Tie::File> can help with that +Modules such as L<File::Slurp> and L<Tie::File> can help with that too. If you can, however, avoid reading the entire file at once. Perl won't give that memory back to the operating system until the process finishes. @@ -221,42 +213,42 @@ on in-place editing. The current line is in C<$_>. With C<-p>, Perl automatically prints the value of C<$_> at the end of the loop. See L<perlrun> for more details. - perl -pi -e 's/Fred/Barney/' inFile.txt + perl -pi -e 's/Fred/Barney/' inFile.txt To make a backup of C<inFile.txt>, give C<-i> a file extension to add: - perl -pi.bak -e 's/Fred/Barney/' inFile.txt + perl -pi.bak -e 's/Fred/Barney/' inFile.txt To change only the fifth line, you can add a test checking C<$.>, the input line number, then only perform the operation when the test passes: - perl -pi -e 's/Fred/Barney/ if $. == 5' inFile.txt + perl -pi -e 's/Fred/Barney/ if $. == 5' inFile.txt To add lines before a certain line, you can add a line (or lines!) before Perl prints C<$_>: - perl -pi -e 'print "Put before third line\n" if $. == 3' inFile.txt + perl -pi -e 'print "Put before third line\n" if $. == 3' inFile.txt You can even add a line to the beginning of a file, since the current line prints at the end of the loop: - perl -pi -e 'print "Put before first line\n" if $. == 1' inFile.txt + perl -pi -e 'print "Put before first line\n" if $. == 1' inFile.txt To insert a line after one already in the file, use the C<-n> switch. It's just like C<-p> except that it doesn't print C<$_> at the end of the loop, so you have to do that yourself. In this case, print C<$_> first, then print the line that you want to add. - perl -ni -e 'print; print "Put after fifth line\n" if $. == 5' inFile.txt + perl -ni -e 'print; print "Put after fifth line\n" if $. == 5' inFile.txt To delete lines, only print the ones that you want. - perl -ni -e 'print unless /d/' inFile.txt + perl -ni -e 'print unless /d/' inFile.txt - ... or ... + ... or ... - perl -pi -e 'next unless /d/' inFile.txt + perl -pi -e 'next unless /d/' inFile.txt =head2 How do I count the number of lines in a file? X<file, counting lines> X<lines> X<line> @@ -266,58 +258,58 @@ X<file, counting lines> X<lines> X<line> Conceptually, the easiest way to count the lines in a file is to simply read them and count them: - my $count = 0; - while( <$fh> ) { $count++; } + my $count = 0; + while( <$fh> ) { $count++; } You don't really have to count them yourself, though, since Perl already does that with the C<$.> variable, which is the current line number from the last filehandle read: - 1 while( <$fh> ); - my $count = $.; + 1 while( <$fh> ); + my $count = $.; If you want to use C<$.>, you can reduce it to a simple one-liner, like one of these: - % perl -lne '} print $.; {' file + % perl -lne '} print $.; {' file - % perl -lne 'END { print $. }' file + % perl -lne 'END { print $. }' file Those can be rather inefficient though. If they aren't fast enough for you, you might just read chunks of data and count the number of newlines: - my $lines = 0; - open my($fh), '<:raw', $filename or die "Can't open $filename: $!"; - while( sysread $fh, $buffer, 4096 ) { - $lines += ( $buffer =~ tr/\n// ); - } - close FILE; + my $lines = 0; + open my($fh), '<:raw', $filename or die "Can't open $filename: $!"; + while( sysread $fh, $buffer, 4096 ) { + $lines += ( $buffer =~ tr/\n// ); + } + close FILE; However, that doesn't work if the line ending isn't a newline. You might change that C<tr///> to a C<s///> so you can count the number of times the input record separator, C<$/>, shows up: - my $lines = 0; - open my($fh), '<:raw', $filename or die "Can't open $filename: $!"; - while( sysread $fh, $buffer, 4096 ) { - $lines += ( $buffer =~ s|$/||g; ); - } - close FILE; + my $lines = 0; + open my($fh), '<:raw', $filename or die "Can't open $filename: $!"; + while( sysread $fh, $buffer, 4096 ) { + $lines += ( $buffer =~ s|$/||g; ); + } + close FILE; If you don't mind shelling out, the C<wc> command is usually the fastest, even with the extra interprocess overhead. Ensure that you have an untainted filename though: - #!perl -T + #!perl -T - $ENV{PATH} = undef; + $ENV{PATH} = undef; - my $lines; - if( $filename =~ /^([0-9a-z_.]+)\z/ ) { - $lines = `/usr/bin/wc -l $1` - chomp $lines; - } + my $lines; + if( $filename =~ /^([0-9a-z_.]+)\z/ ) { + $lines = `/usr/bin/wc -l $1` + chomp $lines; + } =head2 How do I delete the last N lines from a file? X<lines> X<file> @@ -333,34 +325,33 @@ without making more than one pass over the file, or how to do it without a lot of copying. The easy concept is the hard reality when you might have millions of lines in your file. -One trick is to use C<File::ReadBackwards>, which starts at the end of +One trick is to use L<File::ReadBackwards>, which starts at the end of the file. That module provides an object that wraps the real filehandle to make it easy for you to move around the file. Once you get to the spot you need, you can get the actual filehandle and work with it as normal. In this case, you get the file position at the end of the last line you want to keep and truncate the file to that point: - use File::ReadBackwards; + use File::ReadBackwards; - my $filename = 'test.txt'; - my $Lines_to_truncate = 2; + my $filename = 'test.txt'; + my $Lines_to_truncate = 2; - my $bw = File::ReadBackwards->new( $filename ) - or die "Could not read backwards in [$filename]: $!"; + my $bw = File::ReadBackwards->new( $filename ) + or die "Could not read backwards in [$filename]: $!"; - my $lines_from_end = 0; - until( $bw->eof or $lines_from_end == $Lines_to_truncate ) - { - print "Got: ", $bw->readline; - $lines_from_end++; - } + my $lines_from_end = 0; + until( $bw->eof or $lines_from_end == $Lines_to_truncate ) { + print "Got: ", $bw->readline; + $lines_from_end++; + } - truncate( $filename, $bw->tell ); + truncate( $filename, $bw->tell ); -The C<File::ReadBackwards> module also has the advantage of setting +The L<File::ReadBackwards> module also has the advantage of setting the input record separator to a regular expression. -You can also use the C<Tie::File> module which lets you access +You can also use the L<Tie::File> module which lets you access the lines through a tied array. You can use normal array operations to modify your file, including setting the last index and using C<splice>. @@ -373,19 +364,19 @@ the behavior of C<< <> >>; see L<perlrun> for more details. By modifying the appropriate variables directly, you can get the same behavior within a larger program. For example: - # ... - { - local($^I, @ARGV) = ('.orig', glob("*.c")); - while (<>) { - if ($. == 1) { - print "This line should appear at the top of each file\n"; - } - s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case - print; - close ARGV if eof; # Reset $. - } - } - # $^I and @ARGV return to their old values here + # ... + { + local($^I, @ARGV) = ('.orig', glob("*.c")); + while (<>) { + if ($. == 1) { + print "This line should appear at the top of each file\n"; + } + s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case + print; + close ARGV if eof; # Reset $. + } + } + # $^I and @ARGV return to their old values here This block modifies all the C<.c> files in the current directory, leaving a backup of the original data from each file in a new @@ -396,15 +387,15 @@ X<copy> X<file, copy> X<File::Copy> (contributed by brian d foy) -Use the C<File::Copy> module. It comes with Perl and can do a +Use the L<File::Copy> module. It comes with Perl and can do a true copy across file systems, and it does its magic in a portable fashion. - use File::Copy; + use File::Copy; - copy( $original, $new_copy ) or die "Copy failed: $!"; + copy( $original, $new_copy ) or die "Copy failed: $!"; -If you can't use C<File::Copy>, you'll have to do the work yourself: +If you can't use L<File::Copy>, you'll have to do the work yourself: open the original file, open the destination file, then print to the destination file as you read the original. You also have to remember to copy the permissions, owner, and group to the new file. @@ -416,55 +407,54 @@ If you don't need to know the name of the file, you can use C<open()> with C<undef> in place of the file name. In Perl 5.8 or later, the C<open()> function creates an anonymous temporary file: - open my $tmp, '+>', undef or die $!; + open my $tmp, '+>', undef or die $!; Otherwise, you can use the File::Temp module. - use File::Temp qw/ tempfile tempdir /; + use File::Temp qw/ tempfile tempdir /; - my $dir = tempdir( CLEANUP => 1 ); - ($fh, $filename) = tempfile( DIR => $dir ); + my $dir = tempdir( CLEANUP => 1 ); + ($fh, $filename) = tempfile( DIR => $dir ); - # or if you don't need to know the filename + # or if you don't need to know the filename - my $fh = tempfile( DIR => $dir ); + my $fh = tempfile( DIR => $dir ); The File::Temp has been a standard module since Perl 5.6.1. If you don't have a modern enough Perl installed, use the C<new_tmpfile> class method from the IO::File module to get a filehandle opened for reading and writing. Use it if you don't need to know the file's name: - use IO::File; - my $fh = IO::File->new_tmpfile() - or die "Unable to make new temporary file: $!"; + use IO::File; + my $fh = IO::File->new_tmpfile() + or die "Unable to make new temporary file: $!"; If you're committed to creating a temporary file by hand, use the process ID and/or the current time-value. If you need to have many temporary files in one process, use a counter: - BEGIN { - use Fcntl; - my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMPDIR} || $ENV{TEMP}; - my $base_name = sprintf "%s/%d-%d-0000", $temp_dir, $$, time; - - sub temp_file { - my $fh; - my $count = 0; - until( defined(fileno($fh)) || $count++ > 100 ) { - $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e; - # O_EXCL is required for security reasons. - sysopen $fh, $base_name, O_WRONLY|O_EXCL|O_CREAT; - } - - if( defined fileno($fh) ) { - return ($fh, $base_name); - } - else { - return (); - } - } - - } + BEGIN { + use Fcntl; + my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMPDIR} || $ENV{TEMP}; + my $base_name = sprintf "%s/%d-%d-0000", $temp_dir, $$, time; + + sub temp_file { + my $fh; + my $count = 0; + until( defined(fileno($fh)) || $count++ > 100 ) { + $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e; + # O_EXCL is required for security reasons. + sysopen $fh, $base_name, O_WRONLY|O_EXCL|O_CREAT; + } + + if( defined fileno($fh) ) { + return ($fh, $base_name); + } + else { + return (); + } + } + } =head2 How can I manipulate fixed-record-length files? X<fixed-length> X<file, fixed-length records> @@ -478,20 +468,20 @@ Here is a sample chunk of code to break up and put back together again some fixed-format input lines, in this case from the output of a normal, Berkeley-style ps: - # sample input line: - # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what - my $PS_T = 'A6 A4 A7 A5 A*'; - open my $ps, '-|', 'ps'; - print scalar <$ps>; - my @fields = qw( pid tt stat time command ); - while (<$ps>) { - my %process; - @process{@fields} = unpack($PS_T, $_); - for my $field ( @fields ) { - print "$field: <$process{$field}>\n"; - } - print 'line=', pack($PS_T, @process{@fields} ), "\n"; - } + # sample input line: + # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what + my $PS_T = 'A6 A4 A7 A5 A*'; + open my $ps, '-|', 'ps'; + print scalar <$ps>; + my @fields = qw( pid tt stat time command ); + while (<$ps>) { + my %process; + @process{@fields} = unpack($PS_T, $_); + for my $field ( @fields ) { + print "$field: <$process{$field}>\n"; + } + print 'line=', pack($PS_T, @process{@fields} ), "\n"; + } We've used a hash slice in order to easily handle the fields of each row. Storing the keys in an array makes it easy to operate on them as a @@ -506,13 +496,13 @@ as references if you pass it an uninitialized scalar variable. You can then pass these references just like any other scalar, and use them in the place of named handles. - open my $fh, $file_name; + open my $fh, $file_name; - open local $fh, $file_name; + open local $fh, $file_name; - print $fh "Hello World!\n"; + print $fh "Hello World!\n"; - process_file( $fh ); + process_file( $fh ); If you like, you can store these filehandles in an array or a hash. If you access them directly, they aren't simple scalars and you @@ -520,21 +510,21 @@ need to give C<print> a little help by placing the filehandle reference in braces. Perl can only figure it out on its own when the filehandle reference is a simple scalar. - my @fhs = ( $fh1, $fh2, $fh3 ); + my @fhs = ( $fh1, $fh2, $fh3 ); - for( $i = 0; $i <= $#fhs; $i++ ) { - print {$fhs[$i]} "just another Perl answer, \n"; - } + for( $i = 0; $i <= $#fhs; $i++ ) { + print {$fhs[$i]} "just another Perl answer, \n"; + } Before perl5.6, you had to deal with various typeglob idioms which you may see in older code. - open FILE, "> $filename"; - process_typeglob( *FILE ); - process_reference( \*FILE ); + open FILE, "> $filename"; + process_typeglob( *FILE ); + process_reference( \*FILE ); - sub process_typeglob { local *FH = shift; print FH "Typeglob!" } - sub process_reference { local $fh = shift; print $fh "Reference!" } + sub process_typeglob { local *FH = shift; print FH "Typeglob!" } + sub process_reference { local $fh = shift; print $fh "Reference!" } If you want to create many anonymous handles, you should check out the Symbol or IO::Handle modules. @@ -546,17 +536,17 @@ An indirect filehandle is the use of something other than a symbol in a place that a filehandle is expected. Here are ways to get indirect filehandles: - $fh = SOME_FH; # bareword is strict-subs hostile - $fh = "SOME_FH"; # strict-refs hostile; same package only - $fh = *SOME_FH; # typeglob - $fh = \*SOME_FH; # ref to typeglob (bless-able) - $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob + $fh = SOME_FH; # bareword is strict-subs hostile + $fh = "SOME_FH"; # strict-refs hostile; same package only + $fh = *SOME_FH; # typeglob + $fh = \*SOME_FH; # ref to typeglob (bless-able) + $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob Or, you can use the C<new> method from one of the IO::* modules to create an anonymous filehandle and store that in a scalar variable. - use IO::Handle; # 5.004 or higher - my $fh = IO::Handle->new(); + use IO::Handle; # 5.004 or higher + my $fh = IO::Handle->new(); Then use any of those as you would a normal filehandle. Anywhere that Perl is expecting a filehandle, an indirect filehandle may be used @@ -565,32 +555,32 @@ a filehandle. Functions like C<print>, C<open>, C<seek>, or the C<< <FH> >> diamond operator will accept either a named filehandle or a scalar variable containing one: - ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR); - print $ofh "Type it: "; - my $got = <$ifh> - print $efh "What was that: $got"; + ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR); + print $ofh "Type it: "; + my $got = <$ifh> + print $efh "What was that: $got"; If you're passing a filehandle to a function, you can write the function in two ways: - sub accept_fh { - my $fh = shift; - print $fh "Sending to indirect filehandle\n"; - } + sub accept_fh { + my $fh = shift; + print $fh "Sending to indirect filehandle\n"; + } Or it can localize a typeglob and use the filehandle directly: - sub accept_fh { - local *FH = shift; - print FH "Sending to localized filehandle\n"; - } + sub accept_fh { + local *FH = shift; + print FH "Sending to localized filehandle\n"; + } Both styles work with either objects or typeglobs of real filehandles. (They might also work with strings under some circumstances, but this is risky.) - accept_fh(*STDOUT); - accept_fh($handle); + accept_fh(*STDOUT); + accept_fh($handle); In the examples above, we assigned the filehandle to a scalar variable before using it. That is because only simple scalar variables, not @@ -599,24 +589,24 @@ built-ins like C<print>, C<printf>, or the diamond operator. Using something other than a simple scalar variable as a filehandle is illegal and won't even compile: - my @fd = (*STDIN, *STDOUT, *STDERR); - print $fd[1] "Type it: "; # WRONG - my $got = <$fd[0]> # WRONG - print $fd[2] "What was that: $got"; # WRONG + my @fd = (*STDIN, *STDOUT, *STDERR); + print $fd[1] "Type it: "; # WRONG + my $got = <$fd[0]> # WRONG + print $fd[2] "What was that: $got"; # WRONG With C<print> and C<printf>, you get around this by using a block and an expression where you would place the filehandle: - print { $fd[1] } "funny stuff\n"; - printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559; - # Pity the poor deadbeef. + print { $fd[1] } "funny stuff\n"; + printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559; + # Pity the poor deadbeef. That block is a proper block like any other, so you can put more complicated code there. This sends the message out to one of two places: - my $ok = -x "/bin/cat"; - print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n"; - print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n"; + my $ok = -x "/bin/cat"; + print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n"; + print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n"; This approach of treating C<print> and C<printf> like object methods calls doesn't work for the diamond operator. That's because it's a @@ -627,7 +617,7 @@ as C<< <> >> does. Given the initialization shown above for @fd, this would work, but only because readline() requires a typeglob. It doesn't work with objects or strings, which might be a bug we haven't fixed yet. - $got = readline($fd[0]); + $got = readline($fd[0]); Let it be noted that the flakiness of indirect filehandles is not related to whether they're strings, typeglobs, objects, or anything else. @@ -648,8 +638,8 @@ X<write, into a string> If you want to C<write> into a string, you just have to <open> a filehandle to a string, which Perl has been able to do since Perl 5.6: - open FH, '>', \my $string; - write( FH ); + open FH, '>', \my $string; + write( FH ); Since you want to be a good programmer, you probably want to use a lexical filehandle, even though formats are designed to work with bareword filehandles @@ -658,34 +648,34 @@ control this with some Perl special per-filehandle variables: C<$^>, which names the top-of-page format, and C<$~> which shows the line format. You have to change the default filehandle to set these variables: - open my($fh), '>', \my $string; + open my($fh), '>', \my $string; - { # set per-filehandle variables - my $old_fh = select( $fh ); - $~ = 'ANIMAL'; - $^ = 'ANIMAL_TOP'; - select( $old_fh ); - } + { # set per-filehandle variables + my $old_fh = select( $fh ); + $~ = 'ANIMAL'; + $^ = 'ANIMAL_TOP'; + select( $old_fh ); + } - format ANIMAL_TOP = - ID Type Name - . + format ANIMAL_TOP = + ID Type Name + . - format ANIMAL = - @## @<<< @<<<<<<<<<<<<<< - $id, $type, $name - . + format ANIMAL = + @## @<<< @<<<<<<<<<<<<<< + $id, $type, $name + . Although write can work with lexical or package variables, whatever variables you use have to scope in the format. That most likely means you'll want to localize some package variables: - { - local( $id, $type, $name ) = qw( 12 cat Buster ); - write( $fh ); - } + { + local( $id, $type, $name ) = qw( 12 cat Buster ); + write( $fh ); + } - print $string; + print $string; There are also some tricks that you can play with C<formline> and the accumulator variable C<$^A>, but you lose a lot of the value of formats @@ -701,14 +691,14 @@ Since Perl 5.8.0 a file handle referring to a string can be created by calling open with a reference to that string instead of the filename. This file handle can then be used to read from or write to the string: - open(my $fh, '>', \$string) or die "Could not open string for writing"; - print $fh "foo\n"; - print $fh "bar\n"; # $string now contains "foo\nbar\n" + open(my $fh, '>', \$string) or die "Could not open string for writing"; + print $fh "foo\n"; + print $fh "bar\n"; # $string now contains "foo\nbar\n" - open(my $fh, '<', \$string) or die "Could not open string for reading"; - my $x = <$fh>; # $x now contains "foo\n" + open(my $fh, '<', \$string) or die "Could not open string for reading"; + my $x = <$fh>; # $x now contains "foo\n" -With older versions of Perl, the C<IO::String> module provides similar +With older versions of Perl, the L<IO::String> module provides similar functionality. =head2 How can I output my numbers with commas added? @@ -723,29 +713,29 @@ really). This subroutine will add commas to your number: - sub commify { - local $_ = shift; - 1 while s/^([-+]?\d+)(\d{3})/$1,$2/; - return $_; - } + sub commify { + local $_ = shift; + 1 while s/^([-+]?\d+)(\d{3})/$1,$2/; + return $_; + } This regex from Benjamin Goldberg will add commas to numbers: - s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g; + s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g; It is easier to see with comments: - s/( - ^[-+]? # beginning of number. - \d+? # first digits before first comma - (?= # followed by, (but not included in the match) : - (?>(?:\d{3})+) # some positive multiple of three digits. - (?!\d) # an *exact* multiple, not x * 3 + 1 or whatever. - ) - | # or: - \G\d{3} # after the last group, get three digits - (?=\d) # but they have to have more digits after them. - )/$1,/xg; + s/( + ^[-+]? # beginning of number. + \d+? # first digits before first comma + (?= # followed by, (but not included in the match) : + (?>(?:\d{3})+) # some positive multiple of three digits. + (?!\d) # an *exact* multiple, not x * 3 + 1 or whatever. + ) + | # or: + \G\d{3} # after the last group, get three digits + (?=\d) # but they have to have more digits after them. + )/$1,/xg; =head2 How can I translate tildes (~) in a filename? X<tilde> X<tilde expansion> @@ -753,22 +743,22 @@ X<tilde> X<tilde expansion> Use the E<lt>E<gt> (C<glob()>) operator, documented in L<perlfunc>. Versions of Perl older than 5.6 require that you have a shell installed that groks tildes. Later versions of Perl have this feature -built in. The C<File::KGlob> module (available from CPAN) gives more +built in. The L<File::KGlob> module (available from CPAN) gives more portable glob functionality. Within Perl, you may use this directly: - $filename =~ s{ - ^ ~ # find a leading tilde - ( # save this in $1 - [^/] # a non-slash character - * # repeated 0 or more times (0 means me) - ) - }{ - $1 - ? (getpwnam($1))[7] - : ( $ENV{HOME} || $ENV{LOGDIR} ) - }ex; + $filename =~ s{ + ^ ~ # find a leading tilde + ( # save this in $1 + [^/] # a non-slash character + * # repeated 0 or more times (0 means me) + ) + }{ + $1 + ? (getpwnam($1))[7] + : ( $ENV{HOME} || $ENV{LOGDIR} ) + }ex; =head2 How come when I open a file read-write it wipes it out? X<clobber> X<read-write> X<clobbering> X<truncate> X<truncating> @@ -776,66 +766,66 @@ X<clobber> X<read-write> X<clobbering> X<truncate> X<truncating> Because you're using something like this, which truncates the file I<then> gives you read-write access: - open my $fh, '+>', '/path/name'; # WRONG (almost always) + open my $fh, '+>', '/path/name'; # WRONG (almost always) Whoops. You should instead use this, which will fail if the file doesn't exist: - open my $fh, '+<', '/path/name'; # open for update + open my $fh, '+<', '/path/name'; # open for update Using ">" always clobbers or creates. Using "<" never does either. The "+" doesn't change this. Here are examples of many kinds of file opens. Those using C<sysopen> -all assume that you've pulled in the constants from C<Fcntl>: +all assume that you've pulled in the constants from L<Fcntl>: - use Fcntl; + use Fcntl; To open file for reading: - open my $fh, '<', $path or die $!; - sysopen my $fh, $path, O_RDONLY or die $!; + open my $fh, '<', $path or die $!; + sysopen my $fh, $path, O_RDONLY or die $!; To open file for writing, create new file if needed or else truncate old file: - open my $fh, '>', $path or die $!; - sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT or die $!; - sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666 or die $!; + open my $fh, '>', $path or die $!; + sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT or die $!; + sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666 or die $!; To open file for writing, create new file, file must not exist: - sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT or die $!; - sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT, 0666 or die $!; + sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT or die $!; + sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT, 0666 or die $!; To open file for appending, create if necessary: - open my $fh, '>>' $path or die $!; - sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT or die $!; - sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT, 0666 or die $!; + open my $fh, '>>' $path or die $!; + sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT or die $!; + sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT, 0666 or die $!; To open file for appending, file must exist: - sysopen my $fh, $path, O_WRONLY|O_APPEND or die $!; + sysopen my $fh, $path, O_WRONLY|O_APPEND or die $!; To open file for update, file must exist: - open my $fh, '+<', $path or die $!; - sysopen my $fh, $path, O_RDWR or die $!; + open my $fh, '+<', $path or die $!; + sysopen my $fh, $path, O_RDWR or die $!; To open file for update, create file if necessary: - sysopen my $fh, $path, O_RDWR|O_CREAT or die $!; - sysopen my $fh, $path, O_RDWR|O_CREAT, 0666 or die $!; + sysopen my $fh, $path, O_RDWR|O_CREAT or die $!; + sysopen my $fh, $path, O_RDWR|O_CREAT, 0666 or die $!; To open file for update, file must not exist: - sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT or die $!; - sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT, 0666 or die $!; + sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT or die $!; + sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT, 0666 or die $!; To open a file without blocking, creating if necessary: - sysopen my $fh, '/foo/somefile', O_WRONLY|O_NDELAY|O_CREAT - or die "can't open /foo/somefile: $!": + sysopen my $fh, '/foo/somefile', O_WRONLY|O_NDELAY|O_CREAT + or die "can't open /foo/somefile: $!": Be warned that neither creation nor deletion of files is guaranteed to be an atomic operation over NFS. That is, two processes might both @@ -881,8 +871,8 @@ Unless you have a particular reason to use the two-argument form you should use the three-argument form of open() which does not treat any characters in the filename as special. - open my $fh, "<", " file "; # filename is " file " - open my $fh, ">", ">file"; # filename is ">file" + open my $fh, "<", " file "; # filename is " file " + open my $fh, ">", ">file"; # filename is ">file" =head2 How can I reliably rename a file? X<rename> X<mv> X<move> X<file, rename> @@ -890,9 +880,9 @@ X<rename> X<mv> X<move> X<file, rename> If your operating system supports a proper mv(1) utility or its functional equivalent, this works: - rename($old, $new) or system("mv", $old, $new); + rename($old, $new) or system("mv", $old, $new); -It may be more portable to use the C<File::Copy> module instead. +It may be more portable to use the L<File::Copy> module instead. You just copy to the new file to the new name (checking return values), then delete the old one. This isn't really the same semantically as a C<rename()>, which preserves meta-information like @@ -951,15 +941,15 @@ X<lock, lockfile race condition> A common bit of code B<NOT TO USE> is this: - sleep(3) while -e 'file.lock'; # PLEASE DO NOT USE - open my $lock, '>', 'file.lock'; # THIS BROKEN CODE + sleep(3) while -e 'file.lock'; # PLEASE DO NOT USE + open my $lock, '>', 'file.lock'; # THIS BROKEN CODE This is a classic race condition: you take two steps to do something which must be done in one. That's why computer hardware provides an atomic test-and-set instruction. In theory, this "ought" to work: - sysopen my $fh, "file.lock", O_WRONLY|O_EXCL|O_CREAT - or die "can't open file.lock: $!"; + sysopen my $fh, "file.lock", O_WRONLY|O_EXCL|O_CREAT + or die "can't open file.lock: $!"; except that lamentably, file creation (and deletion) is not atomic over NFS, so this won't work (at least, not every time) over the net. @@ -976,18 +966,18 @@ they're more realistic. Anyway, this is what you can do if you can't help yourself. - use Fcntl qw(:DEFAULT :flock); - sysopen my $fh, "numfile", O_RDWR|O_CREAT or die "can't open numfile: $!"; - flock $fh, LOCK_EX or die "can't flock numfile: $!"; - my $num = <$fh> || 0; - seek $fh, 0, 0 or die "can't rewind numfile: $!"; - truncate $fh, 0 or die "can't truncate numfile: $!"; - (print $fh $num+1, "\n") or die "can't write numfile: $!"; - close $fh or die "can't close numfile: $!"; + use Fcntl qw(:DEFAULT :flock); + sysopen my $fh, "numfile", O_RDWR|O_CREAT or die "can't open numfile: $!"; + flock $fh, LOCK_EX or die "can't flock numfile: $!"; + my $num = <$fh> || 0; + seek $fh, 0, 0 or die "can't rewind numfile: $!"; + truncate $fh, 0 or die "can't truncate numfile: $!"; + (print $fh $num+1, "\n") or die "can't write numfile: $!"; + close $fh or die "can't close numfile: $!"; Here's a much better web-page hit counter: - $hits = int( (time() - 850_000_000) / rand(1_000) ); + $hits = int( (time() - 850_000_000) / rand(1_000) ); If the count doesn't impress your friends, then the code might. :-) @@ -1028,20 +1018,20 @@ X<file, binary patch> If you're just trying to patch a binary, in many cases something as simple as this works: - perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs + perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs However, if you have fixed sized records, then you might do something more like this: - $RECSIZE = 220; # size of record, in bytes - $recno = 37; # which record to update - open my $fh, '+<', 'somewhere' or die "can't update somewhere: $!"; - seek $fh, $recno * $RECSIZE, 0; - read $fh, $record, $RECSIZE == $RECSIZE or die "can't read record $recno: $!"; - # munge the record - seek $fh, -$RECSIZE, 1; - print $fh $record; - close $fh; + $RECSIZE = 220; # size of record, in bytes + $recno = 37; # which record to update + open my $fh, '+<', 'somewhere' or die "can't update somewhere: $!"; + seek $fh, $recno * $RECSIZE, 0; + read $fh, $record, $RECSIZE == $RECSIZE or die "can't read record $recno: $!"; + # munge the record + seek $fh, -$RECSIZE, 1; + print $fh $record; + close $fh; Locking and error checking are left as an exercise for the reader. Don't forget them or you'll be quite sorry. @@ -1061,18 +1051,18 @@ C<POSIX::strftime()> to convert this into human-readable form. Here's an example: - my $write_secs = (stat($file))[9]; - printf "file %s updated at %s\n", $file, - scalar localtime($write_secs); + my $write_secs = (stat($file))[9]; + printf "file %s updated at %s\n", $file, + scalar localtime($write_secs); If you prefer something more legible, use the File::stat module (part of the standard distribution in version 5.004 and later): - # error checking left as an exercise for reader. - use File::stat; - use Time::localtime; - my $date_string = ctime(stat($file)->mtime); - print "file $file updated at $date_string\n"; + # error checking left as an exercise for reader. + use File::stat; + use Time::localtime; + my $date_string = ctime(stat($file)->mtime); + print "file $file updated at $date_string\n"; The POSIX::strftime() approach has the benefit of being, in theory, independent of the current locale. See L<perllocale> @@ -1086,12 +1076,12 @@ By way of example, here's a little program that copies the read and write times from its first argument to all the rest of them. - if (@ARGV < 2) { - die "usage: cptimes timestamp_file other_files ...\n"; - } - my $timestamp = shift; - my($atime, $mtime) = (stat($timestamp))[8,9]; - utime $atime, $mtime, @ARGV; + if (@ARGV < 2) { + die "usage: cptimes timestamp_file other_files ...\n"; + } + my $timestamp = shift; + my($atime, $mtime) = (stat($timestamp))[8,9]; + utime $atime, $mtime, @ARGV; Error checking is, as usual, left as an exercise for the reader. @@ -1113,7 +1103,7 @@ you can use the IO::Tee or Tie::FileHandle::Multiplex modules. If you only have to do this once, you can print individually to each filehandle. - for my $fh (FH1, FH2, FH3) { print $fh "whatever\n" } + for my $fh (FH1, FH2, FH3) { print $fh "whatever\n" } =head2 How can I read in an entire file all at once? X<slurp> X<file, slurping> @@ -1121,19 +1111,19 @@ X<slurp> X<file, slurping> The customary Perl approach for processing all the lines in a file is to do so one line at a time: - open my $input, '<', $file or die "can't open $file: $!"; - while (<$input>) { - chomp; - # do something with $_ - } - close $input or die "can't close $file: $!"; + open my $input, '<', $file or die "can't open $file: $!"; + while (<$input>) { + chomp; + # do something with $_ + } + close $input or die "can't close $file: $!"; This is tremendously more efficient than reading the entire file into memory as an array of lines and then processing it one element at a time, which is often--if not almost always--the wrong approach. Whenever you see someone do this: - my @lines = <INPUT>; + my @lines = <INPUT>; You should think long and hard about why you need everything loaded at once. It's just not a scalable solution. @@ -1142,49 +1132,49 @@ If you "mmap" the file with the File::Map module from CPAN, you can virtually load the entire file into a string without actually storing it in memory: - use File::Map qw(map_file); + use File::Map qw(map_file); - map_file my $string, $filename; + map_file my $string, $filename; Once mapped, you can treat C<$string> as you would any other string. Since you don't necessarily have to load the data, mmap-ing can be very fast and may not increase your memory footprint. You might also find it more -fun to use the standard C<Tie::File> module, or the C<DB_File> module's +fun to use the standard L<Tie::File> module, or the L<DB_File> module's C<$DB_RECNO> bindings, which allow you to tie an array to a file so that accessing an element of the array actually accesses the corresponding line in the file. -If you want to load the entire file, you can use the C<File::Slurp> +If you want to load the entire file, you can use the L<File::Slurp> module to do it in one one simple and efficient step: - use File::Slurp; + use File::Slurp; - my $all_of_it = read_file($filename); # entire file in scalar - my @all_lines = read_file($filename); # one line per element + my $all_of_it = read_file($filename); # entire file in scalar + my @all_lines = read_file($filename); # one line per element Or you can read the entire file contents into a scalar like this: - my $var; - { - local $/; - open my $fh, '<', $file or die "can't open $file: $!"; - $var = <$fh>; - } + my $var; + { + local $/; + open my $fh, '<', $file or die "can't open $file: $!"; + $var = <$fh>; + } That temporarily undefs your record separator, and will automatically close the file at block exit. If the file is already open, just use this: - my $var = do { local $/; <$fh> }; + my $var = do { local $/; <$fh> }; You can also use a localized C<@ARGV> to eliminate the C<open>: - my $var = do { local( @ARGV, $/ ) = $file; <> }; + my $var = do { local( @ARGV, $/ ) = $file; <> }; For ordinary files you can also use the C<read> function. - read( $fh, $var, -s $fh ); + read( $fh, $var, -s $fh ); That third argument tests the byte size of the data on the C<$fh> filehandle and reads that many bytes into the buffer C<$var>. @@ -1212,65 +1202,64 @@ If your system supports the portable operating system programming interface (POSIX), you can use the following code, which you'll note turns off echo processing as well. - #!/usr/bin/perl -w - use strict; - $| = 1; - for (1..4) { - print "gimme: "; - my $got = getone(); - print "--> $got\n"; - } + #!/usr/bin/perl -w + use strict; + $| = 1; + for (1..4) { + print "gimme: "; + my $got = getone(); + print "--> $got\n"; + } exit; - BEGIN { - use POSIX qw(:termios_h); + BEGIN { + use POSIX qw(:termios_h); - my ($term, $oterm, $echo, $noecho, $fd_stdin); + my ($term, $oterm, $echo, $noecho, $fd_stdin); - my $fd_stdin = fileno(STDIN); + my $fd_stdin = fileno(STDIN); - $term = POSIX::Termios->new(); - $term->getattr($fd_stdin); - $oterm = $term->getlflag(); + $term = POSIX::Termios->new(); + $term->getattr($fd_stdin); + $oterm = $term->getlflag(); - $echo = ECHO | ECHOK | ICANON; - $noecho = $oterm & ~$echo; + $echo = ECHO | ECHOK | ICANON; + $noecho = $oterm & ~$echo; - sub cbreak { - $term->setlflag($noecho); - $term->setcc(VTIME, 1); - $term->setattr($fd_stdin, TCSANOW); - } + sub cbreak { + $term->setlflag($noecho); + $term->setcc(VTIME, 1); + $term->setattr($fd_stdin, TCSANOW); + } - sub cooked { - $term->setlflag($oterm); - $term->setcc(VTIME, 0); - $term->setattr($fd_stdin, TCSANOW); - } + sub cooked { + $term->setlflag($oterm); + $term->setcc(VTIME, 0); + $term->setattr($fd_stdin, TCSANOW); + } - sub getone { - my $key = ''; - cbreak(); - sysread(STDIN, $key, 1); - cooked(); - return $key; - } + sub getone { + my $key = ''; + cbreak(); + sysread(STDIN, $key, 1); + cooked(); + return $key; + } + } - } - - END { cooked() } + END { cooked() } The Term::ReadKey module from CPAN may be easier to use. Recent versions include also support for non-portable systems as well. - use Term::ReadKey; - open my $tty, '<', '/dev/tty'; - print "Gimme a char: "; - ReadMode "raw"; - my $key = ReadKey 0, $tty; - ReadMode "normal"; - printf "\nYou said %s, char number %03d\n", - $key, ord $key; + use Term::ReadKey; + open my $tty, '<', '/dev/tty'; + print "Gimme a char: "; + ReadMode "raw"; + my $key = ReadKey 0, $tty; + ReadMode "normal"; + printf "\nYou said %s, char number %03d\n", + $key, ord $key; =head2 How can I tell whether there's a character waiting on a filehandle? @@ -1284,11 +1273,11 @@ comp.unix.* for things like this: the answer is essentially the same. It's very system-dependent. Here's one solution that works on BSD systems: - sub key_ready { - my($rin, $nfd); - vec($rin, fileno(STDIN), 1) = 1; - return $nfd = select($rin,undef,undef,0); - } + sub key_ready { + my($rin, $nfd); + vec($rin, fileno(STDIN), 1) = 1; + return $nfd = select($rin,undef,undef,0); + } If you want to find out how many characters are waiting, there's also the FIONREAD ioctl call to be looked at. The I<h2ph> tool that @@ -1296,37 +1285,37 @@ comes with Perl tries to convert C include files to Perl code, which can be C<require>d. FIONREAD ends up defined as a function in the I<sys/ioctl.ph> file: - require 'sys/ioctl.ph'; + require 'sys/ioctl.ph'; - $size = pack("L", 0); - ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n"; - $size = unpack("L", $size); + $size = pack("L", 0); + ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n"; + $size = unpack("L", $size); If I<h2ph> wasn't installed or doesn't work for you, you can I<grep> the include files by hand: - % grep FIONREAD /usr/include/*/* - /usr/include/asm/ioctls.h:#define FIONREAD 0x541B + % grep FIONREAD /usr/include/*/* + /usr/include/asm/ioctls.h:#define FIONREAD 0x541B Or write a small C program using the editor of champions: - % cat > fionread.c - #include <sys/ioctl.h> - main() { - printf("%#08x\n", FIONREAD); - } - ^D - % cc -o fionread fionread.c - % ./fionread - 0x4004667f + % cat > fionread.c + #include <sys/ioctl.h> + main() { + printf("%#08x\n", FIONREAD); + } + ^D + % cc -o fionread fionread.c + % ./fionread + 0x4004667f And then hard-code it, leaving porting as an exercise to your successor. - $FIONREAD = 0x4004667f; # XXX: opsys dependent + $FIONREAD = 0x4004667f; # XXX: opsys dependent - $size = pack("L", 0); - ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n"; - $size = unpack("L", $size); + $size = pack("L", 0); + ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n"; + $size = unpack("L", $size); FIONREAD requires a filehandle connected to a stream, meaning that sockets, pipes, and tty devices work, but I<not> files. @@ -1336,7 +1325,7 @@ X<tail> X<IO::Handle> X<File::Tail> X<clearerr> First try - seek(GWFILE, 0, 1); + seek(GWFILE, 0, 1); The statement C<seek(GWFILE, 0, 1)> doesn't change the current position, but it does clear the end-of-file condition on the handle, so that the @@ -1345,19 +1334,19 @@ next C<< <GWFILE> >> makes Perl try again to read something. If that doesn't work (it relies on features of your stdio implementation), then you need something more like this: - for (;;) { - for ($curpos = tell(GWFILE); <GWFILE>; $curpos = tell(GWFILE)) { - # search for some stuff and put it into files - } - # sleep for a while - seek(GWFILE, $curpos, 0); # seek to where we had been - } + for (;;) { + for ($curpos = tell(GWFILE); <GWFILE>; $curpos = tell(GWFILE)) { + # search for some stuff and put it into files + } + # sleep for a while + seek(GWFILE, $curpos, 0); # seek to where we had been + } If this still doesn't work, look into the C<clearerr> method -from C<IO::Handle>, which resets the error and end-of-file states +from L<IO::Handle>, which resets the error and end-of-file states on the handle. -There's also a C<File::Tail> module from CPAN. +There's also a L<File::Tail> module from CPAN. =head2 How do I dup() a filehandle in Perl? X<dup> @@ -1365,13 +1354,13 @@ X<dup> If you check L<perlfunc/open>, you'll see that several of the ways to call open() should do the trick. For example: - open my $log, '>>', '/foo/logfile'; - open STDERR, '>&LOG'; + open my $log, '>>', '/foo/logfile'; + open STDERR, '>&LOG'; Or even with a literal numeric descriptor: - my $fd = $ENV{MHCONTEXTFD}; - open $mhcontext, "<&=$fd"; # like fdopen(3S) + my $fd = $ENV{MHCONTEXTFD}; + open $mhcontext, "<&=$fd"; # like fdopen(3S) Note that "<&STDIN" makes a copy, but "<&=STDIN" makes an alias. That means if you close an aliased handle, all @@ -1385,27 +1374,27 @@ X<file, closing file descriptors> X<POSIX> X<close> If, for some reason, you have a file descriptor instead of a filehandle (perhaps you used C<POSIX::open>), you can use the -C<close()> function from the C<POSIX> module: +C<close()> function from the L<POSIX> module: - use POSIX (); + use POSIX (); - POSIX::close( $fd ); + POSIX::close( $fd ); This should rarely be necessary, as the Perl C<close()> function is to be used for things that Perl opened itself, even if it was a dup of a numeric descriptor as with C<MHCONTEXT> above. But if you really have to, you may be able to do this: - require 'sys/syscall.ph'; - my $rc = syscall(&SYS_close, $fd + 0); # must force numeric - die "can't sysclose $fd: $!" unless $rc == -1; + require 'sys/syscall.ph'; + my $rc = syscall(&SYS_close, $fd + 0); # must force numeric + die "can't sysclose $fd: $!" unless $rc == -1; Or, just use the fdopen(3S) feature of C<open()>: - { - open my( $fh ), "<&=$fd" or die "Cannot reopen fd=$fd: $!"; - close $fh; - } + { + open my( $fh ), "<&=$fd" or die "Cannot reopen fd=$fd: $!"; + close $fh; + } =head2 Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work? X<filename, DOS issues> @@ -1455,20 +1444,20 @@ the file, there are a couple of things that you can do. Here's a reservoir-sampling algorithm from the Camel Book: - srand; - rand($.) < 1 && ($line = $_) while <>; + srand; + rand($.) < 1 && ($line = $_) while <>; This has a significant advantage in space over reading the whole file in. You can find a proof of this method in I<The Art of Computer Programming>, Volume 2, Section 3.4.2, by Donald E. Knuth. -You can use the C<File::Random> module which provides a function +You can use the L<File::Random> module which provides a function for that algorithm: - use File::Random qw/random_line/; - my $line = random_line($filename); + use File::Random qw/random_line/; + my $line = random_line($filename); -Another way is to use the C<Tie::File> module, which treats the entire +Another way is to use the L<Tie::File> module, which treats the entire file as an array. Simply access a random array element. =head2 Why do I get weird spaces when I print an array of lines? @@ -1479,82 +1468,82 @@ If you are seeing spaces between the elements of your array when you print the array, you are probably interpolating the array in double quotes: - my @animals = qw(camel llama alpaca vicuna); - print "animals are: @animals\n"; + my @animals = qw(camel llama alpaca vicuna); + print "animals are: @animals\n"; It's the double quotes, not the C<print>, doing this. Whenever you interpolate an array in a double quote context, Perl joins the elements with spaces (or whatever is in C<$">, which is a space by default): - animals are: camel llama alpaca vicuna + animals are: camel llama alpaca vicuna This is different than printing the array without the interpolation: - my @animals = qw(camel llama alpaca vicuna); - print "animals are: ", @animals, "\n"; + my @animals = qw(camel llama alpaca vicuna); + print "animals are: ", @animals, "\n"; Now the output doesn't have the spaces between the elements because the elements of C<@animals> simply become part of the list to C<print>: - animals are: camelllamaalpacavicuna + animals are: camelllamaalpacavicuna You might notice this when each of the elements of C<@array> end with a newline. You expect to print one element per line, but notice that every line after the first is indented: - this is a line - this is another line - this is the third line + this is a line + this is another line + this is the third line That extra space comes from the interpolation of the array. If you don't want to put anything between your array elements, don't use the array in double quotes. You can send it to print without them: - print @lines; + print @lines; =head2 How do I traverse a directory tree? (contributed by brian d foy) -The C<File::Find> module, which comes with Perl, does all of the hard +The L<File::Find> module, which comes with Perl, does all of the hard work to traverse a directory structure. It comes with Perl. You simply call the C<find> subroutine with a callback subroutine and the directories you want to traverse: - use File::Find; + use File::Find; - find( \&wanted, @directories ); + find( \&wanted, @directories ); - sub wanted { - # full path in $File::Find::name - # just filename in $_ - ... do whatever you want to do ... - } + sub wanted { + # full path in $File::Find::name + # just filename in $_ + ... do whatever you want to do ... + } -The C<File::Find::Closures>, which you can download from CPAN, provides -many ready-to-use subroutines that you can use with C<File::Find>. +The L<File::Find::Closures>, which you can download from CPAN, provides +many ready-to-use subroutines that you can use with L<File::Find>. -The C<File::Finder>, which you can download from CPAN, can help you +The L<File::Finder>, which you can download from CPAN, can help you create the callback subroutine using something closer to the syntax of the C<find> command-line utility: - use File::Find; - use File::Finder; + use File::Find; + use File::Finder; - my $deep_dirs = File::Finder->depth->type('d')->ls->exec('rmdir','{}'); + my $deep_dirs = File::Finder->depth->type('d')->ls->exec('rmdir','{}'); - find( $deep_dirs->as_options, @places ); + find( $deep_dirs->as_options, @places ); -The C<File::Find::Rule> module, which you can download from CPAN, has +The L<File::Find::Rule> module, which you can download from CPAN, has a similar interface, but does the traversal for you too: - use File::Find::Rule; + use File::Find::Rule; - my @files = File::Find::Rule->file() - ->name( '*.pm' ) - ->in( @INC ); + my @files = File::Find::Rule->file() + ->name( '*.pm' ) + ->in( @INC ); =head2 How do I delete a directory tree? @@ -1565,14 +1554,14 @@ If the directory is not empty (so, no files or subdirectories), you either have to empty it yourself (a lot of work) or use a module to help you. -The C<File::Path> module, which comes with Perl, has a C<remove_tree> +The L<File::Path> module, which comes with Perl, has a C<remove_tree> which can take care of all of the hard work for you: - use File::Path qw(remove_tree); + use File::Path qw(remove_tree); - remove_tree( @directories ); + remove_tree( @directories ); -The C<File::Path> module also has a legacy interface to the older +The L<File::Path> module also has a legacy interface to the older C<rmtree> subroutine. =head2 How do I copy an entire directory? diff --git a/cpan/perlfaq/lib/perlfaq6.pod b/cpan/perlfaq/lib/perlfaq6.pod index d72fee946b..7327b68d95 100644 --- a/cpan/perlfaq/lib/perlfaq6.pod +++ b/cpan/perlfaq/lib/perlfaq6.pod @@ -26,9 +26,9 @@ understandable. Describe what you're doing and how you're doing it, using normal Perl comments. - # turn the line into the first word, a colon, and the - # number of characters on the rest of the line - s/^(\w+)(.*)/ lc($1) . ":" . length($2) /meg; + # turn the line into the first word, a colon, and the + # number of characters on the rest of the line + s/^(\w+)(.*)/ lc($1) . ":" . length($2) /meg; =item Comments Inside the Regex @@ -39,20 +39,20 @@ help a lot. C</x> lets you turn this: - s{<(?:[^>'"]*|".*?"|'.*?')+>}{}gs; + s{<(?:[^>'"]*|".*?"|'.*?')+>}{}gs; into this: - s{ < # opening angle bracket - (?: # Non-backreffing grouping paren - [^>'"] * # 0 or more things that are neither > nor ' nor " - | # or else - ".*?" # a section between double quotes (stingy match) - | # or else - '.*?' # a section between single quotes (stingy match) - ) + # all occurring one or more times - > # closing angle bracket - }{}gsx; # replace with nothing, i.e. delete + s{ < # opening angle bracket + (?: # Non-backreffing grouping paren + [^>'"] * # 0 or more things that are neither > nor ' nor " + | # or else + ".*?" # a section between double quotes (stingy match) + | # or else + '.*?' # a section between single quotes (stingy match) + ) + # all occurring one or more times + > # closing angle bracket + }{}gsx; # replace with nothing, i.e. delete It's still not quite so clear as prose, but it is very useful for describing the meaning of each part of the pattern. @@ -65,8 +65,8 @@ describes this. For example, the C<s///> above uses braces as delimiters. Selecting another delimiter can avoid quoting the delimiter within the pattern: - s/\/usr\/local/\/usr\/share/g; # bad delimiter choice - s#/usr/local#/usr/share#g; # better + s/\/usr\/local/\/usr\/share/g; # bad delimiter choice + s#/usr/local#/usr/share#g; # better =back @@ -97,31 +97,31 @@ to newlines. But it's imperative that $/ be set to something other than the default, or else we won't actually ever have a multiline record read in. - $/ = ''; # read in whole paragraph, not just one line - while ( <> ) { - while ( /\b([\w'-]+)(\s+\g1)+\b/gi ) { # word starts alpha - print "Duplicate $1 at paragraph $.\n"; - } - } + $/ = ''; # read in whole paragraph, not just one line + while ( <> ) { + while ( /\b([\w'-]+)(\s+\g1)+\b/gi ) { # word starts alpha + print "Duplicate $1 at paragraph $.\n"; + } + } Here's code that finds sentences that begin with "From " (which would be mangled by many mailers): - $/ = ''; # read in whole paragraph, not just one line - while ( <> ) { - while ( /^From /gm ) { # /m makes ^ match next to \n - print "leading from in paragraph $.\n"; - } - } + $/ = ''; # read in whole paragraph, not just one line + while ( <> ) { + while ( /^From /gm ) { # /m makes ^ match next to \n + print "leading from in paragraph $.\n"; + } + } Here's code that finds everything between START and END in a paragraph: - undef $/; # read in whole file, not just one line or paragraph - while ( <> ) { - while ( /START(.*?)END/sgm ) { # /s makes . cross line boundaries - print "$1\n"; - } - } + undef $/; # read in whole file, not just one line or paragraph + while ( <> ) { + while ( /START(.*?)END/sgm ) { # /s makes . cross line boundaries + print "$1\n"; + } + } =head2 How can I pull out lines between two patterns that are themselves on different lines? X<..> @@ -129,11 +129,11 @@ X<..> You can use Perl's somewhat exotic C<..> operator (documented in L<perlop>): - perl -ne 'print if /START/ .. /END/' file1 file2 ... + perl -ne 'print if /START/ .. /END/' file1 file2 ... If you wanted text and not lines, you would use - perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ... + perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ... But if you want nested occurrences of C<START> through C<END>, you'll run up against the problem described in the question in this section @@ -141,13 +141,13 @@ on matching balanced text. Here's another example of using C<..>: - while (<>) { - $in_header = 1 .. /^$/; - $in_body = /^$/ .. eof; - # now choose between them - } continue { - $. = 0 if eof; # fix $. - } + while (<>) { + $in_header = 1 .. /^$/; + $in_body = /^$/ .. eof; + # now choose between them + } continue { + $. = 0 if eof; # fix $. + } =head2 How do I match XML, HTML, or other nasty, ugly things with a regex? X<regex, XML> X<regex, HTML> X<XML> X<HTML> X<pain> X<frustration> @@ -156,7 +156,7 @@ X<sucking out, will to live> (contributed by brian d foy) If you just want to get work done, use a module and forget about the -regular expressions. The C<XML::Parser> and C<HTML::Parser> modules +regular expressions. The L<XML::Parser> and L<HTML::Parser> modules are good starts, although each namespace has other parsing modules specialized for certain tasks and different ways of doing it. Start at CPAN Search ( http://search.cpan.org ) and wonder at all the work people @@ -199,14 +199,14 @@ do this. If you have File::Stream, this is easy. - use File::Stream; + use File::Stream; - my $stream = File::Stream->new( - $filehandle, - separator => qr/\s*,\s*/, - ); + my $stream = File::Stream->new( + $filehandle, + separator => qr/\s*,\s*/, + ); - print "$_\n" while <$stream>; + print "$_\n" while <$stream>; If you don't have File::Stream, you have to do a little more work. @@ -214,25 +214,25 @@ You can use the four-argument form of sysread to continually add to a buffer. After you add to the buffer, you check if you have a complete line (using your regular expression). - local $_ = ""; - while( sysread FH, $_, 8192, length ) { - while( s/^((?s).*?)your_pattern// ) { - my $record = $1; - # do stuff here. - } - } + local $_ = ""; + while( sysread FH, $_, 8192, length ) { + while( s/^((?s).*?)your_pattern// ) { + my $record = $1; + # do stuff here. + } + } You can do the same thing with foreach and a match using the c flag and the \G anchor, if you do not mind your entire file being in memory at the end. - local $_ = ""; - while( sysread FH, $_, 8192, length ) { - foreach my $record ( m/\G((?s).*?)your_pattern/gc ) { - # do stuff here. - } - substr( $_, 0, pos ) = "" if pos; - } + local $_ = ""; + while( sysread FH, $_, 8192, length ) { + foreach my $record ( m/\G((?s).*?)your_pattern/gc ) { + # do stuff here. + } + substr( $_, 0, pos ) = "" if pos; + } =head2 How do I substitute case-insensitively on the LHS while preserving case on the RHS? @@ -242,49 +242,49 @@ X<substitution, case preserving> X<s, case preserving> Here's a lovely Perlish solution by Larry Rosler. It exploits properties of bitwise xor on ASCII strings. - $_= "this is a TEsT case"; + $_= "this is a TEsT case"; - $old = 'test'; - $new = 'success'; + $old = 'test'; + $new = 'success'; - s{(\Q$old\E)} - { uc $new | (uc $1 ^ $1) . - (uc(substr $1, -1) ^ substr $1, -1) x - (length($new) - length $1) - }egi; + s{(\Q$old\E)} + { uc $new | (uc $1 ^ $1) . + (uc(substr $1, -1) ^ substr $1, -1) x + (length($new) - length $1) + }egi; - print; + print; And here it is as a subroutine, modeled after the above: - sub preserve_case($$) { - my ($old, $new) = @_; - my $mask = uc $old ^ $old; + sub preserve_case($$) { + my ($old, $new) = @_; + my $mask = uc $old ^ $old; - uc $new | $mask . - substr($mask, -1) x (length($new) - length($old)) + uc $new | $mask . + substr($mask, -1) x (length($new) - length($old)) } - $string = "this is a TEsT case"; - $string =~ s/(test)/preserve_case($1, "success")/egi; - print "$string\n"; + $string = "this is a TEsT case"; + $string =~ s/(test)/preserve_case($1, "success")/egi; + print "$string\n"; This prints: - this is a SUcCESS case + this is a SUcCESS case As an alternative, to keep the case of the replacement word if it is longer than the original, you can use this code, by Jeff Pinyan: - sub preserve_case { - my ($from, $to) = @_; - my ($lf, $lt) = map length, @_; + sub preserve_case { + my ($from, $to) = @_; + my ($lf, $lt) = map length, @_; - if ($lt < $lf) { $from = substr $from, 0, $lt } - else { $from .= substr $to, $lf } + if ($lt < $lf) { $from = substr $from, 0, $lt } + else { $from .= substr $to, $lf } - return uc $to | ($from ^ uc $from); - } + return uc $to | ($from ^ uc $from); + } This changes the sentence to "this is a SUcCess case." @@ -295,36 +295,36 @@ substitution have the same case, letter by letter, as the original. If the substitution has more characters than the string being substituted, the case of the last character is used for the rest of the substitution. - # Original by Nathan Torkington, massaged by Jeffrey Friedl - # - sub preserve_case($$) - { - my ($old, $new) = @_; - my ($state) = 0; # 0 = no change; 1 = lc; 2 = uc - my ($i, $oldlen, $newlen, $c) = (0, length($old), length($new)); - my ($len) = $oldlen < $newlen ? $oldlen : $newlen; - - for ($i = 0; $i < $len; $i++) { - if ($c = substr($old, $i, 1), $c =~ /[\W\d_]/) { - $state = 0; - } elsif (lc $c eq $c) { - substr($new, $i, 1) = lc(substr($new, $i, 1)); - $state = 1; - } else { - substr($new, $i, 1) = uc(substr($new, $i, 1)); - $state = 2; - } - } - # finish up with any remaining new (for when new is longer than old) - if ($newlen > $oldlen) { - if ($state == 1) { - substr($new, $oldlen) = lc(substr($new, $oldlen)); - } elsif ($state == 2) { - substr($new, $oldlen) = uc(substr($new, $oldlen)); - } - } - return $new; - } + # Original by Nathan Torkington, massaged by Jeffrey Friedl + # + sub preserve_case($$) + { + my ($old, $new) = @_; + my ($state) = 0; # 0 = no change; 1 = lc; 2 = uc + my ($i, $oldlen, $newlen, $c) = (0, length($old), length($new)); + my ($len) = $oldlen < $newlen ? $oldlen : $newlen; + + for ($i = 0; $i < $len; $i++) { + if ($c = substr($old, $i, 1), $c =~ /[\W\d_]/) { + $state = 0; + } elsif (lc $c eq $c) { + substr($new, $i, 1) = lc(substr($new, $i, 1)); + $state = 1; + } else { + substr($new, $i, 1) = uc(substr($new, $i, 1)); + $state = 2; + } + } + # finish up with any remaining new (for when new is longer than old) + if ($newlen > $oldlen) { + if ($state == 1) { + substr($new, $oldlen) = lc(substr($new, $oldlen)); + } elsif ($state == 2) { + substr($new, $oldlen) = uc(substr($new, $oldlen)); + } + } + return $new; + } =head2 How can I make C<\w> match national character sets? X<\w> @@ -356,11 +356,11 @@ a double-quoted string (see L<perlop> for more details). Remember also that any regex special characters will be acted on unless you precede the substitution with \Q. Here's an example: - $string = "Placido P. Octopus"; - $regex = "P."; + $string = "Placido P. Octopus"; + $regex = "P."; - $string =~ s/$regex/Polyp/; - # $string is now "Polypacido P. Octopus" + $string =~ s/$regex/Polyp/; + # $string is now "Polypacido P. Octopus" Because C<.> is special in regular expressions, and can match any single character, the regex C<P.> here has matched the <Pl> in the @@ -368,11 +368,11 @@ original string. To escape the special meaning of C<.>, we use C<\Q>: - $string = "Placido P. Octopus"; - $regex = "P."; + $string = "Placido P. Octopus"; + $regex = "P."; - $string =~ s/\Q$regex/Polyp/; - # $string is now "Placido Polyp Octopus" + $string =~ s/\Q$regex/Polyp/; + # $string is now "Placido Polyp Octopus" The use of C<\Q> causes the <.> in the regex to be treated as a regular character, so that C<P.> matches a C<P> followed by a dot. @@ -396,22 +396,22 @@ details. This example takes a regular expression from the argument list and prints the lines of input that match it: - my $pattern = shift @ARGV; + my $pattern = shift @ARGV; - while( <> ) { - print if m/$pattern/; - } + while( <> ) { + print if m/$pattern/; + } Versions of Perl prior to 5.6 would recompile the regular expression for each iteration, even if C<$pattern> had not changed. The C</o> would prevent this by telling Perl to compile the pattern the first time, then reuse that for subsequent iterations: - my $pattern = shift @ARGV; + my $pattern = shift @ARGV; - while( <> ) { - print if m/$pattern/o; # useful for Perl < 5.6 - } + while( <> ) { + print if m/$pattern/o; # useful for Perl < 5.6 + } In versions 5.6 and later, Perl won't recompile the regular expression if the variable hasn't changed, so you probably don't need the C</o> @@ -427,31 +427,31 @@ With Perls before 5.6, you should see C<re> reporting that its compiling the regular expression on each iteration. With Perl 5.6 or later, you should only see C<re> report that for the first iteration. - use re 'debug'; + use re 'debug'; - $regex = 'Perl'; - foreach ( qw(Perl Java Ruby Python) ) { - print STDERR "-" x 73, "\n"; - print STDERR "Trying $_...\n"; - print STDERR "\t$_ is good!\n" if m/$regex/; - } + $regex = 'Perl'; + foreach ( qw(Perl Java Ruby Python) ) { + print STDERR "-" x 73, "\n"; + print STDERR "Trying $_...\n"; + print STDERR "\t$_ is good!\n" if m/$regex/; + } =head2 How do I use a regular expression to strip C-style comments from a file? While this actually can be done, it's much harder than you'd think. For example, this one-liner - perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c + perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c will work in many but not all cases. You see, it's too simple-minded for certain kinds of C programs, in particular, those with what appear to be comments in quoted strings. For that, you'd need something like this, created by Jeffrey Friedl and later modified by Fred Curtis. - $/ = undef; - $_ = <>; - s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse; - print; + $/ = undef; + $_ = <>; + s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse; + print; This could, of course, be more legibly written with the C</x> modifier, adding whitespace and comments. Here it is expanded, courtesy of Fred Curtis. @@ -505,9 +505,9 @@ X<Text::Balanced> X<Regexp::Common> X<backtracking> X<recursion> (contributed by brian d foy) -Your first try should probably be the C<Text::Balanced> module, which +Your first try should probably be the L<Text::Balanced> module, which is in the Perl standard library since Perl 5.8. It has a variety of -functions to deal with tricky text. The C<Regexp::Common> module can +functions to deal with tricky text. The L<Regexp::Common> module can also help by providing canned patterns you can use. As of Perl 5.10, you can match balanced text with regular expressions @@ -520,9 +520,9 @@ nested angle brackets. This sample text has two "major" groups: a group with one level of nesting and a group with two levels of nesting. There are five total groups in angle brackets: - I have some <brackets in <nested brackets> > and - <another group <nested once <nested twice> > > - and that's it. + I have some <brackets in <nested brackets> > and + <another group <nested once <nested twice> > > + and that's it. The regular expression to match the balanced text uses two new (to Perl 5.10) regular expression features. These are covered in L<perlre> @@ -543,34 +543,34 @@ in the outer capture group as an independent part of the regex. Putting it all together, you have: - #!/usr/local/bin/perl5.10.0 - - my $string =<<"HERE"; - I have some <brackets in <nested brackets> > and - <another group <nested once <nested twice> > > - and that's it. - HERE - - my @groups = $string =~ m/ - ( # start of capture group 1 - < # match an opening angle bracket - (?: - [^<>]++ # one or more non angle brackets, non backtracking - | - (?1) # found < or >, so recurse to capture group 1 - )* - > # match a closing angle bracket - ) # end of capture group 1 - /xg; - - $" = "\n\t"; - print "Found:\n\t@groups\n"; + #!/usr/local/bin/perl5.10.0 + + my $string =<<"HERE"; + I have some <brackets in <nested brackets> > and + <another group <nested once <nested twice> > > + and that's it. + HERE + + my @groups = $string =~ m/ + ( # start of capture group 1 + < # match an opening angle bracket + (?: + [^<>]++ # one or more non angle brackets, non backtracking + | + (?1) # found < or >, so recurse to capture group 1 + )* + > # match a closing angle bracket + ) # end of capture group 1 + /xg; + + $" = "\n\t"; + print "Found:\n\t@groups\n"; The output shows that Perl found the two major groups: - Found: - <brackets in <nested brackets> > - <another group <nested once <nested twice> > > + Found: + <brackets in <nested brackets> > + <another group <nested once <nested twice> > > With a little extra work, you can get the all of the groups in angle brackets even if they are in other angle brackets too. Each time you @@ -578,53 +578,52 @@ get a balanced match, remove its outer delimiter (that's the one you just matched so don't match it again) and add it to a queue of strings to process. Keep doing that until you get no matches: - #!/usr/local/bin/perl5.10.0 + #!/usr/local/bin/perl5.10.0 - my @queue =<<"HERE"; - I have some <brackets in <nested brackets> > and - <another group <nested once <nested twice> > > - and that's it. - HERE + my @queue =<<"HERE"; + I have some <brackets in <nested brackets> > and + <another group <nested once <nested twice> > > + and that's it. + HERE - my $regex = qr/ - ( # start of bracket 1 - < # match an opening angle bracket - (?: - [^<>]++ # one or more non angle brackets, non backtracking - | - (?1) # recurse to bracket 1 - )* - > # match a closing angle bracket - ) # end of bracket 1 - /x; + my $regex = qr/ + ( # start of bracket 1 + < # match an opening angle bracket + (?: + [^<>]++ # one or more non angle brackets, non backtracking + | + (?1) # recurse to bracket 1 + )* + > # match a closing angle bracket + ) # end of bracket 1 + /x; - $" = "\n\t"; + $" = "\n\t"; - while( @queue ) - { - my $string = shift @queue; + while( @queue ) { + my $string = shift @queue; - my @groups = $string =~ m/$regex/g; - print "Found:\n\t@groups\n\n" if @groups; + my @groups = $string =~ m/$regex/g; + print "Found:\n\t@groups\n\n" if @groups; - unshift @queue, map { s/^<//; s/>$//; $_ } @groups; - } + unshift @queue, map { s/^<//; s/>$//; $_ } @groups; + } The output shows all of the groups. The outermost matches show up first and the nested matches so up later: - Found: - <brackets in <nested brackets> > - <another group <nested once <nested twice> > > + Found: + <brackets in <nested brackets> > + <another group <nested once <nested twice> > > - Found: - <nested brackets> + Found: + <nested brackets> - Found: - <nested once <nested twice> > + Found: + <nested once <nested twice> > - Found: - <nested twice> + Found: + <nested twice> =head2 What does it mean that regexes are greedy? How can I get around it? X<greedy> X<greediness> @@ -637,9 +636,9 @@ versions of the same quantifiers, use (C<??>, C<*?>, C<+?>, C<{}?>). An example: - $s1 = $s2 = "I am very very cold"; - $s1 =~ s/ve.*y //; # I am cold - $s2 =~ s/ve.*?y //; # I am very cold + $s1 = $s2 = "I am very very cold"; + $s1 =~ s/ve.*y //; # I am cold + $s2 =~ s/ve.*?y //; # I am very cold Notice how the second substitution stopped matching as soon as it encountered "y ". The C<*?> quantifier effectively tells the regular @@ -652,11 +651,11 @@ X<word> Use the split function: - while (<>) { - foreach $word ( split ) { - # do something with $word here - } - } + while (<>) { + foreach $word ( split ) { + # do something with $word here + } + } Note that this isn't really a word in the English sense; it's just chunks of consecutive non-whitespace characters. @@ -664,11 +663,11 @@ chunks of consecutive non-whitespace characters. To work with only alphanumeric sequences (including underscores), you might consider - while (<>) { - foreach $word (m/(\w+)/g) { - # do something with $word here - } - } + while (<>) { + foreach $word (m/(\w+)/g) { + # do something with $word here + } + } =head2 How can I print out a word-frequency or line-frequency summary? @@ -677,26 +676,26 @@ pretend that by word you mean chunk of alphabetics, hyphens, or apostrophes, rather than the non-whitespace chunk idea of a word given in the previous question: - while (<>) { - while ( /(\b[^\W_\d][\w'-]+\b)/g ) { # misses "`sheep'" - $seen{$1}++; - } - } + while (<>) { + while ( /(\b[^\W_\d][\w'-]+\b)/g ) { # misses "`sheep'" + $seen{$1}++; + } + } - while ( ($word, $count) = each %seen ) { - print "$count $word\n"; - } + while ( ($word, $count) = each %seen ) { + print "$count $word\n"; + } If you wanted to do the same thing for lines, you wouldn't need a regular expression: - while (<>) { - $seen{$_}++; - } + while (<>) { + $seen{$_}++; + } - while ( ($line, $count) = each %seen ) { - print "$count $line"; - } + while ( ($line, $count) = each %seen ) { + print "$count $line"; + } If you want these output in a sorted order, see L<perlfaq4>: "How do I sort a hash (optionally by value instead of key)?". @@ -715,11 +714,11 @@ X<regular expression, efficiency> If you have Perl 5.10 or later, this is almost trivial. You just smart match against an array of regular expression objects: - my @patterns = ( qr/Fr.d/, qr/B.rn.y/, qr/W.lm./ ); + my @patterns = ( qr/Fr.d/, qr/B.rn.y/, qr/W.lm./ ); - if( $string ~~ @patterns ) { - ... - }; + if( $string ~~ @patterns ) { + ... + }; The smart match stops when it finds a match, so it doesn't have to try every expression. @@ -730,16 +729,16 @@ In this example, perl must recompile the regular expression for every iteration of the C<foreach> loop since it has no way to know what C<$pattern> will be: - my @patterns = qw( foo bar baz ); + my @patterns = qw( foo bar baz ); - LINE: while( <DATA> ) { - foreach $pattern ( @patterns ) { - if( /\b$pattern\b/i ) { - print; - next LINE; - } - } - } + LINE: while( <DATA> ) { + foreach $pattern ( @patterns ) { + if( /\b$pattern\b/i ) { + print; + next LINE; + } + } + } The C<qr//> operator showed up in perl 5.005. It compiles a regular expression, but doesn't apply it. When you use the pre-compiled @@ -747,27 +746,26 @@ version of the regex, perl does less work. In this example, I inserted a C<map> to turn each pattern into its pre-compiled form. The rest of the script is the same, but faster: - my @patterns = map { qr/\b$_\b/i } qw( foo bar baz ); + my @patterns = map { qr/\b$_\b/i } qw( foo bar baz ); - LINE: while( <> ) { - foreach $pattern ( @patterns ) { - if( /$pattern/ ) - { - print; - next LINE; - } - } - } + LINE: while( <> ) { + foreach $pattern ( @patterns ) { + if( /$pattern/ ) { + print; + next LINE; + } + } + } In some cases, you may be able to make several patterns into a single regular expression. Beware of situations that require backtracking though. - my $regex = join '|', qw( foo bar baz ); + my $regex = join '|', qw( foo bar baz ); - LINE: while( <> ) { - print if /\b(?:$regex)\b/i; - } + LINE: while( <> ) { + print if /\b(?:$regex)\b/i; + } For more details on regular expression efficiency, see I<Mastering Regular Expressions> by Jeffrey Friedl. He explains how regular @@ -797,26 +795,26 @@ boundary before the "P" and after the "l". As long as something other than a word character precedes the "P" and succeeds the "l", the pattern will match. These strings match /\bPerl\b/. - "Perl" # no word char before P or after l - "Perl " # same as previous (space is not a word char) - "'Perl'" # the ' char is not a word char - "Perl's" # no word char before P, non-word char after "l" + "Perl" # no word char before P or after l + "Perl " # same as previous (space is not a word char) + "'Perl'" # the ' char is not a word char + "Perl's" # no word char before P, non-word char after "l" These strings do not match /\bPerl\b/. - "Perl_" # _ is a word char! - "Perler" # no word char before P, but one after l + "Perl_" # _ is a word char! + "Perler" # no word char before P, but one after l You don't have to use \b to match words though. You can look for non-word characters surrounded by word characters. These strings match the pattern /\b'\b/. - "don't" # the ' char is surrounded by "n" and "t" - "qep'a'" # the ' char is surrounded by "p" and "a" + "don't" # the ' char is surrounded by "n" and "t" + "qep'a'" # the ' char is surrounded by "p" and "a" These strings do not match /\b'\b/. - "foo'" # there is no word char after non-word ' + "foo'" # there is no word char after non-word ' You can also use the complement of \b, \B, to specify that there should not be a word boundary. @@ -824,13 +822,13 @@ should not be a word boundary. In the pattern /\Bam\B/, there must be a word character before the "a" and after the "m". These patterns match /\Bam\B/: - "llama" # "am" surrounded by word chars - "Samuel" # same + "llama" # "am" surrounded by word chars + "Samuel" # same These strings do not match /\Bam\B/ - "Sam" # no word boundary before "a", but one after "m" - "I am Sam" # "am" surrounded by non-word chars + "Sam" # no word boundary before "a", but one after "m" + "I am Sam" # "am" surrounded by non-word chars =head2 Why does using $&, $`, or $' slow my program down? @@ -881,8 +879,8 @@ the letter <a> shows up between C<22> and C<44> and you want to stop at C<a>. Simply matching pairs of digits skips over the C<a> and still matches C<44>. - $_ = "1122a44"; - my @pairs = m/(\d\d)/g; # qw( 11 22 44 ) + $_ = "1122a44"; + my @pairs = m/(\d\d)/g; # qw( 11 22 44 ) If you use the C<\G> anchor, you force the match after C<22> to start with the C<a>. The regular expression cannot match @@ -890,28 +888,26 @@ there since it does not find a digit, so the next match fails and the match operator returns the pairs it already found. - $_ = "1122a44"; - my @pairs = m/\G(\d\d)/g; # qw( 11 22 ) + $_ = "1122a44"; + my @pairs = m/\G(\d\d)/g; # qw( 11 22 ) You can also use the C<\G> anchor in scalar context. You still need the C<g> flag. - $_ = "1122a44"; - while( m/\G(\d\d)/g ) - { - print "Found $1\n"; - } + $_ = "1122a44"; + while( m/\G(\d\d)/g ) { + print "Found $1\n"; + } After the match fails at the letter C<a>, perl resets C<pos()> and the next match on the same string starts at the beginning. - $_ = "1122a44"; - while( m/\G(\d\d)/g ) - { - print "Found $1\n"; - } + $_ = "1122a44"; + while( m/\G(\d\d)/g ) { + print "Found $1\n"; + } - print "Found $1 after while" if m/(\d\d)/g; # finds "11" + print "Found $1 after while" if m/(\d\d)/g; # finds "11" You can disable C<pos()> resets on fail with the C<c> flag, documented in L<perlop> and L<perlreref>. Subsequent matches start where the last @@ -921,28 +917,27 @@ the C<while()> loop starts at the C<a> (where the last match stopped), and since it does not use any anchor it can skip over the C<a> to find C<44>. - $_ = "1122a44"; - while( m/\G(\d\d)/gc ) - { - print "Found $1\n"; - } + $_ = "1122a44"; + while( m/\G(\d\d)/gc ) { + print "Found $1\n"; + } - print "Found $1 after while" if m/(\d\d)/g; # finds "44" + print "Found $1 after while" if m/(\d\d)/g; # finds "44" Typically you use the C<\G> anchor with the C<c> flag when you want to try a different match if one fails, such as in a tokenizer. Jeffrey Friedl offers this example which works in 5.004 or later. - while (<>) { - chomp; - PARSER: { - m/ \G( \d+\b )/gcx && do { print "number: $1\n"; redo; }; - m/ \G( \w+ )/gcx && do { print "word: $1\n"; redo; }; - m/ \G( \s+ )/gcx && do { print "space: $1\n"; redo; }; - m/ \G( [^\w\d]+ )/gcx && do { print "other: $1\n"; redo; }; - } - } + while (<>) { + chomp; + PARSER: { + m/ \G( \d+\b )/gcx && do { print "number: $1\n"; redo; }; + m/ \G( \w+ )/gcx && do { print "word: $1\n"; redo; }; + m/ \G( \s+ )/gcx && do { print "space: $1\n"; redo; }; + m/ \G( [^\w\d]+ )/gcx && do { print "other: $1\n"; redo; }; + } + } For each line, the C<PARSER> loop first tries to match a series of digits followed by a word boundary. This match has to @@ -991,8 +986,8 @@ through the Encode module. See L<perluniintro>, L<perlunicode>, and L<Encode>. If you are stuck with older Perls, you can do Unicode with the -C<Unicode::String> module, and character conversions using the -C<Unicode::Map8> and C<Unicode::Map> modules. If you are using +L<Unicode::String> module, and character conversions using the +L<Unicode::Map8> and L<Unicode::Map> modules. If you are using Japanese encodings, you might try using the jperl 5.005_03. Finally, the following set of approaches was offered by Jeffrey @@ -1016,34 +1011,34 @@ looks like it is because "SG" is next to "XX", but there's no real Here are a few ways, all painful, to deal with it: - # Make sure adjacent "martian" bytes are no longer adjacent. - $martian =~ s/([A-Z][A-Z])/ $1 /g; + # Make sure adjacent "martian" bytes are no longer adjacent. + $martian =~ s/([A-Z][A-Z])/ $1 /g; - print "found GX!\n" if $martian =~ /GX/; + print "found GX!\n" if $martian =~ /GX/; Or like this: - @chars = $martian =~ m/([A-Z][A-Z]|[^A-Z])/g; - # above is conceptually similar to: @chars = $text =~ m/(.)/g; - # - foreach $char (@chars) { - print "found GX!\n", last if $char eq 'GX'; - } + @chars = $martian =~ m/([A-Z][A-Z]|[^A-Z])/g; + # above is conceptually similar to: @chars = $text =~ m/(.)/g; + # + foreach $char (@chars) { + print "found GX!\n", last if $char eq 'GX'; + } Or like this: - while ($martian =~ m/\G([A-Z][A-Z]|.)/gs) { # \G probably unneeded - print "found GX!\n", last if $1 eq 'GX'; - } + while ($martian =~ m/\G([A-Z][A-Z]|.)/gs) { # \G probably unneeded + print "found GX!\n", last if $1 eq 'GX'; + } Here's another, slightly less painful, way to do it from Benjamin Goldberg, who uses a zero-width negative look-behind assertion. - print "found GX!\n" if $martian =~ m/ - (?<![A-Z]) - (?:[A-Z][A-Z])*? - GX - /x; + print "found GX!\n" if $martian =~ m/ + (?<![A-Z]) + (?:[A-Z][A-Z])*? + GX + /x; This succeeds if the "martian" character GX is in the string, and fails otherwise. If you don't like using (?<!), a zero-width negative @@ -1068,71 +1063,71 @@ read the regular expression as user input and store it in C<$regex>. Once you have the pattern in C<$regex>, you use that variable in the match operator. - chomp( my $regex = <STDIN> ); + chomp( my $regex = <STDIN> ); - if( $string =~ m/$regex/ ) { ... } + if( $string =~ m/$regex/ ) { ... } Any regular expression special characters in C<$regex> are still special, and the pattern still has to be valid or Perl will complain. For instance, in this pattern there is an unpaired parenthesis. - my $regex = "Unmatched ( paren"; + my $regex = "Unmatched ( paren"; - "Two parens to bind them all" =~ m/$regex/; + "Two parens to bind them all" =~ m/$regex/; When Perl compiles the regular expression, it treats the parenthesis as the start of a memory match. When it doesn't find the closing parenthesis, it complains: - Unmatched ( in regex; marked by <-- HERE in m/Unmatched ( <-- HERE paren/ at script line 3. + Unmatched ( in regex; marked by <-- HERE in m/Unmatched ( <-- HERE paren/ at script line 3. You can get around this in several ways depending on our situation. First, if you don't want any of the characters in the string to be special, you can escape them with C<quotemeta> before you use the string. - chomp( my $regex = <STDIN> ); - $regex = quotemeta( $regex ); + chomp( my $regex = <STDIN> ); + $regex = quotemeta( $regex ); - if( $string =~ m/$regex/ ) { ... } + if( $string =~ m/$regex/ ) { ... } You can also do this directly in the match operator using the C<\Q> and C<\E> sequences. The C<\Q> tells Perl where to start escaping special characters, and the C<\E> tells it where to stop (see L<perlop> for more details). - chomp( my $regex = <STDIN> ); + chomp( my $regex = <STDIN> ); - if( $string =~ m/\Q$regex\E/ ) { ... } + if( $string =~ m/\Q$regex\E/ ) { ... } Alternately, you can use C<qr//>, the regular expression quote operator (see L<perlop> for more details). It quotes and perhaps compiles the pattern, and you can apply regular expression flags to the pattern. - chomp( my $input = <STDIN> ); + chomp( my $input = <STDIN> ); - my $regex = qr/$input/is; + my $regex = qr/$input/is; - $string =~ m/$regex/ # same as m/$input/is; + $string =~ m/$regex/ # same as m/$input/is; You might also want to trap any errors by wrapping an C<eval> block around the whole thing. - chomp( my $input = <STDIN> ); + chomp( my $input = <STDIN> ); - eval { - if( $string =~ m/\Q$input\E/ ) { ... } - }; - warn $@ if $@; + eval { + if( $string =~ m/\Q$input\E/ ) { ... } + }; + warn $@ if $@; Or... - my $regex = eval { qr/$input/is }; - if( defined $regex ) { - $string =~ m/$regex/; - } - else { - warn $@; - } + my $regex = eval { qr/$input/is }; + if( defined $regex ) { + $string =~ m/$regex/; + } + else { + warn $@; + } =head1 AUTHOR AND COPYRIGHT diff --git a/cpan/perlfaq/lib/perlfaq7.pod b/cpan/perlfaq/lib/perlfaq7.pod index c32b109542..df7a07ae9f 100644 --- a/cpan/perlfaq/lib/perlfaq7.pod +++ b/cpan/perlfaq/lib/perlfaq7.pod @@ -22,19 +22,19 @@ and mirrors." They are type specifiers, as detailed in L<perldata>: - $ for scalar values (number, string or reference) - @ for arrays - % for hashes (associative arrays) - & for subroutines (aka functions, procedures, methods) - * for all types of that symbol name. In version 4 you used them like - pointers, but in modern perls you can just use references. + $ for scalar values (number, string or reference) + @ for arrays + % for hashes (associative arrays) + & for subroutines (aka functions, procedures, methods) + * for all types of that symbol name. In version 4 you used them like + pointers, but in modern perls you can just use references. There are a couple of other symbols that you're likely to encounter that aren't really type specifiers: - <> are used for inputting a record from a filehandle. - \ takes a reference to something. + <> are used for inputting a record from a filehandle. + \ takes a reference to something. Note that <FILE> is I<neither> the type specifier for files nor the name of the handle. It is the C<< <> >> operator applied @@ -53,41 +53,41 @@ consisting of a simple word and the left-hand operand to the C<< => >> operator both count as though they were quoted: - This is like this - ------------ --------------- - $foo{line} $foo{'line'} - bar => stuff 'bar' => stuff + This is like this + ------------ --------------- + $foo{line} $foo{'line'} + bar => stuff 'bar' => stuff The final semicolon in a block is optional, as is the final comma in a list. Good style (see L<perlstyle>) says to put them in except for one-liners: - if ($whoops) { exit 1 } - @nums = (1, 2, 3); + if ($whoops) { exit 1 } + @nums = (1, 2, 3); - if ($whoops) { - exit 1; - } + if ($whoops) { + exit 1; + } - @lines = ( - "There Beren came from mountains cold", - "And lost he wandered under leaves", - ); + @lines = ( + "There Beren came from mountains cold", + "And lost he wandered under leaves", + ); =head2 How do I skip some return values? One way is to treat the return values as a list and index into it: - $dir = (getpwnam($user))[7]; + $dir = (getpwnam($user))[7]; Another way is to use undef as an element on the left-hand-side: - ($dev, $ino, undef, undef, $uid, $gid) = stat($file); + ($dev, $ino, undef, undef, $uid, $gid) = stat($file); You can also use a list slice to select only the elements that you need: - ($dev, $ino, $uid, $gid) = ( stat($file) )[0,1,4,5]; + ($dev, $ino, $uid, $gid) = ( stat($file) )[0,1,4,5]; =head2 How do I temporarily block warnings? @@ -95,28 +95,28 @@ If you are running Perl 5.6.0 or better, the C<use warnings> pragma allows fine control of what warning are produced. See L<perllexwarn> for more details. - { - no warnings; # temporarily turn off warnings - $a = $b + $c; # I know these might be undef - } + { + no warnings; # temporarily turn off warnings + $a = $b + $c; # I know these might be undef + } Additionally, you can enable and disable categories of warnings. You turn off the categories you want to ignore and you can still get other categories of warnings. See L<perllexwarn> for the complete details, including the category names and hierarchy. - { - no warnings 'uninitialized'; - $a = $b + $c; - } + { + no warnings 'uninitialized'; + $a = $b + $c; + } If you have an older version of Perl, the C<$^W> variable (documented in L<perlvar>) controls runtime warnings for a block: - { - local $^W = 0; # temporarily turn off warnings - $a = $b + $c; # I know these might be undef - } + { + local $^W = 0; # temporarily turn off warnings + $a = $b + $c; # I know these might be undef + } Note that like all the punctuation variables, you cannot currently use my() on C<$^W>, only local(). @@ -137,17 +137,17 @@ L<perlop>. A common mistake is to write: - unlink $file || die "snafu"; + unlink $file || die "snafu"; This gets interpreted as: - unlink ($file || die "snafu"); + unlink ($file || die "snafu"); To avoid this problem, either put in extra parentheses or use the super low precedence C<or> operator: - (unlink $file) || die "snafu"; - unlink $file or die "snafu"; + (unlink $file) || die "snafu"; + unlink $file or die "snafu"; The "English" operators (C<and>, C<or>, C<xor>, and C<not>) deliberately have precedence lower than that of list operators for @@ -162,7 +162,7 @@ Although it has the same precedence as in C, Perl's C<?:> operator produces an lvalue. This assigns $x to either $a or $b, depending on the trueness of $maybe: - ($maybe ? $a : $b) = $x; + ($maybe ? $a : $b) = $x; =head2 How do I declare/create a structure? @@ -170,9 +170,9 @@ In general, you don't "declare" a structure. Just use a (probably anonymous) hash reference. See L<perlref> and L<perldsc> for details. Here's an example: - $person = {}; # new anonymous hash - $person->{AGE} = 24; # set field AGE to 24 - $person->{NAME} = "Nat"; # set field NAME to "Nat" + $person = {}; # new anonymous hash + $person->{AGE} = 24; # set field AGE to 24 + $person->{NAME} = "Nat"; # set field NAME to "Nat" If you're looking for something a bit more rigorous, try L<perltoot>. @@ -280,12 +280,12 @@ but encourages closures. Here's a classic non-closure function-generating function: - sub add_function_generator { - return sub { shift() + shift() }; - } + sub add_function_generator { + return sub { shift() + shift() }; + } - $add_sub = add_function_generator(); - $sum = $add_sub->(4,5); # $sum is 9 now. + $add_sub = add_function_generator(); + $sum = $add_sub->(4,5); # $sum is 9 now. The anonymous subroutine returned by add_function_generator() isn't technically a closure because it refers to no lexicals outside its own @@ -298,13 +298,13 @@ outside the scope of that function itself. Such a reference requires that Perl return a proper closure, thus locking in for all time the value that the lexical had when the function was created. - sub make_adder { - my $addpiece = shift; - return sub { shift() + $addpiece }; - } + sub make_adder { + my $addpiece = shift; + return sub { shift() + $addpiece }; + } - $f1 = make_adder(20); - $f2 = make_adder(555); + $f1 = make_adder(20); + $f2 = make_adder(555); Now C<&$f1($n)> is always 20 plus whatever $n you pass in, whereas C<&$f2($n)> is always 555 plus whatever $n you pass in. The $addpiece @@ -313,8 +313,8 @@ in the closure sticks around. Closures are often used for less esoteric purposes. For example, when you want to pass in a bit of code into a function: - my $line; - timeout( 30, sub { $line = <STDIN> } ); + my $line; + timeout( 30, sub { $line = <STDIN> } ); If the code to execute had been passed in as a string, C<< '$line = <STDIN>' >>, there would have been no way for the @@ -328,10 +328,10 @@ This is sometimes used with a BEGIN block in package files to make sure a variable doesn't get meddled with during the lifetime of the package: - BEGIN { - my $id = 0; - sub next_id { ++$id } - } + BEGIN { + my $id = 0; + sub next_id { ++$id } + } This is discussed in more detail in L<perlsub>; see the entry on I<Persistent Private Variables>. @@ -347,32 +347,32 @@ interacting with either closures or aliased foreach() iterator variables and subroutine arguments. It used to be easy to inadvertently lose a variable's value this way, but now it's much harder. Take this code: - my $f = 'foo'; - sub T { - while ($i++ < 3) { my $f = $f; $f .= "bar"; print $f, "\n" } - } + my $f = 'foo'; + sub T { + while ($i++ < 3) { my $f = $f; $f .= "bar"; print $f, "\n" } + } - T; - print "Finally $f\n"; + T; + print "Finally $f\n"; If you are experiencing variable suicide, that C<my $f> in the subroutine doesn't pick up a fresh copy of the C<$f> whose value is <foo>. The output shows that inside the subroutine the value of C<$f> leaks through when it shouldn't, as in this output: - foobar - foobarbar - foobarbarbar - Finally foo + foobar + foobarbar + foobarbarbar + Finally foo The $f that has "bar" added to it three times should be a new C<$f> C<my $f> should create a new lexical variable each time through the loop. The expected output is: - foobar - foobar - foobar - Finally foo + foobar + foobar + foobar + Finally foo =head2 How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regex}? @@ -387,30 +387,30 @@ information on references. Regular variables and functions are quite easy to pass: just pass in a reference to an existing or anonymous variable or function: - func( \$some_scalar ); + func( \$some_scalar ); - func( \@some_array ); - func( [ 1 .. 10 ] ); + func( \@some_array ); + func( [ 1 .. 10 ] ); - func( \%some_hash ); - func( { this => 10, that => 20 } ); + func( \%some_hash ); + func( { this => 10, that => 20 } ); - func( \&some_func ); - func( sub { $_[0] ** $_[1] } ); + func( \&some_func ); + func( sub { $_[0] ** $_[1] } ); =item Passing Filehandles As of Perl 5.6, you can represent filehandles with scalar variables which you treat as any other scalar. - open my $fh, $filename or die "Cannot open $filename! $!"; - func( $fh ); + open my $fh, $filename or die "Cannot open $filename! $!"; + func( $fh ); - sub func { - my $passed_fh = shift; + sub func { + my $passed_fh = shift; - my $line = <$passed_fh>; - } + my $line = <$passed_fh>; + } Before Perl 5.6, you had to use the C<*FH> or C<\*FH> notations. These are "typeglobs"--see L<perldata/"Typeglobs and Filehandles"> @@ -422,34 +422,34 @@ Here's an example of how to pass in a string and a regular expression for it to match against. You construct the pattern with the C<qr//> operator: - sub compare($$) { - my ($val1, $regex) = @_; - my $retval = $val1 =~ /$regex/; - return $retval; - } - $match = compare("old McDonald", qr/d.*D/i); + sub compare($$) { + my ($val1, $regex) = @_; + my $retval = $val1 =~ /$regex/; + return $retval; + } + $match = compare("old McDonald", qr/d.*D/i); =item Passing Methods To pass an object method into a subroutine, you can do this: - call_a_lot(10, $some_obj, "methname") - sub call_a_lot { - my ($count, $widget, $trick) = @_; - for (my $i = 0; $i < $count; $i++) { - $widget->$trick(); - } - } + call_a_lot(10, $some_obj, "methname") + sub call_a_lot { + my ($count, $widget, $trick) = @_; + for (my $i = 0; $i < $count; $i++) { + $widget->$trick(); + } + } Or, you can use a closure to bundle up the object, its method call, and arguments: - my $whatnot = sub { $some_obj->obfuscate(@args) }; - func($whatnot); - sub func { - my $code = shift; - &$code(); - } + my $whatnot = sub { $some_obj->obfuscate(@args) }; + func($whatnot); + sub func { + my $code = shift; + &$code(); + } You could also investigate the can() method in the UNIVERSAL class (part of the standard perl distribution). @@ -464,7 +464,7 @@ In Perl 5.10, declare the variable with C<state>. The C<state> declaration creates the lexical variable that persists between calls to the subroutine: - sub counter { state $count = 1; $counter++ } + sub counter { state $count = 1; $counter++ } You can fake a static variable by using a lexical variable which goes out of scope. In this example, you define the subroutine C<counter>, and @@ -480,16 +480,16 @@ can access the value (and each time you do, you increment the value). The data in chunk of memory defined by C<$count> is private to C<counter>. - BEGIN { - my $count = 1; - sub counter { $count++ } - } + BEGIN { + my $count = 1; + sub counter { $count++ } + } - my $start = counter(); + my $start = counter(); - .... # code that calls counter(); + .... # code that calls counter(); - my $end = counter(); + my $end = counter(); In the previous example, you created a function-private variable because only one function remembered its reference. You could define @@ -502,11 +502,11 @@ function adds to the value and the other simply returns the value. They can both access C<$count>, and since it has gone out of scope, there is no other way to access it. - BEGIN { - my $count = 1; - sub increment_count { $count++ } - sub return_count { $count } - } + BEGIN { + my $count = 1; + sub increment_count { $count++ } + sub return_count { $count } + } To declare a file-private variable, you still use a lexical variable. A file is also a scope, so a lexical variable defined in the file @@ -532,25 +532,25 @@ lexical variables or (improperly) static(ly scoped) variables. For instance: - sub visible { - print "var has value $var\n"; - } + sub visible { + print "var has value $var\n"; + } - sub dynamic { - local $var = 'local'; # new temporary value for the still-global - visible(); # variable called $var - } + sub dynamic { + local $var = 'local'; # new temporary value for the still-global + visible(); # variable called $var + } - sub lexical { - my $var = 'private'; # new private variable, $var - visible(); # (invisible outside of sub scope) - } + sub lexical { + my $var = 'private'; # new private variable, $var + visible(); # (invisible outside of sub scope) + } - $var = 'global'; + $var = 'global'; - visible(); # prints global - dynamic(); # prints local - lexical(); # prints global + visible(); # prints global + dynamic(); # prints local + lexical(); # prints global Notice how at no point does the value "private" get printed. That's because $var only has that value within the block of the lexical() @@ -570,28 +570,28 @@ $Some_Pack::var. Note that the notation $::var is B<not> the dynamic $var in the current package, but rather the one in the "main" package, as though you had written $main::var. - use vars '$var'; - local $var = "global"; - my $var = "lexical"; + use vars '$var'; + local $var = "global"; + my $var = "lexical"; - print "lexical is $var\n"; - print "global is $main::var\n"; + print "lexical is $var\n"; + print "global is $main::var\n"; Alternatively you can use the compiler directive our() to bring a dynamic variable into the current lexical scope. - require 5.006; # our() did not exist before 5.6 - use vars '$var'; + require 5.006; # our() did not exist before 5.6 + use vars '$var'; - local $var = "global"; - my $var = "lexical"; + local $var = "global"; + my $var = "lexical"; - print "lexical is $var\n"; + print "lexical is $var\n"; - { - our $var; - print "global is $var\n"; - } + { + our $var; + print "global is $var\n"; + } =head2 What's the difference between deep and shallow binding? @@ -618,15 +618,15 @@ doesn't help you (such as with sort()). To enforce scalar context in this particular case, however, you need merely omit the parentheses: - local($foo) = <FILE>; # WRONG - local($foo) = scalar(<FILE>); # ok - local $foo = <FILE>; # right + local($foo) = <FILE>; # WRONG + local($foo) = scalar(<FILE>); # ok + local $foo = <FILE>; # right You should probably be using lexical variables anyway, although the issue is the same here: - my($foo) = <FILE>; # WRONG - my $foo = <FILE>; # right + my($foo) = <FILE>; # WRONG + my $foo = <FILE>; # right =head2 How do I redefine a builtin function, operator, or method? @@ -652,29 +652,29 @@ the prototype of C<foo> and passes it the current value of the argument list, C<@_>. Here's an example; the C<bar> subroutine calls C<&foo>, which prints its arguments list: - sub bar { &foo } + sub bar { &foo } - sub foo { print "Args in foo are: @_\n" } + sub foo { print "Args in foo are: @_\n" } - bar( qw( a b c ) ); + bar( qw( a b c ) ); When you call C<bar> with arguments, you see that C<foo> got the same C<@_>: - Args in foo are: a b c + Args in foo are: a b c Calling the subroutine with trailing parentheses, with or without arguments, does not use the current C<@_> and respects the subroutine prototype. Changing the example to put parentheses after the call to C<foo> changes the program: - sub bar { &foo() } + sub bar { &foo() } - sub foo { print "Args in foo are: @_\n" } + sub foo { print "Args in foo are: @_\n" } - bar( qw( a b c ) ); + bar( qw( a b c ) ); Now the output shows that C<foo> doesn't get the C<@_> from its caller. - Args in foo are: + Args in foo are: The main use of the C<@_> pass-through feature is to write subroutines whose main job it is to call other subroutines for you. For further @@ -684,24 +684,24 @@ details, see L<perlsub>. In Perl 5.10, use the C<given-when> construct described in L<perlsyn>: - use 5.010; + use 5.010; - given ( $string ) { - when( 'Fred' ) { say "I found Fred!" } - when( 'Barney' ) { say "I found Barney!" } - when( /Bamm-?Bamm/ ) { say "I found Bamm-Bamm!" } - default { say "I don't recognize the name!" } - }; + given ( $string ) { + when( 'Fred' ) { say "I found Fred!" } + when( 'Barney' ) { say "I found Barney!" } + when( /Bamm-?Bamm/ ) { say "I found Bamm-Bamm!" } + default { say "I don't recognize the name!" } + }; If one wants to use pure Perl and to be compatible with Perl versions prior to 5.10, the general answer is to use C<if-elsif-else>: - for ($variable_to_test) { - if (/pat1/) { } # do something - elsif (/pat2/) { } # do something else - elsif (/pat3/) { } # do something else - else { } # default - } + for ($variable_to_test) { + if (/pat1/) { } # do something + elsif (/pat2/) { } # do something else + elsif (/pat3/) { } # do something else + else { } # default + } Here's a simple example of a switch based on pattern matching, lined up in a way to make it look more like a switch statement. @@ -710,31 +710,31 @@ in $whatchamacallit: SWITCH: for (ref $whatchamacallit) { - /^$/ && die "not a reference"; + /^$/ && die "not a reference"; - /SCALAR/ && do { - print_scalar($$ref); - last SWITCH; - }; + /SCALAR/ && do { + print_scalar($$ref); + last SWITCH; + }; - /ARRAY/ && do { - print_array(@$ref); - last SWITCH; - }; + /ARRAY/ && do { + print_array(@$ref); + last SWITCH; + }; - /HASH/ && do { - print_hash(%$ref); - last SWITCH; - }; + /HASH/ && do { + print_hash(%$ref); + last SWITCH; + }; - /CODE/ && do { - warn "can't print function ref"; - last SWITCH; - }; + /CODE/ && do { + warn "can't print function ref"; + last SWITCH; + }; - # DEFAULT + # DEFAULT - warn "User defined type skipped"; + warn "User defined type skipped"; } @@ -748,29 +748,29 @@ different characters or if you want to arrange the matches so that one takes precedence over another, as C<"SEND"> has precedence over C<"STOP"> here: - chomp($answer = <>); - if ("SEND" =~ /^\Q$answer/i) { print "Action is send\n" } - elsif ("STOP" =~ /^\Q$answer/i) { print "Action is stop\n" } - elsif ("ABORT" =~ /^\Q$answer/i) { print "Action is abort\n" } - elsif ("LIST" =~ /^\Q$answer/i) { print "Action is list\n" } - elsif ("EDIT" =~ /^\Q$answer/i) { print "Action is edit\n" } + chomp($answer = <>); + if ("SEND" =~ /^\Q$answer/i) { print "Action is send\n" } + elsif ("STOP" =~ /^\Q$answer/i) { print "Action is stop\n" } + elsif ("ABORT" =~ /^\Q$answer/i) { print "Action is abort\n" } + elsif ("LIST" =~ /^\Q$answer/i) { print "Action is list\n" } + elsif ("EDIT" =~ /^\Q$answer/i) { print "Action is edit\n" } A totally different approach is to create a hash of function references. - my %commands = ( - "happy" => \&joy, - "sad", => \&sullen, - "done" => sub { die "See ya!" }, - "mad" => \&angry, - ); - - print "How are you? "; - chomp($string = <STDIN>); - if ($commands{$string}) { - $commands{$string}->(); - } else { - print "No such command: $string\n"; - } + my %commands = ( + "happy" => \&joy, + "sad", => \&sullen, + "done" => sub { die "See ya!" }, + "mad" => \&angry, + ); + + print "How are you? "; + chomp($string = <STDIN>); + if ($commands{$string}) { + $commands{$string}->(); + } else { + print "No such command: $string\n"; + } Starting from Perl 5.8, a source filter module, C<Switch>, can also be used to get switch and case. Its use is now discouraged, because it's @@ -787,7 +787,7 @@ undefined functions and methods. When it comes to undefined variables that would trigger a warning under C<use warnings>, you can promote the warning to an error. - use warnings FATAL => qw(uninitialized); + use warnings FATAL => qw(uninitialized); =head2 Why can't a method included in this same file be found? @@ -818,39 +818,39 @@ C<__PACKAGE__>, as documented in L<perldata>. You can only use the special literals as separate tokens, so you can't interpolate them into strings like you can with variables: - my $current_package = __PACKAGE__; - print "I am in package $current_package\n"; + my $current_package = __PACKAGE__; + print "I am in package $current_package\n"; If you want to find the package calling your code, perhaps to give better -diagnostics as C<Carp> does, use the C<caller> built-in: +diagnostics as L<Carp> does, use the C<caller> built-in: - sub foo { - my @args = ...; - my( $package, $filename, $line ) = caller; + sub foo { + my @args = ...; + my( $package, $filename, $line ) = caller; - print "I was called from package $package\n"; - ); + print "I was called from package $package\n"; + ); By default, your program starts in package C<main>, so you will always be in some package. This is different from finding out the package an object is blessed into, which might not be the current package. For that, use C<blessed> -from C<Scalar::Util>, part of the Standard Library since Perl 5.8: +from L<Scalar::Util>, part of the Standard Library since Perl 5.8: - use Scalar::Util qw(blessed); - my $object_package = blessed( $object ); + use Scalar::Util qw(blessed); + my $object_package = blessed( $object ); Most of the time, you shouldn't care what package an object is blessed into, however, as long as it claims to inherit from that class: - my $is_right_class = eval { $object->isa( $package ) }; # true or false + my $is_right_class = eval { $object->isa( $package ) }; # true or false And, with Perl 5.10 and later, you don't have to check for an inheritance to see if the object can handle a role. For that, you can use C<DOES>, which comes from C<UNIVERSAL>: - my $class_does_it = eval { $object->DOES( $role ) }; # true or false + my $class_does_it = eval { $object->DOES( $role ) }; # true or false You can safely replace C<isa> with C<DOES> (although the converse is not true). @@ -864,15 +864,15 @@ directives at the beginning of the line and somewhere where Perl expects a new statement (so not in the middle of statements like the # comments). You end the comment with C<=cut>, ending the Pod section: - =pod + =pod - my $object = NotGonnaHappen->new(); + my $object = NotGonnaHappen->new(); - ignored_sub(); + ignored_sub(); - $wont_be_assigned = 37; + $wont_be_assigned = 37; - =cut + =cut The quick-and-dirty method only works well when you don't plan to leave the commented code in the source. If a Pod parser comes along, @@ -885,17 +885,17 @@ the comments with C<comment>. End the comment using C<=end> with the same label. You still need the C<=cut> to go back to Perl code from the Pod comment: - =begin comment + =begin comment - my $object = NotGonnaHappen->new(); + my $object = NotGonnaHappen->new(); - ignored_sub(); + ignored_sub(); - $wont_be_assigned = 37; + $wont_be_assigned = 37; - =end comment + =end comment - =cut + =cut For more information on Pod, check out L<perlpod> and L<perlpodspec>. @@ -903,23 +903,23 @@ For more information on Pod, check out L<perlpod> and L<perlpodspec>. Use this code, provided by Mark-Jason Dominus: - sub scrub_package { - no strict 'refs'; - my $pack = shift; - die "Shouldn't delete main package" - if $pack eq "" || $pack eq "main"; - my $stash = *{$pack . '::'}{HASH}; - my $name; - foreach $name (keys %$stash) { - my $fullname = $pack . '::' . $name; - # Get rid of everything with that name. - undef $$fullname; - undef @$fullname; - undef %$fullname; - undef &$fullname; - undef *$fullname; - } - } + sub scrub_package { + no strict 'refs'; + my $pack = shift; + die "Shouldn't delete main package" + if $pack eq "" || $pack eq "main"; + my $stash = *{$pack . '::'}{HASH}; + my $name; + foreach $name (keys %$stash) { + my $fullname = $pack . '::' . $name; + # Get rid of everything with that name. + undef $$fullname; + undef @$fullname; + undef %$fullname; + undef &$fullname; + undef *$fullname; + } + } Or, if you're using a recent release of Perl, you can just use the Symbol::delete_package() function instead. @@ -929,9 +929,9 @@ just use the Symbol::delete_package() function instead. Beginners often think they want to have a variable contain the name of a variable. - $fred = 23; - $varname = "fred"; - ++$$varname; # $fred now 24 + $fred = 23; + $varname = "fred"; + ++$$varname; # $fred now 24 This works I<sometimes>, but it is a very bad idea for two reasons. @@ -953,9 +953,9 @@ symbolic references, you are just using the package's symbol-table hash (like C<%main::>) instead of a user-defined hash. The solution is to use your own hash or a real reference instead. - $USER_VARS{"fred"} = 23; - $varname = "fred"; - $USER_VARS{$varname}++; # not $$varname++ + $USER_VARS{"fred"} = 23; + $varname = "fred"; + $USER_VARS{$varname}++; # not $$varname++ There we're using the %USER_VARS hash instead of symbolic references. Sometimes this comes up in reading strings from the user with variable @@ -965,20 +965,20 @@ program-addressable namespace and the user-addressable one. Instead of reading a string and expanding it to the actual contents of your program's own variables: - $str = 'this has a $fred and $barney in it'; - $str =~ s/(\$\w+)/$1/eeg; # need double eval + $str = 'this has a $fred and $barney in it'; + $str =~ s/(\$\w+)/$1/eeg; # need double eval it would be better to keep a hash around like %USER_VARS and have variable references actually refer to entries in that hash: - $str =~ s/\$(\w+)/$USER_VARS{$1}/g; # no /e here at all + $str =~ s/\$(\w+)/$USER_VARS{$1}/g; # no /e here at all That's faster, cleaner, and safer than the previous approach. Of course, you don't need to use a dollar sign. You could use your own scheme to make it less confusing, like bracketed percent symbols, etc. - $str = 'this has a %fred% and %barney% in it'; - $str =~ s/%(\w+)%/$USER_VARS{$1}/g; # no /e here at all + $str = 'this has a %fred% and %barney% in it'; + $str =~ s/%(\w+)%/$USER_VARS{$1}/g; # no /e here at all Another reason that folks sometimes think they want a variable to contain the name of a variable is that they don't know how to build @@ -986,17 +986,17 @@ proper data structures using hashes. For example, let's say they wanted two hashes in their program: %fred and %barney, and that they wanted to use another scalar variable to refer to those by name. - $name = "fred"; - $$name{WIFE} = "wilma"; # set %fred + $name = "fred"; + $$name{WIFE} = "wilma"; # set %fred - $name = "barney"; - $$name{WIFE} = "betty"; # set %barney + $name = "barney"; + $$name{WIFE} = "betty"; # set %barney This is still a symbolic reference, and is still saddled with the problems enumerated above. It would be far better to write: - $folks{"fred"}{WIFE} = "wilma"; - $folks{"barney"}{WIFE} = "betty"; + $folks{"fred"}{WIFE} = "wilma"; + $folks{"barney"}{WIFE} = "betty"; And just use a multilevel hash to start with. @@ -1009,11 +1009,11 @@ through the symbol table for resolution. In those cases, you would turn off C<strict 'refs'> temporarily so you can play around with the symbol table. For example: - @colors = qw(red blue green yellow orange purple violet); - for my $name (@colors) { - no strict 'refs'; # renege for the block - *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; - } + @colors = qw(red blue green yellow orange purple violet); + for my $name (@colors) { + no strict 'refs'; # renege for the block + *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; + } All those functions (red(), blue(), green(), etc.) appear to be separate, but the real code in the closure actually was compiled only once. @@ -1048,7 +1048,7 @@ script executable. In either case, you should still be able to run the scripts with perl explicitly: - % perl script.pl + % perl script.pl If you get a message like "perl: command not found", perl is not in your PATH, which might also mean that the location of perl is not diff --git a/cpan/perlfaq/lib/perlfaq8.pod b/cpan/perlfaq/lib/perlfaq8.pod index c3b179d4be..91093b134c 100644 --- a/cpan/perlfaq/lib/perlfaq8.pod +++ b/cpan/perlfaq/lib/perlfaq8.pod @@ -40,21 +40,21 @@ How you access/control keyboards, screens, and pointing devices =item Keyboard - Term::Cap Standard perl distribution - Term::ReadKey CPAN - Term::ReadLine::Gnu CPAN - Term::ReadLine::Perl CPAN - Term::Screen CPAN + Term::Cap Standard perl distribution + Term::ReadKey CPAN + Term::ReadLine::Gnu CPAN + Term::ReadLine::Perl CPAN + Term::Screen CPAN =item Screen - Term::Cap Standard perl distribution - Curses CPAN - Term::ANSIColor CPAN + Term::Cap Standard perl distribution + Curses CPAN + Term::ANSIColor CPAN =item Mouse - Tk CPAN + Tk CPAN =back @@ -66,17 +66,17 @@ in this section of the perlfaq. In general, you don't, because you don't know whether the recipient has a color-aware display device. If you know that they have an ANSI terminal that understands -color, you can use the C<Term::ANSIColor> module from CPAN: +color, you can use the L<Term::ANSIColor> module from CPAN: - use Term::ANSIColor; - print color("red"), "Stop!\n", color("reset"); - print color("green"), "Go!\n", color("reset"); + use Term::ANSIColor; + print color("red"), "Stop!\n", color("reset"); + print color("green"), "Go!\n", color("reset"); Or like this: - use Term::ANSIColor qw(:constants); - print RED, "Stop!\n", RESET; - print GREEN, "Go!\n", RESET; + use Term::ANSIColor qw(:constants); + print RED, "Stop!\n", RESET; + print GREEN, "Go!\n", RESET; =head2 How do I read just one key without waiting for a return key? @@ -85,92 +85,92 @@ On many systems, you can just use the B<stty> command as shown in L<perlfunc/getc>, but as you see, that's already getting you into portability snags. - open(TTY, "+</dev/tty") or die "no tty: $!"; - system "stty cbreak </dev/tty >/dev/tty 2>&1"; - $key = getc(TTY); # perhaps this works - # OR ELSE - sysread(TTY, $key, 1); # probably this does - system "stty -cbreak </dev/tty >/dev/tty 2>&1"; + open(TTY, "+</dev/tty") or die "no tty: $!"; + system "stty cbreak </dev/tty >/dev/tty 2>&1"; + $key = getc(TTY); # perhaps this works + # OR ELSE + sysread(TTY, $key, 1); # probably this does + system "stty -cbreak </dev/tty >/dev/tty 2>&1"; -The C<Term::ReadKey> module from CPAN offers an easy-to-use interface that +The L<Term::ReadKey> module from CPAN offers an easy-to-use interface that should be more efficient than shelling out to B<stty> for each key. It even includes limited support for Windows. - use Term::ReadKey; - ReadMode('cbreak'); - $key = ReadKey(0); - ReadMode('normal'); + use Term::ReadKey; + ReadMode('cbreak'); + $key = ReadKey(0); + ReadMode('normal'); However, using the code requires that you have a working C compiler and can use it to build and install a CPAN module. Here's a solution -using the standard C<POSIX> module, which is already on your system +using the standard L<POSIX> module, which is already on your system (assuming your system supports POSIX). - use HotKey; - $key = readkey(); + use HotKey; + $key = readkey(); And here's the C<HotKey> module, which hides the somewhat mystifying calls to manipulate the POSIX termios structures. - # HotKey.pm - package HotKey; + # HotKey.pm + package HotKey; - @ISA = qw(Exporter); - @EXPORT = qw(cbreak cooked readkey); + @ISA = qw(Exporter); + @EXPORT = qw(cbreak cooked readkey); - use strict; - use POSIX qw(:termios_h); - my ($term, $oterm, $echo, $noecho, $fd_stdin); + use strict; + use POSIX qw(:termios_h); + my ($term, $oterm, $echo, $noecho, $fd_stdin); - $fd_stdin = fileno(STDIN); - $term = POSIX::Termios->new(); - $term->getattr($fd_stdin); - $oterm = $term->getlflag(); + $fd_stdin = fileno(STDIN); + $term = POSIX::Termios->new(); + $term->getattr($fd_stdin); + $oterm = $term->getlflag(); - $echo = ECHO | ECHOK | ICANON; - $noecho = $oterm & ~$echo; + $echo = ECHO | ECHOK | ICANON; + $noecho = $oterm & ~$echo; - sub cbreak { - $term->setlflag($noecho); # ok, so i don't want echo either - $term->setcc(VTIME, 1); - $term->setattr($fd_stdin, TCSANOW); - } + sub cbreak { + $term->setlflag($noecho); # ok, so i don't want echo either + $term->setcc(VTIME, 1); + $term->setattr($fd_stdin, TCSANOW); + } - sub cooked { - $term->setlflag($oterm); - $term->setcc(VTIME, 0); - $term->setattr($fd_stdin, TCSANOW); - } + sub cooked { + $term->setlflag($oterm); + $term->setcc(VTIME, 0); + $term->setattr($fd_stdin, TCSANOW); + } - sub readkey { - my $key = ''; - cbreak(); - sysread(STDIN, $key, 1); - cooked(); - return $key; - } + sub readkey { + my $key = ''; + cbreak(); + sysread(STDIN, $key, 1); + cooked(); + return $key; + } - END { cooked() } + END { cooked() } - 1; + 1; =head2 How do I check whether input is ready on the keyboard? The easiest way to do this is to read a key in nonblocking mode with the -C<Term::ReadKey> module from CPAN, passing it an argument of -1 to indicate +L<Term::ReadKey> module from CPAN, passing it an argument of -1 to indicate not to block: - use Term::ReadKey; + use Term::ReadKey; - ReadMode('cbreak'); + ReadMode('cbreak'); - if (defined ($char = ReadKey(-1)) ) { - # input was waiting and it was $char - } else { - # no input was waiting - } + if (defined ($char = ReadKey(-1)) ) { + # input was waiting and it was $char + } else { + # no input was waiting + } - ReadMode('normal'); # restore normal tty settings + ReadMode('normal'); # restore normal tty settings =head2 How do I clear the screen? @@ -180,66 +180,66 @@ To clear the screen, you just have to print the special sequence that tells the terminal to clear the screen. Once you have that sequence, output it when you want to clear the screen. -You can use the C<Term::ANSIScreen> module to get the special +You can use the L<Term::ANSIScreen> module to get the special sequence. Import the C<cls> function (or the C<:screen> tag): - use Term::ANSIScreen qw(cls); - my $clear_screen = cls(); + use Term::ANSIScreen qw(cls); + my $clear_screen = cls(); - print $clear_screen; + print $clear_screen; -The C<Term::Cap> module can also get the special sequence if you want +The L<Term::Cap> module can also get the special sequence if you want to deal with the low-level details of terminal control. The C<Tputs> method returns the string for the given capability: - use Term::Cap; + use Term::Cap; - $terminal = Term::Cap->Tgetent( { OSPEED => 9600 } ); - $clear_string = $terminal->Tputs('cl'); + $terminal = Term::Cap->Tgetent( { OSPEED => 9600 } ); + $clear_string = $terminal->Tputs('cl'); - print $clear_screen; + print $clear_screen; -On Windows, you can use the C<Win32::Console> module. After creating +On Windows, you can use the L<Win32::Console> module. After creating an object for the output filehandle you want to affect, call the C<Cls> method: - Win32::Console; + Win32::Console; - $OUT = Win32::Console->new(STD_OUTPUT_HANDLE); - my $clear_string = $OUT->Cls; + $OUT = Win32::Console->new(STD_OUTPUT_HANDLE); + my $clear_string = $OUT->Cls; - print $clear_screen; + print $clear_screen; If you have a command-line program that does the job, you can call it in backticks to capture whatever it outputs so you can use it later: - $clear_string = `clear`; + $clear_string = `clear`; - print $clear_string; + print $clear_string; =head2 How do I get the screen size? -If you have C<Term::ReadKey> module installed from CPAN, +If you have L<Term::ReadKey> module installed from CPAN, you can use it to fetch the width and height in characters and in pixels: - use Term::ReadKey; - ($wchar, $hchar, $wpixels, $hpixels) = GetTerminalSize(); + use Term::ReadKey; + ($wchar, $hchar, $wpixels, $hpixels) = GetTerminalSize(); This is more portable than the raw C<ioctl>, but not as illustrative: - require 'sys/ioctl.ph'; - die "no TIOCGWINSZ " unless defined &TIOCGWINSZ; - open(TTY, "+</dev/tty") or die "No tty: $!"; - unless (ioctl(TTY, &TIOCGWINSZ, $winsize='')) { - die sprintf "$0: ioctl TIOCGWINSZ (%08x: $!)\n", &TIOCGWINSZ; - } - ($row, $col, $xpixel, $ypixel) = unpack('S4', $winsize); - print "(row,col) = ($row,$col)"; - print " (xpixel,ypixel) = ($xpixel,$ypixel)" if $xpixel || $ypixel; - print "\n"; + require 'sys/ioctl.ph'; + die "no TIOCGWINSZ " unless defined &TIOCGWINSZ; + open(TTY, "+</dev/tty") or die "No tty: $!"; + unless (ioctl(TTY, &TIOCGWINSZ, $winsize='')) { + die sprintf "$0: ioctl TIOCGWINSZ (%08x: $!)\n", &TIOCGWINSZ; + } + ($row, $col, $xpixel, $ypixel) = unpack('S4', $winsize); + print "(row,col) = ($row,$col)"; + print " (xpixel,ypixel) = ($xpixel,$ypixel)" if $xpixel || $ypixel; + print "\n"; =head2 How do I ask the user for a password? @@ -252,13 +252,13 @@ You may do this with an old-style C<ioctl()> function, POSIX terminal control (see L<POSIX> or its documentation the Camel Book), or a call to the B<stty> program, with varying degrees of portability. -You can also do this for most systems using the C<Term::ReadKey> module +You can also do this for most systems using the L<Term::ReadKey> module from CPAN, which is easier to use and in theory more portable. - use Term::ReadKey; + use Term::ReadKey; - ReadMode('noecho'); - $password = ReadLine(0); + ReadMode('noecho'); + $password = ReadLine(0); =head2 How do I read and write the serial port? @@ -282,7 +282,7 @@ If you expect to use both read and write operations on the device, you'll have to open it for update (see L<perlfunc/"open"> for details). You may wish to open it without running the risk of blocking by using C<sysopen()> and C<O_RDWR|O_NDELAY|O_NOCTTY> from the -C<Fcntl> module (part of the standard perl distribution). See +L<Fcntl> module (part of the standard perl distribution). See L<perlfunc/"sysopen"> for more on this approach. =item end of line @@ -293,8 +293,8 @@ their usual (Unix) ASCII values of "\015" and "\012". You may have to give the numeric values you want directly, using octal ("\015"), hex ("0x0D"), or as a control-character specification ("\cM"). - print DEV "atv1\012"; # wrong, for some devices - print DEV "atv1\015"; # right, for some devices + print DEV "atv1\012"; # wrong, for some devices + print DEV "atv1\015"; # right, for some devices Even though with normal text files a "\n" will do the trick, there is still no unified scheme for terminating a line that is portable @@ -311,19 +311,19 @@ and the C<$|> variable to control autoflushing (see L<perlvar/$E<verbar>> and L<perlfunc/select>, or L<perlfaq5>, "How do I flush/unbuffer an output filehandle? Why must I do this?"): - $oldh = select(DEV); - $| = 1; - select($oldh); + $oldh = select(DEV); + $| = 1; + select($oldh); You'll also see code that does this without a temporary variable, as in - select((select(DEV), $| = 1)[0]); + select((select(DEV), $| = 1)[0]); Or if you don't mind pulling in a few thousand lines of code just because you're afraid of a little C<$|> variable: - use IO::Handle; - DEV->autoflush(1); + use IO::Handle; + DEV->autoflush(1); As mentioned in the previous item, this still doesn't work when using socket I/O between Unix and Macintosh. You'll need to hard code your @@ -346,19 +346,19 @@ fighting with C<sysread>, C<sysopen>, POSIX's C<tcgetattr> business, and various other functions that go bump in the night, finally came up with this: - sub open_modem { - use IPC::Open2; - my $stty = `/bin/stty -g`; - open2( \*MODEM_IN, \*MODEM_OUT, "cu -l$modem_device -s2400 2>&1"); - # starting cu hoses /dev/tty's stty settings, even when it has - # been opened on a pipe... - system("/bin/stty $stty"); - $_ = <MODEM_IN>; - chomp; - if ( !m/^Connected/ ) { - print STDERR "$0: cu printed `$_' instead of `Connected'\n"; - } - } + sub open_modem { + use IPC::Open2; + my $stty = `/bin/stty -g`; + open2( \*MODEM_IN, \*MODEM_OUT, "cu -l$modem_device -s2400 2>&1"); + # starting cu hoses /dev/tty's stty settings, even when it has + # been opened on a pipe... + system("/bin/stty $stty"); + $_ = <MODEM_IN>; + chomp; + if ( !m/^Connected/ ) { + print STDERR "$0: cu printed `$_' instead of `Connected'\n"; + } + } =head2 How do I decode encrypted password files? @@ -385,16 +385,16 @@ have to wait for it to finish before your program moves on to other tasks. Process management depends on your particular operating system, and many of the techniques are in L<perlipc>. -Several CPAN modules may be able to help, including C<IPC::Open2> or -C<IPC::Open3>, C<IPC::Run>, C<Parallel::Jobs>, -C<Parallel::ForkManager>, C<POE>, C<Proc::Background>, and -C<Win32::Process>. There are many other modules you might use, so +Several CPAN modules may be able to help, including L<IPC::Open2> or +L<IPC::Open3>, L<IPC::Run>, L<Parallel::Jobs>, +L<Parallel::ForkManager>, L<POE>, L<Proc::Background>, and +L<Win32::Process>. There are many other modules you might use, so check those namespaces for other options too. If you are on a Unix-like system, you might be able to get away with a system call where you put an C<&> on the end of the command: - system("cmd &") + system("cmd &") You can also try using C<fork>, as described in L<perlfunc> (although this is the same thing that many of the modules will do for you). @@ -422,22 +422,22 @@ not an issue with C<system("cmd&")>. You have to be prepared to "reap" the child process when it finishes. - $SIG{CHLD} = sub { wait }; + $SIG{CHLD} = sub { wait }; - $SIG{CHLD} = 'IGNORE'; + $SIG{CHLD} = 'IGNORE'; You can also use a double fork. You immediately C<wait()> for your first child, and the init daemon will C<wait()> for your grandchild once it exits. - unless ($pid = fork) { - unless (fork) { - exec "what you really wanna do"; - die "exec failed!"; - } - exit 0; - } - waitpid($pid, 0); + unless ($pid = fork) { + unless (fork) { + exec "what you really wanna do"; + die "exec failed!"; + } + exit 0; + } + waitpid($pid, 0); See L<perlipc/"Signals"> for other examples of code to do this. Zombies are not an issue with C<system("prog &")>. @@ -457,17 +457,17 @@ to handle the signal. After perl catches the signal, it looks in C<%SIG> for a key with the same name as the signal, then calls the subroutine value for that key. - # as an anonymous subroutine + # as an anonymous subroutine - $SIG{INT} = sub { syswrite(STDERR, "ouch\n", 5 ) }; + $SIG{INT} = sub { syswrite(STDERR, "ouch\n", 5 ) }; - # or a reference to a function + # or a reference to a function - $SIG{INT} = \&ouch; + $SIG{INT} = \&ouch; - # or the name of the function as a string + # or the name of the function as a string - $SIG{INT} = "ouch"; + $SIG{INT} = "ouch"; Perl versions before 5.8 had in its C source code signal handlers which would catch the signal and possibly run a Perl function that you had set @@ -496,28 +496,28 @@ the VMS equivalent is C<set time>. However, if all you want to do is change your time zone, you can probably get away with setting an environment variable: - $ENV{TZ} = "MST7MDT"; # Unixish - $ENV{'SYS$TIMEZONE_DIFFERENTIAL'}="-5" # vms - system "trn comp.lang.perl.misc"; + $ENV{TZ} = "MST7MDT"; # Unixish + $ENV{'SYS$TIMEZONE_DIFFERENTIAL'}="-5" # vms + system "trn comp.lang.perl.misc"; =head2 How can I sleep() or alarm() for under a second? X<Time::HiRes> X<BSD::Itimer> X<sleep> X<select> If you want finer granularity than the 1 second that the C<sleep()> function provides, the easiest way is to use the C<select()> function as -documented in L<perlfunc/"select">. Try the C<Time::HiRes> and -the C<BSD::Itimer> modules (available from CPAN, and starting from -Perl 5.8 C<Time::HiRes> is part of the standard distribution). +documented in L<perlfunc/"select">. Try the L<Time::HiRes> and +the L<BSD::Itimer> modules (available from CPAN, and starting from +Perl 5.8 L<Time::HiRes> is part of the standard distribution). =head2 How can I measure time under a second? X<Time::HiRes> X<BSD::Itimer> X<sleep> X<select> (contributed by brian d foy) -The C<Time::HiRes> module (part of the standard distribution as of +The L<Time::HiRes> module (part of the standard distribution as of Perl 5.8) measures time with the C<gettimeofday()> system call, which returns the time in microseconds since the epoch. If you can't install -C<Time::HiRes> for older Perls and you are on a Unixish system, you +L<Time::HiRes> for older Perls and you are on a Unixish system, you may be able to call C<gettimeofday(2)> directly. See L<perlfunc/syscall>. @@ -530,14 +530,14 @@ manpage for more details about C<END> blocks. For example, you can use this to make sure your filter program managed to finish its output without filling up the disk: - END { - close(STDOUT) || die "stdout close failed: $!"; - } + END { + close(STDOUT) || die "stdout close failed: $!"; + } The C<END> block isn't called when untrapped signals kill the program, though, so if you use C<END> blocks you should also use - use sigtrap qw(die normal-signals); + use sigtrap qw(die normal-signals); Perl's exception-handling mechanism is its C<eval()> operator. You can use C<eval()> as C<setjmp> and C<die()> as C<longjmp>. For @@ -546,7 +546,7 @@ handler for a blocking C<flock()> in L<perlipc/"Signals"> or the section on "Signals" in I<Programming Perl>. If exception handling is all you're interested in, use one of the -many CPAN modules that handle exceptions, such as C<Try::Tiny>. +many CPAN modules that handle exceptions, such as L<Try::Tiny>. If you want the C<atexit()> syntax (and an C<rmexit()> as well), try the C<AtExit> module available from CPAN. @@ -571,13 +571,13 @@ L<perlfunc>). Remember to check the modules that came with your distribution, and CPAN as well--someone may already have written a module to do it. On -Windows, try C<Win32::API>. On Macs, try C<Mac::Carbon>. If no module +Windows, try L<Win32::API>. On Macs, try L<Mac::Carbon>. If no module has an interface to the C function, you can inline a bit of C in your -Perl source with C<Inline::C>. +Perl source with L<Inline::C>. =head2 Where do I get the include files to do ioctl() or syscall()? -Historically, these would be generated by the C<h2ph> tool, part of the +Historically, these would be generated by the L<h2ph> tool, part of the standard perl distribution. This program converts C<cpp(1)> directives in C header files to files containing subroutine definitions, like C<&SYS_getitimer>, which you can use as arguments to your functions. @@ -586,17 +586,17 @@ Simple files like F<errno.h>, F<syscall.h>, and F<socket.h> were fine, but the hard ones like F<ioctl.h> nearly always need to be hand-edited. Here's how to install the *.ph files: - 1. become super-user - 2. cd /usr/include - 3. h2ph *.h */*.h + 1. become super-user + 2. cd /usr/include + 3. h2ph *.h */*.h If your system supports dynamic loading, for reasons of portability and -sanity you probably ought to use C<h2xs> (also part of the standard perl +sanity you probably ought to use L<h2xs> (also part of the standard perl distribution). This tool converts C header files to Perl extensions. -See L<perlxstut> for how to get started with C<h2xs>. +See L<perlxstut> for how to get started with L<h2xs>. If your system doesn't support dynamic loading, you still probably -ought to use C<h2xs>. See L<perlxstut> and L<ExtUtils::MakeMaker> for +ought to use L<h2xs>. See L<perlxstut> and L<ExtUtils::MakeMaker> for more information (in brief, just use B<make perl> instead of a plain B<make> to rebuild perl with a new static extension). @@ -608,16 +608,16 @@ scripts inherently insecure. Perl gives you a number of options =head2 How can I open a pipe both to and from a command? -The C<IPC::Open2> module (part of the standard perl distribution) is +The L<IPC::Open2> module (part of the standard perl distribution) is an easy-to-use approach that internally uses C<pipe()>, C<fork()>, and C<exec()> to do the job. Make sure you read the deadlock warnings in its documentation, though (see L<IPC::Open2>). See L<perlipc/"Bidirectional Communication with Another Process"> and L<perlipc/"Bidirectional Communication with Yourself"> -You may also use the C<IPC::Open3> module (part of the standard perl +You may also use the L<IPC::Open3> module (part of the standard perl distribution), but be warned that it has a different order of -arguments from C<IPC::Open2> (see L<IPC::Open3>). +arguments from L<IPC::Open2> (see L<IPC::Open3>). =head2 Why can't I get the output of a command with system()? @@ -627,105 +627,105 @@ the low 7 bits are the signal the process died from, if any, and the high 8 bits are the actual exit value). Backticks (``) run a command and return what it sent to STDOUT. - $exit_status = system("mail-users"); - $output_string = `ls`; + $exit_status = system("mail-users"); + $output_string = `ls`; =head2 How can I capture STDERR from an external command? There are three basic ways of running external commands: - system $cmd; # using system() - $output = `$cmd`; # using backticks (``) - open (PIPE, "cmd |"); # using open() + system $cmd; # using system() + $output = `$cmd`; # using backticks (``) + open (PIPE, "cmd |"); # using open() With C<system()>, both STDOUT and STDERR will go the same place as the script's STDOUT and STDERR, unless the C<system()> command redirects them. Backticks and C<open()> read B<only> the STDOUT of your command. -You can also use the C<open3()> function from C<IPC::Open3>. Benjamin +You can also use the C<open3()> function from L<IPC::Open3>. Benjamin Goldberg provides some sample code: To capture a program's STDOUT, but discard its STDERR: - use IPC::Open3; - use File::Spec; - use Symbol qw(gensym); - open(NULL, ">", File::Spec->devnull); - my $pid = open3(gensym, \*PH, ">&NULL", "cmd"); - while( <PH> ) { } - waitpid($pid, 0); + use IPC::Open3; + use File::Spec; + use Symbol qw(gensym); + open(NULL, ">", File::Spec->devnull); + my $pid = open3(gensym, \*PH, ">&NULL", "cmd"); + while( <PH> ) { } + waitpid($pid, 0); To capture a program's STDERR, but discard its STDOUT: - use IPC::Open3; - use File::Spec; - use Symbol qw(gensym); - open(NULL, ">", File::Spec->devnull); - my $pid = open3(gensym, ">&NULL", \*PH, "cmd"); - while( <PH> ) { } - waitpid($pid, 0); + use IPC::Open3; + use File::Spec; + use Symbol qw(gensym); + open(NULL, ">", File::Spec->devnull); + my $pid = open3(gensym, ">&NULL", \*PH, "cmd"); + while( <PH> ) { } + waitpid($pid, 0); To capture a program's STDERR, and let its STDOUT go to our own STDERR: - use IPC::Open3; - use Symbol qw(gensym); - my $pid = open3(gensym, ">&STDERR", \*PH, "cmd"); - while( <PH> ) { } - waitpid($pid, 0); + use IPC::Open3; + use Symbol qw(gensym); + my $pid = open3(gensym, ">&STDERR", \*PH, "cmd"); + while( <PH> ) { } + waitpid($pid, 0); To read both a command's STDOUT and its STDERR separately, you can redirect them to temp files, let the command run, then read the temp files: - use IPC::Open3; - use Symbol qw(gensym); - use IO::File; - local *CATCHOUT = IO::File->new_tmpfile; - local *CATCHERR = IO::File->new_tmpfile; - my $pid = open3(gensym, ">&CATCHOUT", ">&CATCHERR", "cmd"); - waitpid($pid, 0); - seek $_, 0, 0 for \*CATCHOUT, \*CATCHERR; - while( <CATCHOUT> ) {} - while( <CATCHERR> ) {} + use IPC::Open3; + use Symbol qw(gensym); + use IO::File; + local *CATCHOUT = IO::File->new_tmpfile; + local *CATCHERR = IO::File->new_tmpfile; + my $pid = open3(gensym, ">&CATCHOUT", ">&CATCHERR", "cmd"); + waitpid($pid, 0); + seek $_, 0, 0 for \*CATCHOUT, \*CATCHERR; + while( <CATCHOUT> ) {} + while( <CATCHERR> ) {} But there's no real need for B<both> to be tempfiles... the following should work just as well, without deadlocking: - use IPC::Open3; - use Symbol qw(gensym); - use IO::File; - local *CATCHERR = IO::File->new_tmpfile; - my $pid = open3(gensym, \*CATCHOUT, ">&CATCHERR", "cmd"); - while( <CATCHOUT> ) {} - waitpid($pid, 0); - seek CATCHERR, 0, 0; - while( <CATCHERR> ) {} + use IPC::Open3; + use Symbol qw(gensym); + use IO::File; + local *CATCHERR = IO::File->new_tmpfile; + my $pid = open3(gensym, \*CATCHOUT, ">&CATCHERR", "cmd"); + while( <CATCHOUT> ) {} + waitpid($pid, 0); + seek CATCHERR, 0, 0; + while( <CATCHERR> ) {} And it'll be faster, too, since we can begin processing the program's stdout immediately, rather than waiting for the program to finish. With any of these, you can change file descriptors before the call: - open(STDOUT, ">logfile"); - system("ls"); + open(STDOUT, ">logfile"); + system("ls"); or you can use Bourne shell file-descriptor redirection: - $output = `$cmd 2>some_file`; - open (PIPE, "cmd 2>some_file |"); + $output = `$cmd 2>some_file`; + open (PIPE, "cmd 2>some_file |"); You can also use file-descriptor redirection to make STDERR a duplicate of STDOUT: - $output = `$cmd 2>&1`; - open (PIPE, "cmd 2>&1 |"); + $output = `$cmd 2>&1`; + open (PIPE, "cmd 2>&1 |"); Note that you I<cannot> simply open STDERR to be a dup of STDOUT in your Perl program and avoid calling the shell to do the redirection. This doesn't work: - open(STDERR, ">&STDOUT"); - $alloutput = `cmd args`; # stderr still escapes + open(STDERR, ">&STDOUT"); + $alloutput = `cmd args`; # stderr still escapes This fails because the C<open()> makes STDERR go to where STDOUT was going at the time of the C<open()>. The backticks then make STDOUT go to @@ -739,40 +739,40 @@ F<versus/csh.whynot> article in the "Far More Than You Ever Wanted To Know" collection in http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz . To capture a command's STDERR and STDOUT together: - $output = `cmd 2>&1`; # either with backticks - $pid = open(PH, "cmd 2>&1 |"); # or with an open pipe - while (<PH>) { } # plus a read + $output = `cmd 2>&1`; # either with backticks + $pid = open(PH, "cmd 2>&1 |"); # or with an open pipe + while (<PH>) { } # plus a read To capture a command's STDOUT but discard its STDERR: - $output = `cmd 2>/dev/null`; # either with backticks - $pid = open(PH, "cmd 2>/dev/null |"); # or with an open pipe - while (<PH>) { } # plus a read + $output = `cmd 2>/dev/null`; # either with backticks + $pid = open(PH, "cmd 2>/dev/null |"); # or with an open pipe + while (<PH>) { } # plus a read To capture a command's STDERR but discard its STDOUT: - $output = `cmd 2>&1 1>/dev/null`; # either with backticks - $pid = open(PH, "cmd 2>&1 1>/dev/null |"); # or with an open pipe - while (<PH>) { } # plus a read + $output = `cmd 2>&1 1>/dev/null`; # either with backticks + $pid = open(PH, "cmd 2>&1 1>/dev/null |"); # or with an open pipe + while (<PH>) { } # plus a read To exchange a command's STDOUT and STDERR in order to capture the STDERR but leave its STDOUT to come out our old STDERR: - $output = `cmd 3>&1 1>&2 2>&3 3>&-`; # either with backticks - $pid = open(PH, "cmd 3>&1 1>&2 2>&3 3>&-|");# or with an open pipe - while (<PH>) { } # plus a read + $output = `cmd 3>&1 1>&2 2>&3 3>&-`; # either with backticks + $pid = open(PH, "cmd 3>&1 1>&2 2>&3 3>&-|");# or with an open pipe + while (<PH>) { } # plus a read To read both a command's STDOUT and its STDERR separately, it's easiest to redirect them separately to files, and then read from those files when the program is done: - system("program args 1>program.stdout 2>program.stderr"); + system("program args 1>program.stdout 2>program.stderr"); Ordering is important in all these examples. That's because the shell processes file descriptor redirections in strictly left to right order. - system("prog args 1>tmpfile 2>&1"); - system("prog args 2>&1 1>tmpfile"); + system("prog args 1>tmpfile 2>&1"); + system("prog args 2>&1 1>tmpfile"); The first command sends both standard out and standard error to the temporary file. The second command sends only the old standard output @@ -788,7 +788,7 @@ your Perl program can find out is whether the shell itself could be successfully started. You can still capture the shell's STDERR and check it for error messages. See L<"How can I capture STDERR from an external command?"> elsewhere in this document, or use the -C<IPC::Open3> module. +L<IPC::Open3> module. If there are no shell metacharacters in the argument of C<open()>, Perl runs the command directly, without using the shell, and can correctly @@ -808,17 +808,17 @@ Why send a clear message that isn't true? Consider this line: - `cat /etc/termcap`; + `cat /etc/termcap`; You forgot to check C<$?> to see whether the program even ran correctly. Even if you wrote - print `cat /etc/termcap`; + print `cat /etc/termcap`; this code could and probably should be written as - system("cat /etc/termcap") == 0 - or die "cat program failed!"; + system("cat /etc/termcap") == 0 + or die "cat program failed!"; which will echo the cat command's output as it is generated, instead of waiting until the program has completed to print it out. It also @@ -832,28 +832,28 @@ processing may take place, whereas backticks do not. This is a bit tricky. You can't simply write the command like this: - @ok = `grep @opts '$search_string' @filenames`; + @ok = `grep @opts '$search_string' @filenames`; As of Perl 5.8.0, you can use C<open()> with multiple arguments. Just like the list forms of C<system()> and C<exec()>, no shell escapes happen. - open( GREP, "-|", 'grep', @opts, $search_string, @filenames ); - chomp(@ok = <GREP>); - close GREP; + open( GREP, "-|", 'grep', @opts, $search_string, @filenames ); + chomp(@ok = <GREP>); + close GREP; You can also: - my @ok = (); - if (open(GREP, "-|")) { - while (<GREP>) { - chomp; - push(@ok, $_); - } - close GREP; - } else { - exec 'grep', @opts, $search_string, @filenames; - } + my @ok = (); + if (open(GREP, "-|")) { + while (<GREP>) { + chomp; + push(@ok, $_); + } + close GREP; + } else { + exec 'grep', @opts, $search_string, @filenames; + } Just as with C<system()>, no shell escapes happen when you C<exec()> a list. Further examples of this can be found in L<perlipc/"Safe Pipe @@ -867,7 +867,7 @@ stuck, because Windows does not have an argc/argv-style API. This happens only if your perl is compiled to use stdio instead of perlio, which is the default. Some (maybe all?) stdios set error and -eof flags that you may need to clear. The C<POSIX> module defines +eof flags that you may need to clear. The L<POSIX> module defines C<clearerr()> that you can use. That is the technically correct way to do it. Here are some less reliable workarounds: @@ -877,8 +877,8 @@ do it. Here are some less reliable workarounds: Try keeping around the seekpointer and go there, like this: - $where = tell(LOG); - seek(LOG, $where, 0); + $where = tell(LOG); + seek(LOG, $where, 0); =item 2 @@ -908,28 +908,28 @@ causes many inefficiencies. =head2 Can I use perl to run a telnet or ftp session? -Try the C<Net::FTP>, C<TCP::Client>, and C<Net::Telnet> modules +Try the L<Net::FTP>, L<TCP::Client>, and L<Net::Telnet> modules (available from CPAN). http://www.cpan.org/scripts/netstuff/telnet.emul.shar will also help -for emulating the telnet protocol, but C<Net::Telnet> is quite +for emulating the telnet protocol, but L<Net::Telnet> is quite probably easier to use. If all you want to do is pretend to be telnet but don't need the initial telnet handshaking, then the standard dual-process approach will suffice: - use IO::Socket; # new in 5.004 - $handle = IO::Socket::INET->new('www.perl.com:80') - or die "can't connect to port 80 on www.perl.com: $!"; - $handle->autoflush(1); - if (fork()) { # XXX: undef means failure - select($handle); - print while <STDIN>; # everything from stdin to socket - } else { - print while <$handle>; # everything from socket to stdout - } - close $handle; - exit; + use IO::Socket; # new in 5.004 + $handle = IO::Socket::INET->new('www.perl.com:80') + or die "can't connect to port 80 on www.perl.com: $!"; + $handle->autoflush(1); + if (fork()) { # XXX: undef means failure + select($handle); + print while <STDIN>; # everything from stdin to socket + } else { + print while <$handle>; # everything from socket to stdout + } + close $handle; + exit; =head2 How can I write expect in Perl? @@ -937,7 +937,7 @@ Once upon a time, there was a library called F<chat2.pl> (part of the standard perl distribution), which never really got finished. If you find it somewhere, I<don't use it>. These days, your best bet is to look at the Expect module available from CPAN, which also requires two -other modules from CPAN, C<IO::Pty> and C<IO::Stty>. +other modules from CPAN, L<IO::Pty> and L<IO::Stty>. =head2 Is there a way to hide perl's command line from programs such as "ps"? @@ -952,7 +952,7 @@ variable $0 as documented in L<perlvar>. This won't work on all operating systems, though. Daemon programs like sendmail place their state there, as in: - $0 = "orcus [accepting connections]"; + $0 = "orcus [accepting connections]"; =head2 I {changed directory, modified my environment} in a perl script. How come the change disappeared when I exited the script? How do I get my changes to be visible? @@ -1003,11 +1003,11 @@ tty. Background yourself like this: - fork && exit; + fork && exit; =back -The C<Proc::Daemon> module, available from CPAN, provides a function to +The L<Proc::Daemon> module, available from CPAN, provides a function to perform these actions for you. =head2 How do I find out if I'm running interactively or not? @@ -1021,24 +1021,24 @@ What do you really want to know? If you merely want to know if one of your filehandles is connected to a terminal, you can try the C<-t> file test: - if( -t STDOUT ) { - print "I'm connected to a terminal!\n"; - } + if( -t STDOUT ) { + print "I'm connected to a terminal!\n"; + } However, you might be out of luck if you expect that means there is a -real person on the other side. With the C<Expect> module, another +real person on the other side. With the L<Expect> module, another program can pretend to be a person. The program might even come close to passing the Turing test. -The C<IO::Interactive> module does the best it can to give you an +The L<IO::Interactive> module does the best it can to give you an answer. Its C<is_interactive> function returns an output filehandle; that filehandle points to standard output if the module thinks the session is interactive. Otherwise, the filehandle is a null handle that simply discards the output: - use IO::Interactive; + use IO::Interactive; - print { is_interactive } "I might go to standard output!\n"; + print { is_interactive } "I might go to standard output!\n"; This still doesn't guarantee that a real person is answering your prompts or reading your output. @@ -1047,16 +1047,16 @@ If you want to know how to handle automated testing for your distribution, you can check the environment. The CPAN Testers, for instance, set the value of C<AUTOMATED_TESTING>: - unless( $ENV{AUTOMATED_TESTING} ) { - print "Hello interactive tester!\n"; - } + unless( $ENV{AUTOMATED_TESTING} ) { + print "Hello interactive tester!\n"; + } =head2 How do I timeout a slow event? Use the C<alarm()> function, probably in conjunction with a signal handler, as documented in L<perlipc/"Signals"> and the section on "Signals" in the Camel. You may instead use the more flexible -C<Sys::AlarmCall> module available from CPAN. +L<Sys::AlarmCall> module available from CPAN. The C<alarm()> function is not implemented on all versions of Windows. Check the documentation for your specific version of Perl. @@ -1066,10 +1066,10 @@ X<BSD::Resource> X<limit> X<CPU> (contributed by Xho) -Use the C<BSD::Resource> module from CPAN. As an example: +Use the L<BSD::Resource> module from CPAN. As an example: - use BSD::Resource; - setrlimit(RLIMIT_CPU,10,20) or die $!; + use BSD::Resource; + setrlimit(RLIMIT_CPU,10,20) or die $!; This sets the soft and hard limits to 10 and 20 seconds, respectively. After 10 seconds of time spent running on the CPU (not "wall" time), @@ -1078,7 +1078,7 @@ trapped, will cause the process to terminate. If that signal is trapped, then after 10 more seconds (20 seconds in total) the process will be killed with a non-trappable signal. -See the C<BSD::Resource> and your systems documentation for the gory +See the L<BSD::Resource> and your systems documentation for the gory details. =head2 How do I avoid zombies on a Unix system? @@ -1089,14 +1089,14 @@ in L<perlfaq8/"How do I start a process in the background?">. =head2 How do I use an SQL database? -The C<DBI> module provides an abstract interface to most database +The L<DBI> module provides an abstract interface to most database servers and types, including Oracle, DB2, Sybase, mysql, Postgresql, ODBC, and flat files. The DBI module accesses each database type through a database driver, or DBD. You can see a complete list of available drivers on CPAN: http://www.cpan.org/modules/by-module/DBD/ . You can read more about DBI on http://dbi.perl.org . -Other modules provide more specific access: C<Win32::ODBC>, C<Alzabo>, +Other modules provide more specific access: L<Win32::ODBC>, L<Alzabo>, C<iodbc>, and others found on CPAN Search: http://search.cpan.org . =head2 How do I make a system() exit on control-C? @@ -1105,8 +1105,8 @@ You can't. You need to imitate the C<system()> call (see L<perlipc> for sample code) and then have a signal handler for the INT signal that passes the signal on to the subprocess. Or you can check for it: - $rc = system($cmd); - if ($rc & 127) { die "signal death" } + $rc = system($cmd); + if ($rc & 127) { die "signal death" } =head2 How do I open a file without blocking? @@ -1115,9 +1115,9 @@ non-blocking reads (most Unixish systems do), you need only to use the C<O_NDELAY> or C<O_NONBLOCK> flag from the C<Fcntl> module in conjunction with C<sysopen()>: - use Fcntl; - sysopen(FH, "/foo/somefile", O_WRONLY|O_NDELAY|O_CREAT, 0644) - or die "can't open /foo/somefile: $!": + use Fcntl; + sysopen(FH, "/foo/somefile", O_WRONLY|O_NDELAY|O_CREAT, 0644) + or die "can't open /foo/somefile: $!": =head2 How do I tell the difference between errors from the shell and perl? @@ -1133,43 +1133,43 @@ perl outputs its warnings by defining a custom warning and die functions. Consider this script, which has an error you may not notice immediately. - #!/usr/locl/bin/perl + #!/usr/locl/bin/perl - print "Hello World\n"; + print "Hello World\n"; I get an error when I run this from my shell (which happens to be bash). That may look like perl forgot it has a C<print()> function, but my shebang line is not the path to perl, so the shell runs the script, and I get the error. - $ ./test - ./test: line 3: print: command not found + $ ./test + ./test: line 3: print: command not found A quick and dirty fix involves a little bit of code, but this may be all you need to figure out the problem. - #!/usr/bin/perl -w + #!/usr/bin/perl -w - BEGIN { - $SIG{__WARN__} = sub{ print STDERR "Perl: ", @_; }; - $SIG{__DIE__} = sub{ print STDERR "Perl: ", @_; exit 1}; - } + BEGIN { + $SIG{__WARN__} = sub{ print STDERR "Perl: ", @_; }; + $SIG{__DIE__} = sub{ print STDERR "Perl: ", @_; exit 1}; + } - $a = 1 + undef; - $x / 0; - __END__ + $a = 1 + undef; + $x / 0; + __END__ The perl message comes out with "Perl" in front. The C<BEGIN> block works at compile time so all of the compilation errors and warnings get the "Perl:" prefix too. - Perl: Useless use of division (/) in void context at ./test line 9. - Perl: Name "main::a" used only once: possible typo at ./test line 8. - Perl: Name "main::x" used only once: possible typo at ./test line 9. - Perl: Use of uninitialized value in addition (+) at ./test line 8. - Perl: Use of uninitialized value in division (/) at ./test line 9. - Perl: Illegal division by zero at ./test line 9. - Perl: Illegal division by zero at -e line 3. + Perl: Useless use of division (/) in void context at ./test line 9. + Perl: Name "main::a" used only once: possible typo at ./test line 8. + Perl: Name "main::x" used only once: possible typo at ./test line 9. + Perl: Use of uninitialized value in addition (+) at ./test line 8. + Perl: Use of uninitialized value in division (/) at ./test line 9. + Perl: Illegal division by zero at ./test line 9. + Perl: Illegal division by zero at -e line 3. If I don't see that "Perl:", it's not from perl. @@ -1182,7 +1182,7 @@ Looking up every message is not the easiest way, so let perl to do it for you. Use the diagnostics pragma with turns perl's normal messages into longer discussions on the topic. - use diagnostics; + use diagnostics; If you don't get a paragraph or two of expanded discussion, it might not be perl's message. @@ -1195,16 +1195,16 @@ The easiest way is to have a module also named CPAN do it for you by using the C<cpan> command that comes with Perl. You can give it a list of modules to install: - $ cpan IO::Interactive Getopt::Whatever + $ cpan IO::Interactive Getopt::Whatever If you prefer C<CPANPLUS>, it's just as easy: - $ cpanp i IO::Interactive Getopt::Whatever + $ cpanp i IO::Interactive Getopt::Whatever If you want to install a distribution from the current directory, you can tell C<CPAN.pm> to install C<.> (the full stop): - $ cpan . + $ cpan . See the documentation for either of those commands to see what else you can do. @@ -1215,14 +1215,14 @@ paths. For distributions that use I<Makefile.PL>: - $ perl Makefile.PL - $ make test install + $ perl Makefile.PL + $ make test install For distributions that use I<Build.PL>: - $ perl Build.PL - $ ./Build test - $ ./Build install + $ perl Build.PL + $ ./Build test + $ ./Build install Some distributions may need to link to libraries or other third-party code and their build and installation sequences may be more complicated. @@ -1237,27 +1237,27 @@ and runs the file, it doesn't do anything else. The C<use> statement is the same as a C<require> run at compile-time, but Perl also calls the C<import> method for the loaded package. These two are the same: - use MODULE qw(import list); + use MODULE qw(import list); - BEGIN { - require MODULE; - MODULE->import(import list); - } + BEGIN { + require MODULE; + MODULE->import(import list); + } However, you can suppress the C<import> by using an explicit, empty import list. Both of these still happen at compile-time: - use MODULE (); + use MODULE (); - BEGIN { - require MODULE; - } + BEGIN { + require MODULE; + } Since C<use> will also call the C<import> method, the actual value for C<MODULE> must be a bareword. That is, C<use> cannot load files by name, although C<require> can: - require "$ENV{HOME}/lib/Foo.pm"; # no @INC searching! + require "$ENV{HOME}/lib/Foo.pm"; # no @INC searching! See the entry for C<use> in L<perlfunc> for more details. @@ -1266,7 +1266,7 @@ See the entry for C<use> in L<perlfunc> for more details. When you build modules, tell Perl where to install the modules. If you want to install modules for your own use, the easiest way might -be C<local::lib>, which you can download from CPAN. It sets various +be L<local::lib>, which you can download from CPAN. It sets various installation settings for you, and uses those same settings within your programs. @@ -1276,25 +1276,25 @@ for your particular situation. For C<Makefile.PL>-based distributions, use the INSTALL_BASE option when generating Makefiles: - perl Makefile.PL INSTALL_BASE=/mydir/perl + perl Makefile.PL INSTALL_BASE=/mydir/perl You can set this in your C<CPAN.pm> configuration so modules automatically install in your private library directory when you use the CPAN.pm shell: - % cpan - cpan> o conf makepl_arg INSTALL_BASE=/mydir/perl - cpan> o conf commit + % cpan + cpan> o conf makepl_arg INSTALL_BASE=/mydir/perl + cpan> o conf commit For C<Build.PL>-based distributions, use the --install_base option: - perl Build.PL --install_base /mydir/perl + perl Build.PL --install_base /mydir/perl You can configure C<CPAN.pm> to automatically use this option too: - % cpan - cpan> o conf mbuild_arg "--install_base /mydir/perl" - cpan> o conf commit + % cpan + cpan> o conf mbuild_arg "--install_base /mydir/perl" + cpan> o conf commit INSTALL_BASE tells these tools to put your modules into F</mydir/perl/lib/perl5>. See L<How do I add a directory to my @@ -1303,11 +1303,11 @@ installed modules. There is one caveat with INSTALL_BASE, though, since it acts differently from the PREFIX and LIB settings that older versions of -C<ExtUtils::MakeMaker> advocated. INSTALL_BASE does not support +L<ExtUtils::MakeMaker> advocated. INSTALL_BASE does not support installing modules for multiple versions of Perl or different architectures under the same directory. You should consider whether you really want that and, if you do, use the older PREFIX and LIB -settings. See the C<ExtUtils::Makemaker> documentation for more details. +settings. See the L<ExtUtils::Makemaker> documentation for more details. =head2 How do I add the directory my program lives in to the module/library search path? @@ -1317,46 +1317,46 @@ If you know the directory already, you can add it to C<@INC> as you would for any other directory. You might <use lib> if you know the directory at compile time: - use lib $directory; + use lib $directory; The trick in this task is to find the directory. Before your script does anything else (such as a C<chdir>), you can get the current working directory with the C<Cwd> module, which comes with Perl: - BEGIN { - use Cwd; - our $directory = cwd; - } + BEGIN { + use Cwd; + our $directory = cwd; + } - use lib $directory; + use lib $directory; You can do a similar thing with the value of C<$0>, which holds the script name. That might hold a relative path, but C<rel2abs> can turn it into an absolute path. Once you have the - BEGIN { - use File::Spec::Functions qw(rel2abs); - use File::Basename qw(dirname); + BEGIN { + use File::Spec::Functions qw(rel2abs); + use File::Basename qw(dirname); - my $path = rel2abs( $0 ); - our $directory = dirname( $path ); - } + my $path = rel2abs( $0 ); + our $directory = dirname( $path ); + } - use lib $directory; + use lib $directory; -The C<FindBin> module, which comes with Perl, might work. It finds the +The L<FindBin> module, which comes with Perl, might work. It finds the directory of the currently running script and puts it in C<$Bin>, which you can then use to construct the right library path: - use FindBin qw($Bin); + use FindBin qw($Bin); -You can also use C<local::lib> to do much of the same thing. Install -modules using C<local::lib>'s settings then use the module in your +You can also use L<local::lib> to do much of the same thing. Install +modules using L<local::lib>'s settings then use the module in your program: - use local::lib; # sets up a local lib at ~/perl5 + use local::lib; # sets up a local lib at ~/perl5 -See the C<local::lib> documentation for more details. +See the L<local::lib> documentation for more details. =head2 How do I add a directory to my include path (@INC) at runtime? @@ -1367,27 +1367,27 @@ environment variables, run-time switches, and in-code statements: =item the C<PERLLIB> environment variable - $ export PERLLIB=/path/to/my/dir - $ perl program.pl + $ export PERLLIB=/path/to/my/dir + $ perl program.pl =item the C<PERL5LIB> environment variable - $ export PERL5LIB=/path/to/my/dir - $ perl program.pl + $ export PERL5LIB=/path/to/my/dir + $ perl program.pl =item the C<perl -Idir> command line flag - $ perl -I/path/to/my/dir program.pl + $ perl -I/path/to/my/dir program.pl =item the C<lib> pragma: - use lib "$ENV{HOME}/myown_perllib"; + use lib "$ENV{HOME}/myown_perllib"; -=item the C<local::lib> module: +=item the L<local::lib> module: - use local::lib; + use local::lib; - use local::lib "~/myown_perllib"; + use local::lib "~/myown_perllib"; =back @@ -1398,7 +1398,7 @@ included with the 5.002 release of Perl. =head2 What is socket.ph and where do I get it? It's a Perl 4 style file defining values for system networking -constants. Sometimes it is built using C<h2ph> when Perl is installed, +constants. Sometimes it is built using L<h2ph> when Perl is installed, but other times it is not. Modern programs C<use Socket;> instead. =head1 AUTHOR AND COPYRIGHT diff --git a/cpan/perlfaq/lib/perlfaq9.pod b/cpan/perlfaq/lib/perlfaq9.pod index 8ab955243a..81e7613fc8 100644 --- a/cpan/perlfaq/lib/perlfaq9.pod +++ b/cpan/perlfaq/lib/perlfaq9.pod @@ -48,37 +48,36 @@ systems. C<CGI.pm> selects an appropriate newline representation (contributed by brian d foy) There are many things that might be wrong with your CGI program, and only -some of them might be related to Perl. Try going through the troubleshooting -guide on Perlmonks: - - http://www.perlmonks.org/?node_id=380424 +some of them might be related to Perl. Try going through the L<troubleshooting +guide|http://www.perlmonks.org/?node_id=380424> on +L<Perlmonks|http://www.perlmonks.org> =head2 How can I get better error messages from a CGI program? -Use the C<CGI::Carp> module. It replaces C<warn> and C<die>, plus the -normal C<Carp> module's C<carp>, C<croak>, and C<confess> functions with +Use the L<CGI::Carp> module. It replaces C<warn> and C<die>, plus the +normal L<Carp> module's C<carp>, C<croak>, and C<confess> functions with more verbose and safer versions. It still sends them to the normal server error log. - use CGI::Carp; - warn "This is a complaint"; - die "But this one is serious"; + use CGI::Carp; + warn "This is a complaint"; + die "But this one is serious"; -The following use of C<CGI::Carp> also redirects errors to a file of your choice, +The following use of L<CGI::Carp> also redirects errors to a file of your choice, placed in a C<BEGIN> block to catch compile-time warnings as well: - BEGIN { - use CGI::Carp qw(carpout); - open(LOG, ">>/var/local/cgi-logs/mycgi-log") - or die "Unable to append to mycgi-log: $!\n"; - carpout(*LOG); - } + BEGIN { + use CGI::Carp qw(carpout); + open(LOG, ">>/var/local/cgi-logs/mycgi-log") + or die "Unable to append to mycgi-log: $!\n"; + carpout(*LOG); + } You can even arrange for fatal errors to go back to the client browser, which is nice for your own debugging, but might confuse the end user. - use CGI::Carp qw(fatalsToBrowser); - die "Bad error here"; + use CGI::Carp qw(fatalsToBrowser); + die "Bad error here"; Even if the error happens before you get the HTTP header out, the module will try to take care of this to avoid the dreaded server 500 errors. @@ -88,9 +87,9 @@ stamp prepended. =head2 How do I remove HTML from a string? -The most correct way (albeit not the fastest) is to use C<HTML::Parser> +The most correct way (albeit not the fastest) is to use L<HTML::Parser> from CPAN. Another mostly correct -way is to use C<HTML::FormatText> which not only removes HTML but also +way is to use L<HTML::FormatText> which not only removes HTML but also attempts to do a little simple formatting of the resulting plain text. Many folks attempt a simple-minded regular expression approach, like @@ -101,8 +100,8 @@ entities--like C<<> for example. Here's one "simple-minded" approach, that works for most files: - #!/usr/bin/perl -p0777 - s/<(?:[^>'"]*|(['"]).*?\g1)*>//gs + #!/usr/bin/perl -p0777 + s/<(?:[^>'"]*|(['"]).*?\g1)*>//gs If you want a more complete solution, see the 3-stage striphtml program in @@ -112,37 +111,37 @@ http://www.cpan.org/authors/Tom_Christiansen/scripts/striphtml.gz Here are some tricky cases that you should think about when picking a solution: - <IMG SRC = "foo.gif" ALT = "A > B"> + <IMG SRC = "foo.gif" ALT = "A > B"> - <IMG SRC = "foo.gif" - ALT = "A > B"> + <IMG SRC = "foo.gif" + ALT = "A > B"> - <!-- <A comment> --> + <!-- <A comment> --> - <script>if (a<b && a>c)</script> + <script>if (a<b && a>c)</script> - <# Just data #> + <# Just data #> - <![INCLUDE CDATA [ >>>>>>>>>>>> ]]> + <![INCLUDE CDATA [ >>>>>>>>>>>> ]]> If HTML comments include other tags, those solutions would also break on text like this: - <!-- This section commented out. - <B>You can't see me!</B> - --> + <!-- This section commented out. + <B>You can't see me!</B> + --> =head2 How do I extract URLs? You can easily extract all sorts of URLs from HTML with -C<HTML::SimpleLinkExtor> which handles anchors, images, objects, +L<HTML::SimpleLinkExtor> which handles anchors, images, objects, frames, and many other tags that can contain a URL. If you need anything more complex, you can create your own subclass of -C<HTML::LinkExtor> or C<HTML::Parser>. You might even use -C<HTML::SimpleLinkExtor> as an example for something specifically +L<HTML::LinkExtor> or L<HTML::Parser>. You might even use +L<HTML::SimpleLinkExtor> as an example for something specifically suited to your needs. -You can use C<URI::Find> to extract URLs from an arbitrary text document. +You can use L<URI::Find> to extract URLs from an arbitrary text document. Less complete solutions involving regular expressions can save you a lot of processing time if you know that the input is simple. One @@ -150,13 +149,13 @@ solution from Tom Christiansen runs 100 times faster than most module-based approaches but only extracts URLs from anchors where the first attribute is HREF and there are no other attributes. - #!/usr/bin/perl -n00 - # qxurl - tchrist@perl.com - print "$2\n" while m{ - < \s* - A \s+ HREF \s* = \s* (["']) (.*?) \g1 - \s* > - }gsix; + #!/usr/bin/perl -n00 + # qxurl - tchrist@perl.com + print "$2\n" while m{ + < \s* + A \s+ HREF \s* = \s* (["']) (.*?) \g1 + \s* > + }gsix; =head2 How do I download a file from the user's machine? How do I open a file on another machine? @@ -180,69 +179,69 @@ The C<CGI.pm> module (which comes with Perl) has functions to create the HTML form widgets. See the C<CGI.pm> documentation for more examples. - use CGI qw/:standard/; - print header, - start_html('Favorite Animals'), + use CGI qw/:standard/; + print header, + start_html('Favorite Animals'), - start_form, - "What's your favorite animal? ", - popup_menu( - -name => 'animal', - -values => [ qw( Llama Alpaca Camel Ram ) ] - ), - submit, + start_form, + "What's your favorite animal? ", + popup_menu( + -name => 'animal', + -values => [ qw( Llama Alpaca Camel Ram ) ] + ), + submit, - end_form, - end_html; + end_form, + end_html; =head2 How do I fetch an HTML file? (contributed by brian d foy) -Use the libwww-perl distribution. The C<LWP::Simple> module can fetch web +Use the libwww-perl distribution. The L<LWP::Simple> module can fetch web resources and give their content back to you as a string: - use LWP::Simple qw(get); + use LWP::Simple qw(get); - my $html = get( "http://www.example.com/index.html" ); + my $html = get( "http://www.example.com/index.html" ); It can also store the resource directly in a file: - use LWP::Simple qw(getstore); + use LWP::Simple qw(getstore); - getstore( "http://www.example.com/index.html", "foo.html" ); + getstore( "http://www.example.com/index.html", "foo.html" ); If you need to do something more complicated, you can use -C<LWP::UserAgent> module to create your own user-agent (e.g. browser) +L<LWP::UserAgent> module to create your own user-agent (e.g. browser) to get the job done. If you want to simulate an interactive web -browser, you can use the C<WWW::Mechanize> module. +browser, you can use the L<WWW::Mechanize> module. =head2 How do I automate an HTML form submission? If you are doing something complex, such as moving through many pages -and forms or a web site, you can use C<WWW::Mechanize>. See its +and forms or a web site, you can use L<WWW::Mechanize>. See its documentation for all the details. If you're submitting values using the GET method, create a URL and encode the form using the C<query_form> method: - use LWP::Simple; - use URI::URL; + use LWP::Simple; + use URI::URL; - my $url = url('http://www.perl.com/cgi-bin/cpan_mod'); - $url->query_form(module => 'DB_File', readme => 1); - $content = get($url); + my $url = url('http://www.perl.com/cgi-bin/cpan_mod'); + $url->query_form(module => 'DB_File', readme => 1); + $content = get($url); If you're using the POST method, create your own user agent and encode the content appropriately. - use HTTP::Request::Common qw(POST); - use LWP::UserAgent; + use HTTP::Request::Common qw(POST); + use LWP::UserAgent; - $ua = LWP::UserAgent->new(); - my $req = POST 'http://www.perl.com/cgi-bin/cpan_mod', - [ module => 'DB_File', readme => 1 ]; - $content = $ua->request($req)->as_string; + $ua = LWP::UserAgent->new(); + my $req = POST 'http://www.perl.com/cgi-bin/cpan_mod', + [ module => 'DB_File', readme => 1 ]; + $content = $ua->request($req)->as_string; =head2 How do I decode or create those %-encodings on the web? X<URI> X<CGI.pm> X<CGI> X<URI::Escape> X<RFC 2396> @@ -261,30 +260,30 @@ either on the way in or the way out. If you have to encode a string yourself, remember that you should never try to encode an already-composed URI. You need to escape the components separately then put them together. To encode a string, you -can use the C<URI::Escape> module. The C<uri_escape> function +can use the L<URI::Escape> module. The C<uri_escape> function returns the escaped string: - my $original = "Colon : Hash # Percent %"; + my $original = "Colon : Hash # Percent %"; - my $escaped = uri_escape( $original ); + my $escaped = uri_escape( $original ); - print "$escaped\n"; # 'Colon%20%3A%20Hash%20%23%20Percent%20%25' + print "$escaped\n"; # 'Colon%20%3A%20Hash%20%23%20Percent%20%25' To decode the string, use the C<uri_unescape> function: - my $unescaped = uri_unescape( $escaped ); + my $unescaped = uri_unescape( $escaped ); - print $unescaped; # back to original + print $unescaped; # back to original If you wanted to do it yourself, you simply need to replace the reserved characters with their encodings. A global substitution is one way to do it: - # encode - $string =~ s/([^^A-Za-z0-9\-_.!~*'()])/ sprintf "%%%0x", ord $1 /eg; + # encode + $string =~ s/([^^A-Za-z0-9\-_.!~*'()])/ sprintf "%%%0x", ord $1 /eg; - #decode - $string =~ s/%([A-Fa-f\d]{2})/chr hex $1/eg; + #decode + $string =~ s/%([A-Fa-f\d]{2})/chr hex $1/eg; =head2 How do I redirect to another page? @@ -298,23 +297,23 @@ allow relative URLs in either case. Use of C<CGI.pm> is strongly recommended. This example shows redirection with a complete URL. This redirection is handled by the web browser. - use CGI qw/:standard/; + use CGI qw/:standard/; - my $url = 'http://www.cpan.org/'; - print redirect($url); + my $url = 'http://www.cpan.org/'; + print redirect($url); This example shows a redirection with an absolute URLpath. This redirection is handled by the local web server. - my $url = '/CPAN/index.html'; - print redirect($url); + my $url = '/CPAN/index.html'; + print redirect($url); But if coded directly, it could be as follows (the final "\n" is shown separately, for clarity), using either a complete URL or an absolute URLpath. - print "Location: $url\n"; # CGI response header - print "\n"; # end of headers + print "Location: $url\n"; # CGI response header + print "\n"; # end of headers =head2 How do I put a password on my web pages? @@ -326,16 +325,16 @@ the details for your particular server. =head2 How do I edit my .htpasswd and .htgroup files with Perl? -The C<HTTPD::UserAdmin> and C<HTTPD::GroupAdmin> modules provide a +The L<HTTPD::UserAdmin> and L<HTTPD::GroupAdmin> modules provide a consistent OO interface to these files, regardless of how they're stored. Databases may be text, dbm, Berkeley DB or any database with -a DBI compatible driver. C<HTTPD::UserAdmin> supports files used by the +a DBI compatible driver. L<HTTPD::UserAdmin> supports files used by the "Basic" and "Digest" authentication schemes. Here's an example: - use HTTPD::UserAdmin (); - HTTPD::UserAdmin - ->new(DB => "/foo/.htpasswd") - ->add($username => $password); + use HTTPD::UserAdmin (); + HTTPD::UserAdmin + ->new(DB => "/foo/.htpasswd") + ->add($username => $password); =head2 How do I make sure users can't enter values into a form that cause my CGI script to do bad things? @@ -344,12 +343,12 @@ a DBI compatible driver. C<HTTPD::UserAdmin> supports files used by the You can't prevent people from sending your script bad data. Even if you add some client-side checks, people may disable them or bypass them completely. For instance, someone might use a module such as -C<LWP> to access your CGI program. If you want to prevent data that +L<LWP> to access your CGI program. If you want to prevent data that try to use SQL injection or other sorts of attacks (and you should want to), you have to not trust any data that enter your program. The L<perlsec> documentation has general advice about data security. -If you are using the C<DBI> module, use placeholder to fill in data. +If you are using the L<DBI> module, use placeholder to fill in data. If you are running external programs with C<system> or C<exec>, use the list forms. There are many other precautions that you should take, too many to list here, and most of them fall under the category of not @@ -360,14 +359,14 @@ using any data that you don't intend to use. Trust no one. For a quick-and-dirty solution, try this solution derived from L<perlfunc/split>: - $/ = ''; - $header = <MSG>; - $header =~ s/\n\s+/ /g; # merge continuation lines - %head = ( UNIX_FROM_LINE, split /^([-\w]+):\s*/m, $header ); + $/ = ''; + $header = <MSG>; + $header =~ s/\n\s+/ /g; # merge continuation lines + %head = ( UNIX_FROM_LINE, split /^([-\w]+):\s*/m, $header ); That solution doesn't do well if, for example, you're trying to maintain all the Received lines. A more complete approach is to use -the C<Mail::Header> module from CPAN (part of the C<MailTools> package). +the L<Mail::Header> module from CPAN (part of the L<MailTools> package). =head2 How do I decode a CGI form? @@ -384,23 +383,23 @@ It doesn't get much easier: the C<CGI.pm> module automatically parses the input and makes each value available through the C<param()> function. - use CGI qw(:standard); + use CGI qw(:standard); - my $total = param( 'price' ) + param( 'shipping' ); + my $total = param( 'price' ) + param( 'shipping' ); - my @items = param( 'item' ); # multiple values, same field name + my @items = param( 'item' ); # multiple values, same field name If you want an object-oriented approach, C<CGI.pm> can do that too. - use CGI; + use CGI; - my $cgi = CGI->new(); + my $cgi = CGI->new(); - my $total = $cgi->param( 'price' ) + $cgi->param( 'shipping' ); + my $total = $cgi->param( 'price' ) + $cgi->param( 'shipping' ); - my @items = $cgi->param( 'item' ); + my @items = $cgi->param( 'item' ); -You might also try C<CGI::Minimal> which is a lightweight version +You might also try L<CGI::Minimal> which is a lightweight version of the same thing. Other CGI::* modules on CPAN might work better for you, too. @@ -420,7 +419,7 @@ b) How do I verify that an email address targets a valid recipient? Without sending mail to the address and seeing whether there's a human on the other end to answer you, you cannot fully answer part I<b>, but -either the C<Email::Valid> or the C<RFC::RFC822::Address> module will do +either the L<Email::Valid> or the C<RFC::RFC822::Address> module will do both part I<a> and part I<b> as far as you can in real-time. If you want to just check part I<a> to see that the address is valid @@ -432,14 +431,14 @@ following will match valid RFC-2822 addresses that do not have comments, folding whitespace, or any other obsolete or non-essential elements. This I<just> matches the address itself: - my $atom = qr{[a-zA-Z0-9_!#\$\%&'*+/=?\^`{}~|\-]+}; - my $dot_atom = qr{$atom(?:\.$atom)*}; - my $quoted = qr{"(?:\\[^\r\n]|[^\\"])*"}; - my $local = qr{(?:$dot_atom|$quoted)}; - my $quotedpair = qr{\\[\x00-\x09\x0B-\x0c\x0e-\x7e]}; - my $domain_lit = qr{\[(?:$quotedpair|[\x21-\x5a\x5e-\x7e])*\]}; - my $domain = qr{(?:$dot_atom|$domain_lit)}; - my $addr_spec = qr{$local\@$domain}; + my $atom = qr{[a-zA-Z0-9_!#\$\%&'*+/=?\^`{}~|\-]+}; + my $dot_atom = qr{$atom(?:\.$atom)*}; + my $quoted = qr{"(?:\\[^\r\n]|[^\\"])*"}; + my $local = qr{(?:$dot_atom|$quoted)}; + my $quotedpair = qr{\\[\x00-\x09\x0B-\x0c\x0e-\x7e]}; + my $domain_lit = qr{\[(?:$quotedpair|[\x21-\x5a\x5e-\x7e])*\]}; + my $domain = qr{(?:$dot_atom|$domain_lit)}; + my $addr_spec = qr{$local\@$domain}; Just match an address against C</^${addr_spec}$/> to see if it follows the RFC2822 specification. However, because it is impossible to be @@ -464,13 +463,13 @@ with the characters reversed, one added or subtracted to each digit, etc. =head2 How do I decode a MIME/BASE64 string? -The C<MIME-Base64> package (available from CPAN) handles this as well as +The L<MIME::Base64> package (available from CPAN) handles this as well as the MIME/QP encoding. Decoding BASE64 becomes as simple as: - use MIME::Base64; - $decoded = decode_base64($encoded); + use MIME::Base64; + $decoded = decode_base64($encoded); -The C<MIME-Tools> package (available from CPAN) supports extraction with +The L<MIME::Tools> package (available from CPAN) supports extraction with decoding of BASE64 encoded attachments and content directly from email messages. @@ -478,26 +477,26 @@ If the string to decode is short (less than 84 bytes long) a more direct approach is to use the C<unpack()> function's "u" format after minor transliterations: - tr#A-Za-z0-9+/##cd; # remove non-base64 chars - tr#A-Za-z0-9+/# -_#; # convert to uuencoded format - $len = pack("c", 32 + 0.75*length); # compute length byte - print unpack("u", $len . $_); # uudecode and print + tr#A-Za-z0-9+/##cd; # remove non-base64 chars + tr#A-Za-z0-9+/# -_#; # convert to uuencoded format + $len = pack("c", 32 + 0.75*length); # compute length byte + print unpack("u", $len . $_); # uudecode and print =head2 How do I return the user's mail address? On systems that support getpwuid, the C<< $< >> variable, and the -C<Sys::Hostname> module (which is part of the standard perl distribution), +L<Sys::Hostname> module (which is part of the standard perl distribution), you can probably try using something like this: - use Sys::Hostname; - $address = sprintf('%s@%s', scalar getpwuid($<), hostname); + use Sys::Hostname; + $address = sprintf('%s@%s', scalar getpwuid($<), hostname); Company policies on mail address can mean that this generates addresses that the company's mail system will not accept, so you should ask for users' mail addresses when this matters. Furthermore, not all systems on which Perl runs are so forthcoming with this information as is Unix. -The C<Mail::Util> module from CPAN (part of the C<MailTools> package) provides a +The L<Mail::Util> module from CPAN (part of the L<MailTools> package) provides a C<mailaddress()> function that tries to guess the mail address of the user. It makes a more intelligent guess than the code above, using information given when the module was installed, but it could still be incorrect. @@ -507,17 +506,17 @@ Again, the best way is often just to ask the user. Use the C<sendmail> program directly: - open(SENDMAIL, "|/usr/lib/sendmail -oi -t -odq") - or die "Can't fork for sendmail: $!\n"; - print SENDMAIL <<"EOF"; - From: User Originating Mail <me\@host> - To: Final Destination <you\@otherhost> - Subject: A relevant subject line + open(SENDMAIL, "|/usr/lib/sendmail -oi -t -odq") + or die "Can't fork for sendmail: $!\n"; + print SENDMAIL <<"EOF"; + From: User Originating Mail <me\@host> + To: Final Destination <you\@otherhost> + Subject: A relevant subject line - Body of the message goes here after the blank line - in as many lines as you like. - EOF - close(SENDMAIL) or warn "sendmail didn't close nicely"; + Body of the message goes here after the blank line + in as many lines as you like. + EOF + close(SENDMAIL) or warn "sendmail didn't close nicely"; The B<-oi> option prevents C<sendmail> from interpreting a line consisting of a single dot as "end of message". The B<-t> option says to use the @@ -531,89 +530,89 @@ called C<mailx>) directly or simply opening up port 25 have having an intimate conversation between just you and the remote SMTP daemon, probably C<sendmail>. -Or you might be able use the CPAN module C<Mail::Mailer>: +Or you might be able use the CPAN module L<Mail::Mailer>: - use Mail::Mailer; + use Mail::Mailer; - $mailer = Mail::Mailer->new(); - $mailer->open({ From => $from_address, - To => $to_address, - Subject => $subject, - }) - or die "Can't open: $!\n"; - print $mailer $body; - $mailer->close(); + $mailer = Mail::Mailer->new(); + $mailer->open({ From => $from_address, + To => $to_address, + Subject => $subject, + }) + or die "Can't open: $!\n"; + print $mailer $body; + $mailer->close(); -The C<Mail::Internet> module uses C<Net::SMTP> which is less Unix-centric than -C<Mail::Mailer>, but less reliable. Avoid raw SMTP commands. There +The L<Mail::Internet> module uses L<Net::SMTP> which is less Unix-centric than +L<Mail::Mailer>, but less reliable. Avoid raw SMTP commands. There are many reasons to use a mail transport agent like C<sendmail>. These include queuing, MX records, and security. =head2 How do I use MIME to make an attachment to a mail message? -This answer is extracted directly from the C<MIME::Lite> documentation. +This answer is extracted directly from the L<MIME::Lite> documentation. Create a multipart message (i.e., one with attachments). - use MIME::Lite; + use MIME::Lite; - ### Create a new multipart message: - $msg = MIME::Lite->new( - From =>'me@myhost.com', - To =>'you@yourhost.com', - Cc =>'some@other.com, some@more.com', - Subject =>'A message with 2 parts...', - Type =>'multipart/mixed' - ); + ### Create a new multipart message: + $msg = MIME::Lite->new( + From =>'me@myhost.com', + To =>'you@yourhost.com', + Cc =>'some@other.com, some@more.com', + Subject =>'A message with 2 parts...', + Type =>'multipart/mixed' + ); - ### Add parts (each "attach" has same arguments as "new"): - $msg->attach(Type =>'TEXT', - Data =>"Here's the GIF file you wanted" - ); - $msg->attach(Type =>'image/gif', - Path =>'aaa000123.gif', - Filename =>'logo.gif' - ); + ### Add parts (each "attach" has same arguments as "new"): + $msg->attach(Type =>'TEXT', + Data =>"Here's the GIF file you wanted" + ); + $msg->attach(Type =>'image/gif', + Path =>'aaa000123.gif', + Filename =>'logo.gif' + ); - $text = $msg->as_string; + $text = $msg->as_string; -C<MIME::Lite> also includes a method for sending these things. +L<MIME::Lite> also includes a method for sending these things. - $msg->send; + $msg->send; This defaults to using L<sendmail(1)> but can be customized to use SMTP via L<Net::SMTP>. =head2 How do I read mail? -While you could use the C<Mail::Folder> module from CPAN (part of the -C<MailFolder> package) or the C<Mail::Internet> module from CPAN (part -of the C<MailTools> package), often a module is overkill. Here's a +While you could use the L<Mail::Folder> module from CPAN (part of the +L<MailFolder> package) or the L<Mail::Internet> module from CPAN (part +of the L<MailTools> package), often a module is overkill. Here's a mail sorter. - #!/usr/bin/perl - - my(@msgs, @sub); - my $msgno = -1; - $/ = ''; # paragraph reads - while (<>) { - if (/^From /m) { - /^Subject:\s*(?:Re:\s*)*(.*)/mi; - $sub[++$msgno] = lc($1) || ''; - } - $msgs[$msgno] .= $_; - } - for my $i (sort { $sub[$a] cmp $sub[$b] || $a <=> $b } (0 .. $#msgs)) { - print $msgs[$i]; - } + #!/usr/bin/perl + + my(@msgs, @sub); + my $msgno = -1; + $/ = ''; # paragraph reads + while (<>) { + if (/^From /m) { + /^Subject:\s*(?:Re:\s*)*(.*)/mi; + $sub[++$msgno] = lc($1) || ''; + } + $msgs[$msgno] .= $_; + } + for my $i (sort { $sub[$a] cmp $sub[$b] || $a <=> $b } (0 .. $#msgs)) { + print $msgs[$i]; + } Or more succinctly, - #!/usr/bin/perl -n00 - # bysub2 - awkish sort-by-subject - BEGIN { $msgno = -1 } - $sub[++$msgno] = (/^Subject:\s*(?:Re:\s*)*(.*)/mi)[0] if /^From/m; - $msg[$msgno] .= $_; - END { print @msg[ sort { $sub[$a] cmp $sub[$b] || $a <=> $b } (0 .. $#msg) ] } + #!/usr/bin/perl -n00 + # bysub2 - awkish sort-by-subject + BEGIN { $msgno = -1 } + $sub[++$msgno] = (/^Subject:\s*(?:Re:\s*)*(.*)/mi)[0] if /^From/m; + $msg[$msgno] .= $_; + END { print @msg[ sort { $sub[$a] cmp $sub[$b] || $a <=> $b } (0 .. $#msg) ] } =head2 How do I find out my hostname, domainname, or IP address? X<hostname, domainname, IP address, host, domain, hostfqdn, inet_ntoa, @@ -621,54 +620,54 @@ gethostbyname, Socket, Net::Domain, Sys::Hostname> (contributed by brian d foy) -The C<Net::Domain> module, which is part of the Standard Library starting +The L<Net::Domain> module, which is part of the Standard Library starting in Perl 5.7.3, can get you the fully qualified domain name (FQDN), the host name, or the domain name. - use Net::Domain qw(hostname hostfqdn hostdomain); + use Net::Domain qw(hostname hostfqdn hostdomain); - my $host = hostfqdn(); + my $host = hostfqdn(); -The C<Sys::Hostname> module, part of the Standard Library, can also get the +The L<Sys::Hostname> module, part of the Standard Library, can also get the hostname: - use Sys::Hostname; + use Sys::Hostname; - $host = hostname(); + $host = hostname(); To get the IP address, you can use the C<gethostbyname> built-in function to turn the name into a number. To turn that number into the dotted octet form (a.b.c.d) that most people expect, use the C<inet_ntoa> function -from the C<Socket> module, which also comes with perl. +from the L<Socket> module, which also comes with perl. - use Socket; + use Socket; - my $address = inet_ntoa( - scalar gethostbyname( $host || 'localhost' ) - ); + my $address = inet_ntoa( + scalar gethostbyname( $host || 'localhost' ) + ); =head2 How do I fetch a news article or the active newsgroups? -Use the C<Net::NNTP> or C<News::NNTPClient> modules, both available from CPAN. +Use the L<Net::NNTP> or L<News::NNTPClient> modules, both available from CPAN. This can make tasks like fetching the newsgroup list as simple as - perl -MNews::NNTPClient - -e 'print News::NNTPClient->new->list("newsgroups")' + perl -MNews::NNTPClient + -e 'print News::NNTPClient->new->list("newsgroups")' =head2 How do I fetch/put an FTP file? (contributed by brian d foy) -The C<LWP> family of modules (available on CPAN as the libwww-perl distribution) -can work with FTP just like it can with many other protocols. C<LWP::Simple> +The L<LWP> family of modules (available on CPAN as the libwww-perl distribution) +can work with FTP just like it can with many other protocols. L<LWP::Simple> makes it quite easy to fetch a file: - use LWP::Simple; + use LWP::Simple; - my $data = get( 'ftp://some.ftp.site/some/file.txt' ); + my $data = get( 'ftp://some.ftp.site/some/file.txt' ); If you want more direct or low-level control of the FTP process, you can use -the C<Net::FTP> module (in the Standard Library since Perl 5.8). It's +the L<Net::FTP> module (in the Standard Library since Perl 5.8). It's documentation has examples showing you just how to do that. =head2 How can I do RPC in Perl? |