diff options
author | Gurusamy Sarathy <gsar@cpan.org> | 1999-01-17 09:02:07 +0000 |
---|---|---|
committer | Gurusamy Sarathy <gsar@cpan.org> | 1999-01-17 09:02:07 +0000 |
commit | be59e445a231e0102a0fd9822727ddbe3e12d0bb (patch) | |
tree | 4c547e81e163e9470a1090d056d83c6a0083a308 /pod | |
parent | f244e06d4740a118d980f79807cb4f393cc3087b (diff) | |
parent | f828431348b2bbf6fe06182e862634247523af66 (diff) | |
download | perl-be59e445a231e0102a0fd9822727ddbe3e12d0bb.tar.gz |
integrate cfgperl changes into mainline, fix conflicts
p4raw-id: //depot/perl@2620
Diffstat (limited to 'pod')
-rw-r--r-- | pod/perl.pod | 1 | ||||
-rw-r--r-- | pod/perl5005delta.pod | 47 | ||||
-rw-r--r-- | pod/perlcall.pod | 8 | ||||
-rw-r--r-- | pod/perldata.pod | 3 | ||||
-rw-r--r-- | pod/perldelta.pod | 86 | ||||
-rw-r--r-- | pod/perldiag.pod | 13 | ||||
-rw-r--r-- | pod/perldsc.pod | 2 | ||||
-rw-r--r-- | pod/perlfaq.pod | 651 | ||||
-rw-r--r-- | pod/perlfaq1.pod | 183 | ||||
-rw-r--r-- | pod/perlfaq2.pod | 198 | ||||
-rw-r--r-- | pod/perlfaq3.pod | 100 | ||||
-rw-r--r-- | pod/perlfaq4.pod | 449 | ||||
-rw-r--r-- | pod/perlfaq5.pod | 173 | ||||
-rw-r--r-- | pod/perlfaq6.pod | 127 | ||||
-rw-r--r-- | pod/perlfaq7.pod | 42 | ||||
-rw-r--r-- | pod/perlfaq8.pod | 62 | ||||
-rw-r--r-- | pod/perlfaq9.pod | 24 | ||||
-rw-r--r-- | pod/perlfunc.pod | 150 | ||||
-rw-r--r-- | pod/perlguts.pod | 25 | ||||
-rw-r--r-- | pod/perllocale.pod | 6 | ||||
-rw-r--r-- | pod/perlmodlib.pod | 2 | ||||
-rw-r--r-- | pod/perlobj.pod | 2 | ||||
-rw-r--r-- | pod/perlop.pod | 43 | ||||
-rw-r--r-- | pod/perlopentut.pod | 862 | ||||
-rw-r--r-- | pod/perlport.pod | 121 | ||||
-rw-r--r-- | pod/perlsub.pod | 2 | ||||
-rw-r--r-- | pod/perlvar.pod | 8 | ||||
-rw-r--r-- | pod/perlxstut.pod | 2 | ||||
-rw-r--r-- | pod/pod2html.PL | 2 | ||||
-rw-r--r-- | pod/pod2man.PL | 6 | ||||
-rw-r--r-- | pod/roffitall | 1 |
31 files changed, 2762 insertions, 639 deletions
diff --git a/pod/perl.pod b/pod/perl.pod index 1b886d01e0..ae81c7d3d6 100644 --- a/pod/perl.pod +++ b/pod/perl.pod @@ -33,6 +33,7 @@ of sections: perlfunc Perl builtin functions perlvar Perl predefined variables perlsub Perl subroutines + perlopentut Perl opening things tutorial perlmod Perl modules: how they work perlmodlib Perl modules: how to write and use perlmodinstall Perl modules: how to install from CPAN diff --git a/pod/perl5005delta.pod b/pod/perl5005delta.pod index 205c6af78b..1c69bdd246 100644 --- a/pod/perl5005delta.pod +++ b/pod/perl5005delta.pod @@ -85,7 +85,7 @@ begin with C<perl> be referenced with a C<Perl_> prefix. The bare function names without the C<Perl_> prefix are supported with macros, but this support may cease in a future release. -See L<perlguts/API LISTING>. +See L<perlguts/"API LISTING">. =item Enabling threads has source compatibility issues @@ -100,7 +100,7 @@ directly accessing perl globals as C<GvSV(errgv)>. The API call is backward compatible with existing perls and provides source compatibility with threading is enabled. -See L<API Changes for more information>. +See L<"C Source Compatibility"> for more information. =back @@ -502,7 +502,9 @@ DOS is now supported under the DJGPP tools. See L<README.dos>. MPE/iX is now supported. See L<README.mpeix>. -MVS (OS390) is now supported. See L<README.os390>. +MVS (aka OS390, aka Open Edition) is now supported. See L<README.os390>. + +Stratus VOS is now supported. See L<README.vos>. =head2 Changes in existing support @@ -587,10 +589,33 @@ Various pragmata to control behavior of regular expressions. =over +=item Benchmark + +You can now run tests for I<x> seconds instead of guessing the right +number of tests to run. + =item CGI CGI has been updated to version 2.42. +=item Fcntl + +More Fcntl constants added: F_SETLK64, F_SETLKW64, O_LARGEFILE for +large (more than 4G) file access (the 64-bit support is not yet +working, though, so no need to get overly excited), Free/Net/OpenBSD +locking behaviour flags F_FLOCK, F_POSIX, Linux F_SHLCK, and +O_ACCMODE: the mask of O_RDONLY, O_WRONLY, and O_RDWR. + +=item Math::Complex + +The accessors methods Re, Im, arg, abs, rho, theta, methods can +($z->Re()) now also act as mutators ($z->Re(3)). + +=item Math::Trig + +A little bit of radial trigonometry (cylindrical and spherical) added, +for example the great circle distance. + =item POSIX POSIX now has its own platform-specific hints files. @@ -860,7 +885,7 @@ not use those settings. This was not dead serious, fortunately: there is a "default locale" called "C" that Perl can and will use, the script will be run. Before you really fix the problem, however, you will get the same error message each time you run Perl. How to really -fix the problem can be found in L<perllocale> section B<LOCALE PROBLEMS>. +fix the problem can be found in L<perllocale/"LOCALE PROBLEMS">. =back @@ -874,16 +899,30 @@ fix the problem can be found in L<perllocale> section B<LOCALE PROBLEMS>. (F) The mktemp() routine failed for some reason while trying to process a B<-e> switch. Maybe your /tmp partition is full, or clobbered. +Removed because B<-e> doesn't use temporary files any more. + =item Can't write to temp file for B<-e>: %s (F) The write routine failed for some reason while trying to process a B<-e> switch. Maybe your /tmp partition is full, or clobbered. +Removed because B<-e> doesn't use temporary files any more. + =item Cannot open temporary file (F) The create routine failed for some reason while trying to process a B<-e> switch. Maybe your /tmp partition is full, or clobbered. +Removed because B<-e> doesn't use temporary files any more. + +=item regexp too big + +(F) The current implementation of regular expressions uses shorts as +address offsets within a string. Unfortunately this means that if +the regular expression compiles to longer than 32767, it'll blow up. +Usually when you want a regular expression this big, there is a better +way to do it with multiple statements. See L<perlre>. + =item regexp too big (F) The current implementation of regular expressions uses shorts as diff --git a/pod/perlcall.pod b/pod/perlcall.pod index e3e02de613..8771be852b 100644 --- a/pod/perlcall.pod +++ b/pod/perlcall.pod @@ -72,7 +72,7 @@ Each of the functions will now be discussed in turn. =over 5 -=item B<perl_call_sv> +=item perl_call_sv I<perl_call_sv> takes two parameters, the first, C<sv>, is an SV*. This allows you to specify the Perl subroutine to be called either as a @@ -80,7 +80,7 @@ C string (which has first been converted to an SV) or a reference to a subroutine. The section, I<Using perl_call_sv>, shows how you can make use of I<perl_call_sv>. -=item B<perl_call_pv> +=item perl_call_pv The function, I<perl_call_pv>, is similar to I<perl_call_sv> except it expects its first parameter to be a C char* which identifies the Perl @@ -88,7 +88,7 @@ subroutine you want to call, e.g., C<perl_call_pv("fred", 0)>. If the subroutine you want to call is in another package, just include the package name in the string, e.g., C<"pkg::fred">. -=item B<perl_call_method> +=item perl_call_method The function I<perl_call_method> is used to call a method from a Perl class. The parameter C<methname> corresponds to the name of the method @@ -99,7 +99,7 @@ object (for a virtual method). See L<perlobj> for more information on static and virtual methods and L<Using perl_call_method> for an example of using I<perl_call_method>. -=item B<perl_call_argv> +=item perl_call_argv I<perl_call_argv> calls the Perl subroutine specified by the C string stored in the C<subname> parameter. It also takes the usual C<flags> diff --git a/pod/perldata.pod b/pod/perldata.pod index 8f700f634c..7b9a323338 100644 --- a/pod/perldata.pod +++ b/pod/perldata.pod @@ -245,6 +245,7 @@ integer formats: .23E-10 0xffff # hex 0377 # octal + 0b111000 # binary 4_294_967_296 # underline for legibility String literals are usually delimited by either single or double @@ -253,7 +254,7 @@ literals are subject to backslash and variable substitution; single-quoted strings are not (except for "C<\'>" and "C<\\>"). The usual Unix backslash rules apply for making characters such as newline, tab, etc., as well as some more exotic forms. See -L<perlop/Quote and Quotelike Operators> for a list. +L<perlop/"Quote and Quotelike Operators"> for a list. Octal or hex representations in string literals (e.g. '0xffff') are not automatically converted to their integer representation. The hex() and diff --git a/pod/perldelta.pod b/pod/perldelta.pod index d4fe20f299..bd2b715605 100644 --- a/pod/perldelta.pod +++ b/pod/perldelta.pod @@ -40,14 +40,81 @@ maintenance versions. =head1 Core Changes -Todo. +Binary numbers are now supported as literals, in s?printf formats, and +C<oct()>: + + $answer = 0b101010; + printf "The answer is: %b\n", oct("0b101010"); + +The length argument of C<syswrite()> is now optional. + +Better 64-bit support -- but full support still a distant goal. One +must Configure with -Duse64bits to get Configure to probe for the +extent of 64-bit support. Depending on the platform (hints file) more +or less 64-awareness becomes available. As of 5.005_54 at least +somewhat 64-bit aware platforms are HP-UX 11 or better, Solaris 2.6 or +better, IRIX 6.2 or better. Naturally 64-bit platforms like Digital +UNIX and UNICOS also have 64-bit support. =head1 Supported Platforms -Todo. +VM/ESA is now supported. + +Siemens BS200 is now supported. + +The Mach CThreads (NeXTstep) are now supported by the Thread extension. + +=head1 New tests + +=over 4 + +=item op/io_const + +IO constants (SEEK_*, _IO*). + +=item op/io_dir + +Directory-related IO methods (new, read, close, rewind, tied delete). + +=item op/io_multihomed + +INET sockets with multi-homed hosts. + +=item op/io_poll + +IO poll(). + +=item op/io_unix + +UNIX sockets. + +=item op/filetest + +File test operators. + +=item op/lex_assign + +Guard against lexicals leaking (internal stuff). + +=back =head1 Modules and Pragmata +=head2 Modules + +Dumpvalue module provides screen dumps of Perl data. + +=head2 Pragmata + +Lexical warnings pragma, "use warning;", to control optional warnings. + +Filetest pragma, to control the behaviour of filetests (C<-r> C<-w> ...). +Currently only one subpragma implemented, "use filetest 'access';", +that enables the use of access(2) or equivalent to check the +permissions instead of using stat(2) as usual. This matters +in filesystems where there are ACLs (access control lists), the +stat(2) might lie, while access(2) knows better. + Todo. =head1 Utility Changes @@ -56,11 +123,22 @@ Todo. =head1 Documentation Changes -Todo. +perlopentut, tutorial on opening things in Perl, was added. + +perlreftut, tutorial on references, was added. =head1 New Diagnostics -Todo. +=item /%s/: Unrecognized escape \\%c passed through + +(W) You used a backslash-character combination which is not recognized +by Perl. This combination appears in an interpolated variable or a +C<'>-delimited regular expression. + +=item Unrecognized escape \\%c passed through + +(W) You used a backslash-character combination which is not recognized +by Perl. =head1 Obsolete Diagnostics diff --git a/pod/perldiag.pod b/pod/perldiag.pod index 6b4c1277bc..e0e9b128e9 100644 --- a/pod/perldiag.pod +++ b/pod/perldiag.pod @@ -57,6 +57,12 @@ no useful value. See L<perlmod>. checksumming process loses information, and you can't go the other way. See L<perlfunc/unpack>. +=item /%s/: Unrecognized escape \\%c passed through + +(W) You used a backslash-character combination which is not recognized +by Perl. This combination appears in an interpolated variable or a +C<'>-delimited regular expression. + =item %s (...) interpreted as function (W) You've run afoul of the rule that says that any list operator followed @@ -1428,7 +1434,7 @@ architecture. On a 32-bit architecture the largest octal literal is (S) A warning peculiar to VMS. Perl keeps track of the number of times you've called C<fork> and C<exec>, to determine whether the current call to C<exec> should affect the current -script or a subprocess (see L<perlvms/exec>). Somehow, this count +script or a subprocess (see L<perlvms/"exec LIST">). Somehow, this count has become scrambled, so Perl is making a guess and treating this C<exec> as a request to terminate the Perl script and execute the specified command. @@ -2770,6 +2776,11 @@ an underbar into it. You might also declare it as a subroutine. in your Perl script (or eval). Perhaps you tried to run a compressed script, a binary program, or a directory as a Perl program. +=item Unrecognized escape \\%c passed through + +(W) You used a backslash-character combination which is not recognized +by Perl. + =item Unrecognized signal name "%s" (F) You specified a signal name to the kill() function that was not recognized. diff --git a/pod/perldsc.pod b/pod/perldsc.pod index d0cc335736..ef3ae750a5 100644 --- a/pod/perldsc.pod +++ b/pod/perldsc.pod @@ -690,7 +690,7 @@ many different sorts: print $rec->{TEXT}; - print $rec->{LIST}[0]; + print $rec->{SEQUENCE}[0]; $last = pop @ { $rec->{SEQUENCE} }; print $rec->{LOOKUP}{"key"}; diff --git a/pod/perlfaq.pod b/pod/perlfaq.pod index e6be112008..cb354931cc 100644 --- a/pod/perlfaq.pod +++ b/pod/perlfaq.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq - frequently asked questions about Perl ($Date: 1998/08/05 12:09:32 $) +perlfaq - frequently asked questions about Perl ($Date: 1999/01/08 05:54:52 $) =head1 DESCRIPTION @@ -16,42 +16,682 @@ This document. Very general, high-level information about Perl. +=over 4 + +=item * What is Perl? + +=item * Who supports Perl? Who develops it? Why is it free? + +=item * Which version of Perl should I use? + +=item * What are perl4 and perl5? + +=item * What is perl6? + +=item * How stable is Perl? + +=item * Is Perl difficult to learn? + +=item * How does Perl compare with other languages like Java, Python, REXX, Scheme, or Tcl? + +=item * Can I do [task] in Perl? + +=item * When shouldn't I program in Perl? + +=item * What's the difference between "perl" and "Perl"? + +=item * Is it a Perl program or a Perl script? + +=item * What is a JAPH? + +=item * Where can I get a list of Larry Wall witticisms? + +=item * How can I convince my sysadmin/supervisor/employees to use version (5/5.005/Perl instead of some other language)? + +=back + + =item L<perlfaq2>: Obtaining and Learning about Perl Where to find source and documentation to Perl, support, and related matters. +=over 4 + +=item * What machines support Perl? Where do I get it? + +=item * How can I get a binary version of Perl? + +=item * I don't have a C compiler on my system. How can I compile perl? + +=item * I copied the Perl binary from one machine to another, but scripts don't work. + +=item * I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed. How do I make it work? + +=item * What modules and extensions are available for Perl? What is CPAN? What does CPAN/src/... mean? + +=item * Is there an ISO or ANSI certified version of Perl? + +=item * Where can I get information on Perl? + +=item * What are the Perl newsgroups on USENET? Where do I post questions? + +=item * Where should I post source code? + +=item * Perl Books + +=item * Perl in Magazines + +=item * Perl on the Net: FTP and WWW Access + +=item * What mailing lists are there for perl? + +=item * Archives of comp.lang.perl.misc + +=item * Where can I buy a commercial version of Perl? + +=item * Where do I send bug reports? + +=item * What is perl.com? + +=back + + =item L<perlfaq3>: Programming Tools Programmer tools and programming support. +=over 4 + +=item * How do I do (anything)? + +=item * How can I use Perl interactively? + +=item * Is there a Perl shell? + +=item * How do I debug my Perl programs? + +=item * How do I profile my Perl programs? + +=item * How do I cross-reference my Perl programs? + +=item * Is there a pretty-printer (formatter) for Perl? + +=item * Is there a ctags for Perl? + +=item * Is there an IDE or Windows Perl Editor? + +=item * Where can I get Perl macros for vi? + +=item * Where can I get perl-mode for emacs? + +=item * How can I use curses with Perl? + +=item * How can I use X or Tk with Perl? + +=item * How can I generate simple menus without using CGI or Tk? + +=item * What is undump? + +=item * How can I make my Perl program run faster? + +=item * How can I make my Perl program take less memory? + +=item * Is it unsafe to return a pointer to local data? + +=item * How can I free an array or hash so my program shrinks? + +=item * How can I make my CGI script more efficient? + +=item * How can I hide the source for my Perl program? + +=item * How can I compile my Perl program into byte code or C? + +=item * How can I compile Perl into Java? + +=item * How can I get C<#!perl> to work on [MS-DOS,NT,...]? + +=item * Can I write useful perl programs on the command line? + +=item * Why don't perl one-liners work on my DOS/Mac/VMS system? + +=item * Where can I learn about CGI or Web programming in Perl? + +=item * Where can I learn about object-oriented Perl programming? + +=item * Where can I learn about linking C with Perl? [h2xs, xsubpp] + +=item * I've read perlembed, perlguts, etc., but I can't embed perl in +my C program, what am I doing wrong? + +=item * When I tried to run my script, I got this message. What does it +mean? + +=item * What's MakeMaker? + +=back + + =item L<perlfaq4>: Data Manipulation Manipulating numbers, dates, strings, arrays, hashes, and miscellaneous data issues. +=over 4 + +=item * Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)? + +=item * Why isn't my octal data interpreted correctly? + +=item * Does Perl have a round() function? What about ceil() and floor()? Trig functions? + +=item * How do I convert bits into ints? + +=item * Why doesn't & work the way I want it to? + +=item * How do I multiply matrices? + +=item * How do I perform an operation on a series of integers? + +=item * How can I output Roman numerals? + +=item * Why aren't my random numbers random? + +=item * How do I find the week-of-the-year/day-of-the-year? + +=item * How can I compare two dates and find the difference? + +=item * How can I take a string and turn it into epoch seconds? + +=item * How can I find the Julian Day? + +=item * How do I find yesterday's date? + +=item * Does Perl have a year 2000 problem? Is Perl Y2K compliant? + +=item * How do I validate input? + +=item * How do I unescape a string? + +=item * How do I remove consecutive pairs of characters? + +=item * How do I expand function calls in a string? + +=item * How do I find matching/nesting anything? + +=item * How do I reverse a string? + +=item * How do I expand tabs in a string? + +=item * How do I reformat a paragraph? + +=item * How can I access/change the first N letters of a string? + +=item * How do I change the Nth occurrence of something? + +=item * How can I count the number of occurrences of a substring within a string? + +=item * How do I capitalize all the words on one line? + +=item * How can I split a [character] delimited string except when inside +[character]? (Comma-separated files) + +=item * How do I strip blank space from the beginning/end of a string? + +=item * How do I pad a string with blanks or pad a number with zeroes? + +=item * How do I extract selected columns from a string? + +=item * How do I find the soundex value of a string? + +=item * How can I expand variables in text strings? + +=item * What's wrong with always quoting "$vars"? + +=item * Why don't my E<lt>E<lt>HERE documents work? + +=item * What is the difference between a list and an array? + +=item * What is the difference between $array[1] and @array[1]? + +=item * How can I extract just the unique elements of an array? + +=item * How can I tell whether a list or array contains a certain element? + +=item * How do I compute the difference of two arrays? How do I compute the intersection of two arrays? + +=item * How do I test whether two arrays or hashes are equal? + +=item * How do I find the first array element for which a condition is true? + +=item * How do I handle linked lists? + +=item * How do I handle circular lists? + +=item * How do I shuffle an array randomly? + +=item * How do I process/modify each element of an array? + +=item * How do I select a random element from an array? + +=item * How do I permute N elements of a list? + +=item * How do I sort an array by (anything)? + +=item * How do I manipulate arrays of bits? + +=item * Why does defined() return true on empty arrays and hashes? + +=item * How do I process an entire hash? + +=item * What happens if I add or remove keys from a hash while iterating over it? + +=item * How do I look up a hash element by value? + +=item * How can I know how many entries are in a hash? + +=item * How do I sort a hash (optionally by value instead of key)? + +=item * How can I always keep my hash sorted? + +=item * What's the difference between "delete" and "undef" with hashes? + +=item * Why don't my tied hashes make the defined/exists distinction? + +=item * How do I reset an each() operation part-way through? + +=item * How can I get the unique keys from two hashes? + +=item * How can I store a multidimensional array in a DBM file? + +=item * How can I make my hash remember the order I put elements into it? + +=item * Why does passing a subroutine an undefined element in a hash create it? + +=item * How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays? + +=item * How can I use a reference as a hash key? + +=item * How do I handle binary data correctly? + +=item * How do I determine whether a scalar is a number/whole/integer/float? + +=item * How do I keep persistent data across program calls? + +=item * How do I print out or copy a recursive data structure? + +=item * How do I define methods for every class/object? + +=item * How do I verify a credit card checksum? + +=item * How do I pack arrays of doubles or floats for XS code? + +=back + + =item L<perlfaq5>: Files and Formats I/O and the "f" issues: filehandles, flushing, formats and footers. +=over 4 + +=item * How do I flush/unbuffer an output filehandle? Why must I do this? + +=item * How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file? + +=item * How do I count the number of lines in a file? + +=item * How do I make a temporary file name? + +=item * How can I manipulate fixed-record-length files? + +=item * How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles? + +=item * How can I use a filehandle indirectly? + +=item * How can I set up a footer format to be used with write()? + +=item * How can I write() into a string? + +=item * How can I output my numbers with commas added? + +=item * How can I translate tildes (~) in a filename? + +=item * How come when I open a file read-write it wipes it out? + +=item * Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>? + +=item * Is there a leak/bug in glob()? + +=item * How can I open a file with a leading "E<gt>" or trailing blanks? + +=item * How can I reliably rename a file? + +=item * How can I lock a file? + +=item * Why can't I just open(FH, ">file.lock")? + +=item * I still don't get locking. I just want to increment the number in the file. How can I do this? + +=item * How do I randomly update a binary file? + +=item * How do I get a file's timestamp in perl? + +=item * How do I set a file's timestamp in perl? + +=item * How do I print to more than one file at once? + +=item * How can I read in a file by paragraphs? + +=item * How can I read a single character from a file? From the keyboard? + +=item * How can I tell whether there's a character waiting on a filehandle? + +=item * How do I do a C<tail -f> in perl? + +=item * How do I dup() a filehandle in Perl? + +=item * How do I close a file descriptor by number? + +=item * Why can't I use "C:\temp\foo" in DOS paths? What doesn't `C:\temp\foo.exe` work? + +=item * Why doesn't glob("*.*") get all the files? + +=item * Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl? + +=item * How do I select a random line from a file? + +=item * Why do I get weird spaces when I print an array of lines? + +=back + + =item L<perlfaq6>: Regexps Pattern matching and regular expressions. +=over 4 + +=item * How can I hope to use regular expressions without creating illegible and unmaintainable code? + +=item * I'm having trouble matching over more than one line. What's wrong? + +=item * How can I pull out lines between two patterns that are themselves on different lines? + +=item * I put a regular expression into $/ but it didn't work. What's wrong? + +=item * How do I substitute case insensitively on the LHS, but preserving case on the RHS? + +=item * How can I make C<\w> match national character sets? + +=item * How can I match a locale-smart version of C</[a-zA-Z]/>? + +=item * How can I quote a variable to use in a regexp? + +=item * What is C</o> really for? + +=item * How do I use a regular expression to strip C style comments from a file? + +=item * Can I use Perl regular expressions to match balanced text? + +=item * What does it mean that regexps are greedy? How can I get around it? + +=item * How do I process each word on each line? + +=item * How can I print out a word-frequency or line-frequency summary? + +=item * How can I do approximate matching? + +=item * How do I efficiently match many regular expressions at once? + +=item * Why don't word-boundary searches with C<\b> work for me? + +=item * Why does using $&, $`, or $' slow my program down? + +=item * What good is C<\G> in a regular expression? + +=item * Are Perl regexps DFAs or NFAs? Are they POSIX compliant? + +=item * What's wrong with using grep or map in a void context? + +=item * How can I match strings with multibyte characters? + +=item * How do I match a pattern that is supplied by the user? + +=back + + =item L<perlfaq7>: General Perl Language Issues General Perl language issues that don't clearly fit into any of the other sections. +=over 4 + +=item * Can I get a BNF/yacc/RE for the Perl language? + +=item * What are all these $@%* punctuation signs, and how do I know when to use them? + +=item * Do I always/never have to quote my strings or use semicolons and commas? + +=item * How do I skip some return values? + +=item * How do I temporarily block warnings? + +=item * What's an extension? + +=item * Why do Perl operators have different precedence than C operators? + +=item * How do I declare/create a structure? + +=item * How do I create a module? + +=item * How do I create a class? + +=item * How can I tell if a variable is tainted? + +=item * What's a closure? + +=item * What is variable suicide and how can I prevent it? + +=item * How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regexp}? + +=item * How do I create a static variable? + +=item * What's the difference between dynamic and lexical (static) scoping? Between local() and my()? + +=item * How can I access a dynamic variable while a similarly named lexical is in scope? + +=item * What's the difference between deep and shallow binding? + +=item * Why doesn't "my($foo) = E<lt>FILEE<gt>;" work right? + +=item * How do I redefine a builtin function, operator, or method? + +=item * What's the difference between calling a function as &foo and foo()? + +=item * How do I create a switch or case statement? + +=item * How can I catch accesses to undefined variables/functions/methods? + +=item * Why can't a method included in this same file be found? + +=item * How can I find out my current package? + +=item * How can I comment out a large block of perl code? + +=item * How do I clear a package? + +=back + + =item L<perlfaq8>: System Interaction Interprocess communication (IPC), control over the user-interface (keyboard, screen and pointing devices). +=over 4 + +=item * How do I find out which operating system I'm running under? + +=item * How come exec() doesn't return? + +=item * How do I do fancy stuff with the keyboard/screen/mouse? + +=item * How do I print something out in color? + +=item * How do I read just one key without waiting for a return key? + +=item * How do I check whether input is ready on the keyboard? + +=item * How do I clear the screen? + +=item * How do I get the screen size? + +=item * How do I ask the user for a password? + +=item * How do I read and write the serial port? + +=item * How do I decode encrypted password files? + +=item * How do I start a process in the background? + +=item * How do I trap control characters/signals? + +=item * How do I modify the shadow password file on a Unix system? + +=item * How do I set the time and date? + +=item * How can I sleep() or alarm() for under a second? + +=item * How can I measure time under a second? + +=item * How can I do an atexit() or setjmp()/longjmp()? (Exception handling) + +=item * Why doesn't my sockets program work under System V (Solaris)? What does the error message "Protocol not supported" mean? + +=item * How can I call my system's unique C functions from Perl? + +=item * Where do I get the include files to do ioctl() or syscall()? + +=item * Why do setuid perl scripts complain about kernel problems? + +=item * How can I open a pipe both to and from a command? + +=item * Why can't I get the output of a command with system()? + +=item * How can I capture STDERR from an external command? + +=item * Why doesn't open() return an error when a pipe open fails? + +=item * What's wrong with using backticks in a void context? + +=item * How can I call backticks without shell processing? + +=item * Why can't my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS-DOS)? + +=item * How can I convert my shell script to perl? + +=item * Can I use perl to run a telnet or ftp session? + +=item * How can I write expect in Perl? + +=item * Is there a way to hide perl's command line from programs such as "ps"? + +=item * I {changed directory, modified my environment} in a perl script. How come the change disappeared when I exited the script? How do I get my changes to be visible? + +=item * How do I close a process's filehandle without waiting for it to complete? + +=item * How do I fork a daemon process? + +=item * How do I make my program run with sh and csh? + +=item * How do I find out if I'm running interactively or not? + +=item * How do I timeout a slow event? + +=item * How do I set CPU limits? + +=item * How do I avoid zombies on a Unix system? + +=item * How do I use an SQL database? + +=item * How do I make a system() exit on control-C? + +=item * How do I open a file without blocking? + +=item * How do I install a CPAN module? + +=item * What's the difference between require and use? + +=item * How do I keep my own module/library directory? + +=item * How do I add the directory my program lives in to the module/library search path? + +=item * How do I add a directory to my include path at runtime? + +=item * What is socket.ph and where do I get it? + +=back + + =item L<perlfaq9>: Networking Networking, the Internet, and a few on the web. +=over 4 + +=item * My CGI script runs from the command line but not the browser. (500 Server Error) + +=item * How can I get better error messages from a CGI program? + +=item * How do I remove HTML from a string? + +=item * How do I extract URLs? + +=item * How do I download a file from the user's machine? How do I open a file on another machine? + +=item * How do I make a pop-up menu in HTML? + +=item * How do I fetch an HTML file? + +=item * How do I automate an HTML form submission? + +=item * How do I decode or create those %-encodings on the web? + +=item * How do I redirect to another page? + +=item * How do I put a password on my web pages? + +=item * How do I edit my .htpasswd and .htgroup files with Perl? + +=item * How do I make sure users can't enter values into a form that cause my CGI script to do bad things? + +=item * How do I parse a mail header? + +=item * How do I decode a CGI form? + +=item * How do I check a valid mail address? + +=item * How do I decode a MIME/BASE64 string? + +=item * How do I return the user's mail address? + +=item * How do I send mail? + +=item * How do I read mail? + +=item * How do I find out my hostname/domainname/IP address? + +=item * How do I fetch a news article or the active newsgroups? + +=item * How do I fetch/put an FTP file? + +=item * How can I do RPC in Perl? + +=back + + =back =head2 Where to get this document @@ -66,6 +706,7 @@ at http://www.perl.com/perl/faq/ . You may mail corrections, additions, and suggestions to perlfaq-suggestions@perl.com . This alias should not be used to I<ask> FAQs. It's for fixing the current FAQ. +Send questions to the comp.lang.perl.misc newsgroup. =head2 What will happen if you mail your Perl programming problems to the authors @@ -88,7 +729,7 @@ Perl Porters. =head1 Author and Copyright Information -Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. +Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. =head2 Bundled Distributions @@ -117,6 +758,11 @@ in respect of this information or its use. =over 4 +=item 7/January/99 + +Small touchups here and there. Added all questions in this +document as a sort of table of contents. + =item 22/June/98 Significant changes throughout in preparation for the 5.005 @@ -170,3 +816,4 @@ This is the initial release of version 3 of the FAQ; consequently there have been no changes since its initial release. =back + diff --git a/pod/perlfaq1.pod b/pod/perlfaq1.pod index c6d53b3161..6a752b9db9 100644 --- a/pod/perlfaq1.pod +++ b/pod/perlfaq1.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq1 - General Questions About Perl ($Revision: 1.15 $, $Date: 1998/08/05 11:52:24 $) +perlfaq1 - General Questions About Perl ($Revision: 1.20 $, $Date: 1999/01/08 04:22:09 $) =head1 DESCRIPTION @@ -32,12 +32,14 @@ the personal note at the end of the README file in the perl source distribution for more details. See L<perlhist> (new as of 5.005) for Perl's milestone releases. -In particular, the core development team (known as the Perl -Porters) are a rag-tag band of highly altruistic individuals -committed to producing better software for free than you -could hope to purchase for money. You may snoop on pending -developments via news://news.perl.com/perl.porters-gw/ and -http://www.frii.com/~gnat/perl/porters/summary.html. +In particular, the core development team (known as the Perl Porters) +are a rag-tag band of highly altruistic individuals committed +to producing better software for free than you could hope to +purchase for money. You may snoop on pending developments via +nntp://news.perl.com/perl.porters-gw/ and the Deja News archive at +http://www.dejanews.com/ using the perl.porters-gw newsgroup, or you can +subscribe to the mailing list by sending perl5-porters-request@perl.org +a subscription request. While the GNU project includes Perl in its distributions, there's no such thing as "GNU Perl". Perl is not produced nor maintained by the @@ -51,12 +53,16 @@ users the informal support will more than suffice. See the answer to =head2 Which version of Perl should I use? You should definitely use version 5. Version 4 is old, limited, and -no longer maintained; its last patch (4.036) was in 1992. The most -recent production release is 5.005_01. Further references to the Perl -language in this document refer to this production release unless -otherwise specified. There may be one or more official bug fixes for -5.005_01 by the time you read this, and also perhaps some experimental -versions on the way to the next release. +no longer maintained; its last patch (4.036) was in 1992, long ago and +far away. Sure, it's stable, but so is anything that's dead; in fact, +perl4 had been called a dead, flea-bitten camel carcass. The most recent +production release is 5.005_02 (although 5.004_04 is still supported). +The most cutting-edge development release is 5.005_54. Further references +to the Perl language in this document refer to the production release +unless otherwise specified. There may be one or more official bug +fixes for 5.005_02 by the time you read this, and also perhaps some +experimental versions on the way to the next release. All releases +prior to 5.004 were subject to buffer overruns, a grave security issue. =head2 What are perl4 and perl5? @@ -68,11 +74,12 @@ Perl5 is merely the popular name for the fifth major release (October 1994), while perl4 was the fourth major release (March 1991). There was also a perl1 (in January 1988), a perl2 (June 1988), and a perl3 (October 1989). -The 5.0 release is, essentially, a complete rewrite of the perl source -code from the ground up. It has been modularized, object-oriented, -tweaked, trimmed, and optimized until it almost doesn't look like the -old code. However, the interface is mostly the same, and compatibility -with previous releases is very high. +The 5.0 release is, essentially, a ground-up rewrite of the original +perl source code from releases 1 through 4. It has been modularized, +object-oriented, tweaked, trimmed, and optimized until it almost doesn't +look like the old code. However, the interface is mostly the same, and +compatibility with previous releases is very high. See L<perltrap/"Perl4 +to Perl5 Traps">. To avoid the "what language is perl5?" confusion, some people prefer to simply use "perl" to refer to the latest version of perl and avoid using @@ -80,6 +87,27 @@ simply use "perl" to refer to the latest version of perl and avoid using See L<perlhist> for a history of Perl revisions. +=head2 What is perl6? + +Perl6 is a semi-jocular reference to the Topaz project. Headed by Chip +Salzenberg, Topaz is yet-another ground-up rewrite of the current release +of Perl, one whose major goal is to create a more maintainable core than +found in release 5. Written in nominally portable C++, Topaz hopes to +maintain 100% source-compatibility with previous releases of Perl but to +run significantly faster and smaller. The Topaz team hopes to provide +an XS compatibility interface to allow most XS modules to work unchanged, +albeit perhaps without the efficiency that the new interface uowld allow. +New features in Topaz are as yet undetermined, and will be addressed +once compatibility and performance goals are met. + +If you are a hard-working C++ wizard with a firm command of Perl's +internals, and you would like to work on the project, send a request to +perl6-porters-request@perl.org to subscribe to the Topaz mailing list. + +There is no ETA for Topaz. It is expected to be several years before it +achieves enough robustness, compatibility, portability, and performance +to replace perl5 for ordinary use by mere mortals. + =head2 How stable is Perl? Production releases, which incorporate bug fixes and new functionality, @@ -106,18 +134,18 @@ to do it" (TMTOWTDI, sometimes pronounced "tim toady"). Perl's learning curve is therefore shallow (easy to learn) and long (there's a whole lot you can do if you really want). -Finally, Perl is (frequently) an interpreted language. This means -that you can write your programs and test them without an intermediate -compilation step, allowing you to experiment and test/debug quickly -and easily. This ease of experimentation flattens the learning curve -even more. +Finally, because Perl is frequently (but not always, and certainly not by +definition) an interpreted language, you can write your programs and test +them without an intermediate compilation step, allowing you to experiment +and test/debug quickly and easily. This ease of experimentation flattens +the learning curve even more. Things that make Perl easier to learn: Unix experience, almost any kind of programming experience, an understanding of regular expressions, and the ability to understand other people's code. If there's something you need to do, then it's probably already been done, and a working example is usually available for free. Don't forget the new perl modules, either. -They're discussed in Part 3 of this FAQ, along with the CPAN, which is +They're discussed in Part 3 of this FAQ, along with CPAN, which is discussed in Part 2. =head2 How does Perl compare with other languages like Java, Python, REXX, Scheme, or Tcl? @@ -130,22 +158,25 @@ Probably the best thing to do is try to write equivalent code to do a set of tasks. These languages have their own newsgroups in which you can learn about (but hopefully not argue about) them. +Some comparison documents can be found at http://language.perl.com/versus/ +if you really can't stop yourself. + =head2 Can I do [task] in Perl? -Perl is flexible and extensible enough for you to use on almost any -task, from one-line file-processing tasks to complex systems. For -many people, Perl serves as a great replacement for shell scripting. -For others, it serves as a convenient, high-level replacement for most -of what they'd program in low-level languages like C or C++. It's -ultimately up to you (and possibly your management ...) which tasks -you'll use Perl for and which you won't. +Perl is flexible and extensible enough for you to use on virtually any +task, from one-line file-processing tasks to large, elaborate systems. +For many people, Perl serves as a great replacement for shell scripting. +For others, it serves as a convenient, high-level replacement for most of +what they'd program in low-level languages like C or C++. It's ultimately +up to you (and possibly your management) which tasks you'll use Perl +for and which you won't. If you have a library that provides an API, you can make any component of it available as just another Perl function or variable using a Perl extension written in C or C++ and dynamically linked into your main perl interpreter. You can also go the other direction, and write your main program in C or C++, and then link in some Perl code on the fly, -to create a powerful application. +to create a powerful application. See L<perlembed>. That said, there will always be small, focused, special-purpose languages dedicated to a specific problem domain that are simply more @@ -164,17 +195,16 @@ certain task (e.g. prolog, make). For various reasons, Perl is probably not well-suited for real-time embedded systems, low-level operating systems development work like -device drivers or context-switching code, complex multithreaded +device drivers or context-switching code, complex multi-threaded shared-memory applications, or extremely large applications. You'll notice that perl is not itself written in Perl. -The new native-code compiler for Perl may reduce the limitations given -in the previous statement to some degree, but understand that Perl -remains fundamentally a dynamically typed language, and not a -statically typed one. You certainly won't be chastized if you don't -trust nuclear-plant or brain-surgery monitoring code to it. And -Larry will sleep easier, too -- Wall Street programs not -withstanding. :-) +The new, native-code compiler for Perl may eventually reduce the +limitations given in the previous statement to some degree, but understand +that Perl remains fundamentally a dynamically typed language, not +a statically typed one. You certainly won't be chastised if you don't +trust nuclear-plant or brain-surgery monitoring code to it. And Larry +will sleep easier, too -- Wall Street programs not withstanding. :-) =head2 What's the difference between "perl" and "Perl"? @@ -183,33 +213,58 @@ signify the language proper and "perl" the implementation of it, i.e. the current interpreter. Hence Tom's quip that "Nothing but perl can parse Perl." You may or may not choose to follow this usage. For example, parallelism means "awk and perl" and "Python and Perl" look -ok, while "awk and Perl" and "Python and perl" do not. +ok, while "awk and Perl" and "Python and perl" do not. But never +write "PERL", because perl isn't really an acronym, aprocryphal +folklore and post-facto expansions notwithstanding. =head2 Is it a Perl program or a Perl script? -It doesn't matter. - -In "standard terminology" a I<program> has been compiled to physical -machine code once, and can then be be run multiple times, whereas a -I<script> must be translated by a program each time it's used. Perl -programs, however, are usually neither strictly compiled nor strictly -interpreted. They can be compiled to a byte code form (something of a +Larry doesn't really care. He says (half in jest) that "a script is +what you give the actors. A program is what you give the audience." + +Originally, a script was a canned sequence of normally interactive +commands, that is, a chat script. Something like a uucp or ppp chat +script or an expect script fits the bill nicely, as do configuration +scripts run by a program at its start up, such F<.cshrc> or F<.ircrc>, +for example. Chat scripts were just drivers for existing programs, +not stand-alone programs in their own right. + +A computer scientist will correctly explain that all programs are +interpreted, and that the only question is at what level. But if you +ask this question of someone who isn't a computer scientist, they might +tell you that a I<program> has been compiled to physical machine code +once, and can then be run multiple times, whereas a I<script> must be +translated by a program each time it's used. + +Perl programs are (usually) neither strictly compiled nor strictly +interpreted. They can be compiled to a byte-code form (something of a Perl virtual machine) or to completely different languages, like C or -assembly language. You can't tell just by looking whether the source -is destined for a pure interpreter, a parse-tree interpreter, a byte -code interpreter, or a native-code compiler, so it's hard to give a -definitive answer here. +assembly language. You can't tell just by looking at it whether the +source is destined for a pure interpreter, a parse-tree interpreter, +a byte-code interpreter, or a native-code compiler, so it's hard to give +a definitive answer here. + +Now that "script" and "scripting" are terms that have been seized by +unscrupulous or unknowing marketeers for their own nefarious purposes, +they have begun to take on strange and often pejorative meanings, +like "non serious" or "not real programming". Consequently, some perl +programmers prefer to avoid them altogether. =head2 What is a JAPH? These are the "just another perl hacker" signatures that some people -sign their postings with. About 100 of the of the earlier ones are -available from http://www.perl.com/CPAN/misc/japh . +sign their postings with. Randal Schwartz made these famous. About +100 of the earlier ones are available from +http://www.perl.com/CPAN/misc/japh . =head2 Where can I get a list of Larry Wall witticisms? Over a hundred quips by Larry, from postings of his or source code, -can be found at http://www.perl.com/CPAN/misc/lwall-quotes . +can be found at http://www.perl.com/CPAN/misc/lwall-quotes.txt.gz . + +Newer examples can be found by perusing Larry's postings: + + http://x1.dejanews.com/dnquery.xp?QRY=*&DBS=2&ST=PS&defaultOp=AND&LNG=ALL&format=terse&showsort=date&maxhits=100&subjects=&groups=&authors=larry@*wall.org&fromdate=&todate= =head2 How can I convince my sysadmin/supervisor/employees to use version (5/5.005/Perl instead of some other language)? @@ -232,28 +287,29 @@ many Unix vendors now ship Perl by default, and support is usually just a news-posting away, if you can't find the answer in the I<comprehensive> documentation, including this FAQ. +See http://www.perl.org/advocacy/ for more information. + If you face reluctance to upgrading from an older version of perl, then point out that version 4 is utterly unmaintained and unsupported by the Perl Development Team. Another big sell for Perl5 is the large number of modules and extensions which greatly reduce development time for any given task. Also mention that the difference between version 4 and version 5 of Perl is like the difference between awk and C++. -(Well, ok, maybe not quite that distinct, but you get the idea.) If -you want support and a reasonable guarantee that what you're -developing will continue to work in the future, then you have to run -the supported version. That probably means running the 5.005 release, -although 5.004 isn't that bad (it's just one year and one release -behind). Several important bugs were fixed from the 5.000 through +(Well, ok, maybe not quite that distinct, but you get the idea.) If you +want support and a reasonable guarantee that what you're developing +will continue to work in the future, then you have to run the supported +version. That probably means running the 5.005 release, although 5.004 +isn't that bad. Several important bugs were fixed from the 5.000 through 5.003 versions, though, so try upgrading past them if possible. Of particular note is the massive bughunt for buffer overflow problems that went into the 5.004 release. All releases prior to that, including perl4, are considered insecure and should be upgraded -as soon as possible. +as soon as possible. =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. +Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as an integrated part of the Standard Distribution @@ -266,3 +322,4 @@ domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required. + diff --git a/pod/perlfaq2.pod b/pod/perlfaq2.pod index 918e9369ae..13a29072b5 100644 --- a/pod/perlfaq2.pod +++ b/pod/perlfaq2.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq2 - Obtaining and Learning about Perl ($Revision: 1.25 $, $Date: 1998/08/05 11:47:25 $) +perlfaq2 - Obtaining and Learning about Perl ($Revision: 1.30 $, $Date: 1998/12/29 19:43:32 $) =head1 DESCRIPTION @@ -12,7 +12,7 @@ related matters. The standard release of Perl (the one maintained by the perl development team) is distributed only in source code form. You -can find this at http://www.perl.com/CPAN/src/latest.tar.gz, which +can find this at http://www.perl.com/CPAN/src/latest.tar.gz , which in standard Internet format (a gzipped archive in POSIX tar format). Perl builds and runs on a bewildering number of platforms. Virtually @@ -22,7 +22,7 @@ QNX, BeOS, and the Amiga. There are also the beginnings of support for MPE/iX. Binary distributions for some proprietary platforms, including -Apple systems can be found http://www.perl.com/CPAN/ports/ directory. +Apple systems, can be found http://www.perl.com/CPAN/ports/ directory. Because these are not part of the standard distribution, they may and in fact do differ from the base Perl port in a variety of ways. You'll have to check their respective release notes to see just @@ -31,22 +31,23 @@ what the differences are. These differences can be either positive are not supported in the source release of perl) or negative (e.g. might be based upon a less current source release of perl). -A useful FAQ for Win32 Perl users is -http://www.endcontsw.com/people/evangelo/Perl_for_Win32_FAQ.html - =head2 How can I get a binary version of Perl? -If you don't have a C compiler because for whatever reasons your -vendor did not include one with your system, the best thing to do is +If you don't have a C compiler because your vendor for whatever +reasons did not include one with your system, the best thing to do is grab a binary version of gcc from the net and use that to compile perl with. CPAN only has binaries for systems that are terribly hard to get free compilers for, not for Unix systems. -Your first stop should be http://www.perl.com/CPAN/ports to see what -information is already available. A simple installation guide for -MS-DOS is available at http://www.cs.ruu.nl/~piet/perl5dos.html , and -similarly for Windows 3.1 at http://www.cs.ruu.nl/~piet/perlwin3.html -. +Some URLs that might help you are: + + http://language.perl.com/info/software.html + http://www.perl.com/latest/ + http://www.perl.com/CPAN/ports/ + +If you want information on proprietary systems. A simple installation +guide for MS-DOS is available at http://www.cs.ruu.nl/~piet/perl5dos.html +and similarly for Windows 3.1 at http://www.cs.ruu.nl/~piet/perlwin3.html . =head2 I don't have a C compiler on my system. How can I compile perl? @@ -67,11 +68,14 @@ approaches are doomed to failure. One simple way to check that things are in the right place is to print out the hard-coded @INC which perl is looking for. - perl -e 'print join("\n",@INC)' + % perl -e 'print join("\n",@INC)' If this command lists any paths which don't exist on your system, then you may need to move the appropriate libraries to these locations, or create -symlinks, aliases, or shortcuts appropriately. +symlinks, aliases, or shortcuts appropriately. @INC is also printed as +part of the output of + + % perl -V You might also want to check out L<perlfaq8/"How do I keep my own module/library directory?">. @@ -79,7 +83,7 @@ module/library directory?">. =head2 I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed. How do I make it work? Read the F<INSTALL> file, which is part of the source distribution. -It describes in detail how to cope with most idiosyncracies that the +It describes in detail how to cope with most idiosyncrasies that the Configure script can't work around for any given system or architecture. @@ -141,6 +145,16 @@ http://www.perl.com/perl/info/documentation.html that might help. Many good books have been written about Perl -- see the section below for more details. +Tutorial documents are included in current or upcoming Perl releases +include L<perltoot> for objects, L<perlopentut> for file opening +semantics, L<perlreftut> for managing references, and L<perlxstut> +for linking C and Perl together. There may be more by the +time you read this. The following URLs might also be of +assistance: + + http://language.perl.com/info/documentation.html + http://reference.perl.com/query.cgi?tutorials + =head2 What are the Perl newsgroups on USENET? Where do I post questions? The now defunct comp.lang.perl newsgroup has been superseded by the @@ -154,20 +168,17 @@ following groups: comp.infosystems.www.authoring.cgi Writing CGI scripts for the Web. -Actually, the moderated group hasn't passed yet, but we're -keeping our fingers crossed. - There is also USENET gateway to the mailing list used by the crack Perl development team (perl5-porters) at news://news.perl.com/perl.porters-gw/ . =head2 Where should I post source code? -You should post source code to whichever group is most appropriate, -but feel free to cross-post to comp.lang.perl.misc. If you want to -cross-post to alt.sources, please make sure it follows their posting -standards, including setting the Followup-To header line to NOT -include alt.sources; see their FAQ for details. +You should post source code to whichever group is most appropriate, but +feel free to cross-post to comp.lang.perl.misc. If you want to cross-post +to alt.sources, please make sure it follows their posting standards, +including setting the Followup-To header line to NOT include alt.sources; +see their FAQ (http://www.faqs.org/faqs/alt-sources-intro/) for details. If you're just looking for software, first use Alta Vista, Deja News, and search CPAN. This is faster and more productive than just posting @@ -184,7 +195,7 @@ The incontestably definitive reference book on Perl, written by the creator of Perl, is now in its second edition: Programming Perl (the "Camel Book"): - Authors: Larry Wall, Tom Christiansen, and Randal Schwartz + by Larry Wall, Tom Christiansen, and Randal Schwartz ISBN 1-56592-149-6 (English) ISBN 4-89052-384-7 (Japanese) URL: http://www.oreilly.com/catalog/pperl2/ @@ -196,7 +207,7 @@ of real-world examples, mini-tutorials, and complete programs (first premiering at the 1998 Perl Conference), is: The Perl Cookbook (the "Ram Book"): - Authors: Tom Christiansen and Nathan Torkington, + by Tom Christiansen and Nathan Torkington, with Foreword by Larry Wall ISBN: 1-56592-243-3 URL: http://perl.oreilly.com/cookbook/ @@ -206,7 +217,7 @@ might suffice for you to learn Perl from. But if you're not, check out: Learning Perl (the "Llama Book"): - Authors: Randal Schwartz and Tom Christiansen + by Randal Schwartz and Tom Christiansen with Foreword by Larry Wall ISBN: 1-56592-284-0 URL: http://www.oreilly.com/catalog/lperl2/ @@ -230,7 +241,7 @@ See http://www.ora.com/ on the Web. What follows is a list of the books that the FAQ authors found personally useful. Your mileage may (but, we hope, probably won't) vary. -Recommended books on (or muchly on) Perl follow; those marked with +Recommended books on (or mostly on) Perl follow; those marked with a star may be ordered from O'Reilly. =over @@ -262,7 +273,7 @@ a star may be ordered from O'Reilly. MacPerl: Power and Ease by Vicki Brown and Chris Nandor, foreword by Matthias Neeracher -=item Task-Oriented +=item Task-Oriented *The Perl Cookbook by Tom Christiansen and Nathan Torkington @@ -296,7 +307,7 @@ development, databases, Win32 Perl, graphical programming, regular expressions, and networking, and sponsors the Obfuscated Perl Contest. It is published quarterly under the gentle hand of its editor, Jon Orwant. See http://www.tpj.com/ or send mail to -subscriptions@tpj.com. +subscriptions@tpj.com . Beyond this, magazines that frequently carry high-quality articles on Perl are I<Web Techniques> (see http://www.webtechniques.com/), @@ -309,10 +320,11 @@ http://www.stonehenge.com/merlyn/WebTechniques/. To get the best (and possibly cheapest) performance, pick a site from the list below and use it to grab the complete list of mirror sites. -From there you can find the quickest site for you. Remember, the +>From there you can find the quickest site for you. Remember, the following list is I<not> the complete list of CPAN mirrors. - http://www.perl.com/CPAN (redirects to another mirror) + http://www.perl.com/CPAN-local + http://www.perl.com/CPAN (redirects to an ftp mirror) http://www.perl.org/CPAN ftp://ftp.funet.fi/pub/languages/perl/CPAN/ http://www.cs.ruu.nl/pub/PERL/CPAN/ @@ -322,69 +334,19 @@ following list is I<not> the complete list of CPAN mirrors. Most of the major modules (tk, CGI, libwww-perl) have their own mailing lists. Consult the documentation that came with the module for -subscription information. The following are a list of mailing lists -related to perl itself. - -If you subscribe to a mailing list, it behooves you to know how to -unsubscribe from it. Strident pleas to the list itself to get you off -will not be favorably received. - -=over 4 - -=item MacPerl - -There is a mailing list for discussing Macintosh Perl. Contact -"mac-perl-request@iis.ee.ethz.ch". - -Also see Matthias Neeracher's (the creator and maintainer of MacPerl) -webpage at http://www.iis.ee.ethz.ch/~neeri/macintosh/perl.html for -many links to interesting MacPerl sites, and the applications/MPW -tools, precompiled. - -=item Perl5-Porters - -The core development team have a mailing list for discussing fixes and -changes to the language. Send mail to -"perl5-porters-request@perl.org" with help in the body of the message -for information on subscribing. - -=item NTPerl +subscription information. The Perl Institute attempts to maintain a +list of mailing lists at: -This list is used to discuss issues involving Win32 Perl 5 (Windows NT -and Win95). Subscribe by mailing ListManager@ActiveWare.com with the -message body: + http://www.perl.org/maillist.html - subscribe Perl-Win32-Users - -The list software, also written in perl, will automatically determine -your address, and subscribe you automatically. To unsubscribe, mail -the following in the message body to the same address like so: - - unsubscribe Perl-Win32-Users - -You can also check http://www.activeware.com/ and select "Mailing Lists" -to join or leave this list. - -=item Perl-Packrats - -Discussion related to archiving of perl materials, particularly the -Comprehensive Perl Archive Network (CPAN). Subscribe by emailing -majordomo@cis.ufl.edu: - - subscribe perl-packrats - -The list software, also written in perl, will automatically determine -your address, and subscribe you automatically. To unsubscribe, simple -prepend the same command with an "un", and mail to the same address -like so: - - unsubscribe perl-packrats +=head2 Archives of comp.lang.perl.misc -=back +Have you tried Deja News or Alta Vista? Those are the +best archives. Just look up "*perl*" as a newsgroup. -=head2 Archives of comp.lang.perl.misc + http://www.dejanews.com/dnquery.xp?QRY=&DBS=2&ST=PS&defaultOp=AND&LNG=ALL&format=terse&showsort=date&maxhits=25&subjects=&groups=*perl*&authors=&fromdate=&todate= -Have you tried Deja News or Alta Vista? +You'll probably want to trim that down a bit, though. ftp.cis.ufl.edu:/pub/perl/comp.lang.perl.*/monthly has an almost complete collection dating back to 12/89 (missing 08/91 through @@ -402,21 +364,24 @@ let perlfaq-suggestions@perl.com know. =head2 Where can I buy a commercial version of Perl? -In a sense, Perl already I<is> commercial software: It has a licence -that you can grab and carefully read to your manager. It is -distributed in releases and comes in well-defined packages. There is a -very large user community and an extensive literature. The -comp.lang.perl.* newsgroups and several of the mailing lists provide -free answers to your questions in near real-time. Perl has -traditionally been supported by Larry, dozens of software designers -and developers, and thousands of programmers, all working for free -to create a useful thing to make life better for everyone. +In a real sense, Perl already I<is> commercial software: It has a licence +that you can grab and carefully read to your manager. It is distributed +in releases and comes in well-defined packages. There is a very large +user community and an extensive literature. The comp.lang.perl.* +newsgroups and several of the mailing lists provide free answers to your +questions in near real-time. Perl has traditionally been supported by +Larry, scores of software designers and developers, and myriads of +programmers, all working for free to create a useful thing to make life +better for everyone. However, these answers may not suffice for managers who require a -purchase order from a company whom they can sue should anything go -wrong. Or maybe they need very serious hand-holding and contractual -obligations. Shrink-wrapped CDs with perl on them are available from -several sources if that will help. +purchase order from a company whom they can sue should anything go awry. +Or maybe they need very serious hand-holding and contractual obligations. +Shrink-wrapped CDs with perl on them are available from several sources if +that will help. For example, many perl books carry a perl distribution +on them, as do the O'Reily Perl Resource Kits (in both the Unix flavor +and in the proprietary Microsoft flavor); the free Unix distributions +also all come with Perl. Or you can purchase a real support contract. Although Cygnus historically provided this service, they no longer sell support contracts for Perl. @@ -438,20 +403,20 @@ Oraperl and related modules (which Oracle is planning to ship as part of Oracle Web Server 3). 20% of the profit from our Perl support work will be donated to The Perl Institute." -For more information, contact the The Perl Clinic: +For more information, contact The Perl Clinic: Tel: +44 1483 424424 Fax: +44 1483 419419 Web: http://www.perl.co.uk/ Email: perl-support-info@perl.co.uk or Tim.Bunce@ig.co.uk -See also www.perl.com for updates on training and support. +See also www.perl.com for updates on tutorials, training, and support. =head2 Where do I send bug reports? If you are reporting a bug in the perl interpreter or the modules shipped with perl, use the I<perlbug> program in the perl distribution or -mail your report to perlbug@perl.com. +mail your report to perlbug@perl.com . If you are posting a bug with a non-standard port (see the answer to "What platforms is Perl available for?"), a binary distribution, or a @@ -461,30 +426,24 @@ bugs. Read the perlbug(1) man page (perl5.004 or later) for more information. -=head2 What is perl.com? perl.org? The Perl Institute? +=head2 What is perl.com? -The perl.com domain is managed by Tom Christiansen, who created it as a +The perl.com domain is owned by Tom Christiansen, who created it as a public service long before perl.org came about. Despite the name, it's a pretty non-commercial site meant to be a clearinghouse for information about all things Perlian, accepting no paid advertisements, bouncy happy gifs, or silly java applets on its pages. The Perl Home Page at http://www.perl.com/ is currently hosted on a T3 line courtesy of Songline Systems, a software-oriented subsidiary of O'Reilly and Associates. +Other starting points include -perl.org is the official vehicle for The Perl Institute. The motto of -TPI is "helping people help Perl help people" (or something like -that). It's a non-profit organization supporting development, -documentation, and dissemination of perl. - -=head2 How do I learn about object-oriented Perl programming? - -L<perltoot> (distributed with 5.004 or later) is a good place to start. -Also, L<perlobj>, L<perlref>, and L<perlmod> are useful references, -while L<perlbot> has some excellent tips and tricks. + http://language.perl.com/ + http://conference.perl.com/ + http://reference.perl.com/ =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. +Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as an integrated part of the Standard Distribution @@ -497,3 +456,4 @@ domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required. + diff --git a/pod/perlfaq3.pod b/pod/perlfaq3.pod index 478b0805d4..28e64ec5e2 100644 --- a/pod/perlfaq3.pod +++ b/pod/perlfaq3.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq3 - Programming Tools ($Revision: 1.29 $, $Date: 1998/08/05 11:57:04 $) +perlfaq3 - Programming Tools ($Revision: 1.33 $, $Date: 1998/12/29 20:12:12 $) =head1 DESCRIPTION @@ -102,6 +102,10 @@ on your hardware, operating system, and the load on your machine): for: 4 secs ( 3.97 usr 0.01 sys = 3.98 cpu) map: 6 secs ( 4.97 usr 0.00 sys = 4.97 cpu) +Be aware that a good benchmark is very hard to write. It only tests the +data you give it, and really proves little about differing complexities +of contrasting algorithms. + =head2 How do I cross-reference my Perl programs? The B::Xref module, shipped with the new, alpha-release Perl compiler @@ -122,23 +126,50 @@ shouldn't need to reformat. The habit of formatting your code as you write it will help prevent bugs. Your editor can and should help you with this. The perl-mode for emacs can provide a remarkable amount of help with most (but not all) code, and even less programmable editors -can provide significant assistance. +can provide significant assistance. Tom swears by the following +settings in vi and its clones: + + set ai sw=4 + map ^O {^M}^[O^T + +Now put that in your F<.exrc> file (replacing the caret characters +with control characters) and away you go. In insert mode, ^T is +for indenting, ^D is for undenting, and ^O is for blockdenting -- +as it were. If you haven't used the last one, you're missing +a lot. A more complete example, with comments, can be found at +http://www.perl.com/CPAN-local/authors/id/TOMC/scripts/toms.exrc.gz -If you are used to using I<vgrind> program for printing out nice code +If you are used to using the I<vgrind> program for printing out nice code to a laser printer, you can take a stab at this using http://www.perl.com/CPAN/doc/misc/tips/working.vgrind.entry, but the results are not particularly satisfying for sophisticated code. +The a2ps at http://www.infres.enst.fr/~demaille/a2ps/ does lots of things +related to generating nicely printed output of documents. + =head2 Is there a ctags for Perl? There's a simple one at http://www.perl.com/CPAN/authors/id/TOMC/scripts/ptags.gz which may do -the trick. +the trick. And if not, it's easy to hack into what you want. + +=head2 Is there an IDE or Windows Perl Editor? + +If you're on Unix, you already have an IDE -- Unix itself. +You just have to learn the toolbox. If you're not, then you +probably don't have a toolbox, so may need something else. + +PerlBuilder (XXX URL to follow) is an integrated development +environment for Windows that supports Perl development. Perl programs +are just plain text, though, so you could download emacs for Windows +(XXX) or vim for win32 (http://www.cs.vu.nl/~tmgil/vi.html). If +you're transferring Windows files to Unix, be sure to transfer in +ASCII mode so the ends of lines are appropriately converted. =head2 Where can I get Perl macros for vi? For a complete version of Tom Christiansen's vi configuration file, -see http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/toms.exrc, +see http://www.perl.com/CPAN/authors/Tom_Christiansen/scripts/toms.exrc.gz, the standard benchmark file for vi emulators. This runs best with nvi, the current version of vi out of Berkeley, which incidentally can be built with an embedded Perl interpreter -- see http://www.perl.com/CPAN/src/misc. @@ -155,7 +186,7 @@ context-sensitive help, and other nifty things. Note that the perl-mode of emacs will have fits with C<"main'foo"> (single quote), and mess up the indentation and hilighting. You -should be using C<"main::foo"> in new Perl code anyway, so this +are probably using C<"main::foo"> in new Perl code anyway, so this shouldn't be an issue. =head2 How can I use curses with Perl? @@ -236,7 +267,7 @@ wasn't a good solution anyway. When it comes to time-space tradeoffs, Perl nearly always prefers to throw memory at a problem. Scalars in Perl use more memory than -strings in C, arrays take more that, and hashes use even more. While +strings in C, arrays take more than that, and hashes use even more. While there's still a lot to be done, recent releases have been addressing these issues. For example, as of 5.004, duplicate hash keys are shared amongst all hashes using them, so require no reallocation. @@ -278,10 +309,15 @@ No, Perl's garbage collection system takes care of this. You can't. On most operating systems, memory allocated to a program can never be returned to the system. That's why long-running programs -sometimes re-exec themselves. Some operating systems (notably, FreeBSD) -allegedly reclaim large chunks of memory that is no longer used, but -it doesn't appear to happen with Perl (yet). The Mac appears to be the -only platform that will reliably (albeit, slowly) return memory to the OS. +sometimes re-exec themselves. Some operating systems (notably, +FreeBSD and Linux) allegedly reclaim large chunks of memory that is no +longer used, but it doesn't appear to happen with Perl (yet). The Mac +appears to be the only platform that will reliably (albeit, slowly) +return memory to the OS. + +We've had reports that on Linux (Redhat 5.1) on Intel, C<undef +$scalar> will return memory to the system, while on Solaris 2.6 it +won't. In general, try it yourself and see. However, judicious use of my() on your variables will help make sure that they go out of scope so that Perl can free up their storage for @@ -314,8 +350,7 @@ the internal server API, so modules written in Perl can do just about anything a module written in C can. For more on mod_perl, see http://perl.apache.org/ -With the FCGI module (from CPAN), a Perl executable compiled with sfio -(see the F<INSTALL> file in the distribution) and the mod_fastcgi +With the FCGI module (from CPAN) and the mod_fastcgi module (available from http://www.fastcgi.com/) each of your perl scripts becomes a permanent CGI daemon process. @@ -325,7 +360,7 @@ care. See http://www.perl.com/CPAN/modules/by-category/15_World_Wide_Web_HTML_HTTP_CGI/ . -A non-free, commerical product, ``The Velocity Engine for Perl'', +A non-free, commercial product, ``The Velocity Engine for Perl'', (http://www.binevolve.com/ or http://www.binevolve.com/bine/vep) might also be worth looking at. It will allow you to increase the performance of your perl scripts, upto 25 times faster than normal CGI perl by @@ -353,12 +388,12 @@ source. Security through obscurity, the name for hiding your bugs instead of fixing them, is little security indeed. You can try using encryption via source filters (Filter::* from CPAN), -but crackers might be able to decrypt it. You can try using the byte -code compiler and interpreter described below, but crackers might be -able to de-compile it. You can try using the native-code compiler -described below, but crackers might be able to disassemble it. These -pose varying degrees of difficulty to people wanting to get at your -code, but none can definitively conceal it (this is true of every +but any decent programmer will be able to decrypt it. You can try using +the byte code compiler and interpreter described below, but the curious +might still be able to de-compile it. You can try using the native-code +compiler described below, but crackers might be able to disassemble it. +These pose varying degrees of difficulty to people wanting to get at +your code, but none can definitively conceal it (this is true of every language, not just Perl). If you're concerned about people profiting from your code, then the @@ -407,6 +442,14 @@ packaging, and once you see the size of what it makes (well, unless you use a shared I<libperl.so>), you'll probably want a complete Perl install anyway. +=head2 How can I compile Perl into Java? + +You can't. Not yet, anyway. You can integrate Java and Perl with the +Perl Resource Kit from O'Reilly and Associates. See +http://www.oreilly.com/catalog/prkunix/ for more information. +The Java interface will be supported in the core 5.006 release +of Perl. + =head2 How can I get C<#!perl> to work on [MS-DOS,NT,...]? For OS/2 just use @@ -420,10 +463,13 @@ F<INSTALL> file in the source distribution for more information). The Win95/NT installation, when using the ActiveState port of Perl, will modify the Registry to associate the C<.pl> extension with the -perl interpreter. If you install another port (Gurusaramy Sarathy's -is the recommended Win95/NT port), or (eventually) build your own -Win95/NT Perl using WinGCC, then you'll have to modify the Registry -yourself. +perl interpreter. If you install another port (Gurusamy Sarathy's is +the recommended Win95/NT port), or (eventually) build your own +Win95/NT Perl using a Windows port of gcc (e.g., with cygwin32 or +mingw32), then you'll have to modify the Registry yourself. In +addition to associating C<.pl> with the interpreter, NT people can +use: C<SET PATHEXT=%PATHEXT%;.PL> to let them run the program +C<install-linux.pl> merely by typing C<install-linux>. Macintosh perl scripts will have the appropriate Creator and Type, so that double-clicking them will invoke the perl application. @@ -494,6 +540,9 @@ shell, or MPW, is much like Unix shells in its support for several quoting variants, except that it makes free use of the Mac's non-ASCII characters as control characters. +Using qq(), q(), and qx(), instead of "double quotes", 'single +quotes', and `backticks`, may make one-liners easier to write. + There is no general solution to all of this. It is a mess, pure and simple. Sucks to be away from Unix, huh? :-) @@ -580,7 +629,7 @@ information, see L<ExtUtils::MakeMaker>. =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. +Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as an integrated part of the Standard Distribution @@ -593,3 +642,4 @@ domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required. + diff --git a/pod/perlfaq4.pod b/pod/perlfaq4.pod index 39325c2f69..c477b9d4c6 100644 --- a/pod/perlfaq4.pod +++ b/pod/perlfaq4.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq4 - Data Manipulation ($Revision: 1.26 $, $Date: 1998/08/05 12:04:00 $) +perlfaq4 - Data Manipulation ($Revision: 1.40 $, $Date: 1999/01/08 04:26:39 $) =head1 DESCRIPTION @@ -41,7 +41,7 @@ are consequently slower. To get rid of the superfluous digits, just use a format (eg, C<printf("%.2f", 19.95)>) to get the required precision. -See L<perlop/"Floating-point Arithmetic">. +See L<perlop/"Floating-point Arithmetic">. =head2 Why isn't my octal data interpreted correctly? @@ -59,7 +59,7 @@ umask(), or sysopen(), which all want permissions in octal. chmod(644, $file); # WRONG -- perl -w catches this chmod(0644, $file); # right -=head2 Does perl have a round function? What about ceil() and floor()? Trig functions? +=head2 Does Perl have a round() function? What about ceil() and floor()? Trig functions? Remember that int() merely truncates toward 0. For rounding to a certain number of digits, sprintf() or printf() is usually the easiest @@ -88,6 +88,19 @@ cases, it probably pays not to trust whichever system rounding is being used by Perl, but to instead implement the rounding function you need yourself. +To see why, notice how you'll still have an issue on half-way-point +alternation: + + for ($i = 0; $i < 1.01; $i += 0.05) { printf "%.1f ",$i} + + 0.0 0.1 0.1 0.2 0.2 0.2 0.3 0.3 0.4 0.4 0.5 0.5 0.6 0.7 0.7 + 0.8 0.8 0.9 0.9 1.0 1.0 + +Don't blame Perl. It's the same as in C. IEEE says we have to do this. +Perl numbers whose absolute values are integers under 2**31 (on 32 bit +machines) will work pretty much like mathematical integers. Other numbers +are not guaranteed. + =head2 How do I convert bits into ints? To turn a string of 1s and 0s like C<10110110> into a scalar containing @@ -100,6 +113,33 @@ Here's an example of going the other way: $binary_string = join('', unpack('B*', "\x29")); +=head2 Why doesn't & work the way I want it to? + +The behavior of binary arithmetic operators depends on whether they're +used on numbers or strings. The operators treat a string as a series +of bits and work with that (the string C<"3"> is the bit pattern +C<00110011>). The operators work with the binary form of a number +(the number C<3> is treated as the bit pattern C<00000011>). + +So, saying C<11 & 3> performs the "and" operation on numbers (yielding +C<1>). Saying C<"11" & "3"> performs the "and" operation on strings +(yielding C<"1">). + +Most problems with C<&> and C<|> arise because the programmer thinks +they have a number but really it's a string. The rest arise because +the programmer says: + + if ("\020\020" & "\101\101") { + # ... + } + +but a string consisting of two null bytes (the result of C<"\020\020" +& "\101\101">) is not a false value in Perl. You need: + + if ( ("\020\020" & "\101\101") !~ /[^\000]/) { + # ... + } + =head2 How do I multiply matrices? Use the Math::Matrix or Math::MatrixReal modules (available from CPAN) @@ -120,12 +160,12 @@ To call a function on each element of an array, but ignore the results: foreach $iterator (@array) { - &my_func($iterator); + some_func($iterator); } To call a function on each integer in a (small) range, you B<can> use: - @results = map { &my_func($_) } (5 .. 25); + @results = map { some_func($_) } (5 .. 25); but you should be aware that the C<..> operator creates an array of all integers in the range. This can take a lot of memory for large @@ -133,7 +173,7 @@ ranges. Instead use: @results = (); for ($i=5; $i < 500_005; $i++) { - push(@results, &my_func($i)); + push(@results, some_func($i)); } =head2 How can I output Roman numerals? @@ -142,20 +182,25 @@ Get the http://www.perl.com/CPAN/modules/by-module/Roman module. =head2 Why aren't my random numbers random? -The short explanation is that you're getting pseudorandom numbers, not -random ones, because computers are good at being predictable and bad -at being random (despite appearances caused by bugs in your programs -:-). A longer explanation is available on -http://www.perl.com/CPAN/doc/FMTEYEWTK/random, courtesy of Tom -Phoenix. John von Neumann said, ``Anyone who attempts to generate -random numbers by deterministic means is, of course, living in a state -of sin.'' +If you're using a version of Perl before 5.004, you must call C<srand> +once at the start of your program to seed the random number generator. +5.004 and later automatically call C<srand> at the beginning. Don't +call C<srand> more than once--you make your numbers less random, rather +than more. -You should also check out the Math::TrulyRandom module from CPAN. It -uses the imperfections in your system's timer to generate random -numbers, but this takes quite a while. If you want a better +Computers are good at being predictable and bad at being random +(despite appearances caused by bugs in your programs :-). +http://www.perl.com/CPAN/doc/FMTEYEWTK/random, courtesy of Tom +Phoenix, talks more about this.. John von Neumann said, ``Anyone who +attempts to generate random numbers by deterministic means is, of +course, living in a state of sin.'' + +If you want numbers that are more random than C<rand> with C<srand> +provides, you should also check out the Math::TrulyRandom module from +CPAN. It uses the imperfections in your system's timer to generate +random numbers, but this takes quite a while. If you want a better pseudorandom generator than comes with your operating system, look at -``Numerical Recipes in C'' at http://nr.harvard.edu/nr/bookc.html . +``Numerical Recipes in C'' at http://www.nr.com/ . =head1 Data: Dates @@ -178,10 +223,10 @@ You can find the week of the year by dividing this by 7: Of course, this believes that weeks start at zero. The Date::Calc module from CPAN has a lot of date calculation functions, including day of the year, week of the year, and so on. Note that not -all business consider ``week 1'' to be the same; for example, -American business often consider the first week with a Monday -in it to be Work Week #1, despite ISO 8601, which consider -WW1 to be the frist week with a Thursday in it. +all businesses consider ``week 1'' to be the same; for example, +American businesses often consider the first week with a Monday +in it to be Work Week #1, despite ISO 8601, which considers +WW1 to be the first week with a Thursday in it. =head2 How can I compare two dates and find the difference? @@ -204,20 +249,34 @@ there is an example of Julian date calculation that should help you in http://www.perl.com/CPAN/authors/David_Muir_Sharnoff/modules/Time/JulianDay.pm.gz . +=head2 How do I find yesterday's date? + +The C<time()> function returns the current time in seconds since the +epoch. Take one day off that: + + $yesterday = time() - ( 24 * 60 * 60 ); + +Then you can pass this to C<localtime()> and get the individual year, +month, day, hour, minute, seconds values. + =head2 Does Perl have a year 2000 problem? Is Perl Y2K compliant? -Short answer: No, Perl does not have a Year 2000 problem. Yes, -Perl is Y2K compliant. The programmers you've hired to use it, -however, probably are not. +Short answer: No, Perl does not have a Year 2000 problem. Yes, Perl is +Y2K compliant (whatever that means). The programmers you've hired to +use it, however, probably are not. + +Long answer: The question belies a true understanding of the issue. +Perl is just as Y2K compliant as your pencil--no more, and no less. +Can you use your pencil to write a non-Y2K-compliant memo? Of course +you can. Is that the pencil's fault? Of course it isn't. -Long answer: Perl is just as Y2K compliant as your pencil--no more, -and no less. The date and time functions supplied with perl (gmtime -and localtime) supply adequate information to determine the year well -beyond 2000 (2038 is when trouble strikes for 32-bit machines). The -year returned by these functions when used in an array context is the -year minus 1900. For years between 1910 and 1999 this I<happens> to -be a 2-digit decimal number. To avoid the year 2000 problem simply do -not treat the year as a 2-digit number. It isn't. +The date and time functions supplied with perl (gmtime and localtime) +supply adequate information to determine the year well beyond 2000 +(2038 is when trouble strikes for 32-bit machines). The year returned +by these functions when used in an array context is the year minus 1900. +For years between 1910 and 1999 this I<happens> to be a 2-digit decimal +number. To avoid the year 2000 problem simply do not treat the year as +a 2-digit number. It isn't. When gmtime() and localtime() are used in scalar context they return a timestamp string that contains a fully-expanded year. For example, @@ -286,8 +345,9 @@ parser. If you are serious about writing a parser, there are a number of modules or oddities that will make your life a lot easier. There is the CPAN module Parse::RecDescent, the standard module Text::Balanced, -the byacc program, and Mark-Jason Dominus's excellent I<py> tool at -http://www.plover.com/~mjd/perl/py/ . +the byacc program, the CPAN module Parse::Yapp, and Mark-Jason +Dominus's excellent I<py> tool at http://www.plover.com/~mjd/perl/py/ +. One simple destructive, inside-out approach that you might try is to pull out the smallest nesting parts one at a time: @@ -296,6 +356,21 @@ pull out the smallest nesting parts one at a time: # do something with $1 } +A more complicated and sneaky approach is to make Perl's regular +expression engine do it for you. This is courtesy Dean Inada, and +rather has the nature of an Obfuscated Perl Contest entry, but it +really does work: + + # $_ contains the string to parse + # BEGIN and END are the opening and closing markers for the + # nested text. + + @( = ('(',''); + @) = (')',''); + ($re=$_)=~s/((BEGIN)|(END)|.)/$)[!$3]\Q$1\E$([!$2]/gs; + @$ = (eval{/$re/},$@!~/unmatched/); + print join("\n",@$[0..$#$]) if( $$[-1] ); + =head2 How do I reverse a string? Use reverse() in scalar context, as documented in @@ -422,6 +497,11 @@ You can (and probably should) enable locale awareness of those characters by placing a C<use locale> pragma in your program. See L<perllocale> for endless details on locales. +This is sometimes referred to as putting something into "title +case", but that's not quite accurate. Consdier the proper +capitalization of the movie I<Dr. Strangelove or: How I Learned to +Stop Worrying and Love the Bomb>, for example. + =head2 How can I split a [character] delimited string except when inside [character]? (Comma-separated files) @@ -457,13 +537,15 @@ distribution) lets you say: use Text::ParseWords; @new = quotewords(",", 0, $text); +There's also a Text::CSV module on CPAN. + =head2 How do I strip blank space from the beginning/end of a string? Although the simplest approach would seem to be: $string =~ s/^\s*(.*?)\s*$/$1/; -This is unneccesarily slow, destructive, and fails with embedded newlines. +This is unnecessarily slow, destructive, and fails with embedded newlines. It is much better faster to do this in two steps: $string =~ s/^\s+//; @@ -488,6 +570,44 @@ values of a hash if you use a slide: s/\s+$//; } +=head2 How do I pad a string with blanks or pad a number with zeroes? + +(This answer contributed by Uri Guttman) + +In the following examples, C<$pad_len> is the length to which you wish +to pad the string, C<$text> or C<$num> contains the string to be +padded, and C<$pad_char> contains the padding character. You can use a +single character string constant instead of the C<$pad_char> variable +if you know what it is in advance. + +The simplest method use the C<sprintf> function. It can pad on the +left or right with blanks and on the left with zeroes. + + # Left padding with blank: + $padded = sprintf( "%${pad_len}s", $text ) ; + + # Right padding with blank: + $padded = sprintf( "%${pad_len}s", $text ) ; + + # Left padding with 0: + $padded = sprintf( "%0${pad_len}d", $num ) ; + +If you need to pad with a character other than blank or zero you can use +one of the following methods. + +These methods generate a pad string with the C<x> operator and +concatenate that with the original text. + +Left and right padding with any character: + + $padded = $pad_char x ( $pad_len - length( $text ) ) . $text ; + $padded = $text . $pad_char x ( $pad_len - length( $text ) ) ; + +Or you can left or right pad $text directly: + + $text .= $pad_char x ( $pad_len - length( $text ) ) ; + substr( $text, 0, 0 ) = $pad_char x ( $pad_len - length( $text ) ) ; + =head2 How do I extract selected columns from a string? Use substr() or unpack(), both documented in L<perlfunc>. @@ -523,13 +643,13 @@ Let's assume that you have a string like: If those were both global variables, then this would suffice: - $text =~ s/\$(\w+)/${$1}/g; + $text =~ s/\$(\w+)/${$1}/g; # no /e needed But since they are probably lexicals, or at least, they could be, you'd have to do this: $text =~ s/(\$\w+)/$1/eeg; - die if $@; # needed on /ee, not /e + die if $@; # needed /ee, not /e It's probably better in the general case to treat those variables as entries in some special hash. For example: @@ -547,7 +667,9 @@ of the FAQ. The problem is that those double-quotes force stringification, coercing numbers and references into strings, even when you -don't want them to be. +don't want them to be. Think of it this way: double-quote +expansion is used to produce new strings. If you already +have a string, why do you need more? If you get used to writing odd things like these: @@ -583,7 +705,7 @@ Stringification also destroys arrays. print "@lines"; # WRONG - extra blanks print @lines; # right -=head2 Why don't my <<HERE documents work? +=head2 Why don't my E<lt>E<lt>HERE documents work? Check for these three things: @@ -665,6 +787,27 @@ indentation correctly preserved: =head1 Data: Arrays +=head2 What is the difference between a list and an array? + +An array has a changeable length. A list does not. An array is something +you can push or pop, while a list is a set of values. Some people make +the distinction that a list is a value while an array is a variable. +Subroutines are passed and return lists, you put things into list +context, you initialize arrays with lists, and you foreach() across +a list. C<@> variables are arrays, anonymous arrays are arrays, arrays +in scalar context behave like the number of elements in them, subroutines +access their arguments through the array C<@_>, push/pop/shift only work +on arrays. + +As a side note, there's no such thing as a list in scalar context. +When you say + + $scalar = (2, 5, 7, 9); + +you're using the comma operator in scalar context, so it evaluates the +left hand side, then evaluates and returns the left hand side. This +causes the last value to be returned: 9. + =head2 What is the difference between $array[1] and @array[1]? The former is a scalar value, the latter an array slice, which makes @@ -724,6 +867,8 @@ nice in that it won't work with false values like undef, 0, or ""; =back +But perhaps you should have been using a hash all along, eh? + =head2 How can I tell whether a list or array contains a certain element? Hearing the word "in" is an I<in>dication that you probably should have @@ -770,7 +915,17 @@ or worse yet These are slow (checks every element even if the first matches), inefficient (same reason), and potentially buggy (what if there are -regexp characters in $whatever?). +regexp characters in $whatever?). If you're only testing once, then +use: + + $is_there = 0; + foreach $elt (@array) { + if ($elt eq $elt_to_find) { + $is_there = 1; + last; + } + } + if ($is_there) { ... } =head2 How do I compute the difference of two arrays? How do I compute the intersection of two arrays? @@ -785,11 +940,60 @@ each element is unique in a given array: push @{ $count{$element} > 1 ? \@intersection : \@difference }, $element; } +=head2 How do I test whether two arrays or hashes are equal? + +The following code works for single-level arrays. It uses a stringwise +comparison, and does not distinguish defined versus undefined empty +strings. Modify if you have other needs. + + $are_equal = compare_arrays(\@frogs, \@toads); + + sub compare_arrays { + my ($first, $second) = @_; + local $^W = 0; # silence spurious -w undef complaints + return 0 unless @$first == @$second; + for (my $i = 0; $i < @$first; $i++) { + return 0 if $first->[$i] ne $second->[$i]; + } + return 1; + } + +For multilevel structures, you may wish to use an approach more +like this one. It uses the CPAN module FreezeThaw: + + use FreezeThaw qw(cmpStr); + @a = @b = ( "this", "that", [ "more", "stuff" ] ); + + printf "a and b contain %s arrays\n", + cmpStr(\@a, \@b) == 0 + ? "the same" + : "different"; + +This approach also works for comparing hashes. Here +we'll demonstrate two different answers: + + use FreezeThaw qw(cmpStr cmpStrHard); + + %a = %b = ( "this" => "that", "extra" => [ "more", "stuff" ] ); + $a{EXTRA} = \%b; + $b{EXTRA} = \%a; + + printf "a and b contain %s hashes\n", + cmpStr(\%a, \%b) == 0 ? "the same" : "different"; + + printf "a and b contain %s hashes\n", + cmpStrHard(\%a, \%b) == 0 ? "the same" : "different"; + + +The first reports that both those the hashes contain the same data, +while the second reports that they do not. Which you prefer is left as +an exercise to the reader. + =head2 How do I find the first array element for which a condition is true? You can use this if you care about the index: - for ($i=0; $i < @array; $i++) { + for ($i= 0; $i < @array; $i++) { if ($array[$i] eq "Waldo") { $found_index = $i; last; @@ -810,7 +1014,42 @@ need to copy pointers each time. If you really, really wanted, you could use structures as described in L<perldsc> or L<perltoot> and do just what the algorithm book tells you -to do. +to do. For example, imagine a list node like this: + + $node = { + VALUE => 42, + LINK => undef, + }; + +You could walk the list this way: + + print "List: "; + for ($node = $head; $node; $node = $node->{LINK}) { + print $node->{VALUE}, " "; + } + print "\n"; + +You could grow the list this way: + + my ($head, $tail); + $tail = append($head, 1); # grow a new head + for $value ( 2 .. 10 ) { + $tail = append($tail, $value); + } + + sub append { + my($list, $value) = @_; + my $node = { VALUE => $value }; + if ($list) { + $node->{LINK} = $list->{LINK}; + $list->{LINK} = $node; + } else { + $_[0] = $node; # replace caller's version + } + return $node; + } + +But again, Perl's built-in are virtually always good enough. =head2 How do I handle circular lists? @@ -1006,9 +1245,54 @@ get those bits into your @ints array: This method gets faster the more sparse the bit vector is. (Courtesy of Tim Bunce and Winfried Koenig.) +Here's a demo on how to use vec(): + + # vec demo + $vector = "\xff\x0f\xef\xfe"; + print "Ilya's string \\xff\\x0f\\xef\\xfe represents the number ", + unpack("N", $vector), "\n"; + $is_set = vec($vector, 23, 1); + print "Its 23rd bit is ", $is_set ? "set" : "clear", ".\n"; + pvec($vector); + + set_vec(1,1,1); + set_vec(3,1,1); + set_vec(23,1,1); + + set_vec(3,1,3); + set_vec(3,2,3); + set_vec(3,4,3); + set_vec(3,4,7); + set_vec(3,8,3); + set_vec(3,8,7); + + set_vec(0,32,17); + set_vec(1,32,17); + + sub set_vec { + my ($offset, $width, $value) = @_; + my $vector = ''; + vec($vector, $offset, $width) = $value; + print "offset=$offset width=$width value=$value\n"; + pvec($vector); + } + + sub pvec { + my $vector = shift; + my $bits = unpack("b*", $vector); + my $i = 0; + my $BASE = 8; + + print "vector length in bytes: ", length($vector), "\n"; + @bytes = unpack("A8" x length($vector), $bits); + print "bits are: @bytes\n\n"; + } + =head2 Why does defined() return true on empty arrays and hashes? -See L<perlfunc/defined> in the 5.004 release or later of Perl. +The short story is that you should probably only use defined on scalars or +functions, not on aggregates (arrays and hashes). See L<perlfunc/defined> +in the 5.004 release or later of Perl for more detail. =head1 Data: Hashes (Associative Arrays) @@ -1243,9 +1527,21 @@ awk's behavior. =head2 How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays? -Use references (documented in L<perlref>). Examples of complex data -structures are given in L<perldsc> and L<perllol>. Examples of -structures and object-oriented classes are in L<perltoot>. +Usually a hash ref, perhaps like this: + + $record = { + NAME => "Jason", + EMPNO => 132, + TITLE => "deputy peon", + AGE => 23, + SALARY => 37_000, + PALS => [ "Norbert", "Rhys", "Phineas"], + }; + +References are documented in L<perlref> and the upcoming L<perlreftut>. +Examples of complex data structures are given in L<perldsc> and +L<perllol>. Examples of structures and object-oriented classes are +in L<perltoot>. =head2 How can I use a reference as a hash key? @@ -1263,8 +1559,9 @@ this works fine (assuming the files are found): print "Your kernel is GNU-zip enabled!\n"; } -On some systems, however, you have to play tedious games with "text" -versus "binary" files. See L<perlfunc/"binmode">. +On some legacy systems, however, you have to play tedious games with +"text" versus "binary" files. See L<perlfunc/"binmode">, or the upcoming +L<perlopentut> manpage. If you're concerned about 8-bit ASCII data, then see L<perllocale>. @@ -1276,14 +1573,14 @@ some gotchas. See the section on Regular Expressions. Assuming that you don't care about IEEE notations like "NaN" or "Infinity", you probably just want to use a regular expression. - warn "has nondigits" if /\D/; - warn "not a natural number" unless /^\d+$/; # rejects -3 - warn "not an integer" unless /^-?\d+$/; # rejects +3 - warn "not an integer" unless /^[+-]?\d+$/; - warn "not a decimal number" unless /^-?\d+\.?\d*$/; # rejects .2 - warn "not a decimal number" unless /^-?(?:\d+(?:\.\d*)?|\.\d+)$/; - warn "not a C float" - unless /^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/; + if (/\D/) { print "has nondigits\n" } + if (/^\d+$/) { print "is a whole number\n" } + if (/^-?\d+$/) { print "is an integer\n" } + if (/^[+-]?\d+$/) { print "is a +/- integer\n" } + if (/^-?\d+\.?\d*$/) { print "is a real number\n" } + if (/^-?(?:\d+(?:\.\d*)?|\.\d+)$/) { print "is a decimal number" } + if (/^([+-]?)(?=\d|\.\d)\d*(\.\d*)?([Ee]([+-]?\d+))?$/) + { print "a C float" } If you're on a POSIX system, Perl's supports the C<POSIX::strtod> function. Its semantics are somewhat cumbersome, so here's a C<getnum> @@ -1317,19 +1614,32 @@ and longs, respectively. =head2 How do I keep persistent data across program calls? For some specific applications, you can use one of the DBM modules. -See L<AnyDBM_File>. More generically, you should consult the -FreezeThaw, Storable, or Class::Eroot modules from CPAN. +See L<AnyDBM_File>. More generically, you should consult the FreezeThaw, +Storable, or Class::Eroot modules from CPAN. Here's one example using +Storable's C<store> and C<retrieve> functions: + + use Storable; + store(\%hash, "filename"); + + # later on... + $href = retrieve("filename"); # by ref + %hash = %{ retrieve("filename") }; # direct to hash =head2 How do I print out or copy a recursive data structure? -The Data::Dumper module on CPAN is nice for printing out -data structures, and FreezeThaw for copying them. For example: +The Data::Dumper module on CPAN (or the 5.005 release of Perl) is great +for printing out data structures. The Storable module, found on CPAN, +provides a function called C<dclone> that recursively copies its argument. + + use Storable qw(dclone); + $r2 = dclone($r1); - use FreezeThaw qw(freeze thaw); - ($new) = thaw freeze $old; +Where $r1 can be a reference to any kind of data structure you'd like. +It will be deeply copied. Because C<dclone> takes and returns references, +you'd have to add extra punctuation if you had a hash of arrays that +you wanted to copy. -Where $old can be (a reference to) any kind of data structure you'd like. -It will be deeply copied. + %newhash = %{ dclone(\%oldhash) }; =head2 How do I define methods for every class/object? @@ -1339,9 +1649,15 @@ Use the UNIVERSAL class (see L<UNIVERSAL>). Get the Business::CreditCard module from CPAN. +=head2 How do I pack arrays of doubles or floats for XS code? + +The kgbpack.c code in the PGPLOT module on CPAN does just this. +If you're doing a lot of float or double processing, consider using +the PDL module from CPAN instead--it makes number-crunching easy. + =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. +Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as part of the Standard Version of Perl, or as part of @@ -1356,3 +1672,4 @@ are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required. + diff --git a/pod/perlfaq5.pod b/pod/perlfaq5.pod index 015c9b4d21..119ffa4103 100644 --- a/pod/perlfaq5.pod +++ b/pod/perlfaq5.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq5 - Files and Formats ($Revision: 1.24 $, $Date: 1998/07/05 15:07:20 $) +perlfaq5 - Files and Formats ($Revision: 1.34 $, $Date: 1999/01/08 05:46:13 $) =head1 DESCRIPTION @@ -78,12 +78,15 @@ See L<perlfaq9> for other examples of fetching URLs over the web. =head2 How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file? +Those are operations of a text editor. Perl is not a text editor. +Perl is a programming language. You have to decompose the problem into +low-level calls to read, write, open, close, and seek. + Although humans have an easy time thinking of a text file as being a -sequence of lines that operates much like a stack of playing cards -- -or punch cards -- computers usually see the text file as a sequence of -bytes. In general, there's no direct way for Perl to seek to a -particular line of a file, insert text into a file, or remove text -from a file. +sequence of lines that operates much like a stack of playing cards -- or +punch cards -- computers usually see the text file as a sequence of bytes. +In general, there's no direct way for Perl to seek to a particular line +of a file, insert text into a file, or remove text from a file. (There are exceptions in special circumstances. You can add or remove at the very end of the file. Another is replacing a sequence of bytes with @@ -97,7 +100,7 @@ no locking. $old = $file; $new = "$file.tmp.$$"; - $bak = "$file.bak"; + $bak = "$file.orig"; open(OLD, "< $old") or die "can't open $old: $!"; open(NEW, "> $new") or die "can't open $new: $!"; @@ -124,7 +127,7 @@ platform-specific documentation that came with your port. perl -pi -e 's/(^\s+test\s+)\d+/ $1 . ++$count /e' t/op/taint.t # form a script - local($^I, @ARGV) = ('.bak', glob("*.c")); + local($^I, @ARGV) = ('.orig', glob("*.c")); while (<>) { if ($. == 1) { print "This line should appear at the top of each file\n"; @@ -174,9 +177,9 @@ Use the C<new_tmpfile> class method from the IO::File module to get a filehandle opened for reading and writing. Use this if you don't need to know the file's name. - use IO::File; + use IO::File; $fh = IO::File->new_tmpfile() - or die "Unable to make new temporary file: $!"; + or die "Unable to make new temporary file: $!"; Or you can use the C<tmpnam> function from the POSIX module to get a filename that you then open yourself. Use this if you do need to know @@ -222,7 +225,7 @@ one process, use a counter: =head2 How can I manipulate fixed-record-length files? The most efficient way is using pack() and unpack(). This is faster than -using substr() when take many, many strings. It is slower for just a few. +using substr() when taking many, many strings. It is slower for just a few. Here is a sample chunk of code to break up and put back together again some fixed-format input lines, in this case from the output of a normal, @@ -289,10 +292,10 @@ pair to make it easy to sort the hash in insertion order. } For passing filehandles to functions, the easiest way is to -prefer them with a star, as in func(*STDIN). See L<perlfaq7/"Passing +preface them with a star, as in func(*STDIN). See L<perlfaq7/"Passing Filehandles"> for details. -If you want to create many, anonymous handles, you should check out the +If you want to create many anonymous handles, you should check out the Symbol, FileHandle, or IO::Handle (etc.) modules. Here's the equivalent code with Symbol::gensym, which is reasonably light-weight: @@ -303,8 +306,8 @@ code with Symbol::gensym, which is reasonably light-weight: $file{$filename} = [ $i++, $fh ]; } -Or here using the semi-object-oriented FileHandle, which certainly isn't -light-weight: +Or here using the semi-object-oriented FileHandle module, which certainly +isn't light-weight: use FileHandle; @@ -344,7 +347,7 @@ Then use any of those as you would a normal filehandle. Anywhere that Perl is expecting a filehandle, an indirect filehandle may be used instead. An indirect filehandle is just a scalar variable that contains a filehandle. Functions like C<print>, C<open>, C<seek>, or -the C<E<lt>FHE<gt>> diamond operator will accept either a real filehandle +the C<E<lt>FHE<gt>> diamond operator will accept either a read filehandle or a scalar variable containing one: ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR); @@ -422,7 +425,7 @@ techniques to make it possible for the intrepid hacker. =head2 How can I write() into a string? -See L<perlform> for an swrite() function. +See L<perlform/"Accessing Formatting Internals"> for an swrite() function. =head2 How can I output my numbers with commas added? @@ -430,7 +433,7 @@ This one will do it for you: sub commify { local $_ = shift; - 1 while s/^(-?\d+)(\d{3})/$1,$2/; + 1 while s/^([-+]?\d+)(\d{3})/$1,$2/; return $_; } @@ -441,7 +444,7 @@ This one will do it for you: You can't just: - s/^(-?\d+)(\d{3})/$1,$2/g; + s/^([-+]?\d+)(\d{3})/$1,$2/g; because you have to put the comma in and then recalculate your position. @@ -455,7 +458,7 @@ whatever: my $input = shift; $input = reverse $input; $input =~ s<(\d\d\d)(?=\d)(?!\d*\.)><$1,>g; - return reverse $input; + return scalar reverse $input; } =head2 How can I translate tildes (~) in a filename? @@ -547,7 +550,9 @@ be an atomic operation over NFS. That is, two processes might both successful create or unlink the same file! Therefore O_EXCL isn't so exclusive as you might wish. -=head2 Why do I sometimes get an "Argument list too long" when I use <*>? +See also the new L<perlopentut> if you have it (new for 5.006). + +=head2 Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>? The C<E<lt>E<gt>> operator performs a globbing operation (see above). By default glob() forks csh(1) to do the actual glob expansion, but @@ -555,9 +560,9 @@ csh can't handle more than 127 items and so gives the error message C<Argument list too long>. People who installed tcsh as csh won't have this problem, but their users may be surprised by it. -To get around this, either do the glob yourself with C<Dirhandle>s and +To get around this, either do the glob yourself with readdir() and patterns, or use a module like Glob::KGlob, one that doesn't use the -shell to do globbing. +shell to do globbing. This is expected to be fixed soon. =head2 Is there a leak/bug in glob()? @@ -576,15 +581,28 @@ trailing null byte on the name to make perl leave it alone: sub safe_filename { local $_ = shift; - return m#^/# - ? "$_\0" - : "./$_\0"; + s#^([^./])#./$1#; + $_ .= "\0"; + return $_; } - $fn = safe_filename("<<<something really wicked "); - open(FH, "> $fn") or "couldn't open $fn: $!"; + $badpath = "<<<something really wicked "; + $fn = safe_filename($badpath"); + open(FH, "> $fn") or "couldn't open $badpath: $!"; + +This assumes that you are using POSIX (portable operating systems +interface) paths. If you are on a closed, non-portable, proprietary +system, you may have to adjust the C<"./"> above. + +It would be a lot clearer to use sysopen(), though: + + use Fcntl; + $badpath = "<<<something really wicked "; + open (FH, $badpath, O_WRONLY | O_CREAT | O_TRUNC) + or die "can't open $badpath: $!"; -You could also use the sysopen() function (see L<perlfunc/sysopen>). +For more information, see also the new L<perlopentut> if you have it +(new for 5.006). =head2 How can I reliably rename a file? @@ -601,7 +619,7 @@ then delete the old one. This isn't really the same semantics as a real rename(), though, which preserves metainformation like permissions, timestamps, inode info, etc. -The newer version of File::Copy export a move() function. +The newer version of File::Copy exports a move() function. =head2 How can I lock a file? @@ -631,9 +649,12 @@ build Perl. See the flock entry of L<perlfunc>, and the F<INSTALL> file in the source distribution for information on building Perl to do this. +For more information on file locking, see also L<perlopentut/"File +Locking"> if you have it (new for 5.006). + =back -=head2 What can't I just open(FH, ">file.lock")? +=head2 Why can't I just open(FH, ">file.lock")? A common bit of code B<NOT TO USE> is this: @@ -649,7 +670,7 @@ atomic test-and-set instruction. In theory, this "ought" to work: except that lamentably, file creation (and deletion) is not atomic over NFS, so this won't work (at least, not every time) over the net. -Various schemes involving involving link() have been suggested, but +Various schemes involving link() have been suggested, but these tend to involve busy-wait, which is also subdesirable. =head2 I still don't get locking. I just want to increment the number in the file. How can I do this? @@ -661,14 +682,15 @@ It's more realistic. Anyway, this is what you can do if you can't help yourself. - use Fcntl; + use Fcntl ':flock'; sysopen(FH, "numfile", O_RDWR|O_CREAT) or die "can't open numfile: $!"; - flock(FH, 2) or die "can't flock numfile: $!"; + flock(FH, LOCK_EX) or die "can't flock numfile: $!"; $num = <FH> || 0; seek(FH, 0, 0) or die "can't rewind numfile: $!"; truncate(FH, 0) or die "can't truncate numfile: $!"; (print FH $num+1, "\n") or die "can't write numfile: $!"; - # DO NOT UNLOCK THIS UNTIL YOU CLOSE + # Perl as of 5.004 automatically flushes before unlocking + flock(FH, LOCK_UN) or die "can't flock numfile: $!"; close FH or die "can't close numfile: $!"; Here's a much better web-page hit counter: @@ -693,7 +715,7 @@ like this: seek(FH, $recno * $RECSIZE, 0); read(FH, $record, $RECSIZE) == $RECSIZE || die "can't read record $recno: $!"; # munge the record - seek(FH, $recno * $RECSIZE, 0); + seek(FH, -$RECSIZE, 1); print FH $record; close FH; @@ -720,12 +742,15 @@ Here's an example: If you prefer something more legible, use the File::stat module (part of the standard distribution in version 5.004 and later): + # error checking left as an exercise for reader. use File::stat; use Time::localtime; $date_string = ctime(stat($file)->mtime); print "file $file updated at $date_string\n"; -Error checking is left as an exercise for the reader. +The POSIX::strftime() approach has the benefit of being, +in theory, independent of the current locale. See L<perllocale> +for details. =head2 How do I set a file's timestamp in perl? @@ -741,7 +766,7 @@ of them. ($atime, $mtime) = (stat($timestamp))[8,9]; utime $atime, $mtime, @ARGV; -Error checking is left as an exercise for the reader. +Error checking is, as usual, left as an exercise for the reader. Note that utime() currently doesn't work correctly with Win95/NT ports. A bug has been reported. Check it carefully before using @@ -774,11 +799,14 @@ than the stock version. =head2 How can I read in a file by paragraphs? -Use the C<$\> variable (see L<perlvar> for details). You can either +Use the C<$/> variable (see L<perlvar> for details). You can either set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">, for instance, gets treated as two paragraphs and not three), or C<"\n\n"> to accept empty paragraphs. +Note that a blank line must have no blanks in it. Thus C<"fred\n +\nstuff\n\n"> is one paragraph, but C<"fred\n\nstuff\n\n"> is two. + =head2 How can I read a single character from a file? From the keyboard? You can use the builtin C<getc()> function for most filehandles, but @@ -786,8 +814,9 @@ it won't (easily) work on a terminal device. For STDIN, either use the Term::ReadKey module from CPAN, or use the sample code in L<perlfunc/getc>. -If your system supports POSIX, you can use the following code, which -you'll note turns off echo processing as well. +If your system supports the portable operating system programming +interface (POSIX), you can use the following code, which you'll note +turns off echo processing as well. #!/usr/bin/perl -w use strict; @@ -838,7 +867,8 @@ you'll note turns off echo processing as well. END { cooked() } -The Term::ReadKey module from CPAN may be easier to use: +The Term::ReadKey module from CPAN may be easier to use. Recent version +include also support for non-portable systems as well. use Term::ReadKey; open(TTY, "</dev/tty"); @@ -849,7 +879,7 @@ The Term::ReadKey module from CPAN may be easier to use: printf "\nYou said %s, char number %03d\n", $key, ord $key; -For DOS systems, Dan Carson <dbc@tc.fluke.COM> reports the following: +For legacy DOS systems, Dan Carson <dbc@tc.fluke.COM> reports the following: To put the PC in "raw" mode, use ioctl with some magic numbers gleaned from msdos.c (Perl source file) and Ralf Brown's interrupt list (comes @@ -895,11 +925,12 @@ table: This is all trial and error I did a long time ago, I hope I'm reading the file that worked. -=head2 How can I tell if there's a character waiting on a filehandle? +=head2 How can I tell whether there's a character waiting on a filehandle? The very first thing you should do is look into getting the Term::ReadKey -extension from CPAN. It now even has limited support for closed, proprietary -(read: not open systems, not POSIX, not Unix, etc) systems. +extension from CPAN. As we mentioned earlier, it now even has limited +support for non-portable (read: not open systems, closed, proprietary, +not POSIX, not Unix, etc) systems. You should also check out the Frequently Asked Questions list in comp.unix.* for things like this: the answer is essentially the same. @@ -912,12 +943,11 @@ systems: return $nfd = select($rin,undef,undef,0); } -If you want to find out how many characters are waiting, -there's also the FIONREAD ioctl call to be looked at. - -The I<h2ph> tool that comes with Perl tries to convert C include -files to Perl code, which can be C<require>d. FIONREAD ends -up defined as a function in the I<sys/ioctl.ph> file: +If you want to find out how many characters are waiting, there's +also the FIONREAD ioctl call to be looked at. The I<h2ph> tool that +comes with Perl tries to convert C include files to Perl code, which +can be C<require>d. FIONREAD ends up defined as a function in the +I<sys/ioctl.ph> file: require 'sys/ioctl.ph'; @@ -939,7 +969,7 @@ Or write a small C program using the editor of champions: printf("%#08x\n", FIONREAD); } ^D - % cc -o fionread fionread + % cc -o fionread fionread.c % ./fionread 0x4004667f @@ -980,6 +1010,8 @@ the clearerr() method, which can remove the end of file condition on a filehandle. The method: read until end of file, clearerr(), read some more. Lather, rinse, repeat. +There's also a File::Tail module from CPAN. + =head2 How do I dup() a filehandle in Perl? If you check L<perlfunc/open>, you'll see that several of the ways @@ -1018,19 +1050,22 @@ Remember that within double quoted strings ("like\this"), the backslash is an escape character. The full list of these is in L<perlop/Quote and Quote-like Operators>. Unsurprisingly, you don't have a file called "c:(tab)emp(formfeed)oo" or -"c:(tab)emp(formfeed)oo.exe" on your DOS filesystem. +"c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem. Either single-quote your strings, or (preferably) use forward slashes. Since all DOS and Windows versions since something like MS-DOS 2.0 or so have treated C</> and C<\> the same in a path, you might as well use the one that doesn't clash with Perl -- or the POSIX shell, ANSI C and C++, -awk, Tcl, Java, or Python, just to mention a few. +awk, Tcl, Java, or Python, just to mention a few. POSIX paths +are more portable, too. =head2 Why doesn't glob("*.*") get all the files? Because even on non-Unix ports, Perl's glob function follows standard Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden) -files. This makes glob() portable. +files. This makes glob() portable even to legacy systems. Your +port may include proprietary globbing functions as well. Check its +documentation for details. =head2 Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl? @@ -1057,9 +1092,32 @@ This has a significant advantage in space over reading the whole file in. A simple proof by induction is available upon request if you doubt its correctness. +=head2 Why do I get weird spaces when I print an array of lines? + +Saying + + print "@lines\n"; + +joins together the elements of C<@lines> with a space between them. +If C<@lines> were C<("little", "fluffy", "clouds")> then the above +statement would print: + + little fluffy clouds + +but if each element of C<@lines> was a line of text, ending a newline +character C<("little\n", "fluffy\n", "clouds\n")> then it would print: + + little + fluffy + clouds + +If your array contains lines, just print them: + + print @lines; + =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. +Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as an integrated part of the Standard Distribution @@ -1072,3 +1130,4 @@ domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required. + diff --git a/pod/perlfaq6.pod b/pod/perlfaq6.pod index 488a27c83a..834fd89aa1 100644 --- a/pod/perlfaq6.pod +++ b/pod/perlfaq6.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq6 - Regexps ($Revision: 1.22 $, $Date: 1998/07/16 14:01:07 $) +perlfaq6 - Regexps ($Revision: 1.25 $, $Date: 1999/01/08 04:50:47 $) =head1 DESCRIPTION @@ -128,7 +128,7 @@ L<perlop>): If you wanted text and not lines, you would use - perl -0777 -pe 'print "$1\n" while /START(.*?)END/gs' file1 file2 ... + perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ... But if you want nested occurrences of C<START> through C<END>, you'll run up against the problem described in the question in this section @@ -387,48 +387,31 @@ See the module String::Approx available from CPAN. =head2 How do I efficiently match many regular expressions at once? -The following is super-inefficient: +The following is extremely inefficient: - while (<FH>) { - foreach $pat (@patterns) { - if ( /$pat/ ) { - # do something - } - } - } - -Instead, you either need to use one of the experimental Regexp extension -modules from CPAN (which might well be overkill for your purposes), -or else put together something like this, inspired from a routine -in Jeffrey Friedl's book: - - sub _bm_build { - my $condition = shift; - my @regexp = @_; # this MUST not be local(); need my() - my $expr = join $condition => map { "m/\$regexp[$_]/o" } (0..$#regexp); - my $match_func = eval "sub { $expr }"; - die if $@; # propagate $@; this shouldn't happen! - return $match_func; - } - - sub bm_and { _bm_build('&&', @_) } - sub bm_or { _bm_build('||', @_) } - - $f1 = bm_and qw{ - xterm - (?i)window - }; - - $f2 = bm_or qw{ - \b[Ff]ree\b - \bBSD\B - (?i)sys(tem)?\s*[V5]\b - }; - - # feed me /etc/termcap, prolly - while ( <> ) { - print "1: $_" if &$f1; - print "2: $_" if &$f2; + # slow but obvious way + @popstates = qw(CO ON MI WI MN); + while (defined($line = <>)) { + for $state (@popstates) { + if ($line =~ /\b$state\b/i) { + print $line; + last; + } + } + } + +That's because Perl has to recompile all those patterns for each of +the lines of the file. As of the 5.005 release, there's a much better +approach, one which makes use of the new C<qr//> operator: + + # use spiffy new qr// operator, with /i flag even + use 5.005; + @popstates = qw(CO ON MI WI MN); + @poppats = map { qr/\b$_\b/i } @popstates; + while (defined($line = <>)) { + for $patobj (@poppats) { + print $line if $line =~ /$patobj/; + } } =head2 Why don't word-boundary searches with C<\b> work for me? @@ -460,22 +443,24 @@ not "this" or "island". =head2 Why does using $&, $`, or $' slow my program down? -Because once Perl sees that you need one of these variables anywhere -in the program, it has to provide them on each and every pattern -match. The same mechanism that handles these provides for the use of -$1, $2, etc., so you pay the same price for each regexp that contains -capturing parentheses. But if you never use $&, etc., in your script, -then regexps I<without> capturing parentheses won't be penalized. So -avoid $&, $', and $` if you can, but if you can't (and some algorithms -really appreciate them), once you've used them once, use them at will, -because you've already paid the price. +Because once Perl sees that you need one of these variables anywhere in +the program, it has to provide them on each and every pattern match. +The same mechanism that handles these provides for the use of $1, $2, +etc., so you pay the same price for each regexp that contains capturing +parentheses. But if you never use $&, etc., in your script, then regexps +I<without> capturing parentheses won't be penalized. So avoid $&, $', +and $` if you can, but if you can't, once you've used them at all, use +them at will because you've already paid the price. Remember that some +algorithms really appreciate them. As of the 5.005 release. the $& +variable is no longer "expensive" the way the other two are. =head2 What good is C<\G> in a regular expression? The notation C<\G> is used in a match or substitution in conjunction the C</g> modifier (and ignored if there's no C</g>) to anchor the regular expression to the point just past where the last match occurred, i.e. the -pos() point. +pos() point. A failed match resets the position of C<\G> unless the +C</c> modifier is in effect. For example, suppose you had a line of text quoted in standard mail and Usenet notation, (that is, with leading C<E<gt>> characters), and @@ -596,20 +581,41 @@ Or like this: Or like this: - die "sorry, Perl doesn't (yet) have Martian support )-:\n"; - -In addition, a sample program which converts half-width to full-width -katakana (in Shift-JIS or EUC encoding) is available from CPAN as - -=for Tom make it so + die "sorry, Perl doesn't (yet) have Martian support )-:\n"; There are many double- (and multi-) byte encodings commonly used these days. Some versions of these have 1-, 2-, 3-, and 4-byte characters, all mixed. +=head2 How do I match a pattern that is supplied by the user? + +Well, if it's really a pattern, then just use + + chomp($pattern = <STDIN>); + if ($line =~ /$pattern/) { } + +Or, since you have no guarantee that your user entered +a valid regular expression, trap the exception this way: + + if (eval { $line =~ /$pattern/ }) { } + +But if all you really want to search for a string, not a pattern, +then you should either use the index() function, which is made for +string searching, or if you can't be disabused of using a pattern +match on a non-pattern, then be sure to use C<\Q>...C<\E>, documented +in L<perlre>. + + $pattern = <STDIN>; + + open (FILE, $input) or die "Couldn't open input $input: $!; aborting"; + while (<FILE>) { + print if /\Q$pattern\E/; + } + close FILE; + =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. +Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as part of the Standard Version of Perl, or as part of @@ -624,3 +630,4 @@ are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required. + diff --git a/pod/perlfaq7.pod b/pod/perlfaq7.pod index cb7f3c027a..5794bfe372 100644 --- a/pod/perlfaq7.pod +++ b/pod/perlfaq7.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq7 - Perl Language Issues ($Revision: 1.21 $, $Date: 1998/06/22 15:20:07 $) +perlfaq7 - Perl Language Issues ($Revision: 1.24 $, $Date: 1999/01/08 05:32:11 $) =head1 DESCRIPTION @@ -180,7 +180,7 @@ own module. Make sure to change the names appropriately. # if using RCS/CVS, this next line may be preferred, # but beware two-digit versions. - $VERSION = do{my@r=q$Revision: 1.21 $=~/\d+/g;sprintf '%d.'.'%02d'x$#r,@r}; + $VERSION = do{my@r=q$Revision: 1.24 $=~/\d+/g;sprintf '%d.'.'%02d'x$#r,@r}; @ISA = qw(Exporter); @EXPORT = qw(&func1 &func2 &func3); @@ -229,6 +229,10 @@ own module. Make sure to change the names appropriately. 1; # modules must return true +The h2xs program will create stubs for all the important stuff for you: + + % h2xs -XA -n My::Module + =head2 How do I create a class? See L<perltoot> for an introduction to classes and objects, as well as @@ -344,7 +348,7 @@ reference to an existing or anonymous variable or function: func( \$some_scalar ); - func( \$some_array ); + func( \@some_array ); func( [ 1 .. 10 ] ); func( \%some_hash ); @@ -392,7 +396,7 @@ If you're planning on generating new filehandles, you could do this: To pass regexps around, you'll need to either use one of the highly experimental regular expression modules from CPAN (Nick Ing-Simmons's Regexp or Ilya Zakharevich's Devel::Regexp), pass around strings -and use an exception-trapping eval, or else be be very, very clever. +and use an exception-trapping eval, or else be very, very clever. Here's an example of how to pass in a string to be regexp compared: sub compare($$) { @@ -563,7 +567,7 @@ However, dynamic variables (aka global, local, or package variables) are effectively shallowly bound. Consider this just one more reason not to use them. See the answer to L<"What's a closure?">. -=head2 Why doesn't "my($foo) = <FILE>;" work right? +=head2 Why doesn't "my($foo) = E<lt>FILEE<gt>;" work right? C<my()> and C<local()> give list context to the right hand side of C<=>. The E<lt>FHE<gt> read operation, like so many of Perl's @@ -797,9 +801,34 @@ This can't go just anywhere. You have to put a pod directive where the parser is expecting a new statement, not just in the middle of an expression or some other arbitrary yacc grammar production. +=head2 How do I clear a package? + +Use this code, provided by Mark-Jason Dominus: + + sub scrub_package { + no strict 'refs'; + my $pack = shift; + die "Shouldn't delete main package" + if $pack eq "" || $pack eq "main"; + my $stash = *{$pack . '::'}{HASH}; + my $name; + foreach $name (keys %$stash) { + my $fullname = $pack . '::' . $name; + # Get rid of everything with that name. + undef $$fullname; + undef @$fullname; + undef %$fullname; + undef &$fullname; + undef *$fullname; + } + } + +Or, if you're using a recent release of Perl, you can +just use the Symbol::delete_package() function instead. + =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. +Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as part of the Standard Version of Perl, or as part of @@ -814,3 +843,4 @@ are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required. + diff --git a/pod/perlfaq8.pod b/pod/perlfaq8.pod index cbc87b5fd7..7b3ac3edf4 100644 --- a/pod/perlfaq8.pod +++ b/pod/perlfaq8.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq8 - System Interaction ($Revision: 1.26 $, $Date: 1998/08/05 12:20:28 $) +perlfaq8 - System Interaction ($Revision: 1.36 $, $Date: 1999/01/08 05:36:34 $) =head1 DESCRIPTION @@ -325,7 +325,6 @@ go bump in the night, finally came up with this: } } - =head2 How do I decode encrypted password files? You spend lots and lots of money on dedicated hardware, but this is @@ -449,12 +448,12 @@ http://www.perl.com/CPAN/doc/misc/ancient/tutorial/eg/itimers.pl . =head2 How can I measure time under a second? -The Time::HiRes module (available from CPAN) provides this -functionality for some systems. +In general, you may not be able to. The Time::HiRes module (available +from CPAN) provides this functionality for some systems. -In general, you may not be able to. But if your system supports both the -syscall() function in Perl as well as a system call like gettimeofday(2), -then you may be able to do something like this: +If your system supports both the syscall() function in Perl as well as +a system call like gettimeofday(2), then you may be able to do +something like this: require 'sys/syscall.ph'; @@ -462,7 +461,7 @@ then you may be able to do something like this: $done = $start = pack($TIMEVAL_T, ()); - syscall( &SYS_gettimeofday, $start, 0)) != -1 + syscall( &SYS_gettimeofday, $start, 0) != -1 or die "gettimeofday: $!"; ########################## @@ -674,19 +673,26 @@ there, and the old standard error shows up on the old standard out. =head2 Why doesn't open() return an error when a pipe open fails? -It does, but probably not how you expect it to. On systems that -follow the standard fork()/exec() paradigm (such as Unix), it works like -this: open() causes a fork(). In the parent, open() returns with the -process ID of the child. The child exec()s the command to be piped -to/from. The parent can't know whether the exec() was successful or -not - all it can return is whether the fork() succeeded or not. To -find out if the command succeeded, you have to catch SIGCHLD and -wait() to get the exit status. You should also catch SIGPIPE if -you're writing to the child -- you may not have found out the exec() +Because the pipe open takes place in two steps: first Perl calls +fork() to start a new process, then this new process calls exec() to +run the program you really wanted to open. The first step reports +success or failure to your process, so open() can only tell you +whether the fork() succeeded or not. + +To find out if the exec() step succeeded, you have to catch SIGCHLD +and wait() to get the exit status. You should also catch SIGPIPE if +you're writing to the child--you may not have found out the exec() failed by the time you write. This is documented in L<perlipc>. +In some cases, even this won't work. If the second argument to a +piped open() contains shell metacharacters, perl fork()s, then exec()s +a shell to decode the metacharacters and eventually run the desired +program. Now when you call wait(), you only learn whether or not the +I<shell> could be successfully started. Best to avoid shell +metacharacters. + On systems that follow the spawn() paradigm, open() I<might> do what -you expect - unless perl uses a shell to start your command. In this +you expect--unless perl uses a shell to start your command. In this case the fork()/exec() description still applies. =head2 What's wrong with using backticks in a void context? @@ -869,7 +875,7 @@ module for other solutions. =item * -Open /dev/tty and use the the TIOCNOTTY ioctl on it. See L<tty(4)> +Open /dev/tty and use the TIOCNOTTY ioctl on it. See L<tty(4)> for details. Or better yet, you can just use the POSIX::setsid() function, so you don't have to worry about process groups. @@ -908,7 +914,7 @@ the current process group of your controlling terminal as follows: use POSIX qw/getpgrp tcgetpgrp/; open(TTY, "/dev/tty") or die $!; - $tpgrp = tcgetpgrp(TTY); + $tpgrp = tcgetpgrp(fileno(*TTY)); $pgrp = getpgrp(); if ($tpgrp == $pgrp) { print "foreground\n"; @@ -1034,6 +1040,13 @@ scripts that use the modules/libraries (see L<perlrun>) or say use lib '/u/mydir/perl'; +This is almost the same as: + + BEGIN { + unshift(@INC, '/u/mydir/perl'); + } + +except that the lib module checks for machine-dependent subdirectories. See Perl's L<lib> for more information. =head2 How do I add the directory my program lives in to the module/library search path? @@ -1056,9 +1069,15 @@ The latter is particularly useful because it knows about machine dependent architectures. The lib.pm pragmatic module was first included with the 5.002 release of Perl. +=head2 What is socket.ph and where do I get it? + +It's a perl4-style file defining values for system networking +constants. Sometimes it is built using h2ph when Perl is installed, +but other times it is not. Modern programs C<use Socket;> instead. + =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. +Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as part of the Standard Version of Perl, or as part of @@ -1073,3 +1092,4 @@ are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required. + diff --git a/pod/perlfaq9.pod b/pod/perlfaq9.pod index 330158b77b..46c487bea3 100644 --- a/pod/perlfaq9.pod +++ b/pod/perlfaq9.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq9 - Networking ($Revision: 1.20 $, $Date: 1998/06/22 18:31:09 $) +perlfaq9 - Networking ($Revision: 1.24 $, $Date: 1999/01/08 05:39:48 $) =head1 DESCRIPTION @@ -77,8 +77,7 @@ stamp prepended. =head2 How do I remove HTML from a string? The most correct way (albeit not the fastest) is to use HTML::Parse -from CPAN (part of the libwww-perl distribution, which is a must-have -module for all web hackers). +from CPAN (part of the HTML-Tree package on CPAN). Many folks attempt a simple-minded regular expression approach, like C<s/E<lt>.*?E<gt>//g>, but that fails in many cases because the tags @@ -172,6 +171,7 @@ do this. They work through proxies, and don't require lynx: getprint "http://www.sn.no/libwww-perl/"; # or print ASCII from HTML from a URL + # also need HTML-Tree package from CPAN use LWP::Simple; use HTML::Parse; use HTML::FormatText; @@ -303,7 +303,7 @@ In short, they're bad hacks. Resist them at all costs. Please do not be tempted to reinvent the wheel. Instead, use the CGI.pm or CGI_Lite.pm (available from CPAN), or if you're trapped in the module-free land of perl1 .. perl4, you might look into cgi-lib.pl (available from -http://www.bio.cam.ac.uk/web/form.html). +http://cgi-lib.stanford.edu/cgi-lib/ ). Make sure you know whether to use a GET or a POST in your form. GETs should only be used for something that doesn't update the server. @@ -411,7 +411,8 @@ Use the C<sendmail> program directly: To: Final Destination <you\@otherhost> Subject: A relevant subject line - Body of the message goes here, in as many lines as you like. + Body of the message goes here after the blank line + in as many lines as you like. EOF close(SENDMAIL) or warn "sendmail didn't close nicely"; @@ -442,9 +443,8 @@ include queueing, MX records, and security. =head2 How do I read mail? -Use the Mail::Folder module from CPAN -(part of the MailFolder package) or the Mail::Internet module from -CPAN (also part of the MailTools package). +Use the Mail::Folder module from CPAN (part of the MailFolder package) or +the Mail::Internet module from CPAN (also part of the MailTools package). # sending mail use Mail::Internet; @@ -504,7 +504,7 @@ give you the hostname after which you can find out the IP address use Socket; use Sys::Hostname; my $host = hostname(); - my $addr = inet_ntoa(scalar(gethostbyname($name)) || 'localhost'); + my $addr = inet_ntoa(scalar gethostbyname($host || 'localhost')); Probably the simplest way to learn your DNS domain name is to grok it out of /etc/resolv.conf, at least under Unix. Of course, this @@ -531,11 +531,12 @@ available from CPAN) is more complex but can put as well as fetch. A DCE::RPC module is being developed (but is not yet available), and will be released as part of the DCE-Perl package (available from -CPAN). No ONC::RPC module is known. +CPAN). The rpcgen suite, available from CPAN/authors/id/JAKE/, is +an RPC stub generator and includes an RPC::ONC module. =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. +Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as part of the Standard Version of Perl, or as part of @@ -550,3 +551,4 @@ are hereby placed into the public domain. You are permitted and encouraged to use this code in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit would be courteous but is not required. + diff --git a/pod/perlfunc.pod b/pod/perlfunc.pod index 0f8a0609c5..f9bd2c56ad 100644 --- a/pod/perlfunc.pod +++ b/pod/perlfunc.pod @@ -135,10 +135,8 @@ C<unlink>, C<utime> =item Keywords related to the control flow of your perl program -C<caller>, C<continue>, C<die>, C<do>, C<dump>, C<else>, C<elsif>, -C<eval>, C<exit>, C<for>, C<foreach>, C<goto>, C<if>, C<last>, -C<next>, C<redo>, C<return>, C<sub>, C<unless>, C<wantarray>, -C<while>, C<until> +C<caller>, C<continue>, C<die>, C<do>, C<dump>, C<eval>, C<exit>, +C<goto>, C<last>, C<next>, C<redo>, C<return>, C<sub>, C<wantarray> =item Keywords related to scoping @@ -680,14 +678,14 @@ L<perlipc/"Sockets: Client/Server Communication">. =item continue BLOCK Actually a flow control statement rather than a function. If there is a -C<continue> BLOCK attached to a BLOCK (typically in a L</while> or -L</foreach>), it is always executed just before the conditional is about to -be evaluated again, just like the third part of a L</for> loop in C. Thus +C<continue> BLOCK attached to a BLOCK (typically in a C<while> or +C<foreach>), it is always executed just before the conditional is about to +be evaluated again, just like the third part of a C<for> loop in C. Thus it can be used to increment a loop variable, even when the loop has been continued via the C<next> statement (which is similar to the C C<continue> statement). -L</last>, L</next>, or L</redo> may appear within a C<continue> +C<last>, C<next>, or C<redo> may appear within a C<continue> block. C<last> and C<redo> will behave as if they had been executed within the main block. So will C<next>, but since it will execute a C<continue> block, it may be more entertaining. @@ -706,8 +704,6 @@ Omitting the C<continue> section is semantically equivalent to using an empty one, logically enough. In that case, C<next> goes directly back to check the condition at the top of the loop. -See also L<perlsyn>. - =item cos EXPR Returns the cosine of EXPR (expressed in radians). If EXPR is omitted, @@ -953,12 +949,11 @@ as the first line of the handler (see L<perlvar/$^S>). Not really a function. Returns the value of the last command in the sequence of commands indicated by BLOCK. When modified by a loop -modifier such as L</while> or L</until>, executes the BLOCK once -before testing the loop condition. (On other statements the loop -modifiers test the conditional first.) +modifier, executes the BLOCK once before testing the loop condition. +(On other statements the loop modifiers test the conditional first.) C<do BLOCK> does I<not> count as a loop, so the loop control statements -L</next>, L</last> or L</redo> cannot be used to leave or restart the block. +C<next>, C<last> or C<redo> cannot be used to leave or restart the block. =item do SUBROUTINE(LIST) @@ -1077,12 +1072,6 @@ only in a different order: See also C<keys()>, C<values()> and C<sort()>. -=item else BLOCK - -=item elsif (EXPR) BLOCK - -See L</if>. - =item eof FILEHANDLE =item eof () @@ -1454,38 +1443,6 @@ Here's a mailbox appender for BSD systems. See also L<DB_File> for other flock() examples. -=item for (INITIAL; WHILE; EACH) BLOCK - -Do INITIAL, enter BLOCK while EXPR is true, at the end of each round -do EACH. For example: - - for ($i = 0, $j = 0; $i < 10; $i++) { - if ($i % 3 == 0) { $j++ } - print "i = $i, j = $j\n"; - } - -See L<perlsyn> for more details. See also L</foreach>, a twin of -C<for>, L</while> and L</until>, close cousins of L<for>, and -L</last>, L</next>, and L</redo> for additional control flow. - -=item foreach LOOPVAR (LIST) BLOCK - -Enter BLOCK as LOOPVAR set in turn to each element of LIST. -For example: - - foreach $rolling (@stones) { print "$rolling stone\n" } - - foreach my $file (@files) { print "file $file\n" } - -The LOOPVAR is optional and defaults to C<$_>. If the elements are -modifiable (as opposed to constants or tied variables) you can modify them. - - foreach (@words) { tr/abc/xyz/ } - -See L<perlsyn> for more details. See also L</for>, a twin of -C<foreach>, L</while> and L</until>, close cousins of L<for>, and -L</last>, L</next>, and L</redo> for additional control flow. - =item fork Does a fork(2) system call. Returns the child pid to the parent process, @@ -1592,7 +1549,7 @@ is left as an exercise to the reader. The C<POSIX::getattr()> function can do this more portably on systems purporting POSIX compliance. See also the C<Term::ReadKey> module from your nearest CPAN site; -details on CPAN can be found on L<perlmod/CPAN>. +details on CPAN can be found on L<perlmodlib/CPAN>. =item getlogin @@ -1892,24 +1849,6 @@ see L</oct>.) If EXPR is omitted, uses C<$_>. print hex '0xAf'; # prints '175' print hex 'aF'; # same -=item if (EXPR) BLOCK - -=item if (EXPR) BLOCK else BLOCK2 - -=item if (EXPR) BLOCK elsif (EXPR2) BLOCK2 - -Enter BLOCKs conditionally. The first EXPR to return true -causes the corresponding BLOCK to be entered, or, in the case -of C<else>, the fall-through default BLOCK. - -Note 1: Perl wants BLOCKS, expressions won't do (like they do -e.g. in C, C++, Java, Pascal). - -Note 2: It's C<elsif>, not C<elseif>. You can have as many -C<elsif>s as you want. - -See L<perlsyn> for more details. See also C<unless>. - =item import There is no builtin C<import()> function. It is just an ordinary @@ -2075,10 +2014,8 @@ C<continue> block, if any, is not executed: C<last> cannot be used to exit a block which returns a value such as C<eval {}>, C<sub {}> or C<do {}>. -See also L</continue> for an illustration of how C<last>, L</next>, and -L</redo> work. - -See also L<perlsyn>. +See also L</continue> for an illustration of how C<last>, C<next>, and +C<redo> work. =item lc EXPR @@ -2288,10 +2225,8 @@ refers to the innermost enclosing loop. C<next> cannot be used to exit a block which returns a value such as C<eval {}>, C<sub {}> or C<do {}>. -See also L</continue> for an illustration of how L</last>, C<next>, and -L</redo> work. - -See also L<perlsyn>. +See also L</continue> for an illustration of how C<last>, C<next>, and +C<redo> work. =item no Module LIST @@ -2302,8 +2237,9 @@ See the L</use> function, which C<no> is the opposite of. =item oct Interprets EXPR as an octal string and returns the corresponding -value. (If EXPR happens to start off with C<0x>, interprets it as -a hex string instead.) The following will handle decimal, octal, and +value. (If EXPR happens to start off with C<0x>, interprets it as a +hex string. If EXPR starts off with C<0b>, it is interpreted as a +binary string.) The following will handle decimal, binary, octal, and hex in the standard Perl or C notation: $val = oct($val) if $val =~ /^0/; @@ -2527,7 +2463,7 @@ them, and automatically close whenever and however you leave that scope: $first; # Or here. } -See L</seek()> for some details about mixing reading and writing. +See L</seek> for some details about mixing reading and writing. =item opendir DIRHANDLE,EXPR @@ -2912,7 +2848,7 @@ See L<perlipc/"UDP: Message Passing"> for examples. =item redo The C<redo> command restarts the loop block without evaluating the -conditional again. The L</continue> block, if any, is not executed. If +conditional again. The C<continue> block, if any, is not executed. If the LABEL is omitted, the command refers to the innermost enclosing loop. This command is normally used by programs that want to lie to themselves about what was just input: @@ -2937,11 +2873,9 @@ themselves about what was just input: C<redo> cannot be used to retry a block which returns a value such as C<eval {}>, C<sub {}> or C<do {}>. -See also L</continue> for an illustration of how L</last>, L</next>, and +See also L</continue> for an illustration of how C<last>, C<next>, and C<redo> work. -See also L<perlsyn>. - =item ref EXPR =item ref @@ -3411,7 +3345,7 @@ busy multitasking system. For delays of finer granularity than one second, you may use Perl's C<syscall()> interface to access setitimer(2) if your system supports it, -or else see L</select()> above. +or else see L</select> above. See also the POSIX module's C<sigpause()> function. @@ -3712,6 +3646,7 @@ In addition, Perl permits the following widely-supported conversions: %X like %x, but using upper-case letters %E like %e, but using an upper-case "E" %G like %g, but with an upper-case "E" (if applicable) + %b an unsigned integer, in binary %p a pointer (outputs the Perl value's address in hexadecimal) %n special: *stores* the number of characters output so far into the next variable in the parameter list @@ -4222,7 +4157,7 @@ Unlike C<dbmopen()>, the C<tie()> function will not use or require a module for you--you need to do that explicitly yourself. See L<DB_File> or the F<Config> module for interesting C<tie()> implementations. -For further details see L<perltie>, L<tied VARIABLE>. +For further details see L<perltie>, L<"tied VARIABLE">. =item tied VARIABLE @@ -4290,7 +4225,7 @@ If EXPR is omitted, merely returns the current umask. The Unix permission C<rwxr-x---> is represented as three sets of three bits, or three octal digits: C<0750> (the leading 0 indicates octal -and isn't one of the the digits). The C<umask> value is such a number +and isn't one of the digits). The C<umask> value is such a number representing disabled permissions bits. The permission (or "mode") values you pass C<mkdir> or C<sysopen> are modified by your umask, so even if you tell C<sysopen> to create a file with permissions C<0777>, @@ -4318,6 +4253,8 @@ not trying to restrict access for yourself, returns C<undef>. Remember that a umask is a number, usually given in octal; it is I<not> a string of octal digits. See also L</oct>, if all you have is a string. + + =item undef EXPR =item undef @@ -4344,13 +4281,6 @@ parameter. Examples: Note that this is a unary operator, not a list operator. -=item unless (EXPR) BLOCK - -The negative counterpart of L</if>. If the EXPR returns false the -BLOCK is entered. - -See also L<perlsyn>. - =item unlink LIST =item unlink @@ -4400,6 +4330,10 @@ The following efficiently counts the number of set bits in a bit vector: $setbits = unpack("%32b*", $selectmask); +=item untie VARIABLE + +Breaks the binding between a variable and a package. (See C<tie()>.) + =item unshift ARRAY,LIST Does the opposite of a C<shift()>. Or the opposite of a C<push()>, @@ -4412,21 +4346,6 @@ Note the LIST is prepended whole, not one element at a time, so the prepended elements stay in the same order. Use C<reverse()> to do the reverse. -=item until (EXPR) BLOCK - -=item do BLOCK until (EXPR) - -Enter BLOCK until EXPR returns false. The first form may avoid entering -the BLOCK, the second form enters the BLOCK at least once. - -See L</do>, L</while>, and L</for>. - -See also L<perlsyn>. - -=item untie VARIABLE - -Breaks the binding between a variable and a package. (See C<tie()>.) - =item use Module LIST =item use Module @@ -4646,15 +4565,6 @@ warnings (even the so-called mandatory ones). An example: See L<perlvar> for details on setting C<%SIG> entries, and for more examples. -=item while (EXPR) BLOCK - -=item do BLOCK while (EXPR) - -Enter BLOCK while EXPR is true. The first form may avoid entering the -BLOCK, the second form enters the BLOCK at least once. - -See also L<perlsyn>, L</for>, L</until>, and L</continue>. - =item write FILEHANDLE =item write EXPR diff --git a/pod/perlguts.pod b/pod/perlguts.pod index 38d75691f2..9b16a8a026 100644 --- a/pod/perlguts.pod +++ b/pod/perlguts.pod @@ -1025,13 +1025,13 @@ There is a way to achieve a similar task from C via Perl API: create a I<pseudo-block>, and arrange for some changes to be automatically undone at the end of it, either explicit, or via a non-local exit (via die()). A I<block>-like construct is created by a pair of -C<ENTER>/C<LEAVE> macros (see L<perlcall/EXAMPLE/"Returning a -Scalar">). Such a construct may be created specially for some -important localized task, or an existing one (like boundaries of -enclosing Perl subroutine/block, or an existing pair for freeing TMPs) -may be used. (In the second case the overhead of additional -localization must be almost negligible.) Note that any XSUB is -automatically enclosed in an C<ENTER>/C<LEAVE> pair. +C<ENTER>/C<LEAVE> macros (see L<perlcall/"Returning a Scalar">). +Such a construct may be created specially for some important localized +task, or an existing one (like boundaries of enclosing Perl +subroutine/block, or an existing pair for freeing TMPs) may be +used. (In the second case the overhead of additional localization must +be almost negligible.) Note that any XSUB is automatically enclosed in +an C<ENTER>/C<LEAVE> pair. Inside such a I<pseudo-block> the following service is available: @@ -1503,7 +1503,7 @@ It is strongly recommended that all Perl API functions that don't begin with C<perl> be referenced with an explicit C<Perl_> prefix. The sort order of the listing is case insensitive, with any -occurrences of '_' ignored for the the purpose of sorting. +occurrences of '_' ignored for the purpose of sorting. =over 8 @@ -2171,6 +2171,15 @@ Do magic after a value is assigned to the SV. See C<sv_magic>. int mg_set (SV* sv) +=item modglobal + +C<modglobal> is a general purpose, interpreter global HV for use by +extensions. While it could hold extension specific information, it is +meant primarily for information that needs to be shared between +extensions. Moreover, while it could be used for any kind of +information, it is meant for information that should be not accessible +in the usual way from the perl symbol table. + =item Move The XSUB-writer's interface to the C C<memmove> function. The C<s> is the diff --git a/pod/perllocale.pod b/pod/perllocale.pod index ba93f18edd..dba15feffe 100644 --- a/pod/perllocale.pod +++ b/pod/perllocale.pod @@ -330,7 +330,7 @@ Second, if using the listed commands you see something B<exactly> (prefix matches do not count and case usually counts) like "En_US" without the quotes, then you should be okay because you are using a locale name that should be installed and available in your system. -In this case, see L<Fixing system locale configuration>. +In this case, see L<Permanently fixing system locale configuration>. =head2 Permanently fixing your locale configuration @@ -349,7 +349,7 @@ rules for matching locale names are a bit vague because standardization is weak in this area. See again the L<Finding locales> about general rules. -=head2 Permanently fixing system locale configuration +=head2 Fixing system locale configuration Contact a system administrator (preferably your own) and report the exact error message you get, and ask them to read this same documentation you @@ -855,7 +855,7 @@ always in force, even if the program environment suggested otherwise (see L<The setlocale function>). By default, Perl still behaves this way for backward compatibility. If you want a Perl application to pay attention to locale information, you B<must> use the S<C<use locale>> -pragma (see L<The use locale Pragma>) to instruct it to do so. +pragma (see L<The use locale pragma>) to instruct it to do so. Versions of Perl from 5.002 to 5.003 did use the C<LC_CTYPE> information if available; that is, C<\w> did understand what diff --git a/pod/perlmodlib.pod b/pod/perlmodlib.pod index 5d0e5b048a..f10d04bb4a 100644 --- a/pod/perlmodlib.pod +++ b/pod/perlmodlib.pod @@ -21,7 +21,7 @@ bulletproof. They work somewhat like pragmas in that they tend to affect the compilation of your program, and thus will usually work well only when used within a -C<use>, or C<no>. Most of these are locally scoped, so an inner BLOCK +C<use>, or C<no>. Most of these are lexically scoped, so an inner BLOCK may countermand any of these by saying: no integer; diff --git a/pod/perlobj.pod b/pod/perlobj.pod index f10fbdfe2e..182e3ee830 100644 --- a/pod/perlobj.pod +++ b/pod/perlobj.pod @@ -84,7 +84,7 @@ that wish to call methods in the class as part of the construction: } If you care about inheritance (and you should; see -L<perlmod/"Modules: Creation, Use, and Abuse">), +L<perlmodlib/"Modules: Creation, Use, and Abuse">), then you want to use the two-arg form of bless so that your constructors may be inherited: diff --git a/pod/perlop.pod b/pod/perlop.pod index 857b951486..a485781e40 100644 --- a/pod/perlop.pod +++ b/pod/perlop.pod @@ -620,9 +620,9 @@ the same character fore and aft, but the 4 sorts of brackets "" qq{} Literal yes `` qx{} Command yes (unless '' is delimiter) qw{} Word list no - // m{} Pattern match yes - qr{} Pattern yes - s{}{} Substitution yes + // m{} Pattern match yes (unless '' is delimiter) + qr{} Pattern yes (unless '' is delimiter) + s{}{} Substitution yes (unless '' is delimiter) tr{}{} Transliteration no (but see below) Note that there can be whitespace between the operator and the quoting @@ -753,22 +753,22 @@ Options are: If "/" is the delimiter then the initial C<m> is optional. With the C<m> you can use any pair of non-alphanumeric, non-whitespace characters -as delimiters (if single quotes are used, no interpretation is done -on the replacement string. Unlike Perl 4, Perl 5 treats backticks as normal -delimiters; the replacement text is not evaluated as a command). -This is particularly useful for matching Unix path names -that contain "/", to avoid LTS (leaning toothpick syndrome). If "?" is +as delimiters. This is particularly useful for matching Unix path names +that contain "/", to avoid LTS (leaning toothpick syndrome). If "?" is the delimiter, then the match-only-once rule of C<?PATTERN?> applies. +If "'" is the delimiter, no variable interpolation is performed on the +PATTERN. PATTERN may contain variables, which will be interpolated (and the -pattern recompiled) every time the pattern search is evaluated. (Note -that C<$)> and C<$|> might not be interpolated because they look like -end-of-string tests.) If you want such a pattern to be compiled only -once, add a C</o> after the trailing delimiter. This avoids expensive -run-time recompilations, and is useful when the value you are -interpolating won't change over the life of the script. However, mentioning -C</o> constitutes a promise that you won't change the variables in the pattern. -If you change them, Perl won't even notice. +pattern recompiled) every time the pattern search is evaluated, except +for when the delimiter is a single quote. (Note that C<$)> and C<$|> +might not be interpolated because they look like end-of-string tests.) +If you want such a pattern to be compiled only once, add a C</o> after +the trailing delimiter. This avoids expensive run-time recompilations, +and is useful when the value you are interpolating won't change over +the life of the script. However, mentioning C</o> constitutes a promise +that you won't change the variables in the pattern. If you change them, +Perl won't even notice. If the PATTERN evaluates to the empty string, the last I<successfully> matched regular expression is used instead. @@ -911,8 +911,9 @@ A double-quoted, interpolated string. =item qr/STRING/imosx Quote-as-a-regular-expression operator. I<STRING> is interpolated the -same way as I<PATTERN> in C<m/PATTERN/>. Returns a Perl value which -may be used instead of the corresponding C</STRING/imosx> expression. +same way as I<PATTERN> in C<m/PATTERN/>. If "'" is used as the +delimiter, no variable interpolation is done. Returns a Perl value +which may be used instead of the corresponding C</STRING/imosx> expression. For example, @@ -1057,6 +1058,10 @@ comments into a multi-line C<qw>-string. For this reason the C<-w> switch produce warnings if the STRING contains the "," or the "#" character. +Note that under use L<locale> qw() taints because the definition of +whitespace is tainted. See L<perlsec> for more information about +tainting and L<perllocale> for more information about locales. + =item s/PATTERN/REPLACEMENT/egimosx Searches a string for a pattern, and if found, replaces that pattern @@ -1068,7 +1073,7 @@ variable is searched and modified. (The string specified with C<=~> must be scalar variable, an array element, a hash element, or an assignment to one of those, i.e., an lvalue.) -If the delimiter chosen is single quote, no variable interpolation is +If the delimiter chosen is a single quote, no variable interpolation is done on either the PATTERN or the REPLACEMENT. Otherwise, if the PATTERN contains a $ that looks like a variable rather than an end-of-string test, the variable will be interpolated into the pattern diff --git a/pod/perlopentut.pod b/pod/perlopentut.pod new file mode 100644 index 0000000000..6e6091ab49 --- /dev/null +++ b/pod/perlopentut.pod @@ -0,0 +1,862 @@ +=head1 NAME + +perlopentut - tutorial on opening things in Perl + +=head1 DESCRIPTION + +Perl has two simple, built-in ways to open files: the shell way for +convenience, and the C way for precision. The choice is yours. + +=head1 Open E<agrave> la shell + +Perl's C<open> function was designed to mimic the way command-line +redirection in the shell works. Here are some basic examples +from the shell: + + $ myprogram file1 file2 file3 + $ myprogram < inputfile + $ myprogram > outputfile + $ myprogram >> outputfile + $ myprogram | otherprogram + $ otherprogram | myprogram + +And here are some more advanced examples: + + $ otherprogram | myprogram f1 - f2 + $ otherprogram 2>&1 | myprogram - + $ myprogram <&3 + $ myprogram >&4 + +Programmers accustomed to constructs like those above can take comfort +in learning that Perl directly supports these familiar constructs using +virtually the same syntax as the shell. + +=head2 Simple Opens + +The C<open> function takes two arguments: the first is a filehandle, +and the second is a single string comprising both what to open and how +to open it. C<open> returns true when it works, and when it fails, +returns a false value and sets the special variable $! to reflect +the system error. If the filehandle was previously opened, it will +be implicitly closed first. + +For example: + + open(INFO, "datafile") || die("can't open datafile: $!"); + open(INFO, "< datafile") || die("can't open datafile: $!"); + open(RESULTS,"> runstats") || die("can't open runstats: $!"); + open(LOG, ">> logfile ") || die("can't open logfile: $!"); + +If you prefer the low-punctuation version, you could write that this way: + + open INFO, "< datafile" or die "can't open datafile: $!"; + open RESULTS,"> runstats" or die "can't open runstats: $!"; + open LOG, ">> logfile " or die "can't open logfile: $!"; + +A few things to notice. First, the leading less-than is optional. +If omitted, Perl assumes that you want to open the file for reading. + +The other important thing to notice is that, just as in the shell, +any white space before or after the filename is ignored. This is good, +because you wouldn't want these to do different things: + + open INFO, "<datafile" + open INFO, "< datafile" + open INFO, "< datafile" + +Ignoring surround whitespace also helps for when you read a filename in +from a different file, and forget to trim it before opening: + + $filename = <INFO>; # oops, \n still there + open(EXTRA, "< $filename") || die "can't open $filename: $!"; + +This is not a bug, but a feature. Because C<open> mimics the shell in +its style of using redirection arrows to specify how to open the file, it +also does so with respect to extra white space around the filename itself +as well. For accessing files with naughty names, see L</"Dispelling +the Dweomer">. + +=head2 Pipe Opens + +In C, when you want to open a file using the standard I/O library, +you use the C<fopen> function, but when opening a pipe, you use the +C<popen> function. But in the shell, you just use a different redirection +character. That's also the case for Perl. The C<open> call +remains the same--just its argument differs. + +If the leading character is a pipe symbol, C<open) starts up a new +command and open a write-only filehandle leading into that command. +This lets you write into that handle and have what you write show up on +that command's standard input. For example: + + open(PRINTER, "| lpr -Plp1") || die "cannot fork: $!"; + print PRINTER "stuff\n"; + close(PRINTER) || die "can't close lpr: $!"; + +If the trailing character is a pipe, you start up a new command and open a +read-only filehandle leading out of that command. This lets whatever that +command writes to its standard output show up on your handle for reading. +For example: + + open(NET, "netstat -i -n |") || die "cannot fork: $!"; + while (<NET>) { } # do something with input + close(NET) || die "can't close netstat: $!"; + +What happens if you try to open a pipe to or from a non-existent command? +In most systems, such an C<open> will not return an error. That's +because in the traditional C<fork>/C<exec> model, running the other +program happens only in the forked child process, which means that +the failed C<exec> can't be reflected in the return value of C<open>. +Only a failed C<fork> shows up there. See L<perlfaq8/"Why doesn't open() +return an error when a pipe open fails?"> to see how to cope with this. +There's also an explanation in L<perlipc>. + +If you would like to open a bidirectional pipe, the IPC::Open2 +library will handle this for you. Check out L<perlipc/"Bidirectional +Communication with Another Process"> + +=head2 The Minus File + +Again following the lead of the standard shell utilities, Perl's +C<open> function treats a file whose name is a single minus, "-", in a +special way. If you open minus for reading, it really means to access +the standard input. If you open minus for writing, it really means to +access the standard output. + +If minus can be used as the default input or default output? What happens +if you open a pipe into or out of minus? What's the default command it +would run? The same script as you're current running! This is actually +a stealth C<fork> hidden inside an C<open> call. See L<perlipc/"Safe Pipe +Opens"> for details. + +=head2 Mixing Reads and Writes + +It is possible to specify both read and write access. All you do is +add a "+" symbol in front of the redirection. But as in the shell, +using a less-than on a file never creates a new file; it only opens an +existing one. On the other hand, using a greater-than always clobbers +(truncates to zero length) an existing file, or creates a brand-new one +if there isn't an old one. Adding a "+" for read-write doesn't affect +whether it only works on existing files or always clobbers existing ones. + + open(WTMP, "+< /usr/adm/wtmp") + || die "can't open /usr/adm/wtmp: $!"; + + open(SCREEN, "+> /tmp/lkscreen") + || die "can't open /tmp/lkscreen: $!"; + + open(LOGFILE, "+>> /tmp/applog" + || die "can't open /tmp/applog: $!"; + +The first one won't create a new file, and the second one will always +clobber an old one. The third one will create a new file if necessary +and not clobber an old one, and it will allow you to read at any point +in the file, but all writes will always go to the end. In short, +the first case is substantially more common than the second and third +cases, which are almost always wrong. (If you know C, the plus in +Perl's C<open> is historically derived from the one in C's fopen(3S), +which it ultimately calls.) + +In fact, when it comes to updating a file, unless you're working on +a binary file as in the WTMP case above, you probably don't want to +use this approach for updating. Instead, Perl's B<-i> flag comes to +the rescue. The following command takes all the C, C++, or yacc source +or header files and changes all their foo's to bar's, leaving +the old version in the original file name with a ".orig" tacked +on the end: + + $ perl -i.orig -pe 's/\bfoo\b/bar/g' *.[Cchy] + +This is a short cut for some renaming games that are really +the best way to update textfiles. See the second question in +L<perlfaq5> for more details. + +=head2 Filters + +One of the most common uses for C<open> is one you never +even notice. When you process the ARGV filehandle using +C<E<lt>ARGVE<gt>>, Perl actually does an implicit open +on each file in @ARGV. Thus a program called like this: + + $ myprogram file1 file2 file3 + +Can have all its files opened and processed one at a time +using a construct no more complex than: + + while (<>) { + # do something with $_ + } + +If @ARGV is empty when the loop first begins, Perl pretends you've opened +up minus, that is, the standard input. In fact, $ARGV, the currently +open file during C<E<lt>ARGVE<gt>> processing, is even set to "-" +in these circumstances. + +You are welcome to pre-process your @ARGV before starting the loop to +make sure it's to your liking. One reason to do this might be to remove +command options beginning with a minus. While you can always roll the +simple ones by hand, the Getopts modules are good for this. + + use Getopt::Std; + + # -v, -D, -o ARG, sets $opt_v, $opt_D, $opt_o + getopts("vDo:"); + + # -v, -D, -o ARG, sets $args{v}, $args{D}, $args{o} + getopts("vDo:", \%args); + +Or the standard Getopt::Long module to permit named arguments: + + use Getopt::Long; + GetOptions( "verbose" => \$verbose, # --verbose + "Debug" => \$debug, # --Debug + "output=s" => \$output ); + # --output=somestring or --output somestring + +Another reason for preprocessing arguments is to make an empty +argument list default to all files: + + @ARGV = glob("*") unless @ARGV; + +You could even filter out all but plain, text files. This is a bit +silent, of course, and you might prefer to mention them on the way. + + @ARGV = grep { -f && -T } @ARGV; + +If you're using the B<-n> or B<-p> command-line options, you +should put changes to @ARGV in a C<BEGIN{}> block. + +Remember that a normal C<open> has special properties, in that it might +call fopen(3S) or it might called popen(3S), depending on what its +argument looks like; that's why it's sometimes called "magic open". +Here's an example: + + $pwdinfo = `domainname` =~ /^(\(none\))?$/ + ? '< /etc/passwd' + : 'ypcat passwd |'; + + open(PWD, $pwdinfo) + or die "can't open $pwdinfo: $!"; + +This sort of thing also comes into play in filter processing. Because +C<E<lt>ARGVE<gt>> processing employs the normal, shell-style Perl C<open>, +it respects all the special things we've already seen: + + $ myprogram f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile + +That program will read from the file F<f1>, the process F<cmd1>, standard +input (F<tmpfile> in this case), the F<f2> file, the F<cmd2> command, +and finally the F<f3> file. + +Yes, this also means that if you have a file named "-" (and so on) in +your directory, that they won't be processed as literal files by C<open>. +You'll need to pass them as "./-" much as you would for the I<rm> program. +Or you could use C<sysopen> as described below. + +One of the more interesting applications is to change files of a certain +name into pipes. For example, to autoprocess gzipped or compressed +files by decompressing them with I<gzip>: + + @ARGV = map { /^\.(gz|Z)$/ ? "gzip -dc $_ |" : $_ } @ARGV; + +Or, if you have the I<GET> program installed from LWP, +you can fetch URLs before processing them: + + @ARGV = map { m#^\w+://# ? "GET $_ |" : $_ } @ARGV; + +It's not for nothing that this is called magic C<E<lt>ARGVE<gt>>. +Pretty nifty, eh? + +=head1 Open E<agrave> la C + +If you want the convenience of the shell, then Perl's C<open> is +definitely the way to go. On the other hand, if you want finer precision +than C's simplistic fopen(3S) provides, then you should look to Perl's +C<sysopen>, which is a direct hook into the open(2) system call. +That does mean it's a bit more involved, but that's the price of +precision. + +C<sysopen> takes 3 (or 4) arguments. + + sysopen HANDLE, PATH, FLAGS, [MASK] + +The HANDLE argument is a filehandle just as with C<open>. The PATH is +a literal path, one that doesn't pay attention to any greater-thans or +less-thans or pipes or minuses, nor ignore white space. If it's there, +it's part of the path. The FLAGS argument contains one or more values +derived from the Fcntl module that have been or'd together using the +bitwise "|" operator. The final argument, the MASK, is optional; if +present, it is combined with the user's current umask for the creation +mode of the file. You should usually omit this. + +Although the traditional values of read-only, write-only, and read-write +are 0, 1, and 2 respectively, this is known not to hold true on some +systems. Instead, it's best to load in the appropriate constants first +from the Fcntl module, which supplies the following standard flags: + + O_RDONLY Read only + O_WRONLY Write only + O_RDWR Read and write + O_CREAT Create the file if it doesn't exist + O_EXCL Fail if the file already exists + O_APPEND Append to the file + O_TRUNC Truncate the file + O_NONBLOCK Non-blocking access + +Less common flags that are sometimes available on some operating systems +include C<O_BINARY>, C<O_TEXT>, C<O_SHLOCK>, C<O_EXLOCK>, C<O_DEFER>, +C<O_SYNC>, C<O_ASYNC>, C<O_DSYNC>, C<O_RSYNC>, C<O_NOCTTY>, C<O_NDELAY> +and C<O_LARGEFILE>. Consult your open(2) manpage or its local equivalent +for details. + +Here's how to use C<sysopen> to emulate the simple C<open> calls we had +before. We'll omit the C<|| die $!> checks for clarity, but make sure +you always check the return values in real code. These aren't quite +the same, since C<open> will trim leading and trailing white space, +but you'll get the idea: + +To open a file for reading: + + open(FH, "< $path"); + sysopen(FH, $path, O_RDONLY); + +To open a file for writing, creating a new file if needed or else truncating +an old file: + + open(FH, "> $path"); + sysopen(FH, $path, O_WRONLY | O_TRUNC | O_CREAT); + +To open a file for appending, creating one if necessary: + + open(FH, ">> $path"); + sysopen(FH, $path, O_WRONLY | O_APPEND | O_CREAT); + +To open a file for update, where the file must already exist: + + open(FH, "+< $path"); + sysopen(FH, $path, O_RDWR); + +And here are things you can do with C<sysopen> that you cannot do with +a regular C<open>. As you see, it's just a matter of controlling the +flags in the third argument. + +To open a file for writing, creating a new file which must not previously +exist: + + sysopen(FH, $path, O_WRONLY | O_EXCL | O_CREAT); + +To open a file for appending, where that file must already exist: + + sysopen(FH, $path, O_WRONLY | O_APPEND); + +To open a file for update, creating a new file if necessary: + + sysopen(FH, $path, O_RDWR | O_CREAT); + +To open a file for update, where that file must not already exist: + + sysopen(FH, $path, O_RDWR | O_EXCL | O_CREAT); + +To open a file without blocking, creating one if necessary: + + sysopen(FH, $path, O_WRONLY | O_NONBLOCK | O_CREAT); + +=head2 Permissions E<agrave> la mode + +If you omit the MASK argument to C<sysopen>, Perl uses the octal value +0666. The normal MASK to use for executables and directories should +be 0777, and for anything else, 0666. + +Why so permissive? Well, it isn't really. The MASK will be modified +by your process's current C<umask>. A umask is a number representing +I<disabled> permissions bits; that is, bits that will not be turned on +in the created files' permissions field. + +For example, if your C<umask> were 027, then the 020 part would +disable the group from writing, and the 007 part would disable others +from reading, writing, or executing. Under these conditions, passing +C<sysopen> 0666 would create a file with mode 0640, since C<0666 &~ 027> +is 0640. + +You should seldom use the MASK argument to C<sysopen()>. That takes +away the user's freedom to choose what permission new files will have. +Denying choice is almost always a bad thing. One exception would be for +cases where sensitive or private data is being stored, such as with mail +folders, cookie files, and internal temporary files. + +=head1 Obscure Open Tricks + +=head2 Re-Opening Files (dups) + +Sometimes you already have a filehandle open, and want to make another +handle that's a duplicate of the first one. In the shell, we place an +ampersand in front of a file descriptor number when doing redirections. +For example, C<2E<gt>&1> makes descriptor 2 (that's STDERR in Perl) +be redirected into descriptor 1 (which is usually Perl's STDOUT). +The same is essentially true in Perl: a filename that begins with an +ampersand is treated instead as a file descriptor if a number, or as a +filehandle if a string. + + open(SAVEOUT, ">&SAVEERR") || die "couldn't dup SAVEERR: $!"; + open(MHCONTEXT, "<&4") || die "couldn't dup fd4: $!"; + +That means that if a function is expecting a filename, but you don't +want to give it a filename because you already have the file open, you +can just pass the filehandle with a leading ampersand. It's best to +use a fully qualified handle though, just in case the function happens +to be in a different package: + + somefunction("&main::LOGFILE"); + +This way if somefunction() is planning on opening its argument, it can +just use the already opened handle. This differs from passing a handle, +because with a handle, you don't open the file. Here you have something +you can pass to open. + +If you have one of those tricky, newfangled I/O objects that the C++ +folks are raving about, then this doesn't work because those aren't a +proper filehandle in the native Perl sense. You'll have to use fileno() +to pull out the proper descriptor number, assuming you can: + + use IO::Socket; + $handle = IO::Socket::INET->new("www.perl.com:80"); + $fd = $handle->fileno; + somefunction("&$fd"); # not an indirect function call + +It can be easier (and certainly will be faster) just to use real +filehandles though: + + use IO::Socket; + local *REMOTE = IO::Socket::INET->new("www.perl.com:80"); + die "can't connect" unless defined(fileno(REMOTE)); + somefunction("&main::REMOTE"); + +If the filehandle or descriptor number is preceded not just with a simple +"&" but rather with a "&=" combination, then Perl will not create a +completely new descriptor opened to the same place using the dup(2) +system call. Instead, it will just make something of an alias to the +existing one using the fdopen(3S) library call This is slightly more +parsimonious of systems resources, although this is less a concern +these days. Here's an example of that: + + $fd = $ENV{"MHCONTEXTFD"}; + open(MHCONTEXT, "<&=$fd") or die "couldn't fdopen $fd: $!"; + +If you're using magic C<E<lt>ARGVE<gt>>, you could even pass in as a +command line argument in @ARGV something like C<"E<lt>&=$MHCONTEXTFD">, +but we've never seen anyone actually do this. + +=head2 Dispelling the Dweomer + +Perl is more of a DWIMmer language than something like Java--where DWIM +is an acronym for "do what I mean". But this principle sometimes leads +to more hidden magic than one knows what to do with. In this way, Perl +is also filled with I<dweomer>, an obscure word meaning an enchantment. +Sometimes, Perl's DWIMmer is just too much like dweomer for comfort. + +If magic C<open> is a bit too magical for you, you don't have to turn +to C<sysopen>. To open a file with arbitrary weird characters in +it, it's necessary to protect any leading and trailing whitespace. +Leading whitespace is protected by inserting a C<"./"> in front of a +filename that starts with whitespace. Trailing whitespace is protected +by appending an ASCII NUL byte (C<"\0">) at the end off the string. + + $file =~ s#^(\s)#./$1#; + open(FH, "< $file\0") || die "can't open $file: $!"; + +This assumes, of course, that your system considers dot the current +working directory, slash the directory separator, and disallows ASCII +NULs within a valid filename. Most systems follow these conventions, +including all POSIX systems as well as proprietary Microsoft systems. +The only vaguely popular system that doesn't work this way is the +proprietary Macintosh system, which uses a colon where the rest of us +use a slash. Maybe C<sysopen> isn't such a bad idea after all. + +If you want to use C<E<lt>ARGVE<gt>> processing in a totally boring +and non-magical way, you could do this first: + + # "Sam sat on the ground and put his head in his hands. + # 'I wish I had never come here, and I don't want to see + # no more magic,' he said, and fell silent." + for (@ARGV) { + s#^([^./])#./$1#; + $_ .= "\0"; + } + while (<>) { + # now process $_ + } + +But be warned that users will not appreciate being unable to use "-" +to mean standard input, per the standard convention. + +=head2 Paths as Opens + +You've probably noticed how Perl's C<warn> and C<die> functions can +produce messages like: + + Some warning at scriptname line 29, <FH> chunk 7. + +That's because you opened a filehandle FH, and had read in seven records +from it. But what was the name of the file, not the handle? + +If you aren't running with C<strict refs>, or if you've turn them off +temporarily, then all you have to do is this: + + open($path, "< $path") || die "can't open $path: $!"; + while (<$path>) { + # whatever + } + +Since you're using the pathname of the file as its handle, +you'll get warnings more like + + Some warning at scriptname line 29, </etc/motd> chunk 7. + +=head2 Single Argument Open + +Remember how we said that Perl's open took two arguments? That was a +passive prevarication. You see, it can also take just one argument. +If and only if the variable is a global variable, not a lexical, you +can pass C<open> just one argument, the filehandle, and it will +get the path from the global scalar variable of the same name. + + $FILE = "/etc/motd"; + open FILE or die "can't open $FILE: $!"; + while (<FILE>) { + # whatever + } + +Why is this here? Someone has to cater to the hysterical porpoises. +It's something that's been in Perl since the very beginning, if not +before. + +=head2 Playing with STDIN and STDOUT + +One clever move with STDOUT is to explicitly close it when you're done +with the program. + + END { close(STDOUT) || die "can't close stdout: $!" } + +If you don't do this, and your program fills up the disk partition due +to a command line redirection, it won't report the error exit with a +failure status. + +You don't have to accept the STDIN and STDOUT you were given. You are +welcome to reopen them if you'd like. + + open(STDIN, "< datafile") + || die "can't open datafile: $!"; + + open(STDOUT, "> output") + || die "can't open output: $!"; + +And then these can be read directly or passed on to subprocesses. +This makes it look as though the program were initially invoked +with those redirections from the command line. + +It's probably more interesting to connect these to pipes. For example: + + $pager = $ENV{PAGER} || "(less || more)"; + open(STDOUT, "| $pager") + || die "can't fork a pager: $!"; + +This makes it appear as though your program were called with its stdout +already piped into your pager. You can also use this kind of thing +in conjunction with an implicit fork to yourself. You might do this +if you would rather handle the post processing in your own program, +just in a different process: + + head(100); + while (<>) { + print; + } + + sub head { + my $lines = shift || 20; + return unless $pid = open(STDOUT, "|-"); + die "cannot fork: $!" unless defined $pid; + while (<STDIN>) { + print; + last if --$lines < 0; + } + exit; + } + +This technique can be applied to repeatedly push as many filters on your +output stream as you wish. + +=head1 Other I/O Issues + +These topics aren't really arguments related to C<open> or C<sysopen>, +but they do affect what you do with your open files. + +=head2 Opening Non-File Files + +When is a file not a file? Well, you could say when it exists but +isn't a plain file. We'll check whether it's a symbolic link first, +just in case. + + if (-l $file || ! -f _) { + print "$file is not a plain file\n"; + } + +What other kinds of files are there than, well, files? Directories, +symbolic links, named pipes, Unix-domain sockets, and block and character +devices. Those are all files, too--just not I<plain> files. This isn't +the same issue as being a text file. Not all text files are plain files. +Not all plain files are textfiles. That's why there are separate C<-f> +and C<-T> file tests. + +To open a directory, you should use the C<opendir> function, then +process it with C<readdir>, carefully restoring the directory +name if necessary: + + opendir(DIR, $dirname) or die "can't opendir $dirname: $!"; + while (defined($file = readdir(DIR))) { + # do something with "$dirname/$file" + } + closedir(DIR); + +If you want to process directories recursively, it's better to use the +File::Find module. For example, this prints out all files recursively, +add adds a slash to their names if the file is a directory. + + @ARGV = qw(.) unless @ARGV; + use File::Find; + find sub { print $File::Find::name, -d && '/', "\n" }, @ARGV; + +This finds all bogus symbolic links beneath a particular directory: + + find sub { print "$File::Find::name\n" if -l && !-e }, $dir; + +As you see, with symbolic links, you can just pretend that it is +what it points to. Or, if you want to know I<what> it points to, then +C<readlink> is called for: + + if (-l $file) { + if (defined($whither = readlink($file))) { + print "$file points to $whither\n"; + } else { + print "$file points nowhere: $!\n"; + } + } + +Named pipes are a different matter. You pretend they're regular files, +but their opens will normally block until there is both a reader and +a writer. You can read more about them in L<perlipc/"Named Pipes">. +Unix-domain sockets are rather different beasts as well; they're +described in L<perlipc/"Unix-Domain TCP Clients and Servers">. + +When it comes to opening devices, it can be easy and it can tricky. +We'll assume that if you're opening up a block device, you know what +you're doing. The character devices are more interesting. These are +typically used for modems, mice, and some kinds of printers. This is +described in L<perlfaq8/"How do I read and write the serial port?"> +It's often enough to open them carefully: + + sysopen(TTYIN, "/dev/ttyS1", O_RDWR | O_NDELAY | O_NOCTTY) + # (O_NOCTTY no longer needed on POSIX systems) + or die "can't open /dev/ttyS1: $!"; + open(TTYOUT, "+>&TTYIN") + or die "can't dup TTYIN: $!"; + + $ofh = select(TTYOUT); $| = 1; select($ofh); + + print TTYOUT "+++at\015"; + $answer = <TTYIN>; + +With descriptors that you haven't opened using C<sysopen>, such as a +socket, you can set them to be non-blocking using C<fcntl>: + + use Fcntl; + fcntl(Connection, F_SETFL, O_NONBLOCK) + or die "can't set non blocking: $!"; + +Rather than losing yourself in a morass of twisting, turning C<ioctl>s, +all dissimilar, if you're going to manipulate ttys, it's best to +make calls out to the stty(1) program if you have it, or else use the +portable POSIX interface. To figure this all out, you'll need to read the +termios(3) manpage, which describes the POSIX interface to tty devices, +and then L<POSIX>, which describes Perl's interface to POSIX. There are +also some high-level modules on CPAN that can help you with these games. +Check out Term::ReadKey and Term::ReadLine. + +What else can you open? To open a connection using sockets, you won't use +one of Perl's two open functions. See L<perlipc/"Sockets: Client/Server +Communication"> for that. Here's an example. Once you have it, +you can use FH as a bidirectional filehandle. + + use IO::Socket; + local *FH = IO::Socket::INET->new("www.perl.com:80"); + +For opening up a URL, the LWP modules from CPAN are just what +the doctor ordered. There's no filehandle interface, but +it's still easy to get the contents of a document: + + use LWP::Simple; + $doc = get('http://www.sn.no/libwww-perl/'); + +=head2 Binary Files + +On certain legacy systems with what could charitably be called terminally +convoluted (some would say broken) I/O models, a file isn't a file--at +least, not with respect to the C standard I/O library. On these old +systems whose libraries (but not kernels) distinguish between text and +binary streams, to get files to behave properly you'll have to bend over +backwards to avoid nasty problems. On such infelicitous systems, sockets +and pipes are already opened in binary mode, and there is currently no +way to turn that off. With files, you have more options. + +Another option is to use the C<binmode> function on the appropriate +handles before doing regular I/O on them: + + binmode(STDIN); + binmode(STDOUT); + while (<STDIN>) { print } + +Passing C<sysopen> a non-standard flag option will also open the file in +binary mode on those systems that support it. This is the equivalent of +opening the file normally, then calling C<binmode>ing on the handle. + + sysopen(BINDAT, "records.data", O_RDWR | O_BINARY) + || die "can't open records.data: $!"; + +Now you can use C<read> and C<print> on that handle without worrying +about the system non-standard I/O library breaking your data. It's not +a pretty picture, but then, legacy systems seldom are. CP/M will be +with us until the end of days, and after. + +On systems with exotic I/O systems, it turns out that, astonishingly +enough, even unbuffered I/O using C<sysread> and C<syswrite> might do +sneaky data mutilation behind your back. + + while (sysread(WHENCE, $buf, 1024)) { + syswrite(WHITHER, $buf, length($buf)); + } + +Depending on the vicissitudes of your runtime system, even these calls +may need C<binmode> or C<O_BINARY> first. Systems known to be free of +such difficulties include Unix, the Mac OS, Plan9, and Inferno. + +=head2 File Locking + +In a multitasking environment, you may need to be careful not to collide +with other processes who want to do I/O on the same files as others +are working on. You'll often need shared or exclusive locks +on files for reading and writing respectively. You might just +pretend that only exclusive locks exist. + +Never use the existence of a file C<-e $file> as a locking indication, +because there is a race condition between the test for the existence of +the file and its creation. Atomicity is critical. + +Perl's most portable locking interface is via the C<flock> function, +whose simplicity is emulated on systems that don't directly support it, +such as SysV or WindowsNT. The underlying semantics may affect how +it all works, so you should learn how C<flock> is implemented on your +system's port of Perl. + +File locking I<does not> lock out another process that would like to +do I/O. A file lock only locks out others trying to get a lock, not +processes trying to do I/O. Because locks are advisory, if one process +uses locking and another doesn't, all bets are off. + +By default, the C<flock> call will block until a lock is granted. +A request for a shared lock will be granted as soon as there is no +exclusive locker. A request for a exclusive lock will be granted as +soon as there is no locker of any kind. Locks are on file descriptors, +not file names. You can't lock a file until you open it, and you can't +hold on to a lock once the file has been closed. + +Here's how to get a blocking shared lock on a file, typically used +for reading: + + use 5.004; + use Fcntl qw(:DEFAULT :flock); + open(FH, "< filename") or die "can't open filename: $!"; + flock(FH, LOCK_SH) or die "can't lock filename: $!"; + # now read from FH + +You can get a non-blocking lock by using C<LOCK_NB>. + + flock(FH, LOCK_SH | LOCK_NB) + or die "can't lock filename: $!"; + +This can be useful for producing more user-friendly behaviour by warning +if you're going to be blocking: + + use 5.004; + use Fcntl qw(:DEFAULT :flock); + open(FH, "< filename") or die "can't open filename: $!"; + unless (flock(FH, LOCK_SH | LOCK_NB)) { + $| = 1; + print "Waiting for lock..."; + flock(FH, LOCK_SH) or die "can't lock filename: $!"; + print "got it.\n" + } + # now read from FH + +To get an exclusive lock, typically used for writing, you have to be +careful. We C<sysopen> the file so it can be locked before it gets +emptied. You can get a nonblocking version using C<LOCK_EX | LOCK_NB>. + + use 5.004; + use Fcntl qw(:DEFAULT :flock); + sysopen(FH, "filename", O_WRONLY | O_CREAT) + or die "can't open filename: $!"; + flock(FH, LOCK_EX) + or die "can't lock filename: $!"; + truncate(FH, 0) + or die "can't truncate filename: $!"; + # now write to FH + +Finally, due to the uncounted millions who cannot be dissuaded from +wasting cycles on useless vanity devices called hit counters, here's +how to increment a number in a file safely: + + use Fcntl qw(:DEFAULT :flock); + + sysopen(FH, "numfile", O_RDWR | O_CREAT) + or die "can't open numfile: $!"; + # autoflush FH + $ofh = select(FH); $| = 1; select ($ofh); + flock(FH, LOCK_EX) + or die "can't write-lock numfile: $!"; + + $num = <FH> || 0; + seek(FH, 0, 0) + or die "can't rewind numfile : $!"; + print FH $num+1, "\n" + or die "can't write numfile: $!"; + + truncate(FH, tell(FH)) + or die "can't truncate numfile: $!"; + close(FH) + or die "can't close numfile: $!"; + +=head1 SEE ALSO + +The C<open> and C<sysopen> function in perlfunc(1); +the standard open(2), dup(2), fopen(3), and fdopen(3) manpages; +the POSIX documentation. + +=head1 AUTHOR and COPYRIGHT + +Copyright 1998 Tom Christiansen. + +When included as part of the Standard Version of Perl, or as part of +its complete documentation whether printed or otherwise, this work may +be distributed only under the terms of Perl's Artistic License. Any +distribution of this file or derivatives thereof outside of that +package require that special arrangements be made with copyright +holder. + +Irrespective of its distribution, all code examples in these files are +hereby placed into the public domain. You are permitted and +encouraged to use this code in your own programs for fun or for profit +as you see fit. A simple comment in the code giving credit would be +courteous but is not required. + +=head1 HISTORY + +First release: Sat Jan 9 08:09:11 MST 1999 diff --git a/pod/perlport.pod b/pod/perlport.pod index 918827c9d7..d6eb10b2d2 100644 --- a/pod/perlport.pod +++ b/pod/perlport.pod @@ -175,7 +175,7 @@ transfer and store numbers always in text format, instead of raw binary, or consider using modules like C<Data::Dumper> (included in the standard distribution as of Perl 5.005) and C<Storable>. -=head2 Files +=head2 Files and Filesystems Most platforms these days structure files in a hierarchical fashion. So, it is reasonably safe to assume that any platform supports the @@ -189,11 +189,19 @@ root directory. VMS, Windows, and OS/2 can work similarly to Unix with C</> as path separator, or in their own idiosyncratic ways (such as having several -root directories and various "unrooted" device files such NIL: and +root directories and various "unrooted" device files such NIL: and LPT:). S<Mac OS> uses C<:> as a path separator instead of C</>. +The filesystem may support neither hard links (C<link()>) nor +symbolic links (C<symlink()>, C<readlink()>, C<lstat()>). + +The filesystem may not support neither access timestamp nor change +timestamp (meaning that about the only portable timestamp is the +modification timestamp), or one second granularity of any timestamps +(e.g. the FAT filesystem limits the time granularity to two seconds). + VOS perl can emulate Unix filenames with C</> as path separator. The native pathname characters greater-than, less-than, number-sign, and percent-sign are always accepted. @@ -228,19 +236,21 @@ Also of use is C<File::Basename>, from the standard distribution, which splits a pathname into pieces (base filename, full path to directory, and file suffix). -Even when on a single platform (if you can call UNIX a single -platform), remember not to count on the existence or the contents of -system-specific files, like F</etc/passwd>, F</etc/sendmail.conf>, or -F</etc/resolv.conf>. For example the F</etc/passwd> may exist but it -may not contain the encrypted passwords because the system is using -some form of enhanced security-- or it may not contain all the -accounts because the system is using NIS. If code does need to rely -on such a file, include a description of the file and its format in -the code's documentation, and make it easy for the user to override -the default location of the file. +Even when on a single platform (if you can call UNIX a single platform), +remember not to count on the existence or the contents of +system-specific files or directories, like F</etc/passwd>, +F</etc/sendmail.conf>, F</etc/resolv.conf>, or even F</tmp/>. For +example, F</etc/passwd> may exist but it may not contain the encrypted +passwords because the system is using some form of enhanced security -- +or it may not contain all the accounts because the system is using NIS. +If code does need to rely on such a file, include a description of the +file and its format in the code's documentation, and make it easy for +the user to override the default location of the file. + +Don't assume a text file will end with a newline. Do not have two files of the same name with different case, like -F<test.pl> and <Test.pl>, as many platforms have case-insensitive +F<test.pl> and F<Test.pl>, as many platforms have case-insensitive filenames. Also, try not to have non-word characters (except for C<.>) in the names, and keep them to the 8.3 convention, for maximum portability. @@ -250,11 +260,17 @@ Likewise, if using C<AutoSplit>, try to keep the split functions to make it so the resulting files have a unique (case-insensitively) first 8 characters. -Don't assume C<E<lt>> won't be the first character of a filename. Always -use C<E<lt>> explicitly to open a file for reading: +There certainly can be whitespace in filenames. Many systems (DOS, +VMS) cannot have more than one C<"."> in their filenames. + +Don't assume C<E<gt>> won't be the first character of a filename. +Always use C<E<lt>> explicitly to open a file for reading. open(FILE, "<$existing_file") or die $!; +Actually, though, if filenames might use strange characters, it is +safest to open it with C<sysopen> instead of C<open>, which is magic. + =head2 System Interaction @@ -284,6 +300,8 @@ C<closedir> instead. Don't count on per-program environment variables, or per-program current directories. +Don't count on specific values of C<$!>. + =head2 Interprocess Communication (IPC) @@ -320,6 +338,7 @@ code, but expose a common interface). The UNIX System V IPC (C<msg*(), sem*(), shm*()>) is not available even in all UNIX platforms. + =head2 External Subroutines (XS) XS code, in general, can be made to work with any platform; but dependent @@ -375,7 +394,7 @@ C<Time::Local>. Assume very little about character sets. Do not assume anything about the numerical values (C<ord()>, C<chr()>) of characters. Do not assume that the alphabetic characters are encoded contiguously (in -numerical sense). Do no assume anything about the ordering of the +numerical sense). Do not assume anything about the ordering of the characters. The lowercase letters may come before or after the uppercase letters, the lowercase and uppercase may be interlaced so that both 'a' and 'A' come before the 'b', the accented and other @@ -385,10 +404,10 @@ before the 'b'. =head2 Internationalisation -If you may assume POSIX (a rather large assumption, that: in practise -that means UNIX) you may read more about the POSIX locale system from +If you may assume POSIX (a rather large assumption, that in practice +means UNIX), you may read more about the POSIX locale system from L<perllocale>. The locale system at least attempts to make things a -little bit more portable or at least more convenient and +little bit more portable, or at least more convenient and native-friendly for non-English users. The system affects character sets and encoding, and date and time formatting, among other things. @@ -480,7 +499,7 @@ Unix flavors: FreeBSD freebsd freebsd-i386 Linux linux i386-linux HP-UX hpux PA-RISC1.1 - IRIX irix irix + IRIX irix irix OSF1 dec_osf alpha-dec_osf SunOS solaris sun4-solaris SunOS solaris i86pc-solaris @@ -582,7 +601,7 @@ limited to 31 characters, and may include any character except C<:>, which is reserved as a path separator. Instead of C<flock>, see C<FSpSetFLock> and C<FSpRstFLock> in the -C<Mac::Files> module. +C<Mac::Files> module, or C<chmod(0444, ...)> and C<chmod(0666, ...)>. In the MacPerl application, you can't run a program from the command line; programs that expect C<@ARGV> to be populated can be edited with something @@ -617,10 +636,9 @@ the application or MPW tool version is running, check: $is_ppc = $MacPerl::Architecture eq 'MacPPC'; $is_68k = $MacPerl::Architecture eq 'Mac68K'; -S<Mac OS X>, to be based on NeXT's OpenStep OS, will be able to run -MacPerl natively (in the Blue Box, and even in the Yellow Box, once some -changes to the toolbox calls are made), but Unix perl will also run -natively. +S<Mac OS X>, to be based on NeXT's OpenStep OS, will (in theory) be able +to run MacPerl natively, but Unix perl will also run natively under the +built-in Unix environment. Also see: @@ -822,7 +840,7 @@ an effect on what happens with some perl functions (such as C<chr>, C<pack>, C<print>, C<printf>, C<ord>, C<sort>, C<sprintf>, C<unpack>), as well as bit-fiddling with ASCII constants using operators like C<^>, C<&> and C<|>, not to mention dealing with socket interfaces to ASCII computers -(see L<"NEWLINES">). +(see L<Newlines>). Fortunately, most web servers for the mainframe will correctly translate the C<\n> in the following statement to its ASCII equivalent (note that @@ -833,6 +851,7 @@ C<\r> is the same under both Unix and OS/390 & VM/ESA): The value of C<$^O> on OS/390 is "os390". The value of C<$^O> on VM/ESA is "vmesa". + Some simple tricks for determining if you are running on an EBCDIC platform could include any of the following (perhaps all): @@ -905,7 +924,7 @@ C<System$Path> contains a single item list. The filesystem will also expand system variables in filenames if enclosed in angle brackets, so C<E<lt>System$DirE<gt>.Modules> would look for the file S<C<$ENV{'System$Dir'} . 'Modules'>>. The obvious implication of this is -that B<fully qualified filenames can start with C<E<lt>E<gt>> and should +that B<fully qualified filenames can start with C<E<lt>E<gt>>> and should be protected when C<open> is used for input. Because C<.> was in use as a directory separator and filenames could not @@ -1126,6 +1145,7 @@ Invokes VMS debugger. (VMS) Not implemented. (S<Mac OS>) Implemented via Spawn. (VM/ESA) + =item fcntl FILEHANDLE,FUNCTION,SCALAR Not implemented. (Win32, VMS) @@ -1305,6 +1325,9 @@ method of spawning a process. (Win32) Not implemented. (S<Mac OS>, Win32, VMS, S<RISC OS>) +Link count not updated because hard links are not quite that hard +(They are sort of half-way between hard and soft links). (AmigaOS) + =item lstat FILEHANDLE =item lstat EXPR @@ -1338,6 +1361,8 @@ open to C<|-> and C<-|> are unsupported. (S<Mac OS>, Win32, S<RISC OS>) Not implemented. (S<Mac OS>) +Very limited functionality. (MiNT) + =item readlink EXPR =item readlink @@ -1435,6 +1460,11 @@ the child program uses a compatible version of the emulation library. I<scalar> will call the native command line direct and no such emulation of a child Unix program will exists. Mileage B<will> vary. (S<RISC OS>) +Far from being POSIX compliant. Because there may be no underlying +/bin/sh tries to work around the problem by forking and execing the +first token in its argument string. Handles basic redirection +("E<lt>" or "E<gt>") on its own behalf. (MiNT) + =item times Only the first entry returned is nonzero. (S<Mac OS>) @@ -1464,6 +1494,9 @@ should not be held open elsewhere. (Win32) Returns undef where unavailable, as of version 5.005. +C<umask()> works but the correct permissions are only set when the file +is finally close()d. (AmigaOS) + =item utime LIST Only the modification time is updated. (S<Mac OS>, VMS, S<RISC OS>) @@ -1491,23 +1524,38 @@ Not useful. (S<RISC OS>) =over 4 -=item 1.35, 9 September 1998 +=item v1.38, 31 December 1998 + +More changes from Jarkko. + +=item v1.37, 19 December 1998 -Updated for Stratus VOS. +More minor changes. Merge two separate version 1.35 documents. -=item 1.33, 06 August 1998 +=item v1.36, 9 September 1998 + +Updated for Stratus VOS. Also known as version 1.35. + +=item v1.35, 13 August 1998 + +Integrate more minor changes, plus addition of new sections under +L<"ISSUES">: L<"Numbers endianness and Width">, +L<"Character sets and character encoding">, +L<"Internationalisation">. + +=item v1.33, 06 August 1998 Integrate more minor changes. -=item 1.32, 05 August 1998 +=item v1.32, 05 August 1998 Integrate more minor changes. -=item 1.30, 03 August 1998 +=item v1.30, 03 August 1998 Major update for RISC OS, other minor changes. -=item 1.23, 10 July 1998 +=item v1.23, 10 July 1998 First public release with perl5.005. @@ -1529,6 +1577,7 @@ Jarkko Hietaniemi E<lt>jhi@iki.fi<gt>, Luther Huffman E<lt>lutherh@stratcom.comE<gt>, Nick Ing-Simmons E<lt>nick@ni-s.u-net.comE<gt>, Andreas J. KE<ouml>nig E<lt>koenig@kulturbox.deE<gt>, +Markus Laker E<lt>mlaker@contax.co.ukE<gt>, Andrew M. Langmead E<lt>aml@world.std.comE<gt>, Paul Moore E<lt>Paul.Moore@uk.origin-it.comE<gt>, Chris Nandor E<lt>pudge@pobox.comE<gt>, @@ -1542,9 +1591,9 @@ Paul J. Schinder E<lt>schinder@pobox.comE<gt>, Dan Sugalski E<lt>sugalskd@ous.eduE<gt>, Nathan Torkington E<lt>gnat@frii.comE<gt>. -This document is maintained by Chris Nandor. +This document is maintained by Chris Nandor +E<lt>pudge@pobox.comE<gt>. =head1 VERSION -Version 1.35, last modified 09 September 1998. - +Version 1.38, last modified 31 December 1998 diff --git a/pod/perlsub.pod b/pod/perlsub.pod index 957b3d8ad8..95fbb6b342 100644 --- a/pod/perlsub.pod +++ b/pod/perlsub.pod @@ -199,7 +199,7 @@ pre-defined things are C<BEGIN>, C<END>, C<AUTOLOAD>, and C<DESTROY>--plus all t functions mentioned in L<perltie>. The 5.005 release adds C<INIT> to this list. -=head2 Private Variables via C<my()> +=head2 Private Variables via my() Synopsis: diff --git a/pod/perlvar.pod b/pod/perlvar.pod index fb27bfba46..b9b0ce6c0a 100644 --- a/pod/perlvar.pod +++ b/pod/perlvar.pod @@ -845,12 +845,16 @@ specified, and the value is the location of the file actually found. The C<require> command uses this array to determine whether a given file has already been included. -=item %ENV $ENV{expr} +=item %ENV + +=item $ENV{expr} The hash %ENV contains your current environment. Setting a value in C<ENV> changes the environment for child processes. -=item %SIG $SIG{expr} +=item %SIG + +=item $SIG{expr} The hash %SIG is used to set signal handlers for various signals. Example: diff --git a/pod/perlxstut.pod b/pod/perlxstut.pod index 867d42a8c2..69a1a25d73 100644 --- a/pod/perlxstut.pod +++ b/pod/perlxstut.pod @@ -465,7 +465,7 @@ include a C source file and a header file. We'll also create a Makefile.PL in this directory. Then we'll make sure that running make at the Mytest2 level will automatically run this Makefile.PL file and the resulting Makefile. -In the testlib directory, create a file mylib.h that looks like this: +In the mylib directory, create a file mylib.h that looks like this: #define TESTVAL 4 diff --git a/pod/pod2html.PL b/pod/pod2html.PL index 4eec29c26b..366dc163bf 100644 --- a/pod/pod2html.PL +++ b/pod/pod2html.PL @@ -164,7 +164,7 @@ See L<Pod::Html> for a list of known bugs in the translator. =head1 SEE ALSO -L<perlpod>, L<Pod::HTML> +L<perlpod>, L<Pod::Html> =head1 COPYRIGHT diff --git a/pod/pod2man.PL b/pod/pod2man.PL index 4edf4f8bb2..3c55d6e29c 100644 --- a/pod/pod2man.PL +++ b/pod/pod2man.PL @@ -318,7 +318,11 @@ $cutting = 1; # running an installed version of Perl to produce documentation from an # uninstalled newer version's pod files. if ($^O ne 'plan9' and $^O ne 'dos' and $^O ne 'os2' and $^O ne 'MSWin32') { - my $perl = (-x './perl') ? './perl' : ((-x '../perl') ? '../perl' : ''); + my $perl = (-x './perl' && -f './perl' ) ? + './perl' : + ((-x '../perl' && -f '../perl') ? + '../perl' : + ''); ($version,$patch) = `$perl -e 'print $]'` =~ /^(\d\.\d{3})(\d{2})?/ if $perl; } # No luck; we'll just go with the running Perl's version diff --git a/pod/roffitall b/pod/roffitall index 421b37a8f0..3cea6859e9 100644 --- a/pod/roffitall +++ b/pod/roffitall @@ -38,6 +38,7 @@ toroff=` $mandir/perlfunc.1 \ $mandir/perlvar.1 \ $mandir/perlsub.1 \ + $mandir/perlopentut.1 \ $mandir/perlmod.1 \ $mandir/perlmodlib.1 \ $mandir/perlmodinstall.1 \ |