diff options
author | Tom Christiansen <tchrist@perl.com> | 1999-01-07 16:05:02 -0700 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 1999-01-08 11:51:52 +0000 |
commit | 65acb1b1d672587d3a0d073613a475584830e38e (patch) | |
tree | fcb09719fada1c9453493712a798b889dd89b086 /pod/perlfaq5.pod | |
parent | ae83f3772b2dd371e676035c6714025e89d7e08f (diff) | |
download | perl-65acb1b1d672587d3a0d073613a475584830e38e.tar.gz |
FAQ jumbo patch from tchrist.
Message-Id: <199901080605.XAA20229@jhereg.perl.com>
To: pumpkings@jhereg.perl.com
Subject: newest version of perlfaq.pod
Date: Thu, 7 Jan 1999 23:05:02 -0700
Message-Id: <199901080605.XAA20231@jhereg.perl.com>
From: Tom Christiansen <tchrist@jhereg.perl.com>
To: pumpkings@jhereg.perl.com
Subject: newest version of perlfaq1.pod
Date: Thu, 7 Jan 1999 23:05:02 -0700
Message-Id: <199901080605.XAA20233@jhereg.perl.com>
From: Tom Christiansen <tchrist@jhereg.perl.com>
To: pumpkings@jhereg.perl.com
Subject: newest version of perlfaq2.pod
Date: Thu, 7 Jan 1999 23:05:02 -0700
Message-Id: <199901080605.XAA20235@jhereg.perl.com>
From: Tom Christiansen <tchrist@jhereg.perl.com>
To: pumpkings@jhereg.perl.com
Subject: newest version of perlfaq3.pod
Date: Thu, 7 Jan 1999 23:05:02 -0700
Message-Id: <199901080605.XAA20237@jhereg.perl.com>
From: Tom Christiansen <tchrist@jhereg.perl.com>
To: pumpkings@jhereg.perl.com
Subject: newest version of perlfaq4.pod
Date: Thu, 7 Jan 1999 23:05:02 -0700
Message-Id: <199901080605.XAA20239@jhereg.perl.com>
From: Tom Christiansen <tchrist@jhereg.perl.com>
To: pumpkings@jhereg.perl.com
Subject: newest version of perlfaq5.pod
Date: Thu, 7 Jan 1999 23:05:02 -0700
Message-Id: <199901080605.XAA20241@jhereg.perl.com>
From: Tom Christiansen <tchrist@jhereg.perl.com>
To: pumpkings@jhereg.perl.com
Subject: newest version of perlfaq6.pod
Date: Thu, 7 Jan 1999 23:05:02 -0700
Message-Id: <199901080605.XAA20243@jhereg.perl.com>
From: Tom Christiansen <tchrist@jhereg.perl.com>
To: pumpkings@jhereg.perl.com
Subject: newest version of perlfaq7.pod
Date: Thu, 7 Jan 1999 23:05:03 -0700
Message-Id: <199901080605.XAA20245@jhereg.perl.com>
From: Tom Christiansen <tchrist@jhereg.perl.com>
To: pumpkings@jhereg.perl.com
Subject: newest version of perlfaq8.pod
Date: Thu, 7 Jan 1999 23:05:03 -0700
Message-Id: <199901080605.XAA20257@jhereg.perl.com>
From: Tom Christiansen <tchrist@jhereg.perl.com>
To: pumpkings@jhereg.perl.com
Subject: newest version of perlfaq9.pod
Date: Thu, 7 Jan 1999 23:05:03 -0700
p4raw-id: //depot/cfgperl@2588
Diffstat (limited to 'pod/perlfaq5.pod')
-rw-r--r-- | pod/perlfaq5.pod | 175 |
1 files changed, 117 insertions, 58 deletions
diff --git a/pod/perlfaq5.pod b/pod/perlfaq5.pod index 3e1103b2a4..119ffa4103 100644 --- a/pod/perlfaq5.pod +++ b/pod/perlfaq5.pod @@ -1,6 +1,6 @@ =head1 NAME -perlfaq5 - Files and Formats ($Revision: 1.24 $, $Date: 1998/07/05 15:07:20 $) +perlfaq5 - Files and Formats ($Revision: 1.34 $, $Date: 1999/01/08 05:46:13 $) =head1 DESCRIPTION @@ -78,12 +78,15 @@ See L<perlfaq9> for other examples of fetching URLs over the web. =head2 How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file? +Those are operations of a text editor. Perl is not a text editor. +Perl is a programming language. You have to decompose the problem into +low-level calls to read, write, open, close, and seek. + Although humans have an easy time thinking of a text file as being a -sequence of lines that operates much like a stack of playing cards -- -or punch cards -- computers usually see the text file as a sequence of -bytes. In general, there's no direct way for Perl to seek to a -particular line of a file, insert text into a file, or remove text -from a file. +sequence of lines that operates much like a stack of playing cards -- or +punch cards -- computers usually see the text file as a sequence of bytes. +In general, there's no direct way for Perl to seek to a particular line +of a file, insert text into a file, or remove text from a file. (There are exceptions in special circumstances. You can add or remove at the very end of the file. Another is replacing a sequence of bytes with @@ -97,7 +100,7 @@ no locking. $old = $file; $new = "$file.tmp.$$"; - $bak = "$file.bak"; + $bak = "$file.orig"; open(OLD, "< $old") or die "can't open $old: $!"; open(NEW, "> $new") or die "can't open $new: $!"; @@ -124,7 +127,7 @@ platform-specific documentation that came with your port. perl -pi -e 's/(^\s+test\s+)\d+/ $1 . ++$count /e' t/op/taint.t # form a script - local($^I, @ARGV) = ('.bak', glob("*.c")); + local($^I, @ARGV) = ('.orig', glob("*.c")); while (<>) { if ($. == 1) { print "This line should appear at the top of each file\n"; @@ -174,9 +177,9 @@ Use the C<new_tmpfile> class method from the IO::File module to get a filehandle opened for reading and writing. Use this if you don't need to know the file's name. - use IO::File; + use IO::File; $fh = IO::File->new_tmpfile() - or die "Unable to make new temporary file: $!"; + or die "Unable to make new temporary file: $!"; Or you can use the C<tmpnam> function from the POSIX module to get a filename that you then open yourself. Use this if you do need to know @@ -222,7 +225,7 @@ one process, use a counter: =head2 How can I manipulate fixed-record-length files? The most efficient way is using pack() and unpack(). This is faster than -using substr() when take many, many strings. It is slower for just a few. +using substr() when taking many, many strings. It is slower for just a few. Here is a sample chunk of code to break up and put back together again some fixed-format input lines, in this case from the output of a normal, @@ -289,10 +292,10 @@ pair to make it easy to sort the hash in insertion order. } For passing filehandles to functions, the easiest way is to -prefer them with a star, as in func(*STDIN). -See L<perlfaq7/"Passing Filehandles"> for details. +preface them with a star, as in func(*STDIN). See L<perlfaq7/"Passing +Filehandles"> for details. -If you want to create many, anonymous handles, you should check out the +If you want to create many anonymous handles, you should check out the Symbol, FileHandle, or IO::Handle (etc.) modules. Here's the equivalent code with Symbol::gensym, which is reasonably light-weight: @@ -303,8 +306,8 @@ code with Symbol::gensym, which is reasonably light-weight: $file{$filename} = [ $i++, $fh ]; } -Or here using the semi-object-oriented FileHandle, which certainly isn't -light-weight: +Or here using the semi-object-oriented FileHandle module, which certainly +isn't light-weight: use FileHandle; @@ -344,7 +347,7 @@ Then use any of those as you would a normal filehandle. Anywhere that Perl is expecting a filehandle, an indirect filehandle may be used instead. An indirect filehandle is just a scalar variable that contains a filehandle. Functions like C<print>, C<open>, C<seek>, or -the C<E<lt>FHE<gt>> diamond operator will accept either a real filehandle +the C<E<lt>FHE<gt>> diamond operator will accept either a read filehandle or a scalar variable containing one: ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR); @@ -422,7 +425,7 @@ techniques to make it possible for the intrepid hacker. =head2 How can I write() into a string? -See L<perlform> for an swrite() function. +See L<perlform/"Accessing Formatting Internals"> for an swrite() function. =head2 How can I output my numbers with commas added? @@ -430,7 +433,7 @@ This one will do it for you: sub commify { local $_ = shift; - 1 while s/^(-?\d+)(\d{3})/$1,$2/; + 1 while s/^([-+]?\d+)(\d{3})/$1,$2/; return $_; } @@ -441,7 +444,7 @@ This one will do it for you: You can't just: - s/^(-?\d+)(\d{3})/$1,$2/g; + s/^([-+]?\d+)(\d{3})/$1,$2/g; because you have to put the comma in and then recalculate your position. @@ -455,7 +458,7 @@ whatever: my $input = shift; $input = reverse $input; $input =~ s<(\d\d\d)(?=\d)(?!\d*\.)><$1,>g; - return reverse $input; + return scalar reverse $input; } =head2 How can I translate tildes (~) in a filename? @@ -547,7 +550,9 @@ be an atomic operation over NFS. That is, two processes might both successful create or unlink the same file! Therefore O_EXCL isn't so exclusive as you might wish. -=head2 Why do I sometimes get an "Argument list too long" when I use <*>? +See also the new L<perlopentut> if you have it (new for 5.006). + +=head2 Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>? The C<E<lt>E<gt>> operator performs a globbing operation (see above). By default glob() forks csh(1) to do the actual glob expansion, but @@ -555,9 +560,9 @@ csh can't handle more than 127 items and so gives the error message C<Argument list too long>. People who installed tcsh as csh won't have this problem, but their users may be surprised by it. -To get around this, either do the glob yourself with C<Dirhandle>s and +To get around this, either do the glob yourself with readdir() and patterns, or use a module like Glob::KGlob, one that doesn't use the -shell to do globbing. +shell to do globbing. This is expected to be fixed soon. =head2 Is there a leak/bug in glob()? @@ -576,15 +581,28 @@ trailing null byte on the name to make perl leave it alone: sub safe_filename { local $_ = shift; - return m#^/# - ? "$_\0" - : "./$_\0"; + s#^([^./])#./$1#; + $_ .= "\0"; + return $_; } - $fn = safe_filename("<<<something really wicked "); - open(FH, "> $fn") or "couldn't open $fn: $!"; + $badpath = "<<<something really wicked "; + $fn = safe_filename($badpath"); + open(FH, "> $fn") or "couldn't open $badpath: $!"; + +This assumes that you are using POSIX (portable operating systems +interface) paths. If you are on a closed, non-portable, proprietary +system, you may have to adjust the C<"./"> above. + +It would be a lot clearer to use sysopen(), though: + + use Fcntl; + $badpath = "<<<something really wicked "; + open (FH, $badpath, O_WRONLY | O_CREAT | O_TRUNC) + or die "can't open $badpath: $!"; -You could also use the sysopen() function (see L<perlfunc/sysopen>). +For more information, see also the new L<perlopentut> if you have it +(new for 5.006). =head2 How can I reliably rename a file? @@ -601,7 +619,7 @@ then delete the old one. This isn't really the same semantics as a real rename(), though, which preserves metainformation like permissions, timestamps, inode info, etc. -The newer version of File::Copy export a move() function. +The newer version of File::Copy exports a move() function. =head2 How can I lock a file? @@ -631,9 +649,12 @@ build Perl. See the flock entry of L<perlfunc>, and the F<INSTALL> file in the source distribution for information on building Perl to do this. +For more information on file locking, see also L<perlopentut/"File +Locking"> if you have it (new for 5.006). + =back -=head2 What can't I just open(FH, ">file.lock")? +=head2 Why can't I just open(FH, ">file.lock")? A common bit of code B<NOT TO USE> is this: @@ -649,7 +670,7 @@ atomic test-and-set instruction. In theory, this "ought" to work: except that lamentably, file creation (and deletion) is not atomic over NFS, so this won't work (at least, not every time) over the net. -Various schemes involving involving link() have been suggested, but +Various schemes involving link() have been suggested, but these tend to involve busy-wait, which is also subdesirable. =head2 I still don't get locking. I just want to increment the number in the file. How can I do this? @@ -661,14 +682,15 @@ It's more realistic. Anyway, this is what you can do if you can't help yourself. - use Fcntl; + use Fcntl ':flock'; sysopen(FH, "numfile", O_RDWR|O_CREAT) or die "can't open numfile: $!"; - flock(FH, 2) or die "can't flock numfile: $!"; + flock(FH, LOCK_EX) or die "can't flock numfile: $!"; $num = <FH> || 0; seek(FH, 0, 0) or die "can't rewind numfile: $!"; truncate(FH, 0) or die "can't truncate numfile: $!"; (print FH $num+1, "\n") or die "can't write numfile: $!"; - # DO NOT UNLOCK THIS UNTIL YOU CLOSE + # Perl as of 5.004 automatically flushes before unlocking + flock(FH, LOCK_UN) or die "can't flock numfile: $!"; close FH or die "can't close numfile: $!"; Here's a much better web-page hit counter: @@ -693,7 +715,7 @@ like this: seek(FH, $recno * $RECSIZE, 0); read(FH, $record, $RECSIZE) == $RECSIZE || die "can't read record $recno: $!"; # munge the record - seek(FH, $recno * $RECSIZE, 0); + seek(FH, -$RECSIZE, 1); print FH $record; close FH; @@ -720,12 +742,15 @@ Here's an example: If you prefer something more legible, use the File::stat module (part of the standard distribution in version 5.004 and later): + # error checking left as an exercise for reader. use File::stat; use Time::localtime; $date_string = ctime(stat($file)->mtime); print "file $file updated at $date_string\n"; -Error checking is left as an exercise for the reader. +The POSIX::strftime() approach has the benefit of being, +in theory, independent of the current locale. See L<perllocale> +for details. =head2 How do I set a file's timestamp in perl? @@ -741,7 +766,7 @@ of them. ($atime, $mtime) = (stat($timestamp))[8,9]; utime $atime, $mtime, @ARGV; -Error checking is left as an exercise for the reader. +Error checking is, as usual, left as an exercise for the reader. Note that utime() currently doesn't work correctly with Win95/NT ports. A bug has been reported. Check it carefully before using @@ -774,11 +799,14 @@ than the stock version. =head2 How can I read in a file by paragraphs? -Use the C<$\> variable (see L<perlvar> for details). You can either +Use the C<$/> variable (see L<perlvar> for details). You can either set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">, for instance, gets treated as two paragraphs and not three), or C<"\n\n"> to accept empty paragraphs. +Note that a blank line must have no blanks in it. Thus C<"fred\n +\nstuff\n\n"> is one paragraph, but C<"fred\n\nstuff\n\n"> is two. + =head2 How can I read a single character from a file? From the keyboard? You can use the builtin C<getc()> function for most filehandles, but @@ -786,8 +814,9 @@ it won't (easily) work on a terminal device. For STDIN, either use the Term::ReadKey module from CPAN, or use the sample code in L<perlfunc/getc>. -If your system supports POSIX, you can use the following code, which -you'll note turns off echo processing as well. +If your system supports the portable operating system programming +interface (POSIX), you can use the following code, which you'll note +turns off echo processing as well. #!/usr/bin/perl -w use strict; @@ -838,7 +867,8 @@ you'll note turns off echo processing as well. END { cooked() } -The Term::ReadKey module from CPAN may be easier to use: +The Term::ReadKey module from CPAN may be easier to use. Recent version +include also support for non-portable systems as well. use Term::ReadKey; open(TTY, "</dev/tty"); @@ -849,7 +879,7 @@ The Term::ReadKey module from CPAN may be easier to use: printf "\nYou said %s, char number %03d\n", $key, ord $key; -For DOS systems, Dan Carson <dbc@tc.fluke.COM> reports the following: +For legacy DOS systems, Dan Carson <dbc@tc.fluke.COM> reports the following: To put the PC in "raw" mode, use ioctl with some magic numbers gleaned from msdos.c (Perl source file) and Ralf Brown's interrupt list (comes @@ -895,11 +925,12 @@ table: This is all trial and error I did a long time ago, I hope I'm reading the file that worked. -=head2 How can I tell if there's a character waiting on a filehandle? +=head2 How can I tell whether there's a character waiting on a filehandle? The very first thing you should do is look into getting the Term::ReadKey -extension from CPAN. It now even has limited support for closed, proprietary -(read: not open systems, not POSIX, not Unix, etc) systems. +extension from CPAN. As we mentioned earlier, it now even has limited +support for non-portable (read: not open systems, closed, proprietary, +not POSIX, not Unix, etc) systems. You should also check out the Frequently Asked Questions list in comp.unix.* for things like this: the answer is essentially the same. @@ -912,12 +943,11 @@ systems: return $nfd = select($rin,undef,undef,0); } -If you want to find out how many characters are waiting, -there's also the FIONREAD ioctl call to be looked at. - -The I<h2ph> tool that comes with Perl tries to convert C include -files to Perl code, which can be C<require>d. FIONREAD ends -up defined as a function in the I<sys/ioctl.ph> file: +If you want to find out how many characters are waiting, there's +also the FIONREAD ioctl call to be looked at. The I<h2ph> tool that +comes with Perl tries to convert C include files to Perl code, which +can be C<require>d. FIONREAD ends up defined as a function in the +I<sys/ioctl.ph> file: require 'sys/ioctl.ph'; @@ -939,7 +969,7 @@ Or write a small C program using the editor of champions: printf("%#08x\n", FIONREAD); } ^D - % cc -o fionread fionread + % cc -o fionread fionread.c % ./fionread 0x4004667f @@ -980,6 +1010,8 @@ the clearerr() method, which can remove the end of file condition on a filehandle. The method: read until end of file, clearerr(), read some more. Lather, rinse, repeat. +There's also a File::Tail module from CPAN. + =head2 How do I dup() a filehandle in Perl? If you check L<perlfunc/open>, you'll see that several of the ways @@ -1018,19 +1050,22 @@ Remember that within double quoted strings ("like\this"), the backslash is an escape character. The full list of these is in L<perlop/Quote and Quote-like Operators>. Unsurprisingly, you don't have a file called "c:(tab)emp(formfeed)oo" or -"c:(tab)emp(formfeed)oo.exe" on your DOS filesystem. +"c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem. Either single-quote your strings, or (preferably) use forward slashes. Since all DOS and Windows versions since something like MS-DOS 2.0 or so have treated C</> and C<\> the same in a path, you might as well use the one that doesn't clash with Perl -- or the POSIX shell, ANSI C and C++, -awk, Tcl, Java, or Python, just to mention a few. +awk, Tcl, Java, or Python, just to mention a few. POSIX paths +are more portable, too. =head2 Why doesn't glob("*.*") get all the files? Because even on non-Unix ports, Perl's glob function follows standard Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden) -files. This makes glob() portable. +files. This makes glob() portable even to legacy systems. Your +port may include proprietary globbing functions as well. Check its +documentation for details. =head2 Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl? @@ -1057,9 +1092,32 @@ This has a significant advantage in space over reading the whole file in. A simple proof by induction is available upon request if you doubt its correctness. +=head2 Why do I get weird spaces when I print an array of lines? + +Saying + + print "@lines\n"; + +joins together the elements of C<@lines> with a space between them. +If C<@lines> were C<("little", "fluffy", "clouds")> then the above +statement would print: + + little fluffy clouds + +but if each element of C<@lines> was a line of text, ending a newline +character C<("little\n", "fluffy\n", "clouds\n")> then it would print: + + little + fluffy + clouds + +If your array contains lines, just print them: + + print @lines; + =head1 AUTHOR AND COPYRIGHT -Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington. +Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington. All rights reserved. When included as an integrated part of the Standard Distribution @@ -1072,3 +1130,4 @@ domain. You are permitted and encouraged to use this code and any derivatives thereof in your own programs for fun or for profit as you see fit. A simple comment in the code giving credit to the FAQ would be courteous but is not required. + |