summaryrefslogtreecommitdiff
path: root/pod/perlfaq5.pod
diff options
context:
space:
mode:
authorTom Christiansen <tchrist@perl.com>1999-01-07 16:05:02 -0700
committerJarkko Hietaniemi <jhi@iki.fi>1999-01-08 11:51:52 +0000
commit65acb1b1d672587d3a0d073613a475584830e38e (patch)
treefcb09719fada1c9453493712a798b889dd89b086 /pod/perlfaq5.pod
parentae83f3772b2dd371e676035c6714025e89d7e08f (diff)
downloadperl-65acb1b1d672587d3a0d073613a475584830e38e.tar.gz
FAQ jumbo patch from tchrist.
Message-Id: <199901080605.XAA20229@jhereg.perl.com> To: pumpkings@jhereg.perl.com Subject: newest version of perlfaq.pod Date: Thu, 7 Jan 1999 23:05:02 -0700 Message-Id: <199901080605.XAA20231@jhereg.perl.com> From: Tom Christiansen <tchrist@jhereg.perl.com> To: pumpkings@jhereg.perl.com Subject: newest version of perlfaq1.pod Date: Thu, 7 Jan 1999 23:05:02 -0700 Message-Id: <199901080605.XAA20233@jhereg.perl.com> From: Tom Christiansen <tchrist@jhereg.perl.com> To: pumpkings@jhereg.perl.com Subject: newest version of perlfaq2.pod Date: Thu, 7 Jan 1999 23:05:02 -0700 Message-Id: <199901080605.XAA20235@jhereg.perl.com> From: Tom Christiansen <tchrist@jhereg.perl.com> To: pumpkings@jhereg.perl.com Subject: newest version of perlfaq3.pod Date: Thu, 7 Jan 1999 23:05:02 -0700 Message-Id: <199901080605.XAA20237@jhereg.perl.com> From: Tom Christiansen <tchrist@jhereg.perl.com> To: pumpkings@jhereg.perl.com Subject: newest version of perlfaq4.pod Date: Thu, 7 Jan 1999 23:05:02 -0700 Message-Id: <199901080605.XAA20239@jhereg.perl.com> From: Tom Christiansen <tchrist@jhereg.perl.com> To: pumpkings@jhereg.perl.com Subject: newest version of perlfaq5.pod Date: Thu, 7 Jan 1999 23:05:02 -0700 Message-Id: <199901080605.XAA20241@jhereg.perl.com> From: Tom Christiansen <tchrist@jhereg.perl.com> To: pumpkings@jhereg.perl.com Subject: newest version of perlfaq6.pod Date: Thu, 7 Jan 1999 23:05:02 -0700 Message-Id: <199901080605.XAA20243@jhereg.perl.com> From: Tom Christiansen <tchrist@jhereg.perl.com> To: pumpkings@jhereg.perl.com Subject: newest version of perlfaq7.pod Date: Thu, 7 Jan 1999 23:05:03 -0700 Message-Id: <199901080605.XAA20245@jhereg.perl.com> From: Tom Christiansen <tchrist@jhereg.perl.com> To: pumpkings@jhereg.perl.com Subject: newest version of perlfaq8.pod Date: Thu, 7 Jan 1999 23:05:03 -0700 Message-Id: <199901080605.XAA20257@jhereg.perl.com> From: Tom Christiansen <tchrist@jhereg.perl.com> To: pumpkings@jhereg.perl.com Subject: newest version of perlfaq9.pod Date: Thu, 7 Jan 1999 23:05:03 -0700 p4raw-id: //depot/cfgperl@2588
Diffstat (limited to 'pod/perlfaq5.pod')
-rw-r--r--pod/perlfaq5.pod175
1 files changed, 117 insertions, 58 deletions
diff --git a/pod/perlfaq5.pod b/pod/perlfaq5.pod
index 3e1103b2a4..119ffa4103 100644
--- a/pod/perlfaq5.pod
+++ b/pod/perlfaq5.pod
@@ -1,6 +1,6 @@
=head1 NAME
-perlfaq5 - Files and Formats ($Revision: 1.24 $, $Date: 1998/07/05 15:07:20 $)
+perlfaq5 - Files and Formats ($Revision: 1.34 $, $Date: 1999/01/08 05:46:13 $)
=head1 DESCRIPTION
@@ -78,12 +78,15 @@ See L<perlfaq9> for other examples of fetching URLs over the web.
=head2 How do I change one line in a file/delete a line in a file/insert a line in the middle of a file/append to the beginning of a file?
+Those are operations of a text editor. Perl is not a text editor.
+Perl is a programming language. You have to decompose the problem into
+low-level calls to read, write, open, close, and seek.
+
Although humans have an easy time thinking of a text file as being a
-sequence of lines that operates much like a stack of playing cards --
-or punch cards -- computers usually see the text file as a sequence of
-bytes. In general, there's no direct way for Perl to seek to a
-particular line of a file, insert text into a file, or remove text
-from a file.
+sequence of lines that operates much like a stack of playing cards -- or
+punch cards -- computers usually see the text file as a sequence of bytes.
+In general, there's no direct way for Perl to seek to a particular line
+of a file, insert text into a file, or remove text from a file.
(There are exceptions in special circumstances. You can add or remove at
the very end of the file. Another is replacing a sequence of bytes with
@@ -97,7 +100,7 @@ no locking.
$old = $file;
$new = "$file.tmp.$$";
- $bak = "$file.bak";
+ $bak = "$file.orig";
open(OLD, "< $old") or die "can't open $old: $!";
open(NEW, "> $new") or die "can't open $new: $!";
@@ -124,7 +127,7 @@ platform-specific documentation that came with your port.
perl -pi -e 's/(^\s+test\s+)\d+/ $1 . ++$count /e' t/op/taint.t
# form a script
- local($^I, @ARGV) = ('.bak', glob("*.c"));
+ local($^I, @ARGV) = ('.orig', glob("*.c"));
while (<>) {
if ($. == 1) {
print "This line should appear at the top of each file\n";
@@ -174,9 +177,9 @@ Use the C<new_tmpfile> class method from the IO::File module to get a
filehandle opened for reading and writing. Use this if you don't
need to know the file's name.
- use IO::File;
+ use IO::File;
$fh = IO::File->new_tmpfile()
- or die "Unable to make new temporary file: $!";
+ or die "Unable to make new temporary file: $!";
Or you can use the C<tmpnam> function from the POSIX module to get a
filename that you then open yourself. Use this if you do need to know
@@ -222,7 +225,7 @@ one process, use a counter:
=head2 How can I manipulate fixed-record-length files?
The most efficient way is using pack() and unpack(). This is faster than
-using substr() when take many, many strings. It is slower for just a few.
+using substr() when taking many, many strings. It is slower for just a few.
Here is a sample chunk of code to break up and put back together again
some fixed-format input lines, in this case from the output of a normal,
@@ -289,10 +292,10 @@ pair to make it easy to sort the hash in insertion order.
}
For passing filehandles to functions, the easiest way is to
-prefer them with a star, as in func(*STDIN).
-See L<perlfaq7/"Passing Filehandles"> for details.
+preface them with a star, as in func(*STDIN). See L<perlfaq7/"Passing
+Filehandles"> for details.
-If you want to create many, anonymous handles, you should check out the
+If you want to create many anonymous handles, you should check out the
Symbol, FileHandle, or IO::Handle (etc.) modules. Here's the equivalent
code with Symbol::gensym, which is reasonably light-weight:
@@ -303,8 +306,8 @@ code with Symbol::gensym, which is reasonably light-weight:
$file{$filename} = [ $i++, $fh ];
}
-Or here using the semi-object-oriented FileHandle, which certainly isn't
-light-weight:
+Or here using the semi-object-oriented FileHandle module, which certainly
+isn't light-weight:
use FileHandle;
@@ -344,7 +347,7 @@ Then use any of those as you would a normal filehandle. Anywhere that
Perl is expecting a filehandle, an indirect filehandle may be used
instead. An indirect filehandle is just a scalar variable that contains
a filehandle. Functions like C<print>, C<open>, C<seek>, or
-the C<E<lt>FHE<gt>> diamond operator will accept either a real filehandle
+the C<E<lt>FHE<gt>> diamond operator will accept either a read filehandle
or a scalar variable containing one:
($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR);
@@ -422,7 +425,7 @@ techniques to make it possible for the intrepid hacker.
=head2 How can I write() into a string?
-See L<perlform> for an swrite() function.
+See L<perlform/"Accessing Formatting Internals"> for an swrite() function.
=head2 How can I output my numbers with commas added?
@@ -430,7 +433,7 @@ This one will do it for you:
sub commify {
local $_ = shift;
- 1 while s/^(-?\d+)(\d{3})/$1,$2/;
+ 1 while s/^([-+]?\d+)(\d{3})/$1,$2/;
return $_;
}
@@ -441,7 +444,7 @@ This one will do it for you:
You can't just:
- s/^(-?\d+)(\d{3})/$1,$2/g;
+ s/^([-+]?\d+)(\d{3})/$1,$2/g;
because you have to put the comma in and then recalculate your
position.
@@ -455,7 +458,7 @@ whatever:
my $input = shift;
$input = reverse $input;
$input =~ s<(\d\d\d)(?=\d)(?!\d*\.)><$1,>g;
- return reverse $input;
+ return scalar reverse $input;
}
=head2 How can I translate tildes (~) in a filename?
@@ -547,7 +550,9 @@ be an atomic operation over NFS. That is, two processes might both
successful create or unlink the same file! Therefore O_EXCL
isn't so exclusive as you might wish.
-=head2 Why do I sometimes get an "Argument list too long" when I use <*>?
+See also the new L<perlopentut> if you have it (new for 5.006).
+
+=head2 Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>?
The C<E<lt>E<gt>> operator performs a globbing operation (see above).
By default glob() forks csh(1) to do the actual glob expansion, but
@@ -555,9 +560,9 @@ csh can't handle more than 127 items and so gives the error message
C<Argument list too long>. People who installed tcsh as csh won't
have this problem, but their users may be surprised by it.
-To get around this, either do the glob yourself with C<Dirhandle>s and
+To get around this, either do the glob yourself with readdir() and
patterns, or use a module like Glob::KGlob, one that doesn't use the
-shell to do globbing.
+shell to do globbing. This is expected to be fixed soon.
=head2 Is there a leak/bug in glob()?
@@ -576,15 +581,28 @@ trailing null byte on the name to make perl leave it alone:
sub safe_filename {
local $_ = shift;
- return m#^/#
- ? "$_\0"
- : "./$_\0";
+ s#^([^./])#./$1#;
+ $_ .= "\0";
+ return $_;
}
- $fn = safe_filename("<<<something really wicked ");
- open(FH, "> $fn") or "couldn't open $fn: $!";
+ $badpath = "<<<something really wicked ";
+ $fn = safe_filename($badpath");
+ open(FH, "> $fn") or "couldn't open $badpath: $!";
+
+This assumes that you are using POSIX (portable operating systems
+interface) paths. If you are on a closed, non-portable, proprietary
+system, you may have to adjust the C<"./"> above.
+
+It would be a lot clearer to use sysopen(), though:
+
+ use Fcntl;
+ $badpath = "<<<something really wicked ";
+ open (FH, $badpath, O_WRONLY | O_CREAT | O_TRUNC)
+ or die "can't open $badpath: $!";
-You could also use the sysopen() function (see L<perlfunc/sysopen>).
+For more information, see also the new L<perlopentut> if you have it
+(new for 5.006).
=head2 How can I reliably rename a file?
@@ -601,7 +619,7 @@ then delete the old one. This isn't really the same semantics as a
real rename(), though, which preserves metainformation like
permissions, timestamps, inode info, etc.
-The newer version of File::Copy export a move() function.
+The newer version of File::Copy exports a move() function.
=head2 How can I lock a file?
@@ -631,9 +649,12 @@ build Perl. See the flock entry of L<perlfunc>, and the F<INSTALL>
file in the source distribution for information on building Perl to do
this.
+For more information on file locking, see also L<perlopentut/"File
+Locking"> if you have it (new for 5.006).
+
=back
-=head2 What can't I just open(FH, ">file.lock")?
+=head2 Why can't I just open(FH, ">file.lock")?
A common bit of code B<NOT TO USE> is this:
@@ -649,7 +670,7 @@ atomic test-and-set instruction. In theory, this "ought" to work:
except that lamentably, file creation (and deletion) is not atomic
over NFS, so this won't work (at least, not every time) over the net.
-Various schemes involving involving link() have been suggested, but
+Various schemes involving link() have been suggested, but
these tend to involve busy-wait, which is also subdesirable.
=head2 I still don't get locking. I just want to increment the number in the file. How can I do this?
@@ -661,14 +682,15 @@ It's more realistic.
Anyway, this is what you can do if you can't help yourself.
- use Fcntl;
+ use Fcntl ':flock';
sysopen(FH, "numfile", O_RDWR|O_CREAT) or die "can't open numfile: $!";
- flock(FH, 2) or die "can't flock numfile: $!";
+ flock(FH, LOCK_EX) or die "can't flock numfile: $!";
$num = <FH> || 0;
seek(FH, 0, 0) or die "can't rewind numfile: $!";
truncate(FH, 0) or die "can't truncate numfile: $!";
(print FH $num+1, "\n") or die "can't write numfile: $!";
- # DO NOT UNLOCK THIS UNTIL YOU CLOSE
+ # Perl as of 5.004 automatically flushes before unlocking
+ flock(FH, LOCK_UN) or die "can't flock numfile: $!";
close FH or die "can't close numfile: $!";
Here's a much better web-page hit counter:
@@ -693,7 +715,7 @@ like this:
seek(FH, $recno * $RECSIZE, 0);
read(FH, $record, $RECSIZE) == $RECSIZE || die "can't read record $recno: $!";
# munge the record
- seek(FH, $recno * $RECSIZE, 0);
+ seek(FH, -$RECSIZE, 1);
print FH $record;
close FH;
@@ -720,12 +742,15 @@ Here's an example:
If you prefer something more legible, use the File::stat module
(part of the standard distribution in version 5.004 and later):
+ # error checking left as an exercise for reader.
use File::stat;
use Time::localtime;
$date_string = ctime(stat($file)->mtime);
print "file $file updated at $date_string\n";
-Error checking is left as an exercise for the reader.
+The POSIX::strftime() approach has the benefit of being,
+in theory, independent of the current locale. See L<perllocale>
+for details.
=head2 How do I set a file's timestamp in perl?
@@ -741,7 +766,7 @@ of them.
($atime, $mtime) = (stat($timestamp))[8,9];
utime $atime, $mtime, @ARGV;
-Error checking is left as an exercise for the reader.
+Error checking is, as usual, left as an exercise for the reader.
Note that utime() currently doesn't work correctly with Win95/NT
ports. A bug has been reported. Check it carefully before using
@@ -774,11 +799,14 @@ than the stock version.
=head2 How can I read in a file by paragraphs?
-Use the C<$\> variable (see L<perlvar> for details). You can either
+Use the C<$/> variable (see L<perlvar> for details). You can either
set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">,
for instance, gets treated as two paragraphs and not three), or
C<"\n\n"> to accept empty paragraphs.
+Note that a blank line must have no blanks in it. Thus C<"fred\n
+\nstuff\n\n"> is one paragraph, but C<"fred\n\nstuff\n\n"> is two.
+
=head2 How can I read a single character from a file? From the keyboard?
You can use the builtin C<getc()> function for most filehandles, but
@@ -786,8 +814,9 @@ it won't (easily) work on a terminal device. For STDIN, either use
the Term::ReadKey module from CPAN, or use the sample code in
L<perlfunc/getc>.
-If your system supports POSIX, you can use the following code, which
-you'll note turns off echo processing as well.
+If your system supports the portable operating system programming
+interface (POSIX), you can use the following code, which you'll note
+turns off echo processing as well.
#!/usr/bin/perl -w
use strict;
@@ -838,7 +867,8 @@ you'll note turns off echo processing as well.
END { cooked() }
-The Term::ReadKey module from CPAN may be easier to use:
+The Term::ReadKey module from CPAN may be easier to use. Recent version
+include also support for non-portable systems as well.
use Term::ReadKey;
open(TTY, "</dev/tty");
@@ -849,7 +879,7 @@ The Term::ReadKey module from CPAN may be easier to use:
printf "\nYou said %s, char number %03d\n",
$key, ord $key;
-For DOS systems, Dan Carson <dbc@tc.fluke.COM> reports the following:
+For legacy DOS systems, Dan Carson <dbc@tc.fluke.COM> reports the following:
To put the PC in "raw" mode, use ioctl with some magic numbers gleaned
from msdos.c (Perl source file) and Ralf Brown's interrupt list (comes
@@ -895,11 +925,12 @@ table:
This is all trial and error I did a long time ago, I hope I'm reading the
file that worked.
-=head2 How can I tell if there's a character waiting on a filehandle?
+=head2 How can I tell whether there's a character waiting on a filehandle?
The very first thing you should do is look into getting the Term::ReadKey
-extension from CPAN. It now even has limited support for closed, proprietary
-(read: not open systems, not POSIX, not Unix, etc) systems.
+extension from CPAN. As we mentioned earlier, it now even has limited
+support for non-portable (read: not open systems, closed, proprietary,
+not POSIX, not Unix, etc) systems.
You should also check out the Frequently Asked Questions list in
comp.unix.* for things like this: the answer is essentially the same.
@@ -912,12 +943,11 @@ systems:
return $nfd = select($rin,undef,undef,0);
}
-If you want to find out how many characters are waiting,
-there's also the FIONREAD ioctl call to be looked at.
-
-The I<h2ph> tool that comes with Perl tries to convert C include
-files to Perl code, which can be C<require>d. FIONREAD ends
-up defined as a function in the I<sys/ioctl.ph> file:
+If you want to find out how many characters are waiting, there's
+also the FIONREAD ioctl call to be looked at. The I<h2ph> tool that
+comes with Perl tries to convert C include files to Perl code, which
+can be C<require>d. FIONREAD ends up defined as a function in the
+I<sys/ioctl.ph> file:
require 'sys/ioctl.ph';
@@ -939,7 +969,7 @@ Or write a small C program using the editor of champions:
printf("%#08x\n", FIONREAD);
}
^D
- % cc -o fionread fionread
+ % cc -o fionread fionread.c
% ./fionread
0x4004667f
@@ -980,6 +1010,8 @@ the clearerr() method, which can remove the end of file condition on a
filehandle. The method: read until end of file, clearerr(), read some
more. Lather, rinse, repeat.
+There's also a File::Tail module from CPAN.
+
=head2 How do I dup() a filehandle in Perl?
If you check L<perlfunc/open>, you'll see that several of the ways
@@ -1018,19 +1050,22 @@ Remember that within double quoted strings ("like\this"), the
backslash is an escape character. The full list of these is in
L<perlop/Quote and Quote-like Operators>. Unsurprisingly, you don't
have a file called "c:(tab)emp(formfeed)oo" or
-"c:(tab)emp(formfeed)oo.exe" on your DOS filesystem.
+"c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem.
Either single-quote your strings, or (preferably) use forward slashes.
Since all DOS and Windows versions since something like MS-DOS 2.0 or so
have treated C</> and C<\> the same in a path, you might as well use the
one that doesn't clash with Perl -- or the POSIX shell, ANSI C and C++,
-awk, Tcl, Java, or Python, just to mention a few.
+awk, Tcl, Java, or Python, just to mention a few. POSIX paths
+are more portable, too.
=head2 Why doesn't glob("*.*") get all the files?
Because even on non-Unix ports, Perl's glob function follows standard
Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden)
-files. This makes glob() portable.
+files. This makes glob() portable even to legacy systems. Your
+port may include proprietary globbing functions as well. Check its
+documentation for details.
=head2 Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl?
@@ -1057,9 +1092,32 @@ This has a significant advantage in space over reading the whole
file in. A simple proof by induction is available upon
request if you doubt its correctness.
+=head2 Why do I get weird spaces when I print an array of lines?
+
+Saying
+
+ print "@lines\n";
+
+joins together the elements of C<@lines> with a space between them.
+If C<@lines> were C<("little", "fluffy", "clouds")> then the above
+statement would print:
+
+ little fluffy clouds
+
+but if each element of C<@lines> was a line of text, ending a newline
+character C<("little\n", "fluffy\n", "clouds\n")> then it would print:
+
+ little
+ fluffy
+ clouds
+
+If your array contains lines, just print them:
+
+ print @lines;
+
=head1 AUTHOR AND COPYRIGHT
-Copyright (c) 1997, 1998 Tom Christiansen and Nathan Torkington.
+Copyright (c) 1997-1999 Tom Christiansen and Nathan Torkington.
All rights reserved.
When included as an integrated part of the Standard Distribution
@@ -1072,3 +1130,4 @@ domain. You are permitted and encouraged to use this code and any
derivatives thereof in your own programs for fun or for profit as you
see fit. A simple comment in the code giving credit to the FAQ would
be courteous but is not required.
+