diff options
author | Tom Christiansen <tchrist@perl.com> | 1999-01-09 01:13:18 -0700 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 1999-01-14 12:16:14 +0000 |
commit | f828431348b2bbf6fe06182e862634247523af66 (patch) | |
tree | 91a1eb1eddf3debbb6bd53192465af6585376faf | |
parent | 2f0242eb70a44bece80969fa65b9c9aa17ee6415 (diff) | |
download | perl-f828431348b2bbf6fe06182e862634247523af66.tar.gz |
perlopentut.pod
To: pumpkings@jhereg.perl.com
Message-Id: <199901091513.IAA17512@jhereg.perl.com>
p4raw-id: //depot/cfgperl@2611
-rw-r--r-- | MANIFEST | 1 | ||||
-rw-r--r-- | pod/perl.pod | 1 | ||||
-rw-r--r-- | pod/perldelta.pod | 4 | ||||
-rw-r--r-- | pod/perlopentut.pod | 862 | ||||
-rw-r--r-- | pod/roffitall | 1 |
5 files changed, 868 insertions, 1 deletions
@@ -950,6 +950,7 @@ pod/perlmodinstall.pod Installing CPAN Modules pod/perlmodlib.pod Module policy info pod/perlobj.pod Object info pod/perlop.pod Operator info +pod/perlopentut.pod Object info pod/perlpod.pod Pod info pod/perlport.pod Portability guide pod/perlre.pod Regular expression info diff --git a/pod/perl.pod b/pod/perl.pod index 1b886d01e0..ae81c7d3d6 100644 --- a/pod/perl.pod +++ b/pod/perl.pod @@ -33,6 +33,7 @@ of sections: perlfunc Perl builtin functions perlvar Perl predefined variables perlsub Perl subroutines + perlopentut Perl opening things tutorial perlmod Perl modules: how they work perlmodlib Perl modules: how to write and use perlmodinstall Perl modules: how to install from CPAN diff --git a/pod/perldelta.pod b/pod/perldelta.pod index b0d0b837f4..bd2b715605 100644 --- a/pod/perldelta.pod +++ b/pod/perldelta.pod @@ -123,7 +123,9 @@ Todo. =head1 Documentation Changes -Todo. +perlopentut, tutorial on opening things in Perl, was added. + +perlreftut, tutorial on references, was added. =head1 New Diagnostics diff --git a/pod/perlopentut.pod b/pod/perlopentut.pod new file mode 100644 index 0000000000..6e6091ab49 --- /dev/null +++ b/pod/perlopentut.pod @@ -0,0 +1,862 @@ +=head1 NAME + +perlopentut - tutorial on opening things in Perl + +=head1 DESCRIPTION + +Perl has two simple, built-in ways to open files: the shell way for +convenience, and the C way for precision. The choice is yours. + +=head1 Open E<agrave> la shell + +Perl's C<open> function was designed to mimic the way command-line +redirection in the shell works. Here are some basic examples +from the shell: + + $ myprogram file1 file2 file3 + $ myprogram < inputfile + $ myprogram > outputfile + $ myprogram >> outputfile + $ myprogram | otherprogram + $ otherprogram | myprogram + +And here are some more advanced examples: + + $ otherprogram | myprogram f1 - f2 + $ otherprogram 2>&1 | myprogram - + $ myprogram <&3 + $ myprogram >&4 + +Programmers accustomed to constructs like those above can take comfort +in learning that Perl directly supports these familiar constructs using +virtually the same syntax as the shell. + +=head2 Simple Opens + +The C<open> function takes two arguments: the first is a filehandle, +and the second is a single string comprising both what to open and how +to open it. C<open> returns true when it works, and when it fails, +returns a false value and sets the special variable $! to reflect +the system error. If the filehandle was previously opened, it will +be implicitly closed first. + +For example: + + open(INFO, "datafile") || die("can't open datafile: $!"); + open(INFO, "< datafile") || die("can't open datafile: $!"); + open(RESULTS,"> runstats") || die("can't open runstats: $!"); + open(LOG, ">> logfile ") || die("can't open logfile: $!"); + +If you prefer the low-punctuation version, you could write that this way: + + open INFO, "< datafile" or die "can't open datafile: $!"; + open RESULTS,"> runstats" or die "can't open runstats: $!"; + open LOG, ">> logfile " or die "can't open logfile: $!"; + +A few things to notice. First, the leading less-than is optional. +If omitted, Perl assumes that you want to open the file for reading. + +The other important thing to notice is that, just as in the shell, +any white space before or after the filename is ignored. This is good, +because you wouldn't want these to do different things: + + open INFO, "<datafile" + open INFO, "< datafile" + open INFO, "< datafile" + +Ignoring surround whitespace also helps for when you read a filename in +from a different file, and forget to trim it before opening: + + $filename = <INFO>; # oops, \n still there + open(EXTRA, "< $filename") || die "can't open $filename: $!"; + +This is not a bug, but a feature. Because C<open> mimics the shell in +its style of using redirection arrows to specify how to open the file, it +also does so with respect to extra white space around the filename itself +as well. For accessing files with naughty names, see L</"Dispelling +the Dweomer">. + +=head2 Pipe Opens + +In C, when you want to open a file using the standard I/O library, +you use the C<fopen> function, but when opening a pipe, you use the +C<popen> function. But in the shell, you just use a different redirection +character. That's also the case for Perl. The C<open> call +remains the same--just its argument differs. + +If the leading character is a pipe symbol, C<open) starts up a new +command and open a write-only filehandle leading into that command. +This lets you write into that handle and have what you write show up on +that command's standard input. For example: + + open(PRINTER, "| lpr -Plp1") || die "cannot fork: $!"; + print PRINTER "stuff\n"; + close(PRINTER) || die "can't close lpr: $!"; + +If the trailing character is a pipe, you start up a new command and open a +read-only filehandle leading out of that command. This lets whatever that +command writes to its standard output show up on your handle for reading. +For example: + + open(NET, "netstat -i -n |") || die "cannot fork: $!"; + while (<NET>) { } # do something with input + close(NET) || die "can't close netstat: $!"; + +What happens if you try to open a pipe to or from a non-existent command? +In most systems, such an C<open> will not return an error. That's +because in the traditional C<fork>/C<exec> model, running the other +program happens only in the forked child process, which means that +the failed C<exec> can't be reflected in the return value of C<open>. +Only a failed C<fork> shows up there. See L<perlfaq8/"Why doesn't open() +return an error when a pipe open fails?"> to see how to cope with this. +There's also an explanation in L<perlipc>. + +If you would like to open a bidirectional pipe, the IPC::Open2 +library will handle this for you. Check out L<perlipc/"Bidirectional +Communication with Another Process"> + +=head2 The Minus File + +Again following the lead of the standard shell utilities, Perl's +C<open> function treats a file whose name is a single minus, "-", in a +special way. If you open minus for reading, it really means to access +the standard input. If you open minus for writing, it really means to +access the standard output. + +If minus can be used as the default input or default output? What happens +if you open a pipe into or out of minus? What's the default command it +would run? The same script as you're current running! This is actually +a stealth C<fork> hidden inside an C<open> call. See L<perlipc/"Safe Pipe +Opens"> for details. + +=head2 Mixing Reads and Writes + +It is possible to specify both read and write access. All you do is +add a "+" symbol in front of the redirection. But as in the shell, +using a less-than on a file never creates a new file; it only opens an +existing one. On the other hand, using a greater-than always clobbers +(truncates to zero length) an existing file, or creates a brand-new one +if there isn't an old one. Adding a "+" for read-write doesn't affect +whether it only works on existing files or always clobbers existing ones. + + open(WTMP, "+< /usr/adm/wtmp") + || die "can't open /usr/adm/wtmp: $!"; + + open(SCREEN, "+> /tmp/lkscreen") + || die "can't open /tmp/lkscreen: $!"; + + open(LOGFILE, "+>> /tmp/applog" + || die "can't open /tmp/applog: $!"; + +The first one won't create a new file, and the second one will always +clobber an old one. The third one will create a new file if necessary +and not clobber an old one, and it will allow you to read at any point +in the file, but all writes will always go to the end. In short, +the first case is substantially more common than the second and third +cases, which are almost always wrong. (If you know C, the plus in +Perl's C<open> is historically derived from the one in C's fopen(3S), +which it ultimately calls.) + +In fact, when it comes to updating a file, unless you're working on +a binary file as in the WTMP case above, you probably don't want to +use this approach for updating. Instead, Perl's B<-i> flag comes to +the rescue. The following command takes all the C, C++, or yacc source +or header files and changes all their foo's to bar's, leaving +the old version in the original file name with a ".orig" tacked +on the end: + + $ perl -i.orig -pe 's/\bfoo\b/bar/g' *.[Cchy] + +This is a short cut for some renaming games that are really +the best way to update textfiles. See the second question in +L<perlfaq5> for more details. + +=head2 Filters + +One of the most common uses for C<open> is one you never +even notice. When you process the ARGV filehandle using +C<E<lt>ARGVE<gt>>, Perl actually does an implicit open +on each file in @ARGV. Thus a program called like this: + + $ myprogram file1 file2 file3 + +Can have all its files opened and processed one at a time +using a construct no more complex than: + + while (<>) { + # do something with $_ + } + +If @ARGV is empty when the loop first begins, Perl pretends you've opened +up minus, that is, the standard input. In fact, $ARGV, the currently +open file during C<E<lt>ARGVE<gt>> processing, is even set to "-" +in these circumstances. + +You are welcome to pre-process your @ARGV before starting the loop to +make sure it's to your liking. One reason to do this might be to remove +command options beginning with a minus. While you can always roll the +simple ones by hand, the Getopts modules are good for this. + + use Getopt::Std; + + # -v, -D, -o ARG, sets $opt_v, $opt_D, $opt_o + getopts("vDo:"); + + # -v, -D, -o ARG, sets $args{v}, $args{D}, $args{o} + getopts("vDo:", \%args); + +Or the standard Getopt::Long module to permit named arguments: + + use Getopt::Long; + GetOptions( "verbose" => \$verbose, # --verbose + "Debug" => \$debug, # --Debug + "output=s" => \$output ); + # --output=somestring or --output somestring + +Another reason for preprocessing arguments is to make an empty +argument list default to all files: + + @ARGV = glob("*") unless @ARGV; + +You could even filter out all but plain, text files. This is a bit +silent, of course, and you might prefer to mention them on the way. + + @ARGV = grep { -f && -T } @ARGV; + +If you're using the B<-n> or B<-p> command-line options, you +should put changes to @ARGV in a C<BEGIN{}> block. + +Remember that a normal C<open> has special properties, in that it might +call fopen(3S) or it might called popen(3S), depending on what its +argument looks like; that's why it's sometimes called "magic open". +Here's an example: + + $pwdinfo = `domainname` =~ /^(\(none\))?$/ + ? '< /etc/passwd' + : 'ypcat passwd |'; + + open(PWD, $pwdinfo) + or die "can't open $pwdinfo: $!"; + +This sort of thing also comes into play in filter processing. Because +C<E<lt>ARGVE<gt>> processing employs the normal, shell-style Perl C<open>, +it respects all the special things we've already seen: + + $ myprogram f1 "cmd1|" - f2 "cmd2|" f3 < tmpfile + +That program will read from the file F<f1>, the process F<cmd1>, standard +input (F<tmpfile> in this case), the F<f2> file, the F<cmd2> command, +and finally the F<f3> file. + +Yes, this also means that if you have a file named "-" (and so on) in +your directory, that they won't be processed as literal files by C<open>. +You'll need to pass them as "./-" much as you would for the I<rm> program. +Or you could use C<sysopen> as described below. + +One of the more interesting applications is to change files of a certain +name into pipes. For example, to autoprocess gzipped or compressed +files by decompressing them with I<gzip>: + + @ARGV = map { /^\.(gz|Z)$/ ? "gzip -dc $_ |" : $_ } @ARGV; + +Or, if you have the I<GET> program installed from LWP, +you can fetch URLs before processing them: + + @ARGV = map { m#^\w+://# ? "GET $_ |" : $_ } @ARGV; + +It's not for nothing that this is called magic C<E<lt>ARGVE<gt>>. +Pretty nifty, eh? + +=head1 Open E<agrave> la C + +If you want the convenience of the shell, then Perl's C<open> is +definitely the way to go. On the other hand, if you want finer precision +than C's simplistic fopen(3S) provides, then you should look to Perl's +C<sysopen>, which is a direct hook into the open(2) system call. +That does mean it's a bit more involved, but that's the price of +precision. + +C<sysopen> takes 3 (or 4) arguments. + + sysopen HANDLE, PATH, FLAGS, [MASK] + +The HANDLE argument is a filehandle just as with C<open>. The PATH is +a literal path, one that doesn't pay attention to any greater-thans or +less-thans or pipes or minuses, nor ignore white space. If it's there, +it's part of the path. The FLAGS argument contains one or more values +derived from the Fcntl module that have been or'd together using the +bitwise "|" operator. The final argument, the MASK, is optional; if +present, it is combined with the user's current umask for the creation +mode of the file. You should usually omit this. + +Although the traditional values of read-only, write-only, and read-write +are 0, 1, and 2 respectively, this is known not to hold true on some +systems. Instead, it's best to load in the appropriate constants first +from the Fcntl module, which supplies the following standard flags: + + O_RDONLY Read only + O_WRONLY Write only + O_RDWR Read and write + O_CREAT Create the file if it doesn't exist + O_EXCL Fail if the file already exists + O_APPEND Append to the file + O_TRUNC Truncate the file + O_NONBLOCK Non-blocking access + +Less common flags that are sometimes available on some operating systems +include C<O_BINARY>, C<O_TEXT>, C<O_SHLOCK>, C<O_EXLOCK>, C<O_DEFER>, +C<O_SYNC>, C<O_ASYNC>, C<O_DSYNC>, C<O_RSYNC>, C<O_NOCTTY>, C<O_NDELAY> +and C<O_LARGEFILE>. Consult your open(2) manpage or its local equivalent +for details. + +Here's how to use C<sysopen> to emulate the simple C<open> calls we had +before. We'll omit the C<|| die $!> checks for clarity, but make sure +you always check the return values in real code. These aren't quite +the same, since C<open> will trim leading and trailing white space, +but you'll get the idea: + +To open a file for reading: + + open(FH, "< $path"); + sysopen(FH, $path, O_RDONLY); + +To open a file for writing, creating a new file if needed or else truncating +an old file: + + open(FH, "> $path"); + sysopen(FH, $path, O_WRONLY | O_TRUNC | O_CREAT); + +To open a file for appending, creating one if necessary: + + open(FH, ">> $path"); + sysopen(FH, $path, O_WRONLY | O_APPEND | O_CREAT); + +To open a file for update, where the file must already exist: + + open(FH, "+< $path"); + sysopen(FH, $path, O_RDWR); + +And here are things you can do with C<sysopen> that you cannot do with +a regular C<open>. As you see, it's just a matter of controlling the +flags in the third argument. + +To open a file for writing, creating a new file which must not previously +exist: + + sysopen(FH, $path, O_WRONLY | O_EXCL | O_CREAT); + +To open a file for appending, where that file must already exist: + + sysopen(FH, $path, O_WRONLY | O_APPEND); + +To open a file for update, creating a new file if necessary: + + sysopen(FH, $path, O_RDWR | O_CREAT); + +To open a file for update, where that file must not already exist: + + sysopen(FH, $path, O_RDWR | O_EXCL | O_CREAT); + +To open a file without blocking, creating one if necessary: + + sysopen(FH, $path, O_WRONLY | O_NONBLOCK | O_CREAT); + +=head2 Permissions E<agrave> la mode + +If you omit the MASK argument to C<sysopen>, Perl uses the octal value +0666. The normal MASK to use for executables and directories should +be 0777, and for anything else, 0666. + +Why so permissive? Well, it isn't really. The MASK will be modified +by your process's current C<umask>. A umask is a number representing +I<disabled> permissions bits; that is, bits that will not be turned on +in the created files' permissions field. + +For example, if your C<umask> were 027, then the 020 part would +disable the group from writing, and the 007 part would disable others +from reading, writing, or executing. Under these conditions, passing +C<sysopen> 0666 would create a file with mode 0640, since C<0666 &~ 027> +is 0640. + +You should seldom use the MASK argument to C<sysopen()>. That takes +away the user's freedom to choose what permission new files will have. +Denying choice is almost always a bad thing. One exception would be for +cases where sensitive or private data is being stored, such as with mail +folders, cookie files, and internal temporary files. + +=head1 Obscure Open Tricks + +=head2 Re-Opening Files (dups) + +Sometimes you already have a filehandle open, and want to make another +handle that's a duplicate of the first one. In the shell, we place an +ampersand in front of a file descriptor number when doing redirections. +For example, C<2E<gt>&1> makes descriptor 2 (that's STDERR in Perl) +be redirected into descriptor 1 (which is usually Perl's STDOUT). +The same is essentially true in Perl: a filename that begins with an +ampersand is treated instead as a file descriptor if a number, or as a +filehandle if a string. + + open(SAVEOUT, ">&SAVEERR") || die "couldn't dup SAVEERR: $!"; + open(MHCONTEXT, "<&4") || die "couldn't dup fd4: $!"; + +That means that if a function is expecting a filename, but you don't +want to give it a filename because you already have the file open, you +can just pass the filehandle with a leading ampersand. It's best to +use a fully qualified handle though, just in case the function happens +to be in a different package: + + somefunction("&main::LOGFILE"); + +This way if somefunction() is planning on opening its argument, it can +just use the already opened handle. This differs from passing a handle, +because with a handle, you don't open the file. Here you have something +you can pass to open. + +If you have one of those tricky, newfangled I/O objects that the C++ +folks are raving about, then this doesn't work because those aren't a +proper filehandle in the native Perl sense. You'll have to use fileno() +to pull out the proper descriptor number, assuming you can: + + use IO::Socket; + $handle = IO::Socket::INET->new("www.perl.com:80"); + $fd = $handle->fileno; + somefunction("&$fd"); # not an indirect function call + +It can be easier (and certainly will be faster) just to use real +filehandles though: + + use IO::Socket; + local *REMOTE = IO::Socket::INET->new("www.perl.com:80"); + die "can't connect" unless defined(fileno(REMOTE)); + somefunction("&main::REMOTE"); + +If the filehandle or descriptor number is preceded not just with a simple +"&" but rather with a "&=" combination, then Perl will not create a +completely new descriptor opened to the same place using the dup(2) +system call. Instead, it will just make something of an alias to the +existing one using the fdopen(3S) library call This is slightly more +parsimonious of systems resources, although this is less a concern +these days. Here's an example of that: + + $fd = $ENV{"MHCONTEXTFD"}; + open(MHCONTEXT, "<&=$fd") or die "couldn't fdopen $fd: $!"; + +If you're using magic C<E<lt>ARGVE<gt>>, you could even pass in as a +command line argument in @ARGV something like C<"E<lt>&=$MHCONTEXTFD">, +but we've never seen anyone actually do this. + +=head2 Dispelling the Dweomer + +Perl is more of a DWIMmer language than something like Java--where DWIM +is an acronym for "do what I mean". But this principle sometimes leads +to more hidden magic than one knows what to do with. In this way, Perl +is also filled with I<dweomer>, an obscure word meaning an enchantment. +Sometimes, Perl's DWIMmer is just too much like dweomer for comfort. + +If magic C<open> is a bit too magical for you, you don't have to turn +to C<sysopen>. To open a file with arbitrary weird characters in +it, it's necessary to protect any leading and trailing whitespace. +Leading whitespace is protected by inserting a C<"./"> in front of a +filename that starts with whitespace. Trailing whitespace is protected +by appending an ASCII NUL byte (C<"\0">) at the end off the string. + + $file =~ s#^(\s)#./$1#; + open(FH, "< $file\0") || die "can't open $file: $!"; + +This assumes, of course, that your system considers dot the current +working directory, slash the directory separator, and disallows ASCII +NULs within a valid filename. Most systems follow these conventions, +including all POSIX systems as well as proprietary Microsoft systems. +The only vaguely popular system that doesn't work this way is the +proprietary Macintosh system, which uses a colon where the rest of us +use a slash. Maybe C<sysopen> isn't such a bad idea after all. + +If you want to use C<E<lt>ARGVE<gt>> processing in a totally boring +and non-magical way, you could do this first: + + # "Sam sat on the ground and put his head in his hands. + # 'I wish I had never come here, and I don't want to see + # no more magic,' he said, and fell silent." + for (@ARGV) { + s#^([^./])#./$1#; + $_ .= "\0"; + } + while (<>) { + # now process $_ + } + +But be warned that users will not appreciate being unable to use "-" +to mean standard input, per the standard convention. + +=head2 Paths as Opens + +You've probably noticed how Perl's C<warn> and C<die> functions can +produce messages like: + + Some warning at scriptname line 29, <FH> chunk 7. + +That's because you opened a filehandle FH, and had read in seven records +from it. But what was the name of the file, not the handle? + +If you aren't running with C<strict refs>, or if you've turn them off +temporarily, then all you have to do is this: + + open($path, "< $path") || die "can't open $path: $!"; + while (<$path>) { + # whatever + } + +Since you're using the pathname of the file as its handle, +you'll get warnings more like + + Some warning at scriptname line 29, </etc/motd> chunk 7. + +=head2 Single Argument Open + +Remember how we said that Perl's open took two arguments? That was a +passive prevarication. You see, it can also take just one argument. +If and only if the variable is a global variable, not a lexical, you +can pass C<open> just one argument, the filehandle, and it will +get the path from the global scalar variable of the same name. + + $FILE = "/etc/motd"; + open FILE or die "can't open $FILE: $!"; + while (<FILE>) { + # whatever + } + +Why is this here? Someone has to cater to the hysterical porpoises. +It's something that's been in Perl since the very beginning, if not +before. + +=head2 Playing with STDIN and STDOUT + +One clever move with STDOUT is to explicitly close it when you're done +with the program. + + END { close(STDOUT) || die "can't close stdout: $!" } + +If you don't do this, and your program fills up the disk partition due +to a command line redirection, it won't report the error exit with a +failure status. + +You don't have to accept the STDIN and STDOUT you were given. You are +welcome to reopen them if you'd like. + + open(STDIN, "< datafile") + || die "can't open datafile: $!"; + + open(STDOUT, "> output") + || die "can't open output: $!"; + +And then these can be read directly or passed on to subprocesses. +This makes it look as though the program were initially invoked +with those redirections from the command line. + +It's probably more interesting to connect these to pipes. For example: + + $pager = $ENV{PAGER} || "(less || more)"; + open(STDOUT, "| $pager") + || die "can't fork a pager: $!"; + +This makes it appear as though your program were called with its stdout +already piped into your pager. You can also use this kind of thing +in conjunction with an implicit fork to yourself. You might do this +if you would rather handle the post processing in your own program, +just in a different process: + + head(100); + while (<>) { + print; + } + + sub head { + my $lines = shift || 20; + return unless $pid = open(STDOUT, "|-"); + die "cannot fork: $!" unless defined $pid; + while (<STDIN>) { + print; + last if --$lines < 0; + } + exit; + } + +This technique can be applied to repeatedly push as many filters on your +output stream as you wish. + +=head1 Other I/O Issues + +These topics aren't really arguments related to C<open> or C<sysopen>, +but they do affect what you do with your open files. + +=head2 Opening Non-File Files + +When is a file not a file? Well, you could say when it exists but +isn't a plain file. We'll check whether it's a symbolic link first, +just in case. + + if (-l $file || ! -f _) { + print "$file is not a plain file\n"; + } + +What other kinds of files are there than, well, files? Directories, +symbolic links, named pipes, Unix-domain sockets, and block and character +devices. Those are all files, too--just not I<plain> files. This isn't +the same issue as being a text file. Not all text files are plain files. +Not all plain files are textfiles. That's why there are separate C<-f> +and C<-T> file tests. + +To open a directory, you should use the C<opendir> function, then +process it with C<readdir>, carefully restoring the directory +name if necessary: + + opendir(DIR, $dirname) or die "can't opendir $dirname: $!"; + while (defined($file = readdir(DIR))) { + # do something with "$dirname/$file" + } + closedir(DIR); + +If you want to process directories recursively, it's better to use the +File::Find module. For example, this prints out all files recursively, +add adds a slash to their names if the file is a directory. + + @ARGV = qw(.) unless @ARGV; + use File::Find; + find sub { print $File::Find::name, -d && '/', "\n" }, @ARGV; + +This finds all bogus symbolic links beneath a particular directory: + + find sub { print "$File::Find::name\n" if -l && !-e }, $dir; + +As you see, with symbolic links, you can just pretend that it is +what it points to. Or, if you want to know I<what> it points to, then +C<readlink> is called for: + + if (-l $file) { + if (defined($whither = readlink($file))) { + print "$file points to $whither\n"; + } else { + print "$file points nowhere: $!\n"; + } + } + +Named pipes are a different matter. You pretend they're regular files, +but their opens will normally block until there is both a reader and +a writer. You can read more about them in L<perlipc/"Named Pipes">. +Unix-domain sockets are rather different beasts as well; they're +described in L<perlipc/"Unix-Domain TCP Clients and Servers">. + +When it comes to opening devices, it can be easy and it can tricky. +We'll assume that if you're opening up a block device, you know what +you're doing. The character devices are more interesting. These are +typically used for modems, mice, and some kinds of printers. This is +described in L<perlfaq8/"How do I read and write the serial port?"> +It's often enough to open them carefully: + + sysopen(TTYIN, "/dev/ttyS1", O_RDWR | O_NDELAY | O_NOCTTY) + # (O_NOCTTY no longer needed on POSIX systems) + or die "can't open /dev/ttyS1: $!"; + open(TTYOUT, "+>&TTYIN") + or die "can't dup TTYIN: $!"; + + $ofh = select(TTYOUT); $| = 1; select($ofh); + + print TTYOUT "+++at\015"; + $answer = <TTYIN>; + +With descriptors that you haven't opened using C<sysopen>, such as a +socket, you can set them to be non-blocking using C<fcntl>: + + use Fcntl; + fcntl(Connection, F_SETFL, O_NONBLOCK) + or die "can't set non blocking: $!"; + +Rather than losing yourself in a morass of twisting, turning C<ioctl>s, +all dissimilar, if you're going to manipulate ttys, it's best to +make calls out to the stty(1) program if you have it, or else use the +portable POSIX interface. To figure this all out, you'll need to read the +termios(3) manpage, which describes the POSIX interface to tty devices, +and then L<POSIX>, which describes Perl's interface to POSIX. There are +also some high-level modules on CPAN that can help you with these games. +Check out Term::ReadKey and Term::ReadLine. + +What else can you open? To open a connection using sockets, you won't use +one of Perl's two open functions. See L<perlipc/"Sockets: Client/Server +Communication"> for that. Here's an example. Once you have it, +you can use FH as a bidirectional filehandle. + + use IO::Socket; + local *FH = IO::Socket::INET->new("www.perl.com:80"); + +For opening up a URL, the LWP modules from CPAN are just what +the doctor ordered. There's no filehandle interface, but +it's still easy to get the contents of a document: + + use LWP::Simple; + $doc = get('http://www.sn.no/libwww-perl/'); + +=head2 Binary Files + +On certain legacy systems with what could charitably be called terminally +convoluted (some would say broken) I/O models, a file isn't a file--at +least, not with respect to the C standard I/O library. On these old +systems whose libraries (but not kernels) distinguish between text and +binary streams, to get files to behave properly you'll have to bend over +backwards to avoid nasty problems. On such infelicitous systems, sockets +and pipes are already opened in binary mode, and there is currently no +way to turn that off. With files, you have more options. + +Another option is to use the C<binmode> function on the appropriate +handles before doing regular I/O on them: + + binmode(STDIN); + binmode(STDOUT); + while (<STDIN>) { print } + +Passing C<sysopen> a non-standard flag option will also open the file in +binary mode on those systems that support it. This is the equivalent of +opening the file normally, then calling C<binmode>ing on the handle. + + sysopen(BINDAT, "records.data", O_RDWR | O_BINARY) + || die "can't open records.data: $!"; + +Now you can use C<read> and C<print> on that handle without worrying +about the system non-standard I/O library breaking your data. It's not +a pretty picture, but then, legacy systems seldom are. CP/M will be +with us until the end of days, and after. + +On systems with exotic I/O systems, it turns out that, astonishingly +enough, even unbuffered I/O using C<sysread> and C<syswrite> might do +sneaky data mutilation behind your back. + + while (sysread(WHENCE, $buf, 1024)) { + syswrite(WHITHER, $buf, length($buf)); + } + +Depending on the vicissitudes of your runtime system, even these calls +may need C<binmode> or C<O_BINARY> first. Systems known to be free of +such difficulties include Unix, the Mac OS, Plan9, and Inferno. + +=head2 File Locking + +In a multitasking environment, you may need to be careful not to collide +with other processes who want to do I/O on the same files as others +are working on. You'll often need shared or exclusive locks +on files for reading and writing respectively. You might just +pretend that only exclusive locks exist. + +Never use the existence of a file C<-e $file> as a locking indication, +because there is a race condition between the test for the existence of +the file and its creation. Atomicity is critical. + +Perl's most portable locking interface is via the C<flock> function, +whose simplicity is emulated on systems that don't directly support it, +such as SysV or WindowsNT. The underlying semantics may affect how +it all works, so you should learn how C<flock> is implemented on your +system's port of Perl. + +File locking I<does not> lock out another process that would like to +do I/O. A file lock only locks out others trying to get a lock, not +processes trying to do I/O. Because locks are advisory, if one process +uses locking and another doesn't, all bets are off. + +By default, the C<flock> call will block until a lock is granted. +A request for a shared lock will be granted as soon as there is no +exclusive locker. A request for a exclusive lock will be granted as +soon as there is no locker of any kind. Locks are on file descriptors, +not file names. You can't lock a file until you open it, and you can't +hold on to a lock once the file has been closed. + +Here's how to get a blocking shared lock on a file, typically used +for reading: + + use 5.004; + use Fcntl qw(:DEFAULT :flock); + open(FH, "< filename") or die "can't open filename: $!"; + flock(FH, LOCK_SH) or die "can't lock filename: $!"; + # now read from FH + +You can get a non-blocking lock by using C<LOCK_NB>. + + flock(FH, LOCK_SH | LOCK_NB) + or die "can't lock filename: $!"; + +This can be useful for producing more user-friendly behaviour by warning +if you're going to be blocking: + + use 5.004; + use Fcntl qw(:DEFAULT :flock); + open(FH, "< filename") or die "can't open filename: $!"; + unless (flock(FH, LOCK_SH | LOCK_NB)) { + $| = 1; + print "Waiting for lock..."; + flock(FH, LOCK_SH) or die "can't lock filename: $!"; + print "got it.\n" + } + # now read from FH + +To get an exclusive lock, typically used for writing, you have to be +careful. We C<sysopen> the file so it can be locked before it gets +emptied. You can get a nonblocking version using C<LOCK_EX | LOCK_NB>. + + use 5.004; + use Fcntl qw(:DEFAULT :flock); + sysopen(FH, "filename", O_WRONLY | O_CREAT) + or die "can't open filename: $!"; + flock(FH, LOCK_EX) + or die "can't lock filename: $!"; + truncate(FH, 0) + or die "can't truncate filename: $!"; + # now write to FH + +Finally, due to the uncounted millions who cannot be dissuaded from +wasting cycles on useless vanity devices called hit counters, here's +how to increment a number in a file safely: + + use Fcntl qw(:DEFAULT :flock); + + sysopen(FH, "numfile", O_RDWR | O_CREAT) + or die "can't open numfile: $!"; + # autoflush FH + $ofh = select(FH); $| = 1; select ($ofh); + flock(FH, LOCK_EX) + or die "can't write-lock numfile: $!"; + + $num = <FH> || 0; + seek(FH, 0, 0) + or die "can't rewind numfile : $!"; + print FH $num+1, "\n" + or die "can't write numfile: $!"; + + truncate(FH, tell(FH)) + or die "can't truncate numfile: $!"; + close(FH) + or die "can't close numfile: $!"; + +=head1 SEE ALSO + +The C<open> and C<sysopen> function in perlfunc(1); +the standard open(2), dup(2), fopen(3), and fdopen(3) manpages; +the POSIX documentation. + +=head1 AUTHOR and COPYRIGHT + +Copyright 1998 Tom Christiansen. + +When included as part of the Standard Version of Perl, or as part of +its complete documentation whether printed or otherwise, this work may +be distributed only under the terms of Perl's Artistic License. Any +distribution of this file or derivatives thereof outside of that +package require that special arrangements be made with copyright +holder. + +Irrespective of its distribution, all code examples in these files are +hereby placed into the public domain. You are permitted and +encouraged to use this code in your own programs for fun or for profit +as you see fit. A simple comment in the code giving credit would be +courteous but is not required. + +=head1 HISTORY + +First release: Sat Jan 9 08:09:11 MST 1999 diff --git a/pod/roffitall b/pod/roffitall index 421b37a8f0..3cea6859e9 100644 --- a/pod/roffitall +++ b/pod/roffitall @@ -38,6 +38,7 @@ toroff=` $mandir/perlfunc.1 \ $mandir/perlvar.1 \ $mandir/perlsub.1 \ + $mandir/perlopentut.1 \ $mandir/perlmod.1 \ $mandir/perlmodlib.1 \ $mandir/perlmodinstall.1 \ |