diff options
Diffstat (limited to 'perl.man')
-rw-r--r-- | perl.man | 5938 |
1 files changed, 5938 insertions, 0 deletions
diff --git a/perl.man b/perl.man new file mode 100644 index 0000000000..111dca0579 --- /dev/null +++ b/perl.man @@ -0,0 +1,5938 @@ +.rn '' }` +''' $Header: perl.man,v 4.0 91/03/20 01:38:08 lwall Locked $ +''' +''' $Log: perl.man,v $ +''' Revision 4.0 91/03/20 01:38:08 lwall +''' 4.0 baseline. +''' +''' +.de Sh +.br +.ne 5 +.PP +\fB\\$1\fR +.PP +.. +.de Sp +.if t .sp .5v +.if n .sp +.. +.de Ip +.br +.ie \\n(.$>=3 .ne \\$3 +.el .ne 3 +.IP "\\$1" \\$2 +.. +''' +''' Set up \*(-- to give an unbreakable dash; +''' string Tr holds user defined translation string. +''' Bell System Logo is used as a dummy character. +''' +.tr \(*W-|\(bv\*(Tr +.ie n \{\ +.ds -- \(*W- +.if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch +.if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch +.ds L" "" +.ds R" "" +.ds L' ' +.ds R' ' +'br\} +.el\{\ +.ds -- \(em\| +.tr \*(Tr +.ds L" `` +.ds R" '' +.ds L' ` +.ds R' ' +'br\} +.TH PERL 1 "\*(RP" +.UC +.SH NAME +perl \- Practical Extraction and Report Language +.SH SYNOPSIS +.B perl +[options] filename args +.SH DESCRIPTION +.I Perl +is an interpreted language optimized for scanning arbitrary text files, +extracting information from those text files, and printing reports based +on that information. +It's also a good language for many system management tasks. +The language is intended to be practical (easy to use, efficient, complete) +rather than beautiful (tiny, elegant, minimal). +It combines (in the author's opinion, anyway) some of the best features of C, +\fIsed\fR, \fIawk\fR, and \fIsh\fR, +so people familiar with those languages should have little difficulty with it. +(Language historians will also note some vestiges of \fIcsh\fR, Pascal, and +even BASIC-PLUS.) +Expression syntax corresponds quite closely to C expression syntax. +Unlike most Unix utilities, +.I perl +does not arbitrarily limit the size of your data\*(--if you've got +the memory, +.I perl +can slurp in your whole file as a single string. +Recursion is of unlimited depth. +And the hash tables used by associative arrays grow as necessary to prevent +degraded performance. +.I Perl +uses sophisticated pattern matching techniques to scan large amounts of +data very quickly. +Although optimized for scanning text, +.I perl +can also deal with binary data, and can make dbm files look like associative +arrays (where dbm is available). +Setuid +.I perl +scripts are safer than C programs +through a dataflow tracing mechanism which prevents many stupid security holes. +If you have a problem that would ordinarily use \fIsed\fR +or \fIawk\fR or \fIsh\fR, but it +exceeds their capabilities or must run a little faster, +and you don't want to write the silly thing in C, then +.I perl +may be for you. +There are also translators to turn your +.I sed +and +.I awk +scripts into +.I perl +scripts. +OK, enough hype. +.PP +Upon startup, +.I perl +looks for your script in one of the following places: +.Ip 1. 4 2 +Specified line by line via +.B \-e +switches on the command line. +.Ip 2. 4 2 +Contained in the file specified by the first filename on the command line. +(Note that systems supporting the #! notation invoke interpreters this way.) +.Ip 3. 4 2 +Passed in implicitly via standard input. +This only works if there are no filename arguments\*(--to pass +arguments to a +.I stdin +script you must explicitly specify a \- for the script name. +.PP +After locating your script, +.I perl +compiles it to an internal form. +If the script is syntactically correct, it is executed. +.Sh "Options" +Note: on first reading this section may not make much sense to you. It's here +at the front for easy reference. +.PP +A single-character option may be combined with the following option, if any. +This is particularly useful when invoking a script using the #! construct which +only allows one argument. Example: +.nf + +.ne 2 + #!/usr/bin/perl \-spi.bak # same as \-s \-p \-i.bak + .\|.\|. + +.fi +Options include: +.TP 5 +.BI \-0 digits +specifies the record separator ($/) as an octal number. +If there are no digits, the null character is the separator. +Other switches may precede or follow the digits. +For example, if you have a version of +.I find +which can print filenames terminated by the null character, you can say this: +.nf + + find . \-name '*.bak' \-print0 | perl \-n0e unlink + +.fi +The special value 00 will cause Perl to slurp files in paragraph mode. +The value 0777 will cause Perl to slurp files whole since there is no +legal character with that value. +.TP 5 +.B \-a +turns on autosplit mode when used with a +.B \-n +or +.BR \-p . +An implicit split command to the @F array +is done as the first thing inside the implicit while loop produced by +the +.B \-n +or +.BR \-p . +.nf + + perl \-ane \'print pop(@F), "\en";\' + +is equivalent to + + while (<>) { + @F = split(\' \'); + print pop(@F), "\en"; + } + +.fi +.TP 5 +.B \-c +causes +.I perl +to check the syntax of the script and then exit without executing it. +.TP 5 +.BI \-d +runs the script under the perl debugger. +See the section on Debugging. +.TP 5 +.BI \-D number +sets debugging flags. +To watch how it executes your script, use +.BR \-D14 . +(This only works if debugging is compiled into your +.IR perl .) +Another nice value is \-D1024, which lists your compiled syntax tree. +And \-D512 displays compiled regular expressions. +.TP 5 +.BI \-e " commandline" +may be used to enter one line of script. +Multiple +.B \-e +commands may be given to build up a multi-line script. +If +.B \-e +is given, +.I perl +will not look for a script filename in the argument list. +.TP 5 +.BI \-i extension +specifies that files processed by the <> construct are to be edited +in-place. +It does this by renaming the input file, opening the output file by the +same name, and selecting that output file as the default for print statements. +The extension, if supplied, is added to the name of the +old file to make a backup copy. +If no extension is supplied, no backup is made. +Saying \*(L"perl \-p \-i.bak \-e "s/foo/bar/;" .\|.\|. \*(R" is the same as using +the script: +.nf + +.ne 2 + #!/usr/bin/perl \-pi.bak + s/foo/bar/; + +which is equivalent to + +.ne 14 + #!/usr/bin/perl + while (<>) { + if ($ARGV ne $oldargv) { + rename($ARGV, $ARGV . \'.bak\'); + open(ARGVOUT, ">$ARGV"); + select(ARGVOUT); + $oldargv = $ARGV; + } + s/foo/bar/; + } + continue { + print; # this prints to original filename + } + select(STDOUT); + +.fi +except that the +.B \-i +form doesn't need to compare $ARGV to $oldargv to know when +the filename has changed. +It does, however, use ARGVOUT for the selected filehandle. +Note that +.I STDOUT +is restored as the default output filehandle after the loop. +.Sp +You can use eof to locate the end of each input file, in case you want +to append to each file, or reset line numbering (see example under eof). +.TP 5 +.BI \-I directory +may be used in conjunction with +.B \-P +to tell the C preprocessor where to look for include files. +By default /usr/include and /usr/lib/perl are searched. +.TP 5 +.BI \-l octnum +enables automatic line-ending processing. It has two effects: +first, it automatically chops the line terminator when used with +.B \-n +or +.B \-p , +and second, it assigns $\e to have the value of +.I octnum +so that any print statements will have that line terminator added back on. If +.I octnum +is omitted, sets $\e to the current value of $/. +For instance, to trim lines to 80 columns: +.nf + + perl -lpe \'substr($_, 80) = ""\' + +.fi +Note that the assignment $\e = $/ is done when the switch is processed, +so the input record separator can be different than the output record +separator if the +.B \-l +switch is followed by a +.B \-0 +switch: +.nf + + gnufind / -print0 | perl -ln0e 'print "found $_" if -p' + +.fi +This sets $\e to newline and then sets $/ to the null character. +.TP 5 +.B \-n +causes +.I perl +to assume the following loop around your script, which makes it iterate +over filename arguments somewhat like \*(L"sed \-n\*(R" or \fIawk\fR: +.nf + +.ne 3 + while (<>) { + .\|.\|. # your script goes here + } + +.fi +Note that the lines are not printed by default. +See +.B \-p +to have lines printed. +Here is an efficient way to delete all files older than a week: +.nf + + find . \-mtime +7 \-print | perl \-nle \'unlink;\' + +.fi +This is faster than using the \-exec switch of find because you don't have to +start a process on every filename found. +.TP 5 +.B \-p +causes +.I perl +to assume the following loop around your script, which makes it iterate +over filename arguments somewhat like \fIsed\fR: +.nf + +.ne 5 + while (<>) { + .\|.\|. # your script goes here + } continue { + print; + } + +.fi +Note that the lines are printed automatically. +To suppress printing use the +.B \-n +switch. +A +.B \-p +overrides a +.B \-n +switch. +.TP 5 +.B \-P +causes your script to be run through the C preprocessor before +compilation by +.IR perl . +(Since both comments and cpp directives begin with the # character, +you should avoid starting comments with any words recognized +by the C preprocessor such as \*(L"if\*(R", \*(L"else\*(R" or \*(L"define\*(R".) +.TP 5 +.B \-s +enables some rudimentary switch parsing for switches on the command line +after the script name but before any filename arguments (or before a \-\|\-). +Any switch found there is removed from @ARGV and sets the corresponding variable in the +.I perl +script. +The following script prints \*(L"true\*(R" if and only if the script is +invoked with a \-xyz switch. +.nf + +.ne 2 + #!/usr/bin/perl \-s + if ($xyz) { print "true\en"; } + +.fi +.TP 5 +.B \-S +makes +.I perl +use the PATH environment variable to search for the script +(unless the name of the script starts with a slash). +Typically this is used to emulate #! startup on machines that don't +support #!, in the following manner: +.nf + + #!/usr/bin/perl + eval "exec /usr/bin/perl \-S $0 $*" + if $running_under_some_shell; + +.fi +The system ignores the first line and feeds the script to /bin/sh, +which proceeds to try to execute the +.I perl +script as a shell script. +The shell executes the second line as a normal shell command, and thus +starts up the +.I perl +interpreter. +On some systems $0 doesn't always contain the full pathname, +so the +.B \-S +tells +.I perl +to search for the script if necessary. +After +.I perl +locates the script, it parses the lines and ignores them because +the variable $running_under_some_shell is never true. +A better construct than $* would be ${1+"$@"}, which handles embedded spaces +and such in the filenames, but doesn't work if the script is being interpreted +by csh. +In order to start up sh rather than csh, some systems may have to replace the +#! line with a line containing just +a colon, which will be politely ignored by perl. +Other systems can't control that, and need a totally devious construct that +will work under any of csh, sh or perl, such as the following: +.nf + +.ne 3 + eval '(exit $?0)' && eval 'exec /usr/bin/perl -S $0 ${1+"$@"}' + & eval 'exec /usr/bin/perl -S $0 $argv:q' + if 0; + +.fi +.TP 5 +.B \-u +causes +.I perl +to dump core after compiling your script. +You can then take this core dump and turn it into an executable file +by using the undump program (not supplied). +This speeds startup at the expense of some disk space (which you can +minimize by stripping the executable). +(Still, a "hello world" executable comes out to about 200K on my machine.) +If you are going to run your executable as a set-id program then you +should probably compile it using taintperl rather than normal perl. +If you want to execute a portion of your script before dumping, use the +dump operator instead. +Note: availability of undump is platform specific and may not be available +for a specific port of perl. +.TP 5 +.B \-U +allows +.I perl +to do unsafe operations. +Currently the only \*(L"unsafe\*(R" operation is the unlinking of directories while +running as superuser. +.TP 5 +.B \-v +prints the version and patchlevel of your +.I perl +executable. +.TP 5 +.B \-w +prints warnings about identifiers that are mentioned only once, and scalar +variables that are used before being set. +Also warns about redefined subroutines, and references to undefined +filehandles or filehandles opened readonly that you are attempting to +write on. +Also warns you if you use == on values that don't look like numbers, and if +your subroutines recurse more than 100 deep. +.TP 5 +.BI \-x directory +tells +.I perl +that the script is embedded in a message. +Leading garbage will be discarded until the first line that starts +with #! and contains the string "perl". +Any meaningful switches on that line will be applied (but only one +group of switches, as with normal #! processing). +If a directory name is specified, Perl will switch to that directory +before running the script. +The +.B \-x +switch only controls the the disposal of leading garbage. +The script must be terminated with __END__ if there is trailing garbage +to be ignored (the script can process any or all of the trailing garbage +via the DATA filehandle if desired). +.Sh "Data Types and Objects" +.PP +.I Perl +has three data types: scalars, arrays of scalars, and +associative arrays of scalars. +Normal arrays are indexed by number, and associative arrays by string. +.PP +The interpretation of operations and values in perl sometimes +depends on the requirements +of the context around the operation or value. +There are three major contexts: string, numeric and array. +Certain operations return array values +in contexts wanting an array, and scalar values otherwise. +(If this is true of an operation it will be mentioned in the documentation +for that operation.) +Operations which return scalars don't care whether the context is looking +for a string or a number, but +scalar variables and values are interpreted as strings or numbers +as appropriate to the context. +A scalar is interpreted as TRUE in the boolean sense if it is not the null +string or 0. +Booleans returned by operators are 1 for true and 0 or \'\' (the null +string) for false. +.PP +There are actually two varieties of null string: defined and undefined. +Undefined null strings are returned when there is no real value for something, +such as when there was an error, or at end of file, or when you refer +to an uninitialized variable or element of an array. +An undefined null string may become defined the first time you access it, but +prior to that you can use the defined() operator to determine whether the +value is defined or not. +.PP +References to scalar variables always begin with \*(L'$\*(R', even when referring +to a scalar that is part of an array. +Thus: +.nf + +.ne 3 + $days \h'|2i'# a simple scalar variable + $days[28] \h'|2i'# 29th element of array @days + $days{\'Feb\'}\h'|2i'# one value from an associative array + $#days \h'|2i'# last index of array @days + +but entire arrays or array slices are denoted by \*(L'@\*(R': + + @days \h'|2i'# ($days[0], $days[1],\|.\|.\|. $days[n]) + @days[3,4,5]\h'|2i'# same as @days[3.\|.5] + @days{'a','c'}\h'|2i'# same as ($days{'a'},$days{'c'}) + +and entire associative arrays are denoted by \*(L'%\*(R': + + %days \h'|2i'# (key1, val1, key2, val2 .\|.\|.) +.fi +.PP +Any of these eight constructs may serve as an lvalue, +that is, may be assigned to. +(It also turns out that an assignment is itself an lvalue in +certain contexts\*(--see examples under s, tr and chop.) +Assignment to a scalar evaluates the righthand side in a scalar context, +while assignment to an array or array slice evaluates the righthand side +in an array context. +.PP +You may find the length of array @days by evaluating +\*(L"$#days\*(R", as in +.IR csh . +(Actually, it's not the length of the array, it's the subscript of the last element, since there is (ordinarily) a 0th element.) +Assigning to $#days changes the length of the array. +Shortening an array by this method does not actually destroy any values. +Lengthening an array that was previously shortened recovers the values that +were in those elements. +You can also gain some measure of efficiency by preextending an array that +is going to get big. +(You can also extend an array by assigning to an element that is off the +end of the array. +This differs from assigning to $#whatever in that intervening values +are set to null rather than recovered.) +You can truncate an array down to nothing by assigning the null list () to +it. +The following are exactly equivalent +.nf + + @whatever = (); + $#whatever = $[ \- 1; + +.fi +.PP +If you evaluate an array in a scalar context, it returns the length of +the array. +The following is always true: +.nf + + @whatever == $#whatever \- $[ + 1; + +.fi +.PP +Multi-dimensional arrays are not directly supported, but see the discussion +of the $; variable later for a means of emulating multiple subscripts with +an associative array. +You could also write a subroutine to turn multiple subscripts into a single +subscript. +.PP +Every data type has its own namespace. +You can, without fear of conflict, use the same name for a scalar variable, +an array, an associative array, a filehandle, a subroutine name, and/or +a label. +Since variable and array references always start with \*(L'$\*(R', \*(L'@\*(R', +or \*(L'%\*(R', the \*(L"reserved\*(R" words aren't in fact reserved +with respect to variable names. +(They ARE reserved with respect to labels and filehandles, however, which +don't have an initial special character. +Hint: you could say open(LOG,\'logfile\') rather than open(log,\'logfile\'). +Using uppercase filehandles also improves readability and protects you +from conflict with future reserved words.) +Case IS significant\*(--\*(L"FOO\*(R", \*(L"Foo\*(R" and \*(L"foo\*(R" are all +different names. +Names which start with a letter may also contain digits and underscores. +Names which do not start with a letter are limited to one character, +e.g. \*(L"$%\*(R" or \*(L"$$\*(R". +(Most of the one character names have a predefined significance to +.IR perl . +More later.) +.PP +Numeric literals are specified in any of the usual floating point or +integer formats: +.nf + +.ne 5 + 12345 + 12345.67 + .23E-10 + 0xffff # hex + 0377 # octal + +.fi +String literals are delimited by either single or double quotes. +They work much like shell quotes: +double-quoted string literals are subject to backslash and variable +substitution; single-quoted strings are not (except for \e\' and \e\e). +The usual backslash rules apply for making characters such as newline, tab, +etc., as well as some more exotic forms: +.nf + + \et tab + \en newline + \er return + \ef form feed + \eb backspace + \ea alarm (bell) + \ee escape + \e033 octal char + \ex1b hex char + \ec[ control char + \el lowercase next char + \eu uppercase next char + \eL lowercase till \eE + \eU uppercase till \eE + \eE end case modification + +.fi +You can also embed newlines directly in your strings, i.e. they can end on +a different line than they begin. +This is nice, but if you forget your trailing quote, the error will not be +reported until +.I perl +finds another line containing the quote character, which +may be much further on in the script. +Variable substitution inside strings is limited to scalar variables, normal +array values, and array slices. +(In other words, identifiers beginning with $ or @, followed by an optional +bracketed expression as a subscript.) +The following code segment prints out \*(L"The price is $100.\*(R" +.nf + +.ne 2 + $Price = \'$100\';\h'|3.5i'# not interpreted + print "The price is $Price.\e\|n";\h'|3.5i'# interpreted + +.fi +Note that you can put curly brackets around the identifier to delimit it +from following alphanumerics. +Also note that a single quoted string must be separated from a preceding +word by a space, since single quote is a valid character in an identifier +(see Packages). +.PP +Two special literals are __LINE__ and __FILE__, which represent the current +line number and filename at that point in your program. +They may only be used as separate tokens; they will not be interpolated +into strings. +In addition, the token __END__ may be used to indicate the logical end of the +script before the actual end of file. +Any following text is ignored (but may be read via the DATA filehandle). +The two control characters ^D and ^Z are synonyms for __END__. +.PP +A word that doesn't have any other interpretation in the grammar will be +treated as if it had single quotes around it. +For this purpose, a word consists only of alphanumeric characters and underline, +and must start with an alphabetic character. +As with filehandles and labels, a bare word that consists entirely of +lowercase letters risks conflict with future reserved words, and if you +use the +.B \-w +switch, Perl will warn you about any such words. +.PP +Array values are interpolated into double-quoted strings by joining all the +elements of the array with the delimiter specified in the $" variable, +space by default. +(Since in versions of perl prior to 3.0 the @ character was not a metacharacter +in double-quoted strings, the interpolation of @array, $array[EXPR], +@array[LIST], $array{EXPR}, or @array{LIST} only happens if array is +referenced elsewhere in the program or is predefined.) +The following are equivalent: +.nf + +.ne 4 + $temp = join($",@ARGV); + system "echo $temp"; + + system "echo @ARGV"; + +.fi +Within search patterns (which also undergo double-quotish substitution) +there is a bad ambiguity: Is /$foo[bar]/ to be +interpreted as /${foo}[bar]/ (where [bar] is a character class for the +regular expression) or as /${foo[bar]}/ (where [bar] is the subscript to +array @foo)? +If @foo doesn't otherwise exist, then it's obviously a character class. +If @foo exists, perl takes a good guess about [bar], and is almost always right. +If it does guess wrong, or if you're just plain paranoid, +you can force the correct interpretation with curly brackets as above. +.PP +A line-oriented form of quoting is based on the shell here-is syntax. +Following a << you specify a string to terminate the quoted material, and all lines +following the current line down to the terminating string are the value +of the item. +The terminating string may be either an identifier (a word), or some +quoted text. +If quoted, the type of quotes you use determines the treatment of the text, +just as in regular quoting. +An unquoted identifier works like double quotes. +There must be no space between the << and the identifier. +(If you put a space it will be treated as a null identifier, which is +valid, and matches the first blank line\*(--see Merry Christmas example below.) +The terminating string must appear by itself (unquoted and with no surrounding +whitespace) on the terminating line. +.nf + + print <<EOF; # same as above +The price is $Price. +EOF + + print <<"EOF"; # same as above +The price is $Price. +EOF + + print << x 10; # null identifier is delimiter +Merry Christmas! + + print <<`EOC`; # execute commands +echo hi there +echo lo there +EOC + + print <<foo, <<bar; # you can stack them +I said foo. +foo +I said bar. +bar + +.fi +Array literals are denoted by separating individual values by commas, and +enclosing the list in parentheses: +.nf + + (LIST) + +.fi +In a context not requiring an array value, the value of the array literal +is the value of the final element, as in the C comma operator. +For example, +.nf + +.ne 4 + @foo = (\'cc\', \'\-E\', $bar); + +assigns the entire array value to array foo, but + + $foo = (\'cc\', \'\-E\', $bar); + +.fi +assigns the value of variable bar to variable foo. +Note that the value of an actual array in a scalar context is the length +of the array; the following assigns to $foo the value 3: +.nf + +.ne 2 + @foo = (\'cc\', \'\-E\', $bar); + $foo = @foo; # $foo gets 3 + +.fi +You may have an optional comma before the closing parenthesis of an +array literal, so that you can say: +.nf + + @foo = ( + 1, + 2, + 3, + ); + +.fi +When a LIST is evaluated, each element of the list is evaluated in +an array context, and the resulting array value is interpolated into LIST +just as if each individual element were a member of LIST. Thus arrays +lose their identity in a LIST\*(--the list + + (@foo,@bar,&SomeSub) + +contains all the elements of @foo followed by all the elements of @bar, +followed by all the elements returned by the subroutine named SomeSub. +.PP +A list value may also be subscripted like a normal array. +Examples: +.nf + + $time = (stat($file))[8]; # stat returns array value + $digit = ('a','b','c','d','e','f')[$digit-10]; + return (pop(@foo),pop(@foo))[0]; + +.fi +.PP +Array lists may be assigned to if and only if each element of the list +is an lvalue: +.nf + + ($a, $b, $c) = (1, 2, 3); + + ($map{\'red\'}, $map{\'blue\'}, $map{\'green\'}) = (0x00f, 0x0f0, 0xf00); + +The final element may be an array or an associative array: + + ($a, $b, @rest) = split; + local($a, $b, %rest) = @_; + +.fi +You can actually put an array anywhere in the list, but the first array +in the list will soak up all the values, and anything after it will get +a null value. +This may be useful in a local(). +.PP +An associative array literal contains pairs of values to be interpreted +as a key and a value: +.nf + +.ne 2 + # same as map assignment above + %map = ('red',0x00f,'blue',0x0f0,'green',0xf00); + +.fi +Array assignment in a scalar context returns the number of elements +produced by the expression on the right side of the assignment: +.nf + + $x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2 + +.fi +.PP +There are several other pseudo-literals that you should know about. +If a string is enclosed by backticks (grave accents), it first undergoes +variable substitution just like a double quoted string. +It is then interpreted as a command, and the output of that command +is the value of the pseudo-literal, like in a shell. +In a scalar context, a single string consisting of all the output is +returned. +In an array context, an array of values is returned, one for each line +of output. +(You can set $/ to use a different line terminator.) +The command is executed each time the pseudo-literal is evaluated. +The status value of the command is returned in $? (see Predefined Names +for the interpretation of $?). +Unlike in \f2csh\f1, no translation is done on the return +data\*(--newlines remain newlines. +Unlike in any of the shells, single quotes do not hide variable names +in the command from interpretation. +To pass a $ through to the shell you need to hide it with a backslash. +.PP +Evaluating a filehandle in angle brackets yields the next line +from that file (newline included, so it's never false until EOF, at +which time an undefined value is returned). +Ordinarily you must assign that value to a variable, +but there is one situation where an automatic assignment happens. +If (and only if) the input symbol is the only thing inside the conditional of a +.I while +loop, the value is +automatically assigned to the variable \*(L"$_\*(R". +(This may seem like an odd thing to you, but you'll use the construct +in almost every +.I perl +script you write.) +Anyway, the following lines are equivalent to each other: +.nf + +.ne 5 + while ($_ = <STDIN>) { print; } + while (<STDIN>) { print; } + for (\|;\|<STDIN>;\|) { print; } + print while $_ = <STDIN>; + print while <STDIN>; + +.fi +The filehandles +.IR STDIN , +.I STDOUT +and +.I STDERR +are predefined. +(The filehandles +.IR stdin , +.I stdout +and +.I stderr +will also work except in packages, where they would be interpreted as +local identifiers rather than global.) +Additional filehandles may be created with the +.I open +function. +.PP +If a <FILEHANDLE> is used in a context that is looking for an array, an array +consisting of all the input lines is returned, one line per array element. +It's easy to make a LARGE data space this way, so use with care. +.PP +The null filehandle <> is special and can be used to emulate the behavior of +\fIsed\fR and \fIawk\fR. +Input from <> comes either from standard input, or from each file listed on +the command line. +Here's how it works: the first time <> is evaluated, the ARGV array is checked, +and if it is null, $ARGV[0] is set to \'-\', which when opened gives you standard +input. +The ARGV array is then processed as a list of filenames. +The loop +.nf + +.ne 3 + while (<>) { + .\|.\|. # code for each line + } + +.ne 10 +is equivalent to + + unshift(@ARGV, \'\-\') \|if \|$#ARGV < $[; + while ($ARGV = shift) { + open(ARGV, $ARGV); + while (<ARGV>) { + .\|.\|. # code for each line + } + } + +.fi +except that it isn't as cumbersome to say. +It really does shift array ARGV and put the current filename into +variable ARGV. +It also uses filehandle ARGV internally. +You can modify @ARGV before the first <> as long as you leave the first +filename at the beginning of the array. +Line numbers ($.) continue as if the input was one big happy file. +(But see example under eof for how to reset line numbers on each file.) +.PP +.ne 5 +If you want to set @ARGV to your own list of files, go right ahead. +If you want to pass switches into your script, you can +put a loop on the front like this: +.nf + +.ne 10 + while ($_ = $ARGV[0], /\|^\-/\|) { + shift; + last if /\|^\-\|\-$\|/\|; + /\|^\-D\|(.*\|)/ \|&& \|($debug = $1); + /\|^\-v\|/ \|&& \|$verbose++; + .\|.\|. # other switches + } + while (<>) { + .\|.\|. # code for each line + } + +.fi +The <> symbol will return FALSE only once. +If you call it again after this it will assume you are processing another +@ARGV list, and if you haven't set @ARGV, will input from +.IR STDIN . +.PP +If the string inside the angle brackets is a reference to a scalar variable +(e.g. <$foo>), +then that variable contains the name of the filehandle to input from. +.PP +If the string inside angle brackets is not a filehandle, it is interpreted +as a filename pattern to be globbed, and either an array of filenames or the +next filename in the list is returned, depending on context. +One level of $ interpretation is done first, but you can't say <$foo> +because that's an indirect filehandle as explained in the previous +paragraph. +You could insert curly brackets to force interpretation as a +filename glob: <${foo}>. +Example: +.nf + +.ne 3 + while (<*.c>) { + chmod 0644, $_; + } + +is equivalent to + +.ne 5 + open(foo, "echo *.c | tr \-s \' \et\er\ef\' \'\e\e012\e\e012\e\e012\e\e012\'|"); + while (<foo>) { + chop; + chmod 0644, $_; + } + +.fi +In fact, it's currently implemented that way. +(Which means it will not work on filenames with spaces in them unless +you have /bin/csh on your machine.) +Of course, the shortest way to do the above is: +.nf + + chmod 0644, <*.c>; + +.fi +.Sh "Syntax" +.PP +A +.I perl +script consists of a sequence of declarations and commands. +The only things that need to be declared in +.I perl +are report formats and subroutines. +See the sections below for more information on those declarations. +All uninitialized user-created objects are assumed to +start with a null or 0 value until they +are defined by some explicit operation such as assignment. +The sequence of commands is executed just once, unlike in +.I sed +and +.I awk +scripts, where the sequence of commands is executed for each input line. +While this means that you must explicitly loop over the lines of your input file +(or files), it also means you have much more control over which files and which +lines you look at. +(Actually, I'm lying\*(--it is possible to do an implicit loop with either the +.B \-n +or +.B \-p +switch.) +.PP +A declaration can be put anywhere a command can, but has no effect on the +execution of the primary sequence of commands\*(--declarations all take effect +at compile time. +Typically all the declarations are put at the beginning or the end of the script. +.PP +.I Perl +is, for the most part, a free-form language. +(The only exception to this is format declarations, for fairly obvious reasons.) +Comments are indicated by the # character, and extend to the end of the line. +If you attempt to use /* */ C comments, it will be interpreted either as +division or pattern matching, depending on the context. +So don't do that. +.Sh "Compound statements" +In +.IR perl , +a sequence of commands may be treated as one command by enclosing it +in curly brackets. +We will call this a BLOCK. +.PP +The following compound commands may be used to control flow: +.nf + +.ne 4 + if (EXPR) BLOCK + if (EXPR) BLOCK else BLOCK + if (EXPR) BLOCK elsif (EXPR) BLOCK .\|.\|. else BLOCK + LABEL while (EXPR) BLOCK + LABEL while (EXPR) BLOCK continue BLOCK + LABEL for (EXPR; EXPR; EXPR) BLOCK + LABEL foreach VAR (ARRAY) BLOCK + LABEL BLOCK continue BLOCK + +.fi +Note that, unlike C and Pascal, these are defined in terms of BLOCKs, not +statements. +This means that the curly brackets are \fIrequired\fR\*(--no dangling statements allowed. +If you want to write conditionals without curly brackets there are several +other ways to do it. +The following all do the same thing: +.nf + +.ne 5 + if (!open(foo)) { die "Can't open $foo: $!"; } + die "Can't open $foo: $!" unless open(foo); + open(foo) || die "Can't open $foo: $!"; # foo or bust! + open(foo) ? \'hi mom\' : die "Can't open $foo: $!"; + # a bit exotic, that last one + +.fi +.PP +The +.I if +statement is straightforward. +Since BLOCKs are always bounded by curly brackets, there is never any +ambiguity about which +.I if +an +.I else +goes with. +If you use +.I unless +in place of +.IR if , +the sense of the test is reversed. +.PP +The +.I while +statement executes the block as long as the expression is true +(does not evaluate to the null string or 0). +The LABEL is optional, and if present, consists of an identifier followed by +a colon. +The LABEL identifies the loop for the loop control statements +.IR next , +.IR last , +and +.I redo +(see below). +If there is a +.I continue +BLOCK, it is always executed just before +the conditional is about to be evaluated again, similarly to the third part +of a +.I for +loop in C. +Thus it can be used to increment a loop variable, even when the loop has +been continued via the +.I next +statement (similar to the C \*(L"continue\*(R" statement). +.PP +If the word +.I while +is replaced by the word +.IR until , +the sense of the test is reversed, but the conditional is still tested before +the first iteration. +.PP +In either the +.I if +or the +.I while +statement, you may replace \*(L"(EXPR)\*(R" with a BLOCK, and the conditional +is true if the value of the last command in that block is true. +.PP +The +.I for +loop works exactly like the corresponding +.I while +loop: +.nf + +.ne 12 + for ($i = 1; $i < 10; $i++) { + .\|.\|. + } + +is the same as + + $i = 1; + while ($i < 10) { + .\|.\|. + } continue { + $i++; + } +.fi +.PP +The foreach loop iterates over a normal array value and sets the variable +VAR to be each element of the array in turn. +The variable is implicitly local to the loop, and regains its former value +upon exiting the loop. +The \*(L"foreach\*(R" keyword is actually identical to the \*(L"for\*(R" keyword, +so you can use \*(L"foreach\*(R" for readability or \*(L"for\*(R" for brevity. +If VAR is omitted, $_ is set to each value. +If ARRAY is an actual array (as opposed to an expression returning an array +value), you can modify each element of the array +by modifying VAR inside the loop. +Examples: +.nf + +.ne 5 + for (@ary) { s/foo/bar/; } + + foreach $elem (@elements) { + $elem *= 2; + } + +.ne 3 + for ((10,9,8,7,6,5,4,3,2,1,\'BOOM\')) { + print $_, "\en"; sleep(1); + } + + for (1..15) { print "Merry Christmas\en"; } + +.ne 3 + foreach $item (split(/:[\e\e\en:]*/, $ENV{\'TERMCAP\'})) { + print "Item: $item\en"; + } + +.fi +.PP +The BLOCK by itself (labeled or not) is equivalent to a loop that executes +once. +Thus you can use any of the loop control statements in it to leave or +restart the block. +The +.I continue +block is optional. +This construct is particularly nice for doing case structures. +.nf + +.ne 6 + foo: { + if (/^abc/) { $abc = 1; last foo; } + if (/^def/) { $def = 1; last foo; } + if (/^xyz/) { $xyz = 1; last foo; } + $nothing = 1; + } + +.fi +There is no official switch statement in perl, because there +are already several ways to write the equivalent. +In addition to the above, you could write +.nf + +.ne 6 + foo: { + $abc = 1, last foo if /^abc/; + $def = 1, last foo if /^def/; + $xyz = 1, last foo if /^xyz/; + $nothing = 1; + } + +or + +.ne 6 + foo: { + /^abc/ && do { $abc = 1; last foo; }; + /^def/ && do { $def = 1; last foo; }; + /^xyz/ && do { $xyz = 1; last foo; }; + $nothing = 1; + } + +or + +.ne 6 + foo: { + /^abc/ && ($abc = 1, last foo); + /^def/ && ($def = 1, last foo); + /^xyz/ && ($xyz = 1, last foo); + $nothing = 1; + } + +or even + +.ne 8 + if (/^abc/) + { $abc = 1; } + elsif (/^def/) + { $def = 1; } + elsif (/^xyz/) + { $xyz = 1; } + else + {$nothing = 1;} + +.fi +As it happens, these are all optimized internally to a switch structure, +so perl jumps directly to the desired statement, and you needn't worry +about perl executing a lot of unnecessary statements when you have a string +of 50 elsifs, as long as you are testing the same simple scalar variable +using ==, eq, or pattern matching as above. +(If you're curious as to whether the optimizer has done this for a particular +case statement, you can use the \-D1024 switch to list the syntax tree +before execution.) +.Sh "Simple statements" +The only kind of simple statement is an expression evaluated for its side +effects. +Every expression (simple statement) must be terminated with a semicolon. +Note that this is like C, but unlike Pascal (and +.IR awk ). +.PP +Any simple statement may optionally be followed by a +single modifier, just before the terminating semicolon. +The possible modifiers are: +.nf + +.ne 4 + if EXPR + unless EXPR + while EXPR + until EXPR + +.fi +The +.I if +and +.I unless +modifiers have the expected semantics. +The +.I while +and +.I until +modifiers also have the expected semantics (conditional evaluated first), +except when applied to a do-BLOCK or a do-SUBROUTINE command, +in which case the block executes once before the conditional is evaluated. +This is so that you can write loops like: +.nf + +.ne 4 + do { + $_ = <STDIN>; + .\|.\|. + } until $_ \|eq \|".\|\e\|n"; + +.fi +(See the +.I do +operator below. Note also that the loop control commands described later will +NOT work in this construct, since modifiers don't take loop labels. +Sorry.) +.Sh "Expressions" +Since +.I perl +expressions work almost exactly like C expressions, only the differences +will be mentioned here. +.PP +Here's what +.I perl +has that C doesn't: +.Ip ** 8 2 +The exponentiation operator. +.Ip **= 8 +The exponentiation assignment operator. +.Ip (\|) 8 3 +The null list, used to initialize an array to null. +.Ip . 8 +Concatenation of two strings. +.Ip .= 8 +The concatenation assignment operator. +.Ip eq 8 +String equality (== is numeric equality). +For a mnemonic just think of \*(L"eq\*(R" as a string. +(If you are used to the +.I awk +behavior of using == for either string or numeric equality +based on the current form of the comparands, beware! +You must be explicit here.) +.Ip ne 8 +String inequality (!= is numeric inequality). +.Ip lt 8 +String less than. +.Ip gt 8 +String greater than. +.Ip le 8 +String less than or equal. +.Ip ge 8 +String greater than or equal. +.Ip cmp 8 +String comparison, returning -1, 0, or 1. +.Ip <=> 8 +Numeric comparison, returning -1, 0, or 1. +.Ip =~ 8 2 +Certain operations search or modify the string \*(L"$_\*(R" by default. +This operator makes that kind of operation work on some other string. +The right argument is a search pattern, substitution, or translation. +The left argument is what is supposed to be searched, substituted, or +translated instead of the default \*(L"$_\*(R". +The return value indicates the success of the operation. +(If the right argument is an expression other than a search pattern, +substitution, or translation, it is interpreted as a search pattern +at run time. +This is less efficient than an explicit search, since the pattern must +be compiled every time the expression is evaluated.) +The precedence of this operator is lower than unary minus and autoincrement/decrement, but higher than everything else. +.Ip !~ 8 +Just like =~ except the return value is negated. +.Ip x 8 +The repetition operator. +Returns a string consisting of the left operand repeated the +number of times specified by the right operand. +In an array context, if the left operand is a list in parens, it repeats +the list. +.nf + + print \'\-\' x 80; # print row of dashes + print \'\-\' x80; # illegal, x80 is identifier + + print "\et" x ($tab/8), \' \' x ($tab%8); # tab over + + @ones = (1) x ; # an array of 80 1's + @ones = (5) x @ones; # set all elements to 5 + +.fi +.Ip x= 8 +The repetition assignment operator. +Only works on scalars. +.Ip .\|. 8 +The range operator, which is really two different operators depending +on the context. +In an array context, returns an array of values counting (by ones) +from the left value to the right value. +This is useful for writing \*(L"for (1..10)\*(R" loops and for doing +slice operations on arrays. +.Sp +In a scalar context, .\|. returns a boolean value. +The operator is bistable, like a flip-flop.. +Each .\|. operator maintains its own boolean state. +It is false as long as its left operand is false. +Once the left operand is true, the range operator stays true +until the right operand is true, +AFTER which the range operator becomes false again. +(It doesn't become false till the next time the range operator is evaluated. +It can become false on the same evaluation it became true, but it still returns +true once.) +The right operand is not evaluated while the operator is in the \*(L"false\*(R" state, +and the left operand is not evaluated while the operator is in the \*(L"true\*(R" state. +The scalar .\|. operator is primarily intended for doing line number ranges +after +the fashion of \fIsed\fR or \fIawk\fR. +The precedence is a little lower than || and &&. +The value returned is either the null string for false, or a sequence number +(beginning with 1) for true. +The sequence number is reset for each range encountered. +The final sequence number in a range has the string \'E0\' appended to it, which +doesn't affect its numeric value, but gives you something to search for if you +want to exclude the endpoint. +You can exclude the beginning point by waiting for the sequence number to be +greater than 1. +If either operand of scalar .\|. is static, that operand is implicitly compared +to the $. variable, the current line number. +Examples: +.nf + +.ne 6 +As a scalar operator: + if (101 .\|. 200) { print; } # print 2nd hundred lines + + next line if (1 .\|. /^$/); # skip header lines + + s/^/> / if (/^$/ .\|. eof()); # quote body + +.ne 4 +As an array operator: + for (101 .\|. 200) { print; } # print $_ 100 times + + @foo = @foo[$[ .\|. $#foo]; # an expensive no-op + @foo = @foo[$#foo-4 .\|. $#foo]; # slice last 5 items + +.fi +.Ip \-x 8 +A file test. +This unary operator takes one argument, either a filename or a filehandle, +and tests the associated file to see if something is true about it. +If the argument is omitted, tests $_, except for \-t, which tests +.IR STDIN . +It returns 1 for true and \'\' for false, or the undefined value if the +file doesn't exist. +Precedence is higher than logical and relational operators, but lower than +arithmetic operators. +The operator may be any of: +.nf + \-r File is readable by effective uid. + \-w File is writable by effective uid. + \-x File is executable by effective uid. + \-o File is owned by effective uid. + \-R File is readable by real uid. + \-W File is writable by real uid. + \-X File is executable by real uid. + \-O File is owned by real uid. + \-e File exists. + \-z File has zero size. + \-s File has non-zero size (returns size). + \-f File is a plain file. + \-d File is a directory. + \-l File is a symbolic link. + \-p File is a named pipe (FIFO). + \-S File is a socket. + \-b File is a block special file. + \-c File is a character special file. + \-u File has setuid bit set. + \-g File has setgid bit set. + \-k File has sticky bit set. + \-t Filehandle is opened to a tty. + \-T File is a text file. + \-B File is a binary file (opposite of \-T). + \-M Age of file in days when script started. + \-A Same for access time. + \-C Same for inode change time. + +.fi +The interpretation of the file permission operators \-r, \-R, \-w, \-W, \-x and \-X +is based solely on the mode of the file and the uids and gids of the user. +There may be other reasons you can't actually read, write or execute the file. +Also note that, for the superuser, \-r, \-R, \-w and \-W always return 1, and +\-x and \-X return 1 if any execute bit is set in the mode. +Scripts run by the superuser may thus need to do a stat() in order to determine +the actual mode of the file, or temporarily set the uid to something else. +.Sp +Example: +.nf +.ne 7 + + while (<>) { + chop; + next unless \-f $_; # ignore specials + .\|.\|. + } + +.fi +Note that \-s/a/b/ does not do a negated substitution. +Saying \-exp($foo) still works as expected, however\*(--only single letters +following a minus are interpreted as file tests. +.Sp +The \-T and \-B switches work as follows. +The first block or so of the file is examined for odd characters such as +strange control codes or metacharacters. +If too many odd characters (>10%) are found, it's a \-B file, otherwise it's a \-T file. +Also, any file containing null in the first block is considered a binary file. +If \-T or \-B is used on a filehandle, the current stdio buffer is examined +rather than the first block. +Both \-T and \-B return TRUE on a null file, or a file at EOF when testing +a filehandle. +.PP +If any of the file tests (or either stat operator) are given the special +filehandle consisting of a solitary underline, then the stat structure +of the previous file test (or stat operator) is used, saving a system +call. +(This doesn't work with \-t, and you need to remember that lstat and -l +will leave values in the stat structure for the symbolic link, not the +real file.) +Example: +.nf + + print "Can do.\en" if -r $a || -w _ || -x _; + +.ne 9 + stat($filename); + print "Readable\en" if -r _; + print "Writable\en" if -w _; + print "Executable\en" if -x _; + print "Setuid\en" if -u _; + print "Setgid\en" if -g _; + print "Sticky\en" if -k _; + print "Text\en" if -T _; + print "Binary\en" if -B _; + +.fi +.PP +Here is what C has that +.I perl +doesn't: +.Ip "unary &" 12 +Address-of operator. +.Ip "unary *" 12 +Dereference-address operator. +.Ip "(TYPE)" 12 +Type casting operator. +.PP +Like C, +.I perl +does a certain amount of expression evaluation at compile time, whenever +it determines that all of the arguments to an operator are static and have +no side effects. +In particular, string concatenation happens at compile time between literals that don't do variable substitution. +Backslash interpretation also happens at compile time. +You can say +.nf + +.ne 2 + \'Now is the time for all\' . "\|\e\|n" . + \'good men to come to.\' + +.fi +and this all reduces to one string internally. +.PP +The autoincrement operator has a little extra built-in magic to it. +If you increment a variable that is numeric, or that has ever been used in +a numeric context, you get a normal increment. +If, however, the variable has only been used in string contexts since it +was set, and has a value that is not null and matches the +pattern /^[a\-zA\-Z]*[0\-9]*$/, the increment is done +as a string, preserving each character within its range, with carry: +.nf + + print ++($foo = \'99\'); # prints \*(L'100\*(R' + print ++($foo = \'a0\'); # prints \*(L'a1\*(R' + print ++($foo = \'Az\'); # prints \*(L'Ba\*(R' + print ++($foo = \'zz\'); # prints \*(L'aaa\*(R' + +.fi +The autodecrement is not magical. +.PP +The range operator (in an array context) makes use of the magical +autoincrement algorithm if the minimum and maximum are strings. +You can say + + @alphabet = (\'A\' .. \'Z\'); + +to get all the letters of the alphabet, or + + $hexdigit = (0 .. 9, \'a\' .. \'f\')[$num & 15]; + +to get a hexadecimal digit, or + + @z2 = (\'01\' .. \'31\'); print @z2[$mday]; + +to get dates with leading zeros. +(If the final value specified is not in the sequence that the magical increment +would produce, the sequence goes until the next value would be longer than +the final value specified.) +.PP +The || and && operators differ from C's in that, rather than returning 0 or 1, +they return the last value evaluated. +Thus, a portable way to find out the home directory might be: +.nf + + $home = $ENV{'HOME'} || $ENV{'LOGDIR'} || + (getpwuid($<))[7] || die "You're homeless!\en"; + +.fi +''' Beginning of part 2 +''' $Header: perl.man,v 4.0 91/03/20 01:38:08 lwall Locked $ +''' +''' $Log: perl.man,v $ +''' Revision 4.0 91/03/20 01:38:08 lwall +''' 4.0 baseline. +''' +''' Revision 3.0.1.11 91/01/11 18:17:08 lwall +''' patch42: fixed some man page entries +''' +''' Revision 3.0.1.10 90/11/10 01:46:29 lwall +''' patch38: random cleanup +''' patch38: added alarm function +''' +''' Revision 3.0.1.9 90/10/15 18:17:37 lwall +''' patch29: added caller +''' patch29: index and substr now have optional 3rd args +''' patch29: added SysV IPC +''' +''' Revision 3.0.1.8 90/08/13 22:21:00 lwall +''' patch28: documented that you can't interpolate $) or $| in pattern +''' +''' Revision 3.0.1.7 90/08/09 04:27:04 lwall +''' patch19: added require operator +''' +''' Revision 3.0.1.6 90/08/03 11:15:29 lwall +''' patch19: Intermediate diffs for Randal +''' +''' Revision 3.0.1.5 90/03/27 16:15:17 lwall +''' patch16: MSDOS support +''' +''' Revision 3.0.1.4 90/03/12 16:46:02 lwall +''' patch13: documented behavior of @array = /noparens/ +''' +''' Revision 3.0.1.3 90/02/28 17:55:58 lwall +''' patch9: grep now returns number of items matched in scalar context +''' patch9: documented in-place modification capabilites of grep +''' +''' Revision 3.0.1.2 89/11/17 15:30:16 lwall +''' patch5: fixed some manual typos and indent problems +''' +''' Revision 3.0.1.1 89/11/11 04:43:10 lwall +''' patch2: made some line breaks depend on troff vs. nroff +''' patch2: example of unshift had args backwards +''' +''' Revision 3.0 89/10/18 15:21:37 lwall +''' 3.0 baseline +''' +''' +.PP +Along with the literals and variables mentioned earlier, +the operations in the following section can serve as terms in an expression. +Some of these operations take a LIST as an argument. +Such a list can consist of any combination of scalar arguments or array values; +the array values will be included in the list as if each individual element were +interpolated at that point in the list, forming a longer single-dimensional +array value. +Elements of the LIST should be separated by commas. +If an operation is listed both with and without parentheses around its +arguments, it means you can either use it as a unary operator or +as a function call. +To use it as a function call, the next token on the same line must +be a left parenthesis. +(There may be intervening white space.) +Such a function then has highest precedence, as you would expect from +a function. +If any token other than a left parenthesis follows, then it is a +unary operator, with a precedence depending only on whether it is a LIST +operator or not. +LIST operators have lowest precedence. +All other unary operators have a precedence greater than relational operators +but less than arithmetic operators. +See the section on Precedence. +.Ip "/PATTERN/" 8 4 +See m/PATTERN/. +.Ip "?PATTERN?" 8 4 +This is just like the /pattern/ search, except that it matches only once between +calls to the +.I reset +operator. +This is a useful optimization when you only want to see the first occurrence of +something in each file of a set of files, for instance. +Only ?? patterns local to the current package are reset. +.Ip "accept(NEWSOCKET,GENERICSOCKET)" 8 2 +Does the same thing that the accept system call does. +Returns true if it succeeded, false otherwise. +See example in section on Interprocess Communication. +.Ip "alarm(SECONDS)" 8 4 +.Ip "alarm SECONDS" 8 +Arranges to have a SIGALRM delivered to this process after the specified number +of seconds (minus 1, actually) have elapsed. Thus, alarm(15) will cause +a SIGALRM at some point more than 14 seconds in the future. +Only one timer may be counting at once. Each call disables the previous +timer, and an argument of 0 may be supplied to cancel the previous timer +without starting a new one. +The returned value is the amount of time remaining on the previous timer. +.Ip "atan2(Y,X)" 8 2 +Returns the arctangent of Y/X in the range +.if t \-\(*p to \(*p. +.if n \-PI to PI. +.Ip "bind(SOCKET,NAME)" 8 2 +Does the same thing that the bind system call does. +Returns true if it succeeded, false otherwise. +NAME should be a packed address of the proper type for the socket. +See example in section on Interprocess Communication. +.Ip "binmode(FILEHANDLE)" 8 4 +.Ip "binmode FILEHANDLE" 8 4 +Arranges for the file to be read in \*(L"binary\*(R" mode in operating systems +that distinguish between binary and text files. +Files that are not read in binary mode have CR LF sequences translated +to LF on input and LF translated to CR LF on output. +Binmode has no effect under Unix. +If FILEHANDLE is an expression, the value is taken as the name of +the filehandle. +.Ip "caller(EXPR)" +.Ip "caller" +Returns the context of the current subroutine call: +.nf + + ($package,$filename,$line) = caller; + +.fi +With EXPR, returns some extra information that the debugger uses to print +a stack trace. The value of EXPR indicates how many call frames to go +back before the current one. +.Ip "chdir(EXPR)" 8 2 +.Ip "chdir EXPR" 8 2 +Changes the working directory to EXPR, if possible. +If EXPR is omitted, changes to home directory. +Returns 1 upon success, 0 otherwise. +See example under +.IR die . +.Ip "chmod(LIST)" 8 2 +.Ip "chmod LIST" 8 2 +Changes the permissions of a list of files. +The first element of the list must be the numerical mode. +Returns the number of files successfully changed. +.nf + +.ne 2 + $cnt = chmod 0755, \'foo\', \'bar\'; + chmod 0755, @executables; + +.fi +.Ip "chop(LIST)" 8 7 +.Ip "chop(VARIABLE)" 8 +.Ip "chop VARIABLE" 8 +.Ip "chop" 8 +Chops off the last character of a string and returns the character chopped. +It's used primarily to remove the newline from the end of an input record, +but is much more efficient than s/\en// because it neither scans nor copies +the string. +If VARIABLE is omitted, chops $_. +Example: +.nf + +.ne 5 + while (<>) { + chop; # avoid \en on last field + @array = split(/:/); + .\|.\|. + } + +.fi +You can actually chop anything that's an lvalue, including an assignment: +.nf + + chop($cwd = \`pwd\`); + chop($answer = <STDIN>); + +.fi +If you chop a list, each element is chopped. +Only the value of the last chop is returned. +.Ip "chown(LIST)" 8 2 +.Ip "chown LIST" 8 2 +Changes the owner (and group) of a list of files. +The first two elements of the list must be the NUMERICAL uid and gid, +in that order. +Returns the number of files successfully changed. +.nf + +.ne 2 + $cnt = chown $uid, $gid, \'foo\', \'bar\'; + chown $uid, $gid, @filenames; + +.fi +.ne 23 +Here's an example of looking up non-numeric uids: +.nf + + print "User: "; + $user = <STDIN>; + chop($user); + print "Files: " + $pattern = <STDIN>; + chop($pattern); +.ie t \{\ + open(pass, \'/etc/passwd\') || die "Can't open passwd: $!\en"; +'br\} +.el \{\ + open(pass, \'/etc/passwd\') + || die "Can't open passwd: $!\en"; +'br\} + while (<pass>) { + ($login,$pass,$uid,$gid) = split(/:/); + $uid{$login} = $uid; + $gid{$login} = $gid; + } + @ary = <${pattern}>; # get filenames + if ($uid{$user} eq \'\') { + die "$user not in passwd file"; + } + else { + chown $uid{$user}, $gid{$user}, @ary; + } + +.fi +.Ip "chroot(FILENAME)" 8 5 +.Ip "chroot FILENAME" 8 +Does the same as the system call of that name. +If you don't know what it does, don't worry about it. +If FILENAME is omitted, does chroot to $_. +.Ip "close(FILEHANDLE)" 8 5 +.Ip "close FILEHANDLE" 8 +Closes the file or pipe associated with the file handle. +You don't have to close FILEHANDLE if you are immediately going to +do another open on it, since open will close it for you. +(See +.IR open .) +However, an explicit close on an input file resets the line counter ($.), while +the implicit close done by +.I open +does not. +Also, closing a pipe will wait for the process executing on the pipe to complete, +in case you want to look at the output of the pipe afterwards. +Closing a pipe explicitly also puts the status value of the command into $?. +Example: +.nf + +.ne 4 + open(OUTPUT, \'|sort >foo\'); # pipe to sort + .\|.\|. # print stuff to output + close OUTPUT; # wait for sort to finish + open(INPUT, \'foo\'); # get sort's results + +.fi +FILEHANDLE may be an expression whose value gives the real filehandle name. +.Ip "closedir(DIRHANDLE)" 8 5 +.Ip "closedir DIRHANDLE" 8 +Closes a directory opened by opendir(). +.Ip "connect(SOCKET,NAME)" 8 2 +Does the same thing that the connect system call does. +Returns true if it succeeded, false otherwise. +NAME should be a package address of the proper type for the socket. +See example in section on Interprocess Communication. +.Ip "cos(EXPR)" 8 6 +.Ip "cos EXPR" 8 6 +Returns the cosine of EXPR (expressed in radians). +If EXPR is omitted takes cosine of $_. +.Ip "crypt(PLAINTEXT,SALT)" 8 6 +Encrypts a string exactly like the crypt() function in the C library. +Useful for checking the password file for lousy passwords. +Only the guys wearing white hats should do this. +.Ip "dbmclose(ASSOC_ARRAY)" 8 6 +.Ip "dbmclose ASSOC_ARRAY" 8 +Breaks the binding between a dbm file and an associative array. +The values remaining in the associative array are meaningless unless +you happen to want to know what was in the cache for the dbm file. +This function is only useful if you have ndbm. +.Ip "dbmopen(ASSOC,DBNAME,MODE)" 8 6 +This binds a dbm or ndbm file to an associative array. +ASSOC is the name of the associative array. +(Unlike normal open, the first argument is NOT a filehandle, even though +it looks like one). +DBNAME is the name of the database (without the .dir or .pag extension). +If the database does not exist, it is created with protection specified +by MODE (as modified by the umask). +If your system only supports the older dbm functions, you may only have one +dbmopen in your program. +If your system has neither dbm nor ndbm, calling dbmopen produces a fatal +error. +.Sp +Values assigned to the associative array prior to the dbmopen are lost. +A certain number of values from the dbm file are cached in memory. +By default this number is 64, but you can increase it by preallocating +that number of garbage entries in the associative array before the dbmopen. +You can flush the cache if necessary with the reset command. +.Sp +If you don't have write access to the dbm file, you can only read +associative array variables, not set them. +If you want to test whether you can write, either use file tests or +try setting a dummy array entry inside an eval, which will trap the error. +.Sp +Note that functions such as keys() and values() may return huge array values +when used on large dbm files. +You may prefer to use the each() function to iterate over large dbm files. +Example: +.nf + +.ne 6 + # print out history file offsets + dbmopen(HIST,'/usr/lib/news/history',0666); + while (($key,$val) = each %HIST) { + print $key, ' = ', unpack('L',$val), "\en"; + } + dbmclose(HIST); + +.fi +.Ip "defined(EXPR)" 8 6 +.Ip "defined EXPR" 8 +Returns a boolean value saying whether the lvalue EXPR has a real value +or not. +Many operations return the undefined value under exceptional conditions, +such as end of file, uninitialized variable, system error and such. +This function allows you to distinguish between an undefined null string +and a defined null string with operations that might return a real null +string, in particular referencing elements of an array. +You may also check to see if arrays or subroutines exist. +Use on predefined variables is not guaranteed to produce intuitive results. +Examples: +.nf + +.ne 7 + print if defined $switch{'D'}; + print "$val\en" while defined($val = pop(@ary)); + die "Can't readlink $sym: $!" + unless defined($value = readlink $sym); + eval '@foo = ()' if defined(@foo); + die "No XYZ package defined" unless defined %_XYZ; + sub foo { defined &bar ? &bar(@_) : die "No bar"; } + +.fi +See also undef. +.Ip "delete $ASSOC{KEY}" 8 6 +Deletes the specified value from the specified associative array. +Returns the deleted value, or the undefined value if nothing was deleted. +Deleting from $ENV{} modifies the environment. +Deleting from an array bound to a dbm file deletes the entry from the dbm +file. +.Sp +The following deletes all the values of an associative array: +.nf + +.ne 3 + foreach $key (keys %ARRAY) { + delete $ARRAY{$key}; + } + +.fi +(But it would be faster to use the +.I reset +command. +Saying undef %ARRAY is faster yet.) +.Ip "die(LIST)" 8 +.Ip "die LIST" 8 +Outside of an eval, prints the value of LIST to +.I STDERR +and exits with the current value of $! +(errno). +If $! is 0, exits with the value of ($? >> 8) (\`command\` status). +If ($? >> 8) is 0, exits with 255. +Inside an eval, the error message is stuffed into $@ and the eval is terminated +with the undefined value. +.Sp +Equivalent examples: +.nf + +.ne 3 +.ie t \{\ + die "Can't cd to spool: $!\en" unless chdir \'/usr/spool/news\'; +'br\} +.el \{\ + die "Can't cd to spool: $!\en" + unless chdir \'/usr/spool/news\'; +'br\} + + chdir \'/usr/spool/news\' || die "Can't cd to spool: $!\en" + +.fi +.Sp +If the value of EXPR does not end in a newline, the current script line +number and input line number (if any) are also printed, and a newline is +supplied. +Hint: sometimes appending \*(L", stopped\*(R" to your message will cause it to make +better sense when the string \*(L"at foo line 123\*(R" is appended. +Suppose you are running script \*(L"canasta\*(R". +.nf + +.ne 7 + die "/etc/games is no good"; + die "/etc/games is no good, stopped"; + +produce, respectively + + /etc/games is no good at canasta line 123. + /etc/games is no good, stopped at canasta line 123. + +.fi +See also +.IR exit . +.Ip "do BLOCK" 8 4 +Returns the value of the last command in the sequence of commands indicated +by BLOCK. +When modified by a loop modifier, executes the BLOCK once before testing the +loop condition. +(On other statements the loop modifiers test the conditional first.) +.Ip "do SUBROUTINE (LIST)" 8 3 +Executes a SUBROUTINE declared by a +.I sub +declaration, and returns the value +of the last expression evaluated in SUBROUTINE. +If there is no subroutine by that name, produces a fatal error. +(You may use the \*(L"defined\*(R" operator to determine if a subroutine +exists.) +If you pass arrays as part of LIST you may wish to pass the length +of the array in front of each array. +(See the section on subroutines later on.) +SUBROUTINE may be a scalar variable, in which case the variable contains +the name of the subroutine to execute. +The parentheses are required to avoid confusion with the \*(L"do EXPR\*(R" +form. +.Sp +As an alternate form, you may call a subroutine by prefixing the name with +an ampersand: &foo(@args). +If you aren't passing any arguments, you don't have to use parentheses. +If you omit the parentheses, no @_ array is passed to the subroutine. +The & form is also used to specify subroutines to the defined and undef +operators. +.Ip "do EXPR" 8 3 +Uses the value of EXPR as a filename and executes the contents of the file +as a +.I perl +script. +Its primary use is to include subroutines from a +.I perl +subroutine library. +.nf + + do \'stat.pl\'; + +is just like + + eval \`cat stat.pl\`; + +.fi +except that it's more efficient, more concise, keeps track of the current +filename for error messages, and searches all the +.B \-I +libraries if the file +isn't in the current directory (see also the @INC array in Predefined Names). +It's the same, however, in that it does reparse the file every time you +call it, so if you are going to use the file inside a loop you might prefer +to use \-P and #include, at the expense of a little more startup time. +(The main problem with #include is that cpp doesn't grok # comments\*(--a +workaround is to use \*(L";#\*(R" for standalone comments.) +Note that the following are NOT equivalent: +.nf + +.ne 2 + do $foo; # eval a file + do $foo(); # call a subroutine + +.fi +Note that inclusion of library routines is better done with +the \*(L"require\*(R" operator. +.Ip "dump LABEL" 8 6 +This causes an immediate core dump. +Primarily this is so that you can use the undump program to turn your +core dump into an executable binary after having initialized all your +variables at the beginning of the program. +When the new binary is executed it will begin by executing a "goto LABEL" +(with all the restrictions that goto suffers). +Think of it as a goto with an intervening core dump and reincarnation. +If LABEL is omitted, restarts the program from the top. +WARNING: any files opened at the time of the dump will NOT be open any more +when the program is reincarnated, with possible resulting confusion on the part +of perl. +See also \-u. +.Sp +Example: +.nf + +.ne 16 + #!/usr/bin/perl + require 'getopt.pl'; + require 'stat.pl'; + %days = ( + 'Sun',1, + 'Mon',2, + 'Tue',3, + 'Wed',4, + 'Thu',5, + 'Fri',6, + 'Sat',7); + + dump QUICKSTART if $ARGV[0] eq '-d'; + + QUICKSTART: + do Getopt('f'); + +.fi +.Ip "each(ASSOC_ARRAY)" 8 6 +.Ip "each ASSOC_ARRAY" 8 +Returns a 2 element array consisting of the key and value for the next +value of an associative array, so that you can iterate over it. +Entries are returned in an apparently random order. +When the array is entirely read, a null array is returned (which when +assigned produces a FALSE (0) value). +The next call to each() after that will start iterating again. +The iterator can be reset only by reading all the elements from the array. +You must not modify the array while iterating over it. +There is a single iterator for each associative array, shared by all +each(), keys() and values() function calls in the program. +The following prints out your environment like the printenv program, only +in a different order: +.nf + +.ne 3 + while (($key,$value) = each %ENV) { + print "$key=$value\en"; + } + +.fi +See also keys() and values(). +.Ip "eof(FILEHANDLE)" 8 8 +.Ip "eof()" 8 +.Ip "eof" 8 +Returns 1 if the next read on FILEHANDLE will return end of file, or if +FILEHANDLE is not open. +FILEHANDLE may be an expression whose value gives the real filehandle name. +(Note that this function actually reads a character and then ungetc's it, +so it is not very useful in an interactive context.) +An eof without an argument returns the eof status for the last file read. +Empty parentheses () may be used to indicate the pseudo file formed of the +files listed on the command line, i.e. eof() is reasonable to use inside +a while (<>) loop to detect the end of only the last file. +Use eof(ARGV) or eof without the parentheses to test EACH file in a while (<>) loop. +Examples: +.nf + +.ne 7 + # insert dashes just before last line of last file + while (<>) { + if (eof()) { + print "\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\|\-\en"; + } + print; + } + +.ne 7 + # reset line numbering on each input file + while (<>) { + print "$.\et$_"; + if (eof) { # Not eof(). + close(ARGV); + } + } + +.fi +.Ip "eval(EXPR)" 8 6 +.Ip "eval EXPR" 8 6 +EXPR is parsed and executed as if it were a little +.I perl +program. +It is executed in the context of the current +.I perl +program, so that +any variable settings, subroutine or format definitions remain afterwards. +The value returned is the value of the last expression evaluated, just +as with subroutines. +If there is a syntax error or runtime error, or a die statement is +executed, an undefined value is returned by +eval, and $@ is set to the error message. +If there was no error, $@ is guaranteed to be a null string. +If EXPR is omitted, evaluates $_. +The final semicolon, if any, may be omitted from the expression. +.Sp +Note that, since eval traps otherwise-fatal errors, it is useful for +determining whether a particular feature +(such as dbmopen or symlink) is implemented. +It is also Perl's exception trapping mechanism, where the die operator is +used to raise exceptions. +.Ip "exec(LIST)" 8 8 +.Ip "exec LIST" 8 6 +If there is more than one argument in LIST, or if LIST is an array with +more than one value, +calls execvp() with the arguments in LIST. +If there is only one scalar argument, the argument is checked for shell metacharacters. +If there are any, the entire argument is passed to \*(L"/bin/sh \-c\*(R" for parsing. +If there are none, the argument is split into words and passed directly to +execvp(), which is more efficient. +Note: exec (and system) do not flush your output buffer, so you may need to +set $| to avoid lost output. +Examples: +.nf + + exec \'/bin/echo\', \'Your arguments are: \', @ARGV; + exec "sort $outfile | uniq"; + +.fi +.Sp +If you don't really want to execute the first argument, but want to lie +to the program you are executing about its own name, you can specify +the program you actually want to run by assigning that to a variable and +putting the name of the variable in front of the LIST without a comma. +(This always forces interpretation of the LIST as a multi-valued list, even +if there is only a single scalar in the list.) +Example: +.nf + +.ne 2 + $shell = '/bin/csh'; + exec $shell '-sh'; # pretend it's a login shell + +.fi +.Ip "exit(EXPR)" 8 6 +.Ip "exit EXPR" 8 +Evaluates EXPR and exits immediately with that value. +Example: +.nf + +.ne 2 + $ans = <STDIN>; + exit 0 \|if \|$ans \|=~ \|/\|^[Xx]\|/\|; + +.fi +See also +.IR die . +If EXPR is omitted, exits with 0 status. +.Ip "exp(EXPR)" 8 3 +.Ip "exp EXPR" 8 +Returns +.I e +to the power of EXPR. +If EXPR is omitted, gives exp($_). +.Ip "fcntl(FILEHANDLE,FUNCTION,SCALAR)" 8 4 +Implements the fcntl(2) function. +You'll probably have to say +.nf + + require "fcntl.ph"; # probably /usr/local/lib/perl/fcntl.ph + +.fi +first to get the correct function definitions. +If fcntl.ph doesn't exist or doesn't have the correct definitions +you'll have to roll +your own, based on your C header files such as <sys/fcntl.h>. +(There is a perl script called h2ph that comes with the perl kit +which may help you in this.) +Argument processing and value return works just like ioctl below. +Note that fcntl will produce a fatal error if used on a machine that doesn't implement +fcntl(2). +.Ip "fileno(FILEHANDLE)" 8 4 +.Ip "fileno FILEHANDLE" 8 4 +Returns the file descriptor for a filehandle. +Useful for constructing bitmaps for select(). +If FILEHANDLE is an expression, the value is taken as the name of +the filehandle. +.Ip "flock(FILEHANDLE,OPERATION)" 8 4 +Calls flock(2) on FILEHANDLE. +See manual page for flock(2) for definition of OPERATION. +Returns true for success, false on failure. +Will produce a fatal error if used on a machine that doesn't implement +flock(2). +Here's a mailbox appender for BSD systems. +.nf + +.ne 20 + $LOCK_SH = 1; + $LOCK_EX = 2; + $LOCK_NB = 4; + $LOCK_UN = 8; + + sub lock { + flock(MBOX,$LOCK_EX); + # and, in case someone appended + # while we were waiting... + seek(MBOX, 0, 2); + } + + sub unlock { + flock(MBOX,$LOCK_UN); + } + + open(MBOX, ">>/usr/spool/mail/$ENV{'USER'}") + || die "Can't open mailbox: $!"; + + do lock(); + print MBOX $msg,"\en\en"; + do unlock(); + +.fi +.Ip "fork" 8 4 +Does a fork() call. +Returns the child pid to the parent process and 0 to the child process. +Note: unflushed buffers remain unflushed in both processes, which means +you may need to set $| to avoid duplicate output. +.Ip "getc(FILEHANDLE)" 8 4 +.Ip "getc FILEHANDLE" 8 +.Ip "getc" 8 +Returns the next character from the input file attached to FILEHANDLE, or +a null string at EOF. +If FILEHANDLE is omitted, reads from STDIN. +.Ip "getlogin" 8 3 +Returns the current login from /etc/utmp, if any. +If null, use getpwuid. + + $login = getlogin || (getpwuid($<))[0] || "Somebody"; + +.Ip "getpeername(SOCKET)" 8 3 +Returns the packed sockaddr address of other end of the SOCKET connection. +.nf + +.ne 4 + # An internet sockaddr + $sockaddr = 'S n a4 x8'; + $hersockaddr = getpeername(S); +.ie t \{\ + ($family, $port, $heraddr) = unpack($sockaddr,$hersockaddr); +'br\} +.el \{\ + ($family, $port, $heraddr) = + unpack($sockaddr,$hersockaddr); +'br\} + +.fi +.Ip "getpgrp(PID)" 8 4 +.Ip "getpgrp PID" 8 +Returns the current process group for the specified PID, 0 for the current +process. +Will produce a fatal error if used on a machine that doesn't implement +getpgrp(2). +If EXPR is omitted, returns process group of current process. +.Ip "getppid" 8 4 +Returns the process id of the parent process. +.Ip "getpriority(WHICH,WHO)" 8 4 +Returns the current priority for a process, a process group, or a user. +(See getpriority(2).) +Will produce a fatal error if used on a machine that doesn't implement +getpriority(2). +.Ip "getpwnam(NAME)" 8 +.Ip "getgrnam(NAME)" 8 +.Ip "gethostbyname(NAME)" 8 +.Ip "getnetbyname(NAME)" 8 +.Ip "getprotobyname(NAME)" 8 +.Ip "getpwuid(UID)" 8 +.Ip "getgrgid(GID)" 8 +.Ip "getservbyname(NAME,PROTO)" 8 +.Ip "gethostbyaddr(ADDR,ADDRTYPE)" 8 +.Ip "getnetbyaddr(ADDR,ADDRTYPE)" 8 +.Ip "getprotobynumber(NUMBER)" 8 +.Ip "getservbyport(PORT,PROTO)" 8 +.Ip "getpwent" 8 +.Ip "getgrent" 8 +.Ip "gethostent" 8 +.Ip "getnetent" 8 +.Ip "getprotoent" 8 +.Ip "getservent" 8 +.Ip "setpwent" 8 +.Ip "setgrent" 8 +.Ip "sethostent(STAYOPEN)" 8 +.Ip "setnetent(STAYOPEN)" 8 +.Ip "setprotoent(STAYOPEN)" 8 +.Ip "setservent(STAYOPEN)" 8 +.Ip "endpwent" 8 +.Ip "endgrent" 8 +.Ip "endhostent" 8 +.Ip "endnetent" 8 +.Ip "endprotoent" 8 +.Ip "endservent" 8 +These routines perform the same functions as their counterparts in the +system library. +The return values from the various get routines are as follows: +.nf + + ($name,$passwd,$uid,$gid, + $quota,$comment,$gcos,$dir,$shell) = getpw.\|.\|. + ($name,$passwd,$gid,$members) = getgr.\|.\|. + ($name,$aliases,$addrtype,$length,@addrs) = gethost.\|.\|. + ($name,$aliases,$addrtype,$net) = getnet.\|.\|. + ($name,$aliases,$proto) = getproto.\|.\|. + ($name,$aliases,$port,$proto) = getserv.\|.\|. + +.fi +The $members value returned by getgr.\|.\|. is a space separated list +of the login names of the members of the group. +.Sp +The @addrs value returned by the gethost.\|.\|. functions is a list of the +raw addresses returned by the corresponding system library call. +In the Internet domain, each address is four bytes long and you can unpack +it by saying something like: +.nf + + ($a,$b,$c,$d) = unpack('C4',$addr[0]); + +.fi +.Ip "getsockname(SOCKET)" 8 3 +Returns the packed sockaddr address of this end of the SOCKET connection. +.nf + +.ne 4 + # An internet sockaddr + $sockaddr = 'S n a4 x8'; + $mysockaddr = getsockname(S); +.ie t \{\ + ($family, $port, $myaddr) = unpack($sockaddr,$mysockaddr); +'br\} +.el \{\ + ($family, $port, $myaddr) = + unpack($sockaddr,$mysockaddr); +'br\} + +.fi +.Ip "getsockopt(SOCKET,LEVEL,OPTNAME)" 8 3 +Returns the socket option requested, or undefined if there is an error. +.Ip "gmtime(EXPR)" 8 4 +.Ip "gmtime EXPR" 8 +Converts a time as returned by the time function to a 9-element array with +the time analyzed for the Greenwich timezone. +Typically used as follows: +.nf + +.ne 3 +.ie t \{\ + ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = gmtime(time); +'br\} +.el \{\ + ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = + gmtime(time); +'br\} + +.fi +All array elements are numeric, and come straight out of a struct tm. +In particular this means that $mon has the range 0.\|.11 and $wday has the +range 0.\|.6. +If EXPR is omitted, does gmtime(time). +.Ip "goto LABEL" 8 6 +Finds the statement labeled with LABEL and resumes execution there. +Currently you may only go to statements in the main body of the program +that are not nested inside a do {} construct. +This statement is not implemented very efficiently, and is here only to make +the +.IR sed -to- perl +translator easier. +I may change its semantics at any time, consistent with support for translated +.I sed +scripts. +Use it at your own risk. +Better yet, don't use it at all. +.Ip "grep(EXPR,LIST)" 8 4 +Evaluates EXPR for each element of LIST (locally setting $_ to each element) +and returns the array value consisting of those elements for which the +expression evaluated to true. +In a scalar context, returns the number of times the expression was true. +.nf + + @foo = grep(!/^#/, @bar); # weed out comments + +.fi +Note that, since $_ is a reference into the array value, it can be +used to modify the elements of the array. +While this is useful and supported, it can cause bizarre results if +the LIST is not a named array. +.Ip "hex(EXPR)" 8 4 +.Ip "hex EXPR" 8 +Returns the decimal value of EXPR interpreted as an hex string. +(To interpret strings that might start with 0 or 0x see oct().) +If EXPR is omitted, uses $_. +.Ip "index(STR,SUBSTR,POSITION)" 8 4 +.Ip "index(STR,SUBSTR)" 8 4 +Returns the position of the first occurrence of SUBSTR in STR at or after +POSITION. +If POSITION is omitted, starts searching from the beginning of the string. +The return value is based at 0, or whatever you've +set the $[ variable to. +If the substring is not found, returns one less than the base, ordinarily \-1. +.Ip "int(EXPR)" 8 4 +.Ip "int EXPR" 8 +Returns the integer portion of EXPR. +If EXPR is omitted, uses $_. +.Ip "ioctl(FILEHANDLE,FUNCTION,SCALAR)" 8 4 +Implements the ioctl(2) function. +You'll probably have to say +.nf + + require "ioctl.ph"; # probably /usr/local/lib/perl/ioctl.ph + +.fi +first to get the correct function definitions. +If ioctl.ph doesn't exist or doesn't have the correct definitions +you'll have to roll +your own, based on your C header files such as <sys/ioctl.h>. +(There is a perl script called h2ph that comes with the perl kit +which may help you in this.) +SCALAR will be read and/or written depending on the FUNCTION\*(--a pointer +to the string value of SCALAR will be passed as the third argument of +the actual ioctl call. +(If SCALAR has no string value but does have a numeric value, that value +will be passed rather than a pointer to the string value. +To guarantee this to be true, add a 0 to the scalar before using it.) +The pack() and unpack() functions are useful for manipulating the values +of structures used by ioctl(). +The following example sets the erase character to DEL. +.nf + +.ne 9 + require 'ioctl.ph'; + $sgttyb_t = "ccccs"; # 4 chars and a short + if (ioctl(STDIN,$TIOCGETP,$sgttyb)) { + @ary = unpack($sgttyb_t,$sgttyb); + $ary[2] = 127; + $sgttyb = pack($sgttyb_t,@ary); + ioctl(STDIN,$TIOCSETP,$sgttyb) + || die "Can't ioctl: $!"; + } + +.fi +The return value of ioctl (and fcntl) is as follows: +.nf + +.ne 4 + if OS returns:\h'|3i'perl returns: + -1\h'|3i' undefined value + 0\h'|3i' string "0 but true" + anything else\h'|3i' that number + +.fi +Thus perl returns true on success and false on failure, yet you can still +easily determine the actual value returned by the operating system: +.nf + + ($retval = ioctl(...)) || ($retval = -1); + printf "System returned %d\en", $retval; +.fi +.Ip "join(EXPR,LIST)" 8 8 +.Ip "join(EXPR,ARRAY)" 8 +Joins the separate strings of LIST or ARRAY into a single string with fields +separated by the value of EXPR, and returns the string. +Example: +.nf + +.ie t \{\ + $_ = join(\|\':\', $login,$passwd,$uid,$gid,$gcos,$home,$shell); +'br\} +.el \{\ + $_ = join(\|\':\', + $login,$passwd,$uid,$gid,$gcos,$home,$shell); +'br\} + +.fi +See +.IR split . +.Ip "keys(ASSOC_ARRAY)" 8 6 +.Ip "keys ASSOC_ARRAY" 8 +Returns a normal array consisting of all the keys of the named associative +array. +The keys are returned in an apparently random order, but it is the same order +as either the values() or each() function produces (given that the associative array +has not been modified). +Here is yet another way to print your environment: +.nf + +.ne 5 + @keys = keys %ENV; + @values = values %ENV; + while ($#keys >= 0) { + print pop(@keys), \'=\', pop(@values), "\en"; + } + +or how about sorted by key: + +.ne 3 + foreach $key (sort(keys %ENV)) { + print $key, \'=\', $ENV{$key}, "\en"; + } + +.fi +.Ip "kill(LIST)" 8 8 +.Ip "kill LIST" 8 2 +Sends a signal to a list of processes. +The first element of the list must be the signal to send. +Returns the number of processes successfully signaled. +.nf + + $cnt = kill 1, $child1, $child2; + kill 9, @goners; + +.fi +If the signal is negative, kills process groups instead of processes. +(On System V, a negative \fIprocess\fR number will also kill process groups, +but that's not portable.) +You may use a signal name in quotes. +.Ip "last LABEL" 8 8 +.Ip "last" 8 +The +.I last +command is like the +.I break +statement in C (as used in loops); it immediately exits the loop in question. +If the LABEL is omitted, the command refers to the innermost enclosing loop. +The +.I continue +block, if any, is not executed: +.nf + +.ne 4 + line: while (<STDIN>) { + last line if /\|^$/; # exit when done with header + .\|.\|. + } + +.fi +.Ip "length(EXPR)" 8 4 +.Ip "length EXPR" 8 +Returns the length in characters of the value of EXPR. +If EXPR is omitted, returns length of $_. +.Ip "link(OLDFILE,NEWFILE)" 8 2 +Creates a new filename linked to the old filename. +Returns 1 for success, 0 otherwise. +.Ip "listen(SOCKET,QUEUESIZE)" 8 2 +Does the same thing that the listen system call does. +Returns true if it succeeded, false otherwise. +See example in section on Interprocess Communication. +.Ip "local(LIST)" 8 4 +Declares the listed variables to be local to the enclosing block, +subroutine, eval or \*(L"do\*(R". +All the listed elements must be legal lvalues. +This operator works by saving the current values of those variables in LIST +on a hidden stack and restoring them upon exiting the block, subroutine or eval. +This means that called subroutines can also reference the local variable, +but not the global one. +The LIST may be assigned to if desired, which allows you to initialize +your local variables. +(If no initializer is given for a particular variable, it is created with +an undefined value.) +Commonly this is used to name the parameters to a subroutine. +Examples: +.nf + +.ne 13 + sub RANGEVAL { + local($min, $max, $thunk) = @_; + local($result) = \'\'; + local($i); + + # Presumably $thunk makes reference to $i + + for ($i = $min; $i < $max; $i++) { + $result .= eval $thunk; + } + + $result; + } + +.ne 6 + if ($sw eq \'-v\') { + # init local array with global array + local(@ARGV) = @ARGV; + unshift(@ARGV,\'echo\'); + system @ARGV; + } + # @ARGV restored + +.ne 6 + # temporarily add to digits associative array + if ($base12) { + # (NOTE: not claiming this is efficient!) + local(%digits) = (%digits,'t',10,'e',11); + do parse_num(); + } + +.fi +Note that local() is a run-time command, and so gets executed every time +through a loop, using up more stack storage each time until it's all +released at once when the loop is exited. +.Ip "localtime(EXPR)" 8 4 +.Ip "localtime EXPR" 8 +Converts a time as returned by the time function to a 9-element array with +the time analyzed for the local timezone. +Typically used as follows: +.nf + +.ne 3 +.ie t \{\ + ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time); +'br\} +.el \{\ + ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = + localtime(time); +'br\} + +.fi +All array elements are numeric, and come straight out of a struct tm. +In particular this means that $mon has the range 0.\|.11 and $wday has the +range 0.\|.6. +If EXPR is omitted, does localtime(time). +.Ip "log(EXPR)" 8 4 +.Ip "log EXPR" 8 +Returns logarithm (base +.IR e ) +of EXPR. +If EXPR is omitted, returns log of $_. +.Ip "lstat(FILEHANDLE)" 8 6 +.Ip "lstat FILEHANDLE" 8 +.Ip "lstat(EXPR)" 8 +.Ip "lstat SCALARVARIABLE" 8 +Does the same thing as the stat() function, but stats a symbolic link +instead of the file the symbolic link points to. +If symbolic links are unimplemented on your system, a normal stat is done. +.Ip "m/PATTERN/io" 8 4 +.Ip "/PATTERN/io" 8 +Searches a string for a pattern match, and returns true (1) or false (\'\'). +If no string is specified via the =~ or !~ operator, +the $_ string is searched. +(The string specified with =~ need not be an lvalue\*(--it may be the result of an expression evaluation, but remember the =~ binds rather tightly.) +See also the section on regular expressions. +.Sp +If / is the delimiter then the initial \*(L'm\*(R' is optional. +With the \*(L'm\*(R' you can use any pair of non-alphanumeric characters +as delimiters. +This is particularly useful for matching Unix path names that contain \*(L'/\*(R'. +If the final delimiter is followed by the optional letter \*(L'i\*(R', the matching is +done in a case-insensitive manner. +PATTERN may contain references to scalar variables, which will be interpolated +(and the pattern recompiled) every time the pattern search is evaluated. +(Note that $) and $| may not be interpolated because they look like end-of-string tests.) +If you want such a pattern to be compiled only once, add an \*(L"o\*(R" after +the trailing delimiter. +This avoids expensive run-time recompilations, and +is useful when the value you are interpolating won't change over the +life of the script. +If the PATTERN evaluates to a null string, the most recent successful +regular expression is used instead. +.Sp +If used in a context that requires an array value, a pattern match returns an +array consisting of the subexpressions matched by the parentheses in the +pattern, +i.e. ($1, $2, $3.\|.\|.). +It does NOT actually set $1, $2, etc. in this case, nor does it set $+, $`, $& +or $'. +If the match fails, a null array is returned. +If the match succeeds, but there were no parentheses, an array value of (1) +is returned. +.Sp +Examples: +.nf + +.ne 4 + open(tty, \'/dev/tty\'); + <tty> \|=~ \|/\|^y\|/i \|&& \|do foo(\|); # do foo if desired + + if (/Version: \|*\|([0\-9.]*\|)\|/\|) { $version = $1; } + + next if m#^/usr/spool/uucp#; + +.ne 5 + # poor man's grep + $arg = shift; + while (<>) { + print if /$arg/o; # compile only once + } + + if (($F1, $F2, $Etc) = ($foo =~ /^(\eS+)\es+(\eS+)\es*(.*)/)) + +.fi +This last example splits $foo into the first two words and the remainder +of the line, and assigns those three fields to $F1, $F2 and $Etc. +The conditional is true if any variables were assigned, i.e. if the pattern +matched. +.Ip "mkdir(FILENAME,MODE)" 8 3 +Creates the directory specified by FILENAME, with permissions specified by +MODE (as modified by umask). +If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno). +.Ip "msgctl(ID,CMD,ARG)" 8 4 +Calls the System V IPC function msgctl. If CMD is &IPC_STAT, then ARG +must be a variable which will hold the returned msqid_ds structure. +Returns like ioctl: the undefined value for error, "0 but true" for +zero, or the actual return value otherwise. +.Ip "msgget(KEY,FLAGS)" 8 4 +Calls the System V IPC function msgget. Returns the message queue id, +or the undefined value if there is an error. +.Ip "msgsnd(ID,MSG,FLAGS)" 8 4 +Calls the System V IPC function msgsnd to send the message MSG to the +message queue ID. MSG must begin with the long integer message type, +which may be created with pack("L", $type). Returns true if +successful, or false if there is an error. +.Ip "msgrcv(ID,VAR,SIZE,TYPE,FLAGS)" 8 4 +Calls the System V IPC function msgrcv to receive a message from +message queue ID into variable VAR with a maximum message size of +SIZE. Note that if a message is received, the message type will be +the first thing in VAR, and the maximum length of VAR is SIZE plus the +size of the message type. Returns true if successful, or false if +there is an error. +''' Beginning of part 3 +''' $Header: perl.man,v 4.0 91/03/20 01:38:08 lwall Locked $ +''' +''' $Log: perl.man,v $ +''' Revision 4.0 91/03/20 01:38:08 lwall +''' 4.0 baseline. +''' +''' Revision 3.0.1.12 91/01/11 18:18:15 lwall +''' patch42: added binary and hex pack/unpack options +''' +''' Revision 3.0.1.11 90/11/10 01:48:21 lwall +''' patch38: random cleanup +''' patch38: documented tr///cds +''' +''' Revision 3.0.1.10 90/10/20 02:15:17 lwall +''' patch37: patch37: fixed various typos in man page +''' +''' Revision 3.0.1.9 90/10/16 10:02:43 lwall +''' patch29: you can now read into the middle string +''' patch29: index and substr now have optional 3rd args +''' patch29: added scalar reverse +''' patch29: added scalar +''' patch29: added SysV IPC +''' patch29: added waitpid +''' patch29: added sysread and syswrite +''' +''' Revision 3.0.1.8 90/08/09 04:39:04 lwall +''' patch19: added require operator +''' patch19: added truncate operator +''' patch19: unpack can do checksumming +''' +''' Revision 3.0.1.7 90/08/03 11:15:42 lwall +''' patch19: Intermediate diffs for Randal +''' +''' Revision 3.0.1.6 90/03/27 16:17:56 lwall +''' patch16: MSDOS support +''' +''' Revision 3.0.1.5 90/03/12 16:52:21 lwall +''' patch13: documented that print $filehandle &foo is ambiguous +''' patch13: added splice operator: @oldelems = splice(@array,$offset,$len,LIST) +''' +''' Revision 3.0.1.4 90/02/28 18:00:09 lwall +''' patch9: added pipe function +''' patch9: documented how to handle arbitrary weird characters in filenames +''' patch9: documented the unflushed buffers problem on piped opens +''' patch9: documented how to force top of page +''' +''' Revision 3.0.1.3 89/12/21 20:10:12 lwall +''' patch7: documented that s`pat`repl` does command substitution on replacement +''' patch7: documented that $timeleft from select() is likely not implemented +''' +''' Revision 3.0.1.2 89/11/17 15:31:05 lwall +''' patch5: fixed some manual typos and indent problems +''' patch5: added warning about print making an array context +''' +''' Revision 3.0.1.1 89/11/11 04:45:06 lwall +''' patch2: made some line breaks depend on troff vs. nroff +''' +''' Revision 3.0 89/10/18 15:21:46 lwall +''' 3.0 baseline +''' +.Ip "next LABEL" 8 8 +.Ip "next" 8 +The +.I next +command is like the +.I continue +statement in C; it starts the next iteration of the loop: +.nf + +.ne 4 + line: while (<STDIN>) { + next line if /\|^#/; # discard comments + .\|.\|. + } + +.fi +Note that if there were a +.I continue +block on the above, it would get executed even on discarded lines. +If the LABEL is omitted, the command refers to the innermost enclosing loop. +.Ip "oct(EXPR)" 8 4 +.Ip "oct EXPR" 8 +Returns the decimal value of EXPR interpreted as an octal string. +(If EXPR happens to start off with 0x, interprets it as a hex string instead.) +The following will handle decimal, octal and hex in the standard notation: +.nf + + $val = oct($val) if $val =~ /^0/; + +.fi +If EXPR is omitted, uses $_. +.Ip "open(FILEHANDLE,EXPR)" 8 8 +.Ip "open(FILEHANDLE)" 8 +.Ip "open FILEHANDLE" 8 +Opens the file whose filename is given by EXPR, and associates it with +FILEHANDLE. +If FILEHANDLE is an expression, its value is used as the name of the +real filehandle wanted. +If EXPR is omitted, the scalar variable of the same name as the FILEHANDLE +contains the filename. +If the filename begins with \*(L"<\*(R" or nothing, the file is opened for +input. +If the filename begins with \*(L">\*(R", the file is opened for output. +If the filename begins with \*(L">>\*(R", the file is opened for appending. +(You can put a \'+\' in front of the \'>\' or \'<\' to indicate that you +want both read and write access to the file.) +If the filename begins with \*(L"|\*(R", the filename is interpreted +as a command to which output is to be piped, and if the filename ends +with a \*(L"|\*(R", the filename is interpreted as command which pipes +input to us. +(You may not have a command that pipes both in and out.) +Opening \'\-\' opens +.I STDIN +and opening \'>\-\' opens +.IR STDOUT . +Open returns non-zero upon success, the undefined value otherwise. +If the open involved a pipe, the return value happens to be the pid +of the subprocess. +Examples: +.nf + +.ne 3 + $article = 100; + open article || die "Can't find article $article: $!\en"; + while (<article>) {\|.\|.\|. + +.ie t \{\ + open(LOG, \'>>/usr/spool/news/twitlog\'\|); # (log is reserved) +'br\} +.el \{\ + open(LOG, \'>>/usr/spool/news/twitlog\'\|); + # (log is reserved) +'br\} + +.ie t \{\ + open(article, "caesar <$article |"\|); # decrypt article +'br\} +.el \{\ + open(article, "caesar <$article |"\|); + # decrypt article +'br\} + +.ie t \{\ + open(extract, "|sort >/tmp/Tmp$$"\|); # $$ is our process# +'br\} +.el \{\ + open(extract, "|sort >/tmp/Tmp$$"\|); + # $$ is our process# +'br\} + +.ne 7 + # process argument list of files along with any includes + + foreach $file (@ARGV) { + do process($file, \'fh00\'); # no pun intended + } + + sub process { + local($filename, $input) = @_; + $input++; # this is a string increment + unless (open($input, $filename)) { + print STDERR "Can't open $filename: $!\en"; + return; + } +.ie t \{\ + while (<$input>) { # note the use of indirection +'br\} +.el \{\ + while (<$input>) { # note use of indirection +'br\} + if (/^#include "(.*)"/) { + do process($1, $input); + next; + } + .\|.\|. # whatever + } + } + +.fi +You may also, in the Bourne shell tradition, specify an EXPR beginning +with \*(L">&\*(R", in which case the rest of the string +is interpreted as the name of a filehandle +(or file descriptor, if numeric) which is to be duped and opened. +You may use & after >, >>, <, +>, +>> and +<. +The mode you specify should match the mode of the original filehandle. +Here is a script that saves, redirects, and restores +.I STDOUT +and +.IR STDERR : +.nf + +.ne 21 + #!/usr/bin/perl + open(SAVEOUT, ">&STDOUT"); + open(SAVEERR, ">&STDERR"); + + open(STDOUT, ">foo.out") || die "Can't redirect stdout"; + open(STDERR, ">&STDOUT") || die "Can't dup stdout"; + + select(STDERR); $| = 1; # make unbuffered + select(STDOUT); $| = 1; # make unbuffered + + print STDOUT "stdout 1\en"; # this works for + print STDERR "stderr 1\en"; # subprocesses too + + close(STDOUT); + close(STDERR); + + open(STDOUT, ">&SAVEOUT"); + open(STDERR, ">&SAVEERR"); + + print STDOUT "stdout 2\en"; + print STDERR "stderr 2\en"; + +.fi +If you open a pipe on the command \*(L"\-\*(R", i.e. either \*(L"|\-\*(R" or \*(L"\-|\*(R", +then there is an implicit fork done, and the return value of open +is the pid of the child within the parent process, and 0 within the child +process. +(Use defined($pid) to determine if the open was successful.) +The filehandle behaves normally for the parent, but i/o to that +filehandle is piped from/to the +.IR STDOUT / STDIN +of the child process. +In the child process the filehandle isn't opened\*(--i/o happens from/to +the new +.I STDOUT +or +.IR STDIN . +Typically this is used like the normal piped open when you want to exercise +more control over just how the pipe command gets executed, such as when +you are running setuid, and don't want to have to scan shell commands +for metacharacters. +The following pairs are more or less equivalent: +.nf + +.ne 5 + open(FOO, "|tr \'[a\-z]\' \'[A\-Z]\'"); + open(FOO, "|\-") || exec \'tr\', \'[a\-z]\', \'[A\-Z]\'; + + open(FOO, "cat \-n '$file'|"); + open(FOO, "\-|") || exec \'cat\', \'\-n\', $file; + +.fi +Explicitly closing any piped filehandle causes the parent process to wait for the +child to finish, and returns the status value in $?. +Note: on any operation which may do a fork, +unflushed buffers remain unflushed in both +processes, which means you may need to set $| to +avoid duplicate output. +.Sp +The filename that is passed to open will have leading and trailing +whitespace deleted. +In order to open a file with arbitrary weird characters in it, it's necessary +to protect any leading and trailing whitespace thusly: +.nf + +.ne 2 + $file =~ s#^(\es)#./$1#; + open(FOO, "< $file\e0"); + +.fi +.Ip "opendir(DIRHANDLE,EXPR)" 8 3 +Opens a directory named EXPR for processing by readdir(), telldir(), seekdir(), +rewinddir() and closedir(). +Returns true if successful. +DIRHANDLEs have their own namespace separate from FILEHANDLEs. +.Ip "ord(EXPR)" 8 4 +.Ip "ord EXPR" 8 +Returns the numeric ascii value of the first character of EXPR. +If EXPR is omitted, uses $_. +''' Comments on f & d by gnb@melba.bby.oz.au 22/11/89 +.Ip "pack(TEMPLATE,LIST)" 8 4 +Takes an array or list of values and packs it into a binary structure, +returning the string containing the structure. +The TEMPLATE is a sequence of characters that give the order and type +of values, as follows: +.nf + + A An ascii string, will be space padded. + a An ascii string, will be null padded. + c A signed char value. + C An unsigned char value. + s A signed short value. + S An unsigned short value. + i A signed integer value. + I An unsigned integer value. + l A signed long value. + L An unsigned long value. + n A short in \*(L"network\*(R" order. + N A long in \*(L"network\*(R" order. + f A single-precision float in the native format. + d A double-precision float in the native format. + p A pointer to a string. + x A null byte. + X Back up a byte. + @ Null fill to absolute position. + u A uuencoded string. + b A bit string (ascending bit order, like vec()). + B A bit string (descending bit order). + h A hex string (low nybble first). + H A hex string (high nybble first). + +.fi +Each letter may optionally be followed by a number which gives a repeat +count. +With all types except "a", "A", "b", "B", "h" and "H", +the pack function will gobble up that many values +from the LIST. +A * for the repeat count means to use however many items are left. +The "a" and "A" types gobble just one value, but pack it as a string of length +count, +padding with nulls or spaces as necessary. +(When unpacking, "A" strips trailing spaces and nulls, but "a" does not.) +Likewise, the "b" and "B" fields pack a string that many bits long. +The "h" and "H" fields pack a string that many nybbles long. +Real numbers (floats and doubles) are in the native machine format +only; due to the multiplicity of floating formats around, and the lack +of a standard \*(L"network\*(R" representation, no facility for +interchange has been made. +This means that packed floating point data +written on one machine may not be readable on another - even if both +use IEEE floating point arithmetic (as the endian-ness of the memory +representation is not part of the IEEE spec). +Note that perl uses +doubles internally for all numeric calculation, and converting from +double -> float -> double will lose precision (i.e. unpack("f", +pack("f", $foo)) will not in general equal $foo). +.br +Examples: +.nf + + $foo = pack("cccc",65,66,67,68); + # foo eq "ABCD" + $foo = pack("c4",65,66,67,68); + # same thing + + $foo = pack("ccxxcc",65,66,67,68); + # foo eq "AB\e0\e0CD" + + $foo = pack("s2",1,2); + # "\e1\e0\e2\e0" on little-endian + # "\e0\e1\e0\e2" on big-endian + + $foo = pack("a4","abcd","x","y","z"); + # "abcd" + + $foo = pack("aaaa","abcd","x","y","z"); + # "axyz" + + $foo = pack("a14","abcdefg"); + # "abcdefg\e0\e0\e0\e0\e0\e0\e0" + + $foo = pack("i9pl", gmtime); + # a real struct tm (on my system anyway) + + sub bintodec { + unpack("N", pack("B32", substr("0" x 32 . shift, -32))); + } +.fi +The same template may generally also be used in the unpack function. +.Ip "pipe(READHANDLE,WRITEHANDLE)" 8 3 +Opens a pair of connected pipes like the corresponding system call. +Note that if you set up a loop of piped processes, deadlock can occur +unless you are very careful. +In addition, note that perl's pipes use stdio buffering, so you may need +to set $| to flush your WRITEHANDLE after each command, depending on +the application. +[Requires version 3.0 patchlevel 9.] +.Ip "pop(ARRAY)" 8 +.Ip "pop ARRAY" 8 6 +Pops and returns the last value of the array, shortening the array by 1. +Has the same effect as +.nf + + $tmp = $ARRAY[$#ARRAY\-\|\-]; + +.fi +If there are no elements in the array, returns the undefined value. +.Ip "print(FILEHANDLE LIST)" 8 10 +.Ip "print(LIST)" 8 +.Ip "print FILEHANDLE LIST" 8 +.Ip "print LIST" 8 +.Ip "print" 8 +Prints a string or a comma-separated list of strings. +Returns non-zero if successful. +FILEHANDLE may be a scalar variable name, in which case the variable contains +the name of the filehandle, thus introducing one level of indirection. +(NOTE: If FILEHANDLE is a variable and the next token is a term, it may be +misinterpreted as an operator unless you interpose a + or put parens around +the arguments.) +If FILEHANDLE is omitted, prints by default to standard output (or to the +last selected output channel\*(--see select()). +If LIST is also omitted, prints $_ to +.IR STDOUT . +To set the default output channel to something other than +.I STDOUT +use the select operation. +Note that, because print takes a LIST, anything in the LIST is evaluated +in an array context, and any subroutine that you call will have one or more +of its expressions evaluated in an array context. +Also be careful not to follow the print keyword with a left parenthesis +unless you want the corresponding right parenthesis to terminate the +arguments to the print\*(--interpose a + or put parens around all the arguments. +.Ip "printf(FILEHANDLE LIST)" 8 10 +.Ip "printf(LIST)" 8 +.Ip "printf FILEHANDLE LIST" 8 +.Ip "printf LIST" 8 +Equivalent to a \*(L"print FILEHANDLE sprintf(LIST)\*(R". +.Ip "push(ARRAY,LIST)" 8 7 +Treats ARRAY (@ is optional) as a stack, and pushes the values of LIST +onto the end of ARRAY. +The length of ARRAY increases by the length of LIST. +Has the same effect as +.nf + + for $value (LIST) { + $ARRAY[++$#ARRAY] = $value; + } + +.fi +but is more efficient. +.Ip "q/STRING/" 8 5 +.Ip "qq/STRING/" 8 +.Ip "qx/STRING/" 8 +These are not really functions, but simply syntactic sugar to let you +avoid putting too many backslashes into quoted strings. +The q operator is a generalized single quote, and the qq operator a +generalized double quote. +The qx operator is a generalized backquote. +Any non-alphanumeric delimiter can be used in place of /, including newline. +If the delimiter is an opening bracket or parenthesis, the final delimiter +will be the corresponding closing bracket or parenthesis. +(Embedded occurrences of the closing bracket need to be backslashed as usual.) +Examples: +.nf + +.ne 5 + $foo = q!I said, "You said, \'She said it.\'"!; + $bar = q(\'This is it.\'); + $today = qx{ date }; + $_ .= qq +*** The previous line contains the naughty word "$&".\en + if /(ibm|apple|awk)/; # :-) + +.fi +.Ip "rand(EXPR)" 8 8 +.Ip "rand EXPR" 8 +.Ip "rand" 8 +Returns a random fractional number between 0 and the value of EXPR. +(EXPR should be positive.) +If EXPR is omitted, returns a value between 0 and 1. +See also srand(). +.Ip "read(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5 +.Ip "read(FILEHANDLE,SCALAR,LENGTH)" 8 5 +Attempts to read LENGTH bytes of data into variable SCALAR from the specified +FILEHANDLE. +Returns the number of bytes actually read, or undef if there was an error. +SCALAR will be grown or shrunk to the length actually read. +An OFFSET may be specified to place the read data at some other place +than the beginning of the string. +This call is actually implemented in terms of stdio's fread call. To get +a true read system call, see sysread. +.Ip "readdir(DIRHANDLE)" 8 3 +.Ip "readdir DIRHANDLE" 8 +Returns the next directory entry for a directory opened by opendir(). +If used in an array context, returns all the rest of the entries in the +directory. +If there are no more entries, returns an undefined value in a scalar context +or a null list in an array context. +.Ip "readlink(EXPR)" 8 6 +.Ip "readlink EXPR" 8 +Returns the value of a symbolic link, if symbolic links are implemented. +If not, gives a fatal error. +If there is some system error, returns the undefined value and sets $! (errno). +If EXPR is omitted, uses $_. +.Ip "recv(SOCKET,SCALAR,LEN,FLAGS)" 8 4 +Receives a message on a socket. +Attempts to receive LENGTH bytes of data into variable SCALAR from the specified +SOCKET filehandle. +Returns the address of the sender, or the undefined value if there's an error. +SCALAR will be grown or shrunk to the length actually read. +Takes the same flags as the system call of the same name. +.Ip "redo LABEL" 8 8 +.Ip "redo" 8 +The +.I redo +command restarts the loop block without evaluating the conditional again. +The +.I continue +block, if any, is not executed. +If the LABEL is omitted, the command refers to the innermost enclosing loop. +This command is normally used by programs that want to lie to themselves +about what was just input: +.nf + +.ne 16 + # a simpleminded Pascal comment stripper + # (warning: assumes no { or } in strings) + line: while (<STDIN>) { + while (s|\|({.*}.*\|){.*}|$1 \||) {} + s|{.*}| \||; + if (s|{.*| \||) { + $front = $_; + while (<STDIN>) { + if (\|/\|}/\|) { # end of comment? + s|^|$front{|; + redo line; + } + } + } + print; + } + +.fi +.Ip "rename(OLDNAME,NEWNAME)" 8 2 +Changes the name of a file. +Returns 1 for success, 0 otherwise. +Will not work across filesystem boundaries. +.Ip "require(EXPR)" 8 6 +.Ip "require EXPR" 8 +.Ip "require" 8 +Includes the library file specified by EXPR, or by $_ if EXPR is not supplied. +Has semantics similar to the following subroutine: +.nf + + sub require { + local($filename) = @_; + return 1 if $INC{$filename}; + local($realfilename,$result); + ITER: { + foreach $prefix (@INC) { + $realfilename = "$prefix/$filename"; + if (-f $realfilename) { + $result = do $realfilename; + last ITER; + } + } + die "Can't find $filename in \e@INC"; + } + die $@ if $@; + die "$filename did not return true value" unless $result; + $INC{$filename} = $realfilename; + $result; + } + +.fi +Note that the file will not be included twice under the same specified name. +.Ip "reset(EXPR)" 8 6 +.Ip "reset EXPR" 8 +.Ip "reset" 8 +Generally used in a +.I continue +block at the end of a loop to clear variables and reset ?? searches +so that they work again. +The expression is interpreted as a list of single characters (hyphens allowed +for ranges). +All variables and arrays beginning with one of those letters are reset to +their pristine state. +If the expression is omitted, one-match searches (?pattern?) are reset to +match again. +Only resets variables or searches in the current package. +Always returns 1. +Examples: +.nf + +.ne 3 + reset \'X\'; \h'|2i'# reset all X variables + reset \'a\-z\';\h'|2i'# reset lower case variables + reset; \h'|2i'# just reset ?? searches + +.fi +Note: resetting \*(L"A\-Z\*(R" is not recommended since you'll wipe out your ARGV and ENV +arrays. +.Sp +The use of reset on dbm associative arrays does not change the dbm file. +(It does, however, flush any entries cached by perl, which may be useful if +you are sharing the dbm file. +Then again, maybe not.) +.Ip "return LIST" 8 3 +Returns from a subroutine with the value specified. +(Note that a subroutine can automatically return +the value of the last expression evaluated. +That's the preferred method\*(--use of an explicit +.I return +is a bit slower.) +.Ip "reverse(LIST)" 8 4 +.Ip "reverse LIST" 8 +In an array context, returns an array value consisting of the elements +of LIST in the opposite order. +In a scalar context, returns a string value consisting of the bytes of +the first element of LIST in the opposite order. +.Ip "rewinddir(DIRHANDLE)" 8 5 +.Ip "rewinddir DIRHANDLE" 8 +Sets the current position to the beginning of the directory for the readdir() routine on DIRHANDLE. +.Ip "rindex(STR,SUBSTR,POSITION)" 8 6 +.Ip "rindex(STR,SUBSTR)" 8 4 +Works just like index except that it +returns the position of the LAST occurrence of SUBSTR in STR. +If POSITION is specified, returns the last occurrence at or before that +position. +.Ip "rmdir(FILENAME)" 8 4 +.Ip "rmdir FILENAME" 8 +Deletes the directory specified by FILENAME if it is empty. +If it succeeds it returns 1, otherwise it returns 0 and sets $! (errno). +If FILENAME is omitted, uses $_. +.Ip "s/PATTERN/REPLACEMENT/gieo" 8 3 +Searches a string for a pattern, and if found, replaces that pattern with the +replacement text and returns the number of substitutions made. +Otherwise it returns false (0). +The \*(L"g\*(R" is optional, and if present, indicates that all occurrences +of the pattern are to be replaced. +The \*(L"i\*(R" is also optional, and if present, indicates that matching +is to be done in a case-insensitive manner. +The \*(L"e\*(R" is likewise optional, and if present, indicates that +the replacement string is to be evaluated as an expression rather than just +as a double-quoted string. +Any non-alphanumeric delimiter may replace the slashes; +if single quotes are used, no +interpretation is done on the replacement string (the e modifier overrides +this, however); if backquotes are used, the replacement string is a command +to execute whose output will be used as the actual replacement text. +If no string is specified via the =~ or !~ operator, +the $_ string is searched and modified. +(The string specified with =~ must be a scalar variable, an array element, +or an assignment to one of those, i.e. an lvalue.) +If the pattern contains a $ that looks like a variable rather than an +end-of-string test, the variable will be interpolated into the pattern at +run-time. +If you only want the pattern compiled once the first time the variable is +interpolated, add an \*(L"o\*(R" at the end. +If the PATTERN evaluates to a null string, the most recent successful +regular expression is used instead. +See also the section on regular expressions. +Examples: +.nf + + s/\|\e\|bgreen\e\|b/mauve/g; # don't change wintergreen + + $path \|=~ \|s|\|/usr/bin|\|/usr/local/bin|; + + s/Login: $foo/Login: $bar/; # run-time pattern + + ($foo = $bar) =~ s/bar/foo/; + + $_ = \'abc123xyz\'; + s/\ed+/$&*2/e; # yields \*(L'abc246xyz\*(R' + s/\ed+/sprintf("%5d",$&)/e; # yields \*(L'abc 246xyz\*(R' + s/\ew/$& x 2/eg; # yields \*(L'aabbcc 224466xxyyzz\*(R' + + s/\|([^ \|]*\|) *\|([^ \|]*\|)\|/\|$2 $1/; # reverse 1st two fields + +.fi +(Note the use of $ instead of \|\e\| in the last example. See section +on regular expressions.) +.Ip "scalar(EXPR)" 8 3 +Forces EXPR to be interpreted in a scalar context and returns the value +of EXPR. +.Ip "seek(FILEHANDLE,POSITION,WHENCE)" 8 3 +Randomly positions the file pointer for FILEHANDLE, just like the fseek() +call of stdio. +FILEHANDLE may be an expression whose value gives the name of the filehandle. +Returns 1 upon success, 0 otherwise. +.Ip "seekdir(DIRHANDLE,POS)" 8 3 +Sets the current position for the readdir() routine on DIRHANDLE. +POS must be a value returned by telldir(). +Has the same caveats about possible directory compaction as the corresponding +system library routine. +.Ip "select(FILEHANDLE)" 8 3 +.Ip "select" 8 3 +Returns the currently selected filehandle. +Sets the current default filehandle for output, if FILEHANDLE is supplied. +This has two effects: first, a +.I write +or a +.I print +without a filehandle will default to this FILEHANDLE. +Second, references to variables related to output will refer to this output +channel. +For example, if you have to set the top of form format for more than +one output channel, you might do the following: +.nf + +.ne 4 + select(REPORT1); + $^ = \'report1_top\'; + select(REPORT2); + $^ = \'report2_top\'; + +.fi +FILEHANDLE may be an expression whose value gives the name of the actual filehandle. +Thus: +.nf + + $oldfh = select(STDERR); $| = 1; select($oldfh); + +.fi +.Ip "select(RBITS,WBITS,EBITS,TIMEOUT)" 8 3 +This calls the select system call with the bitmasks specified, which can +be constructed using fileno() and vec(), along these lines: +.nf + + $rin = $win = $ein = ''; + vec($rin,fileno(STDIN),1) = 1; + vec($win,fileno(STDOUT),1) = 1; + $ein = $rin | $win; + +.fi +If you want to select on many filehandles you might wish to write a subroutine: +.nf + + sub fhbits { + local(@fhlist) = split(' ',$_[0]); + local($bits); + for (@fhlist) { + vec($bits,fileno($_),1) = 1; + } + $bits; + } + $rin = &fhbits('STDIN TTY SOCK'); + +.fi +The usual idiom is: +.nf + + ($nfound,$timeleft) = + select($rout=$rin, $wout=$win, $eout=$ein, $timeout); + +or to block until something becomes ready: + +.ie t \{\ + $nfound = select($rout=$rin, $wout=$win, $eout=$ein, undef); +'br\} +.el \{\ + $nfound = select($rout=$rin, $wout=$win, + $eout=$ein, undef); +'br\} + +.fi +Any of the bitmasks can also be undef. +The timeout, if specified, is in seconds, which may be fractional. +NOTE: not all implementations are capable of returning the $timeleft. +If not, they always return $timeleft equal to the supplied $timeout. +.Ip "semctl(ID,SEMNUM,CMD,ARG)" 8 4 +Calls the System V IPC function semctl. If CMD is &IPC_STAT or +&GETALL, then ARG must be a variable which will hold the returned +semid_ds structure or semaphore value array. Returns like ioctl: the +undefined value for error, "0 but true" for zero, or the actual return +value otherwise. +.Ip "semget(KEY,NSEMS,SIZE,FLAGS)" 8 4 +Calls the System V IPC function semget. Returns the semaphore id, or +the undefined value if there is an error. +.Ip "semop(KEY,OPSTRING)" 8 4 +Calls the System V IPC function semop to perform semaphore operations +such as signaling and waiting. OPSTRING must be a packed array of +semop structures. Each semop structure can be generated with +\&'pack("sss", $semnum, $semop, $semflag)'. The number of semaphore +operations is implied by the length of OPSTRING. Returns true if +successful, or false if there is an error. As an example, the +following code waits on semaphore $semnum of semaphore id $semid: +.nf + + $semop = pack("sss", $semnum, -1, 0); + die "Semaphore trouble: $!\en" unless semop($semid, $semop); + +.fi +To signal the semaphore, replace "-1" with "1". +.Ip "send(SOCKET,MSG,FLAGS,TO)" 8 4 +.Ip "send(SOCKET,MSG,FLAGS)" 8 +Sends a message on a socket. +Takes the same flags as the system call of the same name. +On unconnected sockets you must specify a destination to send TO. +Returns the number of characters sent, or the undefined value if +there is an error. +.Ip "setpgrp(PID,PGRP)" 8 4 +Sets the current process group for the specified PID, 0 for the current +process. +Will produce a fatal error if used on a machine that doesn't implement +setpgrp(2). +.Ip "setpriority(WHICH,WHO,PRIORITY)" 8 4 +Sets the current priority for a process, a process group, or a user. +(See setpriority(2).) +Will produce a fatal error if used on a machine that doesn't implement +setpriority(2). +.Ip "setsockopt(SOCKET,LEVEL,OPTNAME,OPTVAL)" 8 3 +Sets the socket option requested. +Returns undefined if there is an error. +OPTVAL may be specified as undef if you don't want to pass an argument. +.Ip "shift(ARRAY)" 8 6 +.Ip "shift ARRAY" 8 +.Ip "shift" 8 +Shifts the first value of the array off and returns it, +shortening the array by 1 and moving everything down. +If there are no elements in the array, returns the undefined value. +If ARRAY is omitted, shifts the @ARGV array in the main program, and the @_ +array in subroutines. +(This is determined lexically.) +See also unshift(), push() and pop(). +Shift() and unshift() do the same thing to the left end of an array that push() +and pop() do to the right end. +.Ip "shmctl(ID,CMD,ARG)" 8 4 +Calls the System V IPC function shmctl. If CMD is &IPC_STAT, then ARG +must be a variable which will hold the returned shmid_ds structure. +Returns like ioctl: the undefined value for error, "0 but true" for +zero, or the actual return value otherwise. +.Ip "shmget(KEY,SIZE,FLAGS)" 8 4 +Calls the System V IPC function shmget. Returns the shared memory +segment id, or the undefined value if there is an error. +.Ip "shmread(ID,VAR,POS,SIZE)" 8 4 +.Ip "shmwrite(ID,STRING,POS,SIZE)" 8 +Reads or writes the System V shared memory segment ID starting at +position POS for size SIZE by attaching to it, copying in/out, and +detaching from it. When reading, VAR must be a variable which +will hold the data read. When writing, if STRING is too long, +only SIZE bytes are used; if STRING is too short, nulls are +written to fill out SIZE bytes. Return true if successful, or +false if there is an error. +.Ip "shutdown(SOCKET,HOW)" 8 3 +Shuts down a socket connection in the manner indicated by HOW, which has +the same interpretation as in the system call of the same name. +.Ip "sin(EXPR)" 8 4 +.Ip "sin EXPR" 8 +Returns the sine of EXPR (expressed in radians). +If EXPR is omitted, returns sine of $_. +.Ip "sleep(EXPR)" 8 6 +.Ip "sleep EXPR" 8 +.Ip "sleep" 8 +Causes the script to sleep for EXPR seconds, or forever if no EXPR. +May be interrupted by sending the process a SIGALARM. +Returns the number of seconds actually slept. +.Ip "socket(SOCKET,DOMAIN,TYPE,PROTOCOL)" 8 3 +Opens a socket of the specified kind and attaches it to filehandle SOCKET. +DOMAIN, TYPE and PROTOCOL are specified the same as for the system call +of the same name. +You may need to run h2ph on sys/socket.h to get the proper values handy +in a perl library file. +Return true if successful. +See the example in the section on Interprocess Communication. +.Ip "socketpair(SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL)" 8 3 +Creates an unnamed pair of sockets in the specified domain, of the specified +type. +DOMAIN, TYPE and PROTOCOL are specified the same as for the system call +of the same name. +If unimplemented, yields a fatal error. +Return true if successful. +.Ip "sort(SUBROUTINE LIST)" 8 9 +.Ip "sort(LIST)" 8 +.Ip "sort SUBROUTINE LIST" 8 +.Ip "sort LIST" 8 +Sorts the LIST and returns the sorted array value. +Nonexistent values of arrays are stripped out. +If SUBROUTINE is omitted, sorts in standard string comparison order. +If SUBROUTINE is specified, gives the name of a subroutine that returns +an integer less than, equal to, or greater than 0, +depending on how the elements of the array are to be ordered. +In the interests of efficiency the normal calling code for subroutines +is bypassed, with the following effects: the subroutine may not be a recursive +subroutine, and the two elements to be compared are passed into the subroutine +not via @_ but as $a and $b (see example below). +They are passed by reference so don't modify $a and $b. +SUBROUTINE may be a scalar variable name, in which case the value provides +the name of the subroutine to use. +Examples: +.nf + +.ne 4 + sub byage { + $age{$a} - $age{$b}; # presuming integers + } + @sortedclass = sort byage @class; + +.ne 9 + sub reverse { $a lt $b ? 1 : $a gt $b ? \-1 : 0; } + @harry = (\'dog\',\'cat\',\'x\',\'Cain\',\'Abel\'); + @george = (\'gone\',\'chased\',\'yz\',\'Punished\',\'Axed\'); + print sort @harry; + # prints AbelCaincatdogx + print sort reverse @harry; + # prints xdogcatCainAbel + print sort @george, \'to\', @harry; + # prints AbelAxedCainPunishedcatchaseddoggonetoxyz + +.fi +.Ip "splice(ARRAY,OFFSET,LENGTH,LIST)" 8 8 +.Ip "splice(ARRAY,OFFSET,LENGTH)" 8 +.Ip "splice(ARRAY,OFFSET)" 8 +Removes the elements designated by OFFSET and LENGTH from an array, and +replaces them with the elements of LIST, if any. +Returns the elements removed from the array. +The array grows or shrinks as necessary. +If LENGTH is omitted, removes everything from OFFSET onward. +The following equivalencies hold (assuming $[ == 0): +.nf + + push(@a,$x,$y)\h'|3.5i'splice(@a,$#a+1,0,$x,$y) + pop(@a)\h'|3.5i'splice(@a,-1) + shift(@a)\h'|3.5i'splice(@a,0,1) + unshift(@a,$x,$y)\h'|3.5i'splice(@a,0,0,$x,$y) + $a[$x] = $y\h'|3.5i'splice(@a,$x,1,$y); + +Example, assuming array lengths are passed before arrays: + + sub aeq { # compare two array values + local(@a) = splice(@_,0,shift); + local(@b) = splice(@_,0,shift); + return 0 unless @a == @b; # same len? + while (@a) { + return 0 if pop(@a) ne pop(@b); + } + return 1; + } + if (&aeq($len,@foo[1..$len],0+@bar,@bar)) { ... } + +.fi +.Ip "split(/PATTERN/,EXPR,LIMIT)" 8 8 +.Ip "split(/PATTERN/,EXPR)" 8 8 +.Ip "split(/PATTERN/)" 8 +.Ip "split" 8 +Splits a string into an array of strings, and returns it. +(If not in an array context, returns the number of fields found and splits +into the @_ array. +(In an array context, you can force the split into @_ +by using ?? as the pattern delimiters, but it still returns the array value.)) +If EXPR is omitted, splits the $_ string. +If PATTERN is also omitted, splits on whitespace (/[\ \et\en]+/). +Anything matching PATTERN is taken to be a delimiter separating the fields. +(Note that the delimiter may be longer than one character.) +If LIMIT is specified, splits into no more than that many fields (though it +may split into fewer). +If LIMIT is unspecified, trailing null fields are stripped (which +potential users of pop() would do well to remember). +A pattern matching the null string (not to be confused with a null pattern //, +which is just one member of the set of patterns matching a null string) +will split the value of EXPR into separate characters at each point it +matches that way. +For example: +.nf + + print join(\':\', split(/ */, \'hi there\')); + +.fi +produces the output \*(L'h:i:t:h:e:r:e\*(R'. +.Sp +The LIMIT parameter can be used to partially split a line +.nf + + ($login, $passwd, $remainder) = split(\|/\|:\|/\|, $_, 3); + +.fi +(When assigning to a list, if LIMIT is omitted, perl supplies a LIMIT one +larger than the number of variables in the list, to avoid unnecessary work. +For the list above LIMIT would have been 4 by default. +In time critical applications it behooves you not to split into +more fields than you really need.) +.Sp +If the PATTERN contains parentheses, additional array elements are created +from each matching substring in the delimiter. +.Sp + split(/([,-])/,"1-10,20"); +.Sp +produces the array value +.Sp + (1,'-',10,',',20) +.Sp +The pattern /PATTERN/ may be replaced with an expression to specify patterns +that vary at runtime. +(To do runtime compilation only once, use /$variable/o.) +As a special case, specifying a space (\'\ \') will split on white space +just as split with no arguments does, but leading white space does NOT +produce a null first field. +Thus, split(\'\ \') can be used to emulate +.IR awk 's +default behavior, whereas +split(/\ /) will give you as many null initial fields as there are +leading spaces. +.Sp +Example: +.nf + +.ne 5 + open(passwd, \'/etc/passwd\'); + while (<passwd>) { +.ie t \{\ + ($login, $passwd, $uid, $gid, $gcos, $home, $shell) = split(\|/\|:\|/\|); +'br\} +.el \{\ + ($login, $passwd, $uid, $gid, $gcos, $home, $shell) + = split(\|/\|:\|/\|); +'br\} + .\|.\|. + } + +.fi +(Note that $shell above will still have a newline on it. See chop().) +See also +.IR join . +.Ip "sprintf(FORMAT,LIST)" 8 4 +Returns a string formatted by the usual printf conventions. +The * character is not supported. +.Ip "sqrt(EXPR)" 8 4 +.Ip "sqrt EXPR" 8 +Return the square root of EXPR. +If EXPR is omitted, returns square root of $_. +.Ip "srand(EXPR)" 8 4 +.Ip "srand EXPR" 8 +Sets the random number seed for the +.I rand +operator. +If EXPR is omitted, does srand(time). +.Ip "stat(FILEHANDLE)" 8 8 +.Ip "stat FILEHANDLE" 8 +.Ip "stat(EXPR)" 8 +.Ip "stat SCALARVARIABLE" 8 +Returns a 13-element array giving the statistics for a file, either the file +opened via FILEHANDLE, or named by EXPR. +Typically used as follows: +.nf + +.ne 3 + ($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size, + $atime,$mtime,$ctime,$blksize,$blocks) + = stat($filename); + +.fi +If stat is passed the special filehandle consisting of an underline, +no stat is done, but the current contents of the stat structure from +the last stat or filetest are returned. +Example: +.nf + +.ne 3 + if (-x $file && (($d) = stat(_)) && $d < 0) { + print "$file is executable NFS file\en"; + } + +.fi +.Ip "study(SCALAR)" 8 6 +.Ip "study SCALAR" 8 +.Ip "study" +Takes extra time to study SCALAR ($_ if unspecified) in anticipation of +doing many pattern matches on the string before it is next modified. +This may or may not save time, depending on the nature and number of patterns +you are searching on, and on the distribution of character frequencies in +the string to be searched\*(--you probably want to compare runtimes with and +without it to see which runs faster. +Those loops which scan for many short constant strings (including the constant +parts of more complex patterns) will benefit most. +You may have only one study active at a time\*(--if you study a different +scalar the first is \*(L"unstudied\*(R". +(The way study works is this: a linked list of every character in the string +to be searched is made, so we know, for example, where all the \*(L'k\*(R' characters +are. +From each search string, the rarest character is selected, based on some +static frequency tables constructed from some C programs and English text. +Only those places that contain this \*(L"rarest\*(R" character are examined.) +.Sp +For example, here is a loop which inserts index producing entries before any line +containing a certain pattern: +.nf + +.ne 8 + while (<>) { + study; + print ".IX foo\en" if /\ebfoo\eb/; + print ".IX bar\en" if /\ebbar\eb/; + print ".IX blurfl\en" if /\ebblurfl\eb/; + .\|.\|. + print; + } + +.fi +In searching for /\ebfoo\eb/, only those locations in $_ that contain \*(L'f\*(R' +will be looked at, because \*(L'f\*(R' is rarer than \*(L'o\*(R'. +In general, this is a big win except in pathological cases. +The only question is whether it saves you more time than it took to build +the linked list in the first place. +.Sp +Note that if you have to look for strings that you don't know till runtime, +you can build an entire loop as a string and eval that to avoid recompiling +all your patterns all the time. +Together with undefining $/ to input entire files as one record, this can +be very fast, often faster than specialized programs like fgrep. +The following scans a list of files (@files) +for a list of words (@words), and prints out the names of those files that +contain a match: +.nf + +.ne 12 + $search = \'while (<>) { study;\'; + foreach $word (@words) { + $search .= "++\e$seen{\e$ARGV} if /\eb$word\eb/;\en"; + } + $search .= "}"; + @ARGV = @files; + undef $/; + eval $search; # this screams + $/ = "\en"; # put back to normal input delim + foreach $file (sort keys(%seen)) { + print $file, "\en"; + } + +.fi +.Ip "substr(EXPR,OFFSET,LEN)" 8 2 +.Ip "substr(EXPR,OFFSET)" 8 2 +Extracts a substring out of EXPR and returns it. +First character is at offset 0, or whatever you've set $[ to. +If OFFSET is negative, starts that far from the end of the string. +If LEN is omitted, returns everything to the end of the string. +You can use the substr() function as an lvalue, in which case EXPR must +be an lvalue. +If you assign something shorter than LEN, the string will shrink, and +if you assign something longer than LEN, the string will grow to accommodate it. +To keep the string the same length you may need to pad or chop your value using +sprintf(). +.Ip "symlink(OLDFILE,NEWFILE)" 8 2 +Creates a new filename symbolically linked to the old filename. +Returns 1 for success, 0 otherwise. +On systems that don't support symbolic links, produces a fatal error at +run time. +To check for that, use eval: +.nf + + $symlink_exists = (eval \'symlink("","");\', $@ eq \'\'); + +.fi +.Ip "syscall(LIST)" 8 6 +.Ip "syscall LIST" 8 +Calls the system call specified as the first element of the list, passing +the remaining elements as arguments to the system call. +If unimplemented, produces a fatal error. +The arguments are interpreted as follows: if a given argument is numeric, +the argument is passed as an int. +If not, the pointer to the string value is passed. +You are responsible to make sure a string is pre-extended long enough +to receive any result that might be written into a string. +If your integer arguments are not literals and have never been interpreted +in a numeric context, you may need to add 0 to them to force them to look +like numbers. +.nf + + require 'syscall.ph'; # may need to run h2ph + syscall(&SYS_write, fileno(STDOUT), "hi there\en", 9); + +.fi +.Ip "sysread(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5 +.Ip "sysread(FILEHANDLE,SCALAR,LENGTH)" 8 5 +Attempts to read LENGTH bytes of data into variable SCALAR from the specified +FILEHANDLE, using the system call read(2). +It bypasses stdio, so mixing this with other kinds of reads may cause +confusion. +Returns the number of bytes actually read, or undef if there was an error. +SCALAR will be grown or shrunk to the length actually read. +An OFFSET may be specified to place the read data at some other place +than the beginning of the string. +.Ip "system(LIST)" 8 6 +.Ip "system LIST" 8 +Does exactly the same thing as \*(L"exec LIST\*(R" except that a fork +is done first, and the parent process waits for the child process to complete. +Note that argument processing varies depending on the number of arguments. +The return value is the exit status of the program as returned by the wait() +call. +To get the actual exit value divide by 256. +See also +.IR exec . +.Ip "syswrite(FILEHANDLE,SCALAR,LENGTH,OFFSET)" 8 5 +.Ip "syswrite(FILEHANDLE,SCALAR,LENGTH)" 8 5 +Attempts to write LENGTH bytes of data from variable SCALAR to the specified +FILEHANDLE, using the system call write(2). +It bypasses stdio, so mixing this with prints may cause +confusion. +Returns the number of bytes actually written, or undef if there was an error. +An OFFSET may be specified to place the read data at some other place +than the beginning of the string. +.Ip "tell(FILEHANDLE)" 8 6 +.Ip "tell FILEHANDLE" 8 6 +.Ip "tell" 8 +Returns the current file position for FILEHANDLE. +FILEHANDLE may be an expression whose value gives the name of the actual +filehandle. +If FILEHANDLE is omitted, assumes the file last read. +.Ip "telldir(DIRHANDLE)" 8 5 +.Ip "telldir DIRHANDLE" 8 +Returns the current position of the readdir() routines on DIRHANDLE. +Value may be given to seekdir() to access a particular location in +a directory. +Has the same caveats about possible directory compaction as the corresponding +system library routine. +.Ip "time" 8 4 +Returns the number of non-leap seconds since 00:00:00 UTC, January 1, 1970. +Suitable for feeding to gmtime() and localtime(). +.Ip "times" 8 4 +Returns a four-element array giving the user and system times, in seconds, for this +process and the children of this process. +.Sp + ($user,$system,$cuser,$csystem) = times; +.Sp +.Ip "tr/SEARCHLIST/REPLACEMENTLIST/cds" 8 5 +.Ip "y/SEARCHLIST/REPLACEMENTLIST/cds" 8 +Translates all occurrences of the characters found in the search list with +the corresponding character in the replacement list. +It returns the number of characters replaced or deleted. +If no string is specified via the =~ or !~ operator, +the $_ string is translated. +(The string specified with =~ must be a scalar variable, an array element, +or an assignment to one of those, i.e. an lvalue.) +For +.I sed +devotees, +.I y +is provided as a synonym for +.IR tr . +.Sp +If the c modifier is specified, the SEARCHLIST character set is complemented. +If the d modifier is specified, any characters specified by SEARCHLIST that +are not found in REPLACEMENTLIST are deleted. +(Note that this is slightly more flexible than the behavior of some +.I tr +programs, which delete anything they find in the SEARCHLIST, period.) +If the s modifier is specified, sequences of characters that were translated +to the same character are squashed down to 1 instance of the character. +.Sp +If the d modifier was used, the REPLACEMENTLIST is always interpreted exactly +as specified. +Otherwise, if the REPLACEMENTLIST is shorter than the SEARCHLIST, +the final character is replicated till it is long enough. +If the REPLACEMENTLIST is null, the SEARCHLIST is replicated. +This latter is useful for counting characters in a class, or for squashing +character sequences in a class. +.Sp +Examples: +.nf + + $ARGV[1] \|=~ \|y/A\-Z/a\-z/; \h'|3i'# canonicalize to lower case + + $cnt = tr/*/*/; \h'|3i'# count the stars in $_ + + $cnt = tr/0\-9//; \h'|3i'# count the digits in $_ + + tr/a\-zA\-Z//s; \h'|3i'# bookkeeper \-> bokeper + + ($HOST = $host) =~ tr/a\-z/A\-Z/; + + y/a\-zA\-Z/ /cs; \h'|3i'# change non-alphas to single space + + tr/\e200\-\e377/\e0\-\e177/;\h'|3i'# delete 8th bit + +.fi +.Ip "truncate(FILEHANDLE,LENGTH)" 8 4 +.Ip "truncate(EXPR,LENGTH)" 8 +Truncates the file opened on FILEHANDLE, or named by EXPR, to the specified +length. +Produces a fatal error if truncate isn't implemented on your system. +.Ip "umask(EXPR)" 8 4 +.Ip "umask EXPR" 8 +.Ip "umask" 8 +Sets the umask for the process and returns the old one. +If EXPR is omitted, merely returns current umask. +.Ip "undef(EXPR)" 8 6 +.Ip "undef EXPR" 8 +.Ip "undef" 8 +Undefines the value of EXPR, which must be an lvalue. +Use only on a scalar value, an entire array, or a subroutine name (using &). +(Undef will probably not do what you expect on most predefined variables or +dbm array values.) +Always returns the undefined value. +You can omit the EXPR, in which case nothing is undefined, but you still +get an undefined value that you could, for instance, return from a subroutine. +Examples: +.nf + +.ne 6 + undef $foo; + undef $bar{'blurfl'}; + undef @ary; + undef %assoc; + undef &mysub; + return (wantarray ? () : undef) if $they_blew_it; + +.fi +.Ip "unlink(LIST)" 8 4 +.Ip "unlink LIST" 8 +Deletes a list of files. +Returns the number of files successfully deleted. +.nf + +.ne 2 + $cnt = unlink \'a\', \'b\', \'c\'; + unlink @goners; + unlink <*.bak>; + +.fi +Note: unlink will not delete directories unless you are superuser and the +.B \-U +flag is supplied to +.IR perl . +Even if these conditions are met, be warned that unlinking a directory +can inflict damage on your filesystem. +Use rmdir instead. +.Ip "unpack(TEMPLATE,EXPR)" 8 4 +Unpack does the reverse of pack: it takes a string representing +a structure and expands it out into an array value, returning the array +value. +(In a scalar context, it merely returns the first value produced.) +The TEMPLATE has the same format as in the pack function. +Here's a subroutine that does substring: +.nf + +.ne 4 + sub substr { + local($what,$where,$howmuch) = @_; + unpack("x$where a$howmuch", $what); + } + +.ne 3 +and then there's + + sub ord { unpack("c",$_[0]); } + +.fi +In addition, you may prefix a field with a %<number> to indicate that +you want a <number>-bit checksum of the items instead of the items themselves. +Default is a 16-bit checksum. +For example, the following computes the same number as the System V sum program: +.nf + +.ne 4 + while (<>) { + $checksum += unpack("%16C*", $_); + } + $checksum %= 65536; + +.fi +.Ip "unshift(ARRAY,LIST)" 8 4 +Does the opposite of a +.IR shift . +Or the opposite of a +.IR push , +depending on how you look at it. +Prepends list to the front of the array, and returns the number of elements +in the new array. +.nf + + unshift(ARGV, \'\-e\') unless $ARGV[0] =~ /^\-/; + +.fi +.Ip "utime(LIST)" 8 2 +.Ip "utime LIST" 8 2 +Changes the access and modification times on each file of a list of files. +The first two elements of the list must be the NUMERICAL access and +modification times, in that order. +Returns the number of files successfully changed. +The inode modification time of each file is set to the current time. +Example of a \*(L"touch\*(R" command: +.nf + +.ne 3 + #!/usr/bin/perl + $now = time; + utime $now, $now, @ARGV; + +.fi +.Ip "values(ASSOC_ARRAY)" 8 6 +.Ip "values ASSOC_ARRAY" 8 +Returns a normal array consisting of all the values of the named associative +array. +The values are returned in an apparently random order, but it is the same order +as either the keys() or each() function would produce on the same array. +See also keys() and each(). +.Ip "vec(EXPR,OFFSET,BITS)" 8 2 +Treats a string as a vector of unsigned integers, and returns the value +of the bitfield specified. +May also be assigned to. +BITS must be a power of two from 1 to 32. +.Sp +Vectors created with vec() can also be manipulated with the logical operators +|, & and ^, +which will assume a bit vector operation is desired when both operands are +strings. +This interpretation is not enabled unless there is at least one vec() in +your program, to protect older programs. +.Sp +To transform a bit vector into a string or array of 0's and 1's, use these: +.nf + + $bits = unpack("b*", $vector); + @bits = split(//, unpack("b*", $vector)); + +.fi +If you know the exact length in bits, it can be used in place of the *. +.Ip "wait" 8 6 +Waits for a child process to terminate and returns the pid of the deceased +process, or -1 if there are no child processes. +The status is returned in $?. +.Ip "waitpid(PID,FLAGS)" 8 6 +Waits for a particular child process to terminate and returns the pid of the deceased +process, or -1 if there is no such child process. +The status is returned in $?. +If you say +.nf + + require "sys/wait.h"; + .\|.\|. + waitpid(-1,&WNOHANG); + +.fi +then you can do a non-blocking wait for any process. Non-blocking wait +is only available on machines supporting either the +.I waitpid (2) +or +.I wait4 (2) +system calls. +However, waiting for a particular pid with FLAGS of 0 is implemented +everywhere. (Perl emulates the system call by remembering the status +values of processes that have exited but have not been harvested by the +Perl script yet.) +.Ip "wantarray" 8 4 +Returns true if the context of the currently executing subroutine +is looking for an array value. +Returns false if the context is looking for a scalar. +.nf + + return wantarray ? () : undef; + +.fi +.Ip "warn(LIST)" 8 4 +.Ip "warn LIST" 8 +Produces a message on STDERR just like \*(L"die\*(R", but doesn't exit. +.Ip "write(FILEHANDLE)" 8 6 +.Ip "write(EXPR)" 8 +.Ip "write" 8 +Writes a formatted record (possibly multi-line) to the specified file, +using the format associated with that file. +By default the format for a file is the one having the same name is the +filehandle, but the format for the current output channel (see +.IR select ) +may be set explicitly +by assigning the name of the format to the $~ variable. +.Sp +Top of form processing is handled automatically: +if there is insufficient room on the current page for the formatted +record, the page is advanced by writing a form feed, +a special top-of-page format is used +to format the new page header, and then the record is written. +By default the top-of-page format is \*(L"top\*(R", but it +may be set to the +format of your choice by assigning the name to the $^ variable. +The number of lines remaining on the current page is in variable $-, which +can be set to 0 to force a new page. +.Sp +If FILEHANDLE is unspecified, output goes to the current default output channel, +which starts out as +.I STDOUT +but may be changed by the +.I select +operator. +If the FILEHANDLE is an EXPR, then the expression is evaluated and the +resulting string is used to look up the name of the FILEHANDLE at run time. +For more on formats, see the section on formats later on. +.Sp +Note that write is NOT the opposite of read. +''' Beginning of part 4 +''' $Header: perl.man,v 4.0 91/03/20 01:38:08 lwall Locked $ +''' +''' $Log: perl.man,v $ +''' Revision 4.0 91/03/20 01:38:08 lwall +''' 4.0 baseline. +''' +''' Revision 3.0.1.14 91/01/11 18:18:53 lwall +''' patch42: started an addendum and errata section in the man page +''' +''' Revision 3.0.1.13 90/11/10 01:51:00 lwall +''' patch38: random cleanup +''' +''' Revision 3.0.1.12 90/10/20 02:15:43 lwall +''' patch37: patch37: fixed various typos in man page +''' +''' Revision 3.0.1.11 90/10/16 10:04:28 lwall +''' patch29: added @###.## fields to format +''' +''' Revision 3.0.1.10 90/08/09 04:47:35 lwall +''' patch19: added require operator +''' patch19: added numeric interpretation of $] +''' +''' Revision 3.0.1.9 90/08/03 11:15:58 lwall +''' patch19: Intermediate diffs for Randal +''' +''' Revision 3.0.1.8 90/03/27 16:19:31 lwall +''' patch16: MSDOS support +''' +''' Revision 3.0.1.7 90/03/14 12:29:50 lwall +''' patch15: man page falsely states that you can't subscript array values +''' +''' Revision 3.0.1.6 90/03/12 16:54:04 lwall +''' patch13: improved documentation of *name +''' +''' Revision 3.0.1.5 90/02/28 18:01:52 lwall +''' patch9: $0 is now always the command name +''' +''' Revision 3.0.1.4 89/12/21 20:12:39 lwall +''' patch7: documented that package'filehandle works as well as $package'variable +''' patch7: documented which identifiers are always in package main +''' +''' Revision 3.0.1.3 89/11/17 15:32:25 lwall +''' patch5: fixed some manual typos and indent problems +''' patch5: clarified difference between $! and $@ +''' +''' Revision 3.0.1.2 89/11/11 04:46:40 lwall +''' patch2: made some line breaks depend on troff vs. nroff +''' patch2: clarified operation of ^ and $ when $* is false +''' +''' Revision 3.0.1.1 89/10/26 23:18:43 lwall +''' patch1: documented the desirability of unnecessary parentheses +''' +''' Revision 3.0 89/10/18 15:21:55 lwall +''' 3.0 baseline +''' +.Sh "Precedence" +.I Perl +operators have the following associativity and precedence: +.nf + +nonassoc\h'|1i'print printf exec system sort reverse +\h'1.5i'chmod chown kill unlink utime die return +left\h'|1i', +right\h'|1i'= += \-= *= etc. +right\h'|1i'?: +nonassoc\h'|1i'.\|. +left\h'|1i'|| +left\h'|1i'&& +left\h'|1i'| ^ +left\h'|1i'& +nonassoc\h'|1i'== != <=> eq ne cmp +nonassoc\h'|1i'< > <= >= lt gt le ge +nonassoc\h'|1i'chdir exit eval reset sleep rand umask +nonassoc\h'|1i'\-r \-w \-x etc. +left\h'|1i'<< >> +left\h'|1i'+ \- . +left\h'|1i'* / % x +left\h'|1i'=~ !~ +right\h'|1i'! ~ and unary minus +right\h'|1i'** +nonassoc\h'|1i'++ \-\|\- +left\h'|1i'\*(L'(\*(R' + +.fi +As mentioned earlier, if any list operator (print, etc.) or +any unary operator (chdir, etc.) +is followed by a left parenthesis as the next token on the same line, +the operator and arguments within parentheses are taken to +be of highest precedence, just like a normal function call. +Examples: +.nf + + chdir $foo || die;\h'|3i'# (chdir $foo) || die + chdir($foo) || die;\h'|3i'# (chdir $foo) || die + chdir ($foo) || die;\h'|3i'# (chdir $foo) || die + chdir +($foo) || die;\h'|3i'# (chdir $foo) || die + +but, because * is higher precedence than ||: + + chdir $foo * 20;\h'|3i'# chdir ($foo * 20) + chdir($foo) * 20;\h'|3i'# (chdir $foo) * 20 + chdir ($foo) * 20;\h'|3i'# (chdir $foo) * 20 + chdir +($foo) * 20;\h'|3i'# chdir ($foo * 20) + + rand 10 * 20;\h'|3i'# rand (10 * 20) + rand(10) * 20;\h'|3i'# (rand 10) * 20 + rand (10) * 20;\h'|3i'# (rand 10) * 20 + rand +(10) * 20;\h'|3i'# rand (10 * 20) + +.fi +In the absence of parentheses, +the precedence of list operators such as print, sort or chmod is +either very high or very low depending on whether you look at the left +side of operator or the right side of it. +For example, in +.nf + + @ary = (1, 3, sort 4, 2); + print @ary; # prints 1324 + +.fi +the commas on the right of the sort are evaluated before the sort, but +the commas on the left are evaluated after. +In other words, list operators tend to gobble up all the arguments that +follow them, and then act like a simple term with regard to the preceding +expression. +Note that you have to be careful with parens: +.nf + +.ne 3 + # These evaluate exit before doing the print: + print($foo, exit); # Obviously not what you want. + print $foo, exit; # Nor is this. + +.ne 4 + # These do the print before evaluating exit: + (print $foo), exit; # This is what you want. + print($foo), exit; # Or this. + print ($foo), exit; # Or even this. + +Also note that + + print ($foo & 255) + 1, "\en"; + +.fi +probably doesn't do what you expect at first glance. +.Sh "Subroutines" +A subroutine may be declared as follows: +.nf + + sub NAME BLOCK + +.fi +.PP +Any arguments passed to the routine come in as array @_, +that is ($_[0], $_[1], .\|.\|.). +The array @_ is a local array, but its values are references to the +actual scalar parameters. +The return value of the subroutine is the value of the last expression +evaluated, and can be either an array value or a scalar value. +Alternately, a return statement may be used to specify the returned value and +exit the subroutine. +To create local variables see the +.I local +operator. +.PP +A subroutine is called using the +.I do +operator or the & operator. +.nf + +.ne 12 +Example: + + sub MAX { + local($max) = pop(@_); + foreach $foo (@_) { + $max = $foo \|if \|$max < $foo; + } + $max; + } + + .\|.\|. + $bestday = &MAX($mon,$tue,$wed,$thu,$fri); + +.ne 21 +Example: + + # get a line, combining continuation lines + # that start with whitespace + sub get_line { + $thisline = $lookahead; + line: while ($lookahead = <STDIN>) { + if ($lookahead \|=~ \|/\|^[ \^\e\|t]\|/\|) { + $thisline \|.= \|$lookahead; + } + else { + last line; + } + } + $thisline; + } + + $lookahead = <STDIN>; # get first line + while ($_ = do get_line(\|)) { + .\|.\|. + } + +.fi +.nf +.ne 6 +Use array assignment to a local list to name your formal arguments: + + sub maybeset { + local($key, $value) = @_; + $foo{$key} = $value unless $foo{$key}; + } + +.fi +This also has the effect of turning call-by-reference into call-by-value, +since the assignment copies the values. +.Sp +Subroutines may be called recursively. +If a subroutine is called using the & form, the argument list is optional. +If omitted, no @_ array is set up for the subroutine; the @_ array at the +time of the call is visible to subroutine instead. +.nf + + do foo(1,2,3); # pass three arguments + &foo(1,2,3); # the same + + do foo(); # pass a null list + &foo(); # the same + &foo; # pass no arguments\*(--more efficient + +.fi +.Sh "Passing By Reference" +Sometimes you don't want to pass the value of an array to a subroutine but +rather the name of it, so that the subroutine can modify the global copy +of it rather than working with a local copy. +In perl you can refer to all the objects of a particular name by prefixing +the name with a star: *foo. +When evaluated, it produces a scalar value that represents all the objects +of that name, including any filehandle, format or subroutine. +When assigned to within a local() operation, it causes the name mentioned +to refer to whatever * value was assigned to it. +Example: +.nf + + sub doubleary { + local(*someary) = @_; + foreach $elem (@someary) { + $elem *= 2; + } + } + do doubleary(*foo); + do doubleary(*bar); + +.fi +Assignment to *name is currently recommended only inside a local(). +You can actually assign to *name anywhere, but the previous referent of +*name may be stranded forever. +This may or may not bother you. +.Sp +Note that scalars are already passed by reference, so you can modify scalar +arguments without using this mechanism by referring explicitly to the $_[nnn] +in question. +You can modify all the elements of an array by passing all the elements +as scalars, but you have to use the * mechanism to push, pop or change the +size of an array. +The * mechanism will probably be more efficient in any case. +.Sp +Since a *name value contains unprintable binary data, if it is used as +an argument in a print, or as a %s argument in a printf or sprintf, it +then has the value '*name', just so it prints out pretty. +.Sp +Even if you don't want to modify an array, this mechanism is useful for +passing multiple arrays in a single LIST, since normally the LIST mechanism +will merge all the array values so that you can't extract out the +individual arrays. +.Sh "Regular Expressions" +The patterns used in pattern matching are regular expressions such as +those supplied in the Version 8 regexp routines. +(In fact, the routines are derived from Henry Spencer's freely redistributable +reimplementation of the V8 routines.) +In addition, \ew matches an alphanumeric character (including \*(L"_\*(R") and \eW a nonalphanumeric. +Word boundaries may be matched by \eb, and non-boundaries by \eB. +A whitespace character is matched by \es, non-whitespace by \eS. +A numeric character is matched by \ed, non-numeric by \eD. +You may use \ew, \es and \ed within character classes. +Also, \en, \er, \ef, \et and \eNNN have their normal interpretations. +Within character classes \eb represents backspace rather than a word boundary. +Alternatives may be separated by |. +The bracketing construct \|(\ .\|.\|.\ \|) may also be used, in which case \e<digit> +matches the digit'th substring. +(Outside of the pattern, always use $ instead of \e in front of the digit. +The scope of $<digit> (and $\`, $& and $\') +extends to the end of the enclosing BLOCK or eval string, or to +the next pattern match with subexpressions. +The \e<digit> notation sometimes works outside the current pattern, but should +not be relied upon.) +You may have as many parentheses as you wish. If you have more than 9 +substrings, the variables $10, $11, ... refer to the corresponding +substring. Within the pattern, \e10, \e11, +etc. refer back to substrings if there have been at least that many left parens +before the backreference. Otherwise (for backward compatibilty) \e10 +is the same as \e010, a backspace, +and \e11 the same as \e011, a tab. +And so on. +(\e1 through \e9 are always backreferences.) +.PP +$+ returns whatever the last bracket match matched. +$& returns the entire matched string. +($0 used to return the same thing, but not any more.) +$\` returns everything before the matched string. +$\' returns everything after the matched string. +Examples: +.nf + + s/\|^\|([^ \|]*\|) \|*([^ \|]*\|)\|/\|$2 $1\|/; # swap first two words + +.ne 5 + if (/\|Time: \|(.\|.\|):\|(.\|.\|):\|(.\|.\|)\|/\|) { + $hours = $1; + $minutes = $2; + $seconds = $3; + } + +.fi +By default, the ^ character is only guaranteed to match at the beginning +of the string, +the $ character only at the end (or before the newline at the end) +and +.I perl +does certain optimizations with the assumption that the string contains +only one line. +The behavior of ^ and $ on embedded newlines will be inconsistent. +You may, however, wish to treat a string as a multi-line buffer, such that +the ^ will match after any newline within the string, and $ will match +before any newline. +At the cost of a little more overhead, you can do this by setting the variable +$* to 1. +Setting it back to 0 makes +.I perl +revert to its old behavior. +.PP +To facilitate multi-line substitutions, the . character never matches a newline +(even when $* is 0). +In particular, the following leaves a newline on the $_ string: +.nf + + $_ = <STDIN>; + s/.*(some_string).*/$1/; + +If the newline is unwanted, try one of + + s/.*(some_string).*\en/$1/; + s/.*(some_string)[^\e000]*/$1/; + s/.*(some_string)(.|\en)*/$1/; + chop; s/.*(some_string).*/$1/; + /(some_string)/ && ($_ = $1); + +.fi +Any item of a regular expression may be followed with digits in curly brackets +of the form {n,m}, where n gives the minimum number of times to match the item +and m gives the maximum. +The form {n} is equivalent to {n,n} and matches exactly n times. +The form {n,} matches n or more times. +(If a curly bracket occurs in any other context, it is treated as a regular +character.) +The * modifier is equivalent to {0,}, the + modifier to {1,} and the ? modifier +to {0,1}. +There is no limit to the size of n or m, but large numbers will chew up +more memory. +.Sp +You will note that all backslashed metacharacters in +.I perl +are alphanumeric, +such as \eb, \ew, \en. +Unlike some other regular expression languages, there are no backslashed +symbols that aren't alphanumeric. +So anything that looks like \e\e, \e(, \e), \e<, \e>, \e{, or \e} is always +interpreted as a literal character, not a metacharacter. +This makes it simple to quote a string that you want to use for a pattern +but that you are afraid might contain metacharacters. +Simply quote all the non-alphanumeric characters: +.nf + + $pattern =~ s/(\eW)/\e\e$1/g; + +.fi +.Sh "Formats" +Output record formats for use with the +.I write +operator may declared as follows: +.nf + +.ne 3 + format NAME = + FORMLIST + . + +.fi +If name is omitted, format \*(L"STDOUT\*(R" is defined. +FORMLIST consists of a sequence of lines, each of which may be of one of three +types: +.Ip 1. 4 +A comment. +.Ip 2. 4 +A \*(L"picture\*(R" line giving the format for one output line. +.Ip 3. 4 +An argument line supplying values to plug into a picture line. +.PP +Picture lines are printed exactly as they look, except for certain fields +that substitute values into the line. +Each picture field starts with either @ or ^. +The @ field (not to be confused with the array marker @) is the normal +case; ^ fields are used +to do rudimentary multi-line text block filling. +The length of the field is supplied by padding out the field +with multiple <, >, or | characters to specify, respectively, left justification, +right justification, or centering. +As an alternate form of right justification, +you may also use # characters (with an optional .) to specify a numeric field. +(Use of ^ instead of @ causes the field to be blanked if undefined.) +If any of the values supplied for these fields contains a newline, only +the text up to the newline is printed. +The special field @* can be used for printing multi-line values. +It should appear by itself on a line. +.PP +The values are specified on the following line, in the same order as +the picture fields. +The values should be separated by commas. +.PP +Picture fields that begin with ^ rather than @ are treated specially. +The value supplied must be a scalar variable name which contains a text +string. +.I Perl +puts as much text as it can into the field, and then chops off the front +of the string so that the next time the variable is referenced, +more of the text can be printed. +Normally you would use a sequence of fields in a vertical stack to print +out a block of text. +If you like, you can end the final field with .\|.\|., which will appear in the +output if the text was too long to appear in its entirety. +You can change which characters are legal to break on by changing the +variable $: to a list of the desired characters. +.PP +Since use of ^ fields can produce variable length records if the text to be +formatted is short, you can suppress blank lines by putting the tilde (~) +character anywhere in the line. +(Normally you should put it in the front if possible, for visibility.) +The tilde will be translated to a space upon output. +If you put a second tilde contiguous to the first, the line will be repeated +until all the fields on the line are exhausted. +(If you use a field of the @ variety, the expression you supply had better +not give the same value every time forever!) +.PP +Examples: +.nf +.lg 0 +.cs R 25 +.ft C + +.ne 10 +# a report on the /etc/passwd file +format top = +\& Passwd File +Name Login Office Uid Gid Home +------------------------------------------------------------------ +\&. +format STDOUT = +@<<<<<<<<<<<<<<<<<< @||||||| @<<<<<<@>>>> @>>>> @<<<<<<<<<<<<<<<<< +$name, $login, $office,$uid,$gid, $home +\&. + +.ne 29 +# a report from a bug report form +format top = +\& Bug Reports +@<<<<<<<<<<<<<<<<<<<<<<< @||| @>>>>>>>>>>>>>>>>>>>>>>> +$system, $%, $date +------------------------------------------------------------------ +\&. +format STDOUT = +Subject: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< +\& $subject +Index: @<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< +\& $index, $description +Priority: @<<<<<<<<<< Date: @<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< +\& $priority, $date, $description +From: @<<<<<<<<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< +\& $from, $description +Assigned to: @<<<<<<<<<<<<<<<<<<<<<< ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< +\& $programmer, $description +\&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< +\& $description +\&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< +\& $description +\&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< +\& $description +\&~ ^<<<<<<<<<<<<<<<<<<<<<<<<<<<< +\& $description +\&~ ^<<<<<<<<<<<<<<<<<<<<<<<... +\& $description +\&. + +.ft R +.cs R +.lg +.fi +It is possible to intermix prints with writes on the same output channel, +but you'll have to handle $\- (lines left on the page) yourself. +.PP +If you are printing lots of fields that are usually blank, you should consider +using the reset operator between records. +Not only is it more efficient, but it can prevent the bug of adding another +field and forgetting to zero it. +.Sh "Interprocess Communication" +The IPC facilities of perl are built on the Berkeley socket mechanism. +If you don't have sockets, you can ignore this section. +The calls have the same names as the corresponding system calls, +but the arguments tend to differ, for two reasons. +First, perl file handles work differently than C file descriptors. +Second, perl already knows the length of its strings, so you don't need +to pass that information. +Here is a sample client (untested): +.nf + + ($them,$port) = @ARGV; + $port = 2345 unless $port; + $them = 'localhost' unless $them; + + $SIG{'INT'} = 'dokill'; + sub dokill { kill 9,$child if $child; } + + require 'sys/socket.ph'; + + $sockaddr = 'S n a4 x8'; + chop($hostname = `hostname`); + + ($name, $aliases, $proto) = getprotobyname('tcp'); + ($name, $aliases, $port) = getservbyname($port, 'tcp') + unless $port =~ /^\ed+$/; +.ie t \{\ + ($name, $aliases, $type, $len, $thisaddr) = gethostbyname($hostname); +'br\} +.el \{\ + ($name, $aliases, $type, $len, $thisaddr) = + gethostbyname($hostname); +'br\} + ($name, $aliases, $type, $len, $thataddr) = gethostbyname($them); + + $this = pack($sockaddr, &AF_INET, 0, $thisaddr); + $that = pack($sockaddr, &AF_INET, $port, $thataddr); + + socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!"; + bind(S, $this) || die "bind: $!"; + connect(S, $that) || die "connect: $!"; + + select(S); $| = 1; select(stdout); + + if ($child = fork) { + while (<>) { + print S; + } + sleep 3; + do dokill(); + } + else { + while (<S>) { + print; + } + } + +.fi +And here's a server: +.nf + + ($port) = @ARGV; + $port = 2345 unless $port; + + require 'sys/socket.ph'; + + $sockaddr = 'S n a4 x8'; + + ($name, $aliases, $proto) = getprotobyname('tcp'); + ($name, $aliases, $port) = getservbyname($port, 'tcp') + unless $port =~ /^\ed+$/; + + $this = pack($sockaddr, &AF_INET, $port, "\e0\e0\e0\e0"); + + select(NS); $| = 1; select(stdout); + + socket(S, &PF_INET, &SOCK_STREAM, $proto) || die "socket: $!"; + bind(S, $this) || die "bind: $!"; + listen(S, 5) || die "connect: $!"; + + select(S); $| = 1; select(stdout); + + for (;;) { + print "Listening again\en"; + ($addr = accept(NS,S)) || die $!; + print "accept ok\en"; + + ($af,$port,$inetaddr) = unpack($sockaddr,$addr); + @inetaddr = unpack('C4',$inetaddr); + print "$af $port @inetaddr\en"; + + while (<NS>) { + print; + print NS; + } + } + +.fi +.Sh "Predefined Names" +The following names have special meaning to +.IR perl . +I could have used alphabetic symbols for some of these, but I didn't want +to take the chance that someone would say reset \*(L"a\-zA\-Z\*(R" and wipe them all +out. +You'll just have to suffer along with these silly symbols. +Most of them have reasonable mnemonics, or analogues in one of the shells. +.Ip $_ 8 +The default input and pattern-searching space. +The following pairs are equivalent: +.nf + +.ne 2 + while (<>) {\|.\|.\|. # only equivalent in while! + while ($_ = <>) {\|.\|.\|. + +.ne 2 + /\|^Subject:/ + $_ \|=~ \|/\|^Subject:/ + +.ne 2 + y/a\-z/A\-Z/ + $_ =~ y/a\-z/A\-Z/ + +.ne 2 + chop + chop($_) + +.fi +(Mnemonic: underline is understood in certain operations.) +.Ip $. 8 +The current input line number of the last filehandle that was read. +Readonly. +Remember that only an explicit close on the filehandle resets the line number. +Since <> never does an explicit close, line numbers increase across ARGV files +(but see examples under eof). +(Mnemonic: many programs use . to mean the current line number.) +.Ip $/ 8 +The input record separator, newline by default. +Works like +.IR awk 's +RS variable, including treating blank lines as delimiters +if set to the null string. +You may set it to a multicharacter string to match a multi-character +delimiter. +(Mnemonic: / is used to delimit line boundaries when quoting poetry.) +.Ip $, 8 +The output field separator for the print operator. +Ordinarily the print operator simply prints out the comma separated fields +you specify. +In order to get behavior more like +.IR awk , +set this variable as you would set +.IR awk 's +OFS variable to specify what is printed between fields. +(Mnemonic: what is printed when there is a , in your print statement.) +.Ip $"" 8 +This is like $, except that it applies to array values interpolated into +a double-quoted string (or similar interpreted string). +Default is a space. +(Mnemonic: obvious, I think.) +.Ip $\e 8 +The output record separator for the print operator. +Ordinarily the print operator simply prints out the comma separated fields +you specify, with no trailing newline or record separator assumed. +In order to get behavior more like +.IR awk , +set this variable as you would set +.IR awk 's +ORS variable to specify what is printed at the end of the print. +(Mnemonic: you set $\e instead of adding \en at the end of the print. +Also, it's just like /, but it's what you get \*(L"back\*(R" from +.IR perl .) +.Ip $# 8 +The output format for printed numbers. +This variable is a half-hearted attempt to emulate +.IR awk 's +OFMT variable. +There are times, however, when +.I awk +and +.I perl +have differing notions of what +is in fact numeric. +Also, the initial value is %.20g rather than %.6g, so you need to set $# +explicitly to get +.IR awk 's +value. +(Mnemonic: # is the number sign.) +.Ip $% 8 +The current page number of the currently selected output channel. +(Mnemonic: % is page number in nroff.) +.Ip $= 8 +The current page length (printable lines) of the currently selected output +channel. +Default is 60. +(Mnemonic: = has horizontal lines.) +.Ip $\- 8 +The number of lines left on the page of the currently selected output channel. +(Mnemonic: lines_on_page \- lines_printed.) +.Ip $~ 8 +The name of the current report format for the currently selected output +channel. +(Mnemonic: brother to $^.) +.Ip $^ 8 +The name of the current top-of-page format for the currently selected output +channel. +(Mnemonic: points to top of page.) +.Ip $| 8 +If set to nonzero, forces a flush after every write or print on the currently +selected output channel. +Default is 0. +Note that +.I STDOUT +will typically be line buffered if output is to the +terminal and block buffered otherwise. +Setting this variable is useful primarily when you are outputting to a pipe, +such as when you are running a +.I perl +script under rsh and want to see the +output as it's happening. +(Mnemonic: when you want your pipes to be piping hot.) +.Ip $$ 8 +The process number of the +.I perl +running this script. +(Mnemonic: same as shells.) +.Ip $? 8 +The status returned by the last pipe close, backtick (\`\`) command or +.I system +operator. +Note that this is the status word returned by the wait() system +call, so the exit value of the subprocess is actually ($? >> 8). +$? & 255 gives which signal, if any, the process died from, and whether +there was a core dump. +(Mnemonic: similar to sh and ksh.) +.Ip $& 8 4 +The string matched by the last pattern match (not counting any matches hidden +within a BLOCK or eval enclosed by the current BLOCK). +(Mnemonic: like & in some editors.) +.Ip $\` 8 4 +The string preceding whatever was matched by the last pattern match +(not counting any matches hidden within a BLOCK or eval enclosed by the current +BLOCK). +(Mnemonic: \` often precedes a quoted string.) +.Ip $\' 8 4 +The string following whatever was matched by the last pattern match +(not counting any matches hidden within a BLOCK or eval enclosed by the current +BLOCK). +(Mnemonic: \' often follows a quoted string.) +Example: +.nf + +.ne 3 + $_ = \'abcdefghi\'; + /def/; + print "$\`:$&:$\'\en"; # prints abc:def:ghi + +.fi +.Ip $+ 8 4 +The last bracket matched by the last search pattern. +This is useful if you don't know which of a set of alternative patterns +matched. +For example: +.nf + + /Version: \|(.*\|)|Revision: \|(.*\|)\|/ \|&& \|($rev = $+); + +.fi +(Mnemonic: be positive and forward looking.) +.Ip $* 8 2 +Set to 1 to do multiline matching within a string, 0 to tell +.I perl +that it can assume that strings contain a single line, for the purpose +of optimizing pattern matches. +Pattern matches on strings containing multiple newlines can produce confusing +results when $* is 0. +Default is 0. +(Mnemonic: * matches multiple things.) +Note that this variable only influences the interpretation of ^ and $. +A literal newline can be searched for even when $* == 0. +.Ip $0 8 +Contains the name of the file containing the +.I perl +script being executed. +Assigning to $0 modifies the argument area that the ps(1) program sees. +(Mnemonic: same as sh and ksh.) +.Ip $<digit> 8 +Contains the subpattern from the corresponding set of parentheses in the last +pattern matched, not counting patterns matched in nested blocks that have +been exited already. +(Mnemonic: like \edigit.) +.Ip $[ 8 2 +The index of the first element in an array, and of the first character in +a substring. +Default is 0, but you could set it to 1 to make +.I perl +behave more like +.I awk +(or Fortran) +when subscripting and when evaluating the index() and substr() functions. +(Mnemonic: [ begins subscripts.) +.Ip $] 8 2 +The string printed out when you say \*(L"perl -v\*(R". +It can be used to determine at the beginning of a script whether the perl +interpreter executing the script is in the right range of versions. +If used in a numeric context, returns the version + patchlevel / 1000. +Example: +.nf + +.ne 8 + # see if getc is available + ($version,$patchlevel) = + $] =~ /(\ed+\e.\ed+).*\enPatch level: (\ed+)/; + print STDERR "(No filename completion available.)\en" + if $version * 1000 + $patchlevel < 2016; + +or, used numerically, + + warn "No checksumming!\en" if $] < 3.019; + +.fi +(Mnemonic: Is this version of perl in the right bracket?) +.Ip $; 8 2 +The subscript separator for multi-dimensional array emulation. +If you refer to an associative array element as +.nf + $foo{$a,$b,$c} + +it really means + + $foo{join($;, $a, $b, $c)} + +But don't put + + @foo{$a,$b,$c} # a slice\*(--note the @ + +which means + + ($foo{$a},$foo{$b},$foo{$c}) + +.fi +Default is "\e034", the same as SUBSEP in +.IR awk . +Note that if your keys contain binary data there might not be any safe +value for $;. +(Mnemonic: comma (the syntactic subscript separator) is a semi-semicolon. +Yeah, I know, it's pretty lame, but $, is already taken for something more +important.) +.Ip $! 8 2 +If used in a numeric context, yields the current value of errno, with all the +usual caveats. +(This means that you shouldn't depend on the value of $! to be anything +in particular unless you've gotten a specific error return indicating a +system error.) +If used in a string context, yields the corresponding system error string. +You can assign to $! in order to set errno +if, for instance, you want $! to return the string for error n, or you want +to set the exit value for the die operator. +(Mnemonic: What just went bang?) +.Ip $@ 8 2 +The perl syntax error message from the last eval command. +If null, the last eval parsed and executed correctly (although the operations +you invoked may have failed in the normal fashion). +(Mnemonic: Where was the syntax error \*(L"at\*(R"?) +.Ip $< 8 2 +The real uid of this process. +(Mnemonic: it's the uid you came FROM, if you're running setuid.) +.Ip $> 8 2 +The effective uid of this process. +Example: +.nf + +.ne 2 + $< = $>; # set real uid to the effective uid + ($<,$>) = ($>,$<); # swap real and effective uid + +.fi +(Mnemonic: it's the uid you went TO, if you're running setuid.) +Note: $< and $> can only be swapped on machines supporting setreuid(). +.Ip $( 8 2 +The real gid of this process. +If you are on a machine that supports membership in multiple groups +simultaneously, gives a space separated list of groups you are in. +The first number is the one returned by getgid(), and the subsequent ones +by getgroups(), one of which may be the same as the first number. +(Mnemonic: parentheses are used to GROUP things. +The real gid is the group you LEFT, if you're running setgid.) +.Ip $) 8 2 +The effective gid of this process. +If you are on a machine that supports membership in multiple groups +simultaneously, gives a space separated list of groups you are in. +The first number is the one returned by getegid(), and the subsequent ones +by getgroups(), one of which may be the same as the first number. +(Mnemonic: parentheses are used to GROUP things. +The effective gid is the group that's RIGHT for you, if you're running setgid.) +.Sp +Note: $<, $>, $( and $) can only be set on machines that support the +corresponding set[re][ug]id() routine. +$( and $) can only be swapped on machines supporting setregid(). +.Ip $: 8 2 +The current set of characters after which a string may be broken to +fill continuation fields (starting with ^) in a format. +Default is "\ \en-", to break on whitespace or hyphens. +(Mnemonic: a \*(L"colon\*(R" in poetry is a part of a line.) +.Ip $^D 8 2 +The current value of the debugging flags. +(Mnemonic: value of +.B \-D +switch.) +.Ip $^I 8 2 +The current value of the inplace-edit extension. +Use undef to disable inplace editing. +(Mnemonic: value of +.B \-i +switch.) +.Ip $^P 8 2 +The name that Perl itself was invoked as, from argv[0]. +.Ip $^T 8 2 +The time at which the script began running, in seconds since the epoch. +The values returned by the +.B \-M , +.B \-A +and +.B \-C +filetests are based on this value. +.Ip $^W 8 2 +The current value of the warning switch. +(Mnemonic: related to the +.B \-w +switch.) +.Ip $ARGV 8 3 +contains the name of the current file when reading from <>. +.Ip @ARGV 8 3 +The array ARGV contains the command line arguments intended for the script. +Note that $#ARGV is the generally number of arguments minus one, since +$ARGV[0] is the first argument, NOT the command name. +See $0 for the command name. +.Ip @INC 8 3 +The array INC contains the list of places to look for +.I perl +scripts to be +evaluated by the \*(L"do EXPR\*(R" command or the \*(L"require\*(R" command. +It initially consists of the arguments to any +.B \-I +command line switches, followed +by the default +.I perl +library, probably \*(L"/usr/local/lib/perl\*(R", +followed by \*(L".\*(R", to represent the current directory. +.Ip %INC 8 3 +The associative array INC contains entries for each filename that has +been included via \*(L"do\*(R" or \*(L"require\*(R". +The key is the filename you specified, and the value is the location of +the file actually found. +The \*(L"require\*(R" command uses this array to determine whether +a given file has already been included. +.Ip $ENV{expr} 8 2 +The associative array ENV contains your current environment. +Setting a value in ENV changes the environment for child processes. +.Ip $SIG{expr} 8 2 +The associative array SIG is used to set signal handlers for various signals. +Example: +.nf + +.ne 12 + sub handler { # 1st argument is signal name + local($sig) = @_; + print "Caught a SIG$sig\-\|\-shutting down\en"; + close(LOG); + exit(0); + } + + $SIG{\'INT\'} = \'handler\'; + $SIG{\'QUIT\'} = \'handler\'; + .\|.\|. + $SIG{\'INT\'} = \'DEFAULT\'; # restore default action + $SIG{\'QUIT\'} = \'IGNORE\'; # ignore SIGQUIT + +.fi +The SIG array only contains values for the signals actually set within +the perl script. +.Sh "Packages" +Perl provides a mechanism for alternate namespaces to protect packages from +stomping on each others variables. +By default, a perl script starts compiling into the package known as \*(L"main\*(R". +By use of the +.I package +declaration, you can switch namespaces. +The scope of the package declaration is from the declaration itself to the end +of the enclosing block (the same scope as the local() operator). +Typically it would be the first declaration in a file to be included by +the \*(L"require\*(R" operator. +You can switch into a package in more than one place; it merely influences +which symbol table is used by the compiler for the rest of that block. +You can refer to variables and filehandles in other packages by prefixing +the identifier with the package name and a single quote. +If the package name is null, the \*(L"main\*(R" package as assumed. +.PP +Only identifiers starting with letters are stored in the packages symbol +table. +All other symbols are kept in package \*(L"main\*(R". +In addition, the identifiers STDIN, STDOUT, STDERR, ARGV, ARGVOUT, ENV, INC +and SIG are forced to be in package \*(L"main\*(R", even when used for +other purposes than their built-in one. +Note also that, if you have a package called \*(L"m\*(R", \*(L"s\*(R" +or \*(L"y\*(R", the you can't use the qualified form of an identifier since it +will be interpreted instead as a pattern match, a substitution +or a translation. +.PP +Eval'ed strings are compiled in the package in which the eval was compiled +in. +(Assignments to $SIG{}, however, assume the signal handler specified is in the +main package. +Qualify the signal handler name if you wish to have a signal handler in +a package.) +For an example, examine perldb.pl in the perl library. +It initially switches to the DB package so that the debugger doesn't interfere +with variables in the script you are trying to debug. +At various points, however, it temporarily switches back to the main package +to evaluate various expressions in the context of the main package. +.PP +The symbol table for a package happens to be stored in the associative array +of that name prepended with an underscore. +The value in each entry of the associative array is +what you are referring to when you use the *name notation. +In fact, the following have the same effect (in package main, anyway), +though the first is more +efficient because it does the symbol table lookups at compile time: +.nf + +.ne 2 + local(*foo) = *bar; + local($_main{'foo'}) = $_main{'bar'}; + +.fi +You can use this to print out all the variables in a package, for instance. +Here is dumpvar.pl from the perl library: +.nf +.ne 11 + package dumpvar; + + sub main'dumpvar { + \& ($package) = @_; + \& local(*stab) = eval("*_$package"); + \& while (($key,$val) = each(%stab)) { + \& { + \& local(*entry) = $val; + \& if (defined $entry) { + \& print "\e$$key = '$entry'\en"; + \& } +.ne 7 + \& if (defined @entry) { + \& print "\e@$key = (\en"; + \& foreach $num ($[ .. $#entry) { + \& print " $num\et'",$entry[$num],"'\en"; + \& } + \& print ")\en"; + \& } +.ne 10 + \& if ($key ne "_$package" && defined %entry) { + \& print "\e%$key = (\en"; + \& foreach $key (sort keys(%entry)) { + \& print " $key\et'",$entry{$key},"'\en"; + \& } + \& print ")\en"; + \& } + \& } + \& } + } + +.fi +Note that, even though the subroutine is compiled in package dumpvar, the +name of the subroutine is qualified so that its name is inserted into package +\*(L"main\*(R". +.Sh "Style" +Each programmer will, of course, have his or her own preferences in regards +to formatting, but there are some general guidelines that will make your +programs easier to read. +.Ip 1. 4 4 +Just because you CAN do something a particular way doesn't mean that +you SHOULD do it that way. +.I Perl +is designed to give you several ways to do anything, so consider picking +the most readable one. +For instance + + open(FOO,$foo) || die "Can't open $foo: $!"; + +is better than + + die "Can't open $foo: $!" unless open(FOO,$foo); + +because the second way hides the main point of the statement in a +modifier. +On the other hand + + print "Starting analysis\en" if $verbose; + +is better than + + $verbose && print "Starting analysis\en"; + +since the main point isn't whether the user typed -v or not. +.Sp +Similarly, just because an operator lets you assume default arguments +doesn't mean that you have to make use of the defaults. +The defaults are there for lazy systems programmers writing one-shot +programs. +If you want your program to be readable, consider supplying the argument. +.Sp +Along the same lines, just because you +.I can +omit parentheses in many places doesn't mean that you ought to: +.nf + + return print reverse sort num values array; + return print(reverse(sort num (values(%array)))); + +.fi +When in doubt, parenthesize. +At the very least it will let some poor schmuck bounce on the % key in vi. +.Sp +Even if you aren't in doubt, consider the mental welfare of the person who +has to maintain the code after you, and who will probably put parens in +the wrong place. +.Ip 2. 4 4 +Don't go through silly contortions to exit a loop at the top or the +bottom, when +.I perl +provides the "last" operator so you can exit in the middle. +Just outdent it a little to make it more visible: +.nf + +.ne 7 + line: + for (;;) { + statements; + last line if $foo; + next line if /^#/; + statements; + } + +.fi +.Ip 3. 4 4 +Don't be afraid to use loop labels\*(--they're there to enhance readability as +well as to allow multi-level loop breaks. +See last example. +.Ip 4. 4 4 +For portability, when using features that may not be implemented on every +machine, test the construct in an eval to see if it fails. +If you know what version or patchlevel a particular feature was implemented, +you can test $] to see if it will be there. +.Ip 5. 4 4 +Choose mnemonic identifiers. +.Ip 6. 4 4 +Be consistent. +.Sh "Debugging" +If you invoke +.I perl +with a +.B \-d +switch, your script will be run under a debugging monitor. +It will halt before the first executable statement and ask you for a +command, such as: +.Ip "h" 12 4 +Prints out a help message. +.Ip "T" 12 4 +Stack trace. +.Ip "s" 12 4 +Single step. +Executes until it reaches the beginning of another statement. +.Ip "n" 12 4 +Next. +Executes over subroutine calls, until it reaches the beginning of the +next statement. +.Ip "f" 12 4 +Finish. +Executes statements until it has finished the current subroutine. +.Ip "c" 12 4 +Continue. +Executes until the next breakpoint is reached. +.Ip "c line" 12 4 +Continue to the specified line. +Inserts a one-time-only breakpoint at the specified line. +.Ip "<CR>" 12 4 +Repeat last n or s. +.Ip "l min+incr" 12 4 +List incr+1 lines starting at min. +If min is omitted, starts where last listing left off. +If incr is omitted, previous value of incr is used. +.Ip "l min-max" 12 4 +List lines in the indicated range. +.Ip "l line" 12 4 +List just the indicated line. +.Ip "l" 12 4 +List next window. +.Ip "-" 12 4 +List previous window. +.Ip "w line" 12 4 +List window around line. +.Ip "l subname" 12 4 +List subroutine. +If it's a long subroutine it just lists the beginning. +Use \*(L"l\*(R" to list more. +.Ip "/pattern/" 12 4 +Regular expression search forward for pattern; the final / is optional. +.Ip "?pattern?" 12 4 +Regular expression search backward for pattern; the final ? is optional. +.Ip "L" 12 4 +List lines that have breakpoints or actions. +.Ip "S" 12 4 +Lists the names of all subroutines. +.Ip "t" 12 4 +Toggle trace mode on or off. +.Ip "b line condition" 12 4 +Set a breakpoint. +If line is omitted, sets a breakpoint on the +line that is about to be executed. +If a condition is specified, it is evaluated each time the statement is +reached and a breakpoint is taken only if the condition is true. +Breakpoints may only be set on lines that begin an executable statement. +.Ip "b subname condition" 12 4 +Set breakpoint at first executable line of subroutine. +.Ip "d line" 12 4 +Delete breakpoint. +If line is omitted, deletes the breakpoint on the +line that is about to be executed. +.Ip "D" 12 4 +Delete all breakpoints. +.Ip "a line command" 12 4 +Set an action for line. +A multi-line command may be entered by backslashing the newlines. +.Ip "A" 12 4 +Delete all line actions. +.Ip "< command" 12 4 +Set an action to happen before every debugger prompt. +A multi-line command may be entered by backslashing the newlines. +.Ip "> command" 12 4 +Set an action to happen after the prompt when you've just given a command +to return to executing the script. +A multi-line command may be entered by backslashing the newlines. +.Ip "V package" 12 4 +List all variables in package. +Default is main package. +.Ip "! number" 12 4 +Redo a debugging command. +If number is omitted, redoes the previous command. +.Ip "! -number" 12 4 +Redo the command that was that many commands ago. +.Ip "H -number" 12 4 +Display last n commands. +Only commands longer than one character are listed. +If number is omitted, lists them all. +.Ip "q or ^D" 12 4 +Quit. +.Ip "command" 12 4 +Execute command as a perl statement. +A missing semicolon will be supplied. +.Ip "p expr" 12 4 +Same as \*(L"print DB'OUT expr\*(R". +The DB'OUT filehandle is opened to /dev/tty, regardless of where STDOUT +may be redirected to. +.PP +If you want to modify the debugger, copy perldb.pl from the perl library +to your current directory and modify it as necessary. +(You'll also have to put -I. on your command line.) +You can do some customization by setting up a .perldb file which contains +initialization code. +For instance, you could make aliases like these: +.nf + + $DB'alias{'len'} = 's/^len(.*)/p length($1)/'; + $DB'alias{'stop'} = 's/^stop (at|in)/b/'; + $DB'alias{'.'} = + 's/^\e./p "\e$DB\e'sub(\e$DB\e'line):\et",\e$DB\e'line[\e$DB\e'line]/'; + +.fi +.Sh "Setuid Scripts" +.I Perl +is designed to make it easy to write secure setuid and setgid scripts. +Unlike shells, which are based on multiple substitution passes on each line +of the script, +.I perl +uses a more conventional evaluation scheme with fewer hidden \*(L"gotchas\*(R". +Additionally, since the language has more built-in functionality, it +has to rely less upon external (and possibly untrustworthy) programs to +accomplish its purposes. +.PP +In an unpatched 4.2 or 4.3bsd kernel, setuid scripts are intrinsically +insecure, but this kernel feature can be disabled. +If it is, +.I perl +can emulate the setuid and setgid mechanism when it notices the otherwise +useless setuid/gid bits on perl scripts. +If the kernel feature isn't disabled, +.I perl +will complain loudly that your setuid script is insecure. +You'll need to either disable the kernel setuid script feature, or put +a C wrapper around the script. +.PP +When perl is executing a setuid script, it takes special precautions to +prevent you from falling into any obvious traps. +(In some ways, a perl script is more secure than the corresponding +C program.) +Any command line argument, environment variable, or input is marked as +\*(L"tainted\*(R", and may not be used, directly or indirectly, in any +command that invokes a subshell, or in any command that modifies files, +directories or processes. +Any variable that is set within an expression that has previously referenced +a tainted value also becomes tainted (even if it is logically impossible +for the tainted value to influence the variable). +For example: +.nf + +.ne 5 + $foo = shift; # $foo is tainted + $bar = $foo,\'bar\'; # $bar is also tainted + $xxx = <>; # Tainted + $path = $ENV{\'PATH\'}; # Tainted, but see below + $abc = \'abc\'; # Not tainted + +.ne 4 + system "echo $foo"; # Insecure + system "/bin/echo", $foo; # Secure (doesn't use sh) + system "echo $bar"; # Insecure + system "echo $abc"; # Insecure until PATH set + +.ne 5 + $ENV{\'PATH\'} = \'/bin:/usr/bin\'; + $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\'; + + $path = $ENV{\'PATH\'}; # Not tainted + system "echo $abc"; # Is secure now! + +.ne 5 + open(FOO,"$foo"); # OK + open(FOO,">$foo"); # Not OK + + open(FOO,"echo $foo|"); # Not OK, but... + open(FOO,"-|") || exec \'echo\', $foo; # OK + + $zzz = `echo $foo`; # Insecure, zzz tainted + + unlink $abc,$foo; # Insecure + umask $foo; # Insecure + +.ne 3 + exec "echo $foo"; # Insecure + exec "echo", $foo; # Secure (doesn't use sh) + exec "sh", \'-c\', $foo; # Considered secure, alas + +.fi +The taintedness is associated with each scalar value, so some elements +of an array can be tainted, and others not. +.PP +If you try to do something insecure, you will get a fatal error saying +something like \*(L"Insecure dependency\*(R" or \*(L"Insecure PATH\*(R". +Note that you can still write an insecure system call or exec, +but only by explicitly doing something like the last example above. +You can also bypass the tainting mechanism by referencing +subpatterns\*(--\c +.I perl +presumes that if you reference a substring using $1, $2, etc, you knew +what you were doing when you wrote the pattern: +.nf + + $ARGV[0] =~ /^\-P(\ew+)$/; + $printer = $1; # Not tainted + +.fi +This is fairly secure since \ew+ doesn't match shell metacharacters. +Use of .+ would have been insecure, but +.I perl +doesn't check for that, so you must be careful with your patterns. +This is the ONLY mechanism for untainting user supplied filenames if you +want to do file operations on them (unless you make $> equal to $<). +.PP +It's also possible to get into trouble with other operations that don't care +whether they use tainted values. +Make judicious use of the file tests in dealing with any user-supplied +filenames. +When possible, do opens and such after setting $> = $<. +.I Perl +doesn't prevent you from opening tainted filenames for reading, so be +careful what you print out. +The tainting mechanism is intended to prevent stupid mistakes, not to remove +the need for thought. +.SH ENVIRONMENT +.I Perl +uses PATH in executing subprocesses, and in finding the script if \-S +is used. +HOME or LOGDIR are used if chdir has no argument. +.PP +Apart from these, +.I perl +uses no environment variables, except to make them available +to the script being executed, and to child processes. +However, scripts running setuid would do well to execute the following lines +before doing anything else, just to keep people honest: +.nf + +.ne 3 + $ENV{\'PATH\'} = \'/bin:/usr/bin\'; # or whatever you need + $ENV{\'SHELL\'} = \'/bin/sh\' if $ENV{\'SHELL\'} ne \'\'; + $ENV{\'IFS\'} = \'\' if $ENV{\'IFS\'} ne \'\'; + +.fi +.SH AUTHOR +Larry Wall <lwall@jpl-devvax.Jpl.Nasa.Gov> +.br +MS-DOS port by Diomidis Spinellis <dds@cc.ic.ac.uk> +.SH FILES +/tmp/perl\-eXXXXXX temporary file for +.B \-e +commands. +.SH SEE ALSO +a2p awk to perl translator +.br +s2p sed to perl translator +.SH DIAGNOSTICS +Compilation errors will tell you the line number of the error, with an +indication of the next token or token type that was to be examined. +(In the case of a script passed to +.I perl +via +.B \-e +switches, each +.B \-e +is counted as one line.) +.PP +Setuid scripts have additional constraints that can produce error messages +such as \*(L"Insecure dependency\*(R". +See the section on setuid scripts. +.SH TRAPS +Accustomed +.IR awk +users should take special note of the following: +.Ip * 4 2 +Semicolons are required after all simple statements in +.IR perl . +Newline +is not a statement delimiter. +.Ip * 4 2 +Curly brackets are required on ifs and whiles. +.Ip * 4 2 +Variables begin with $ or @ in +.IR perl . +.Ip * 4 2 +Arrays index from 0 unless you set $[. +Likewise string positions in substr() and index(). +.Ip * 4 2 +You have to decide whether your array has numeric or string indices. +.Ip * 4 2 +Associative array values do not spring into existence upon mere reference. +.Ip * 4 2 +You have to decide whether you want to use string or numeric comparisons. +.Ip * 4 2 +Reading an input line does not split it for you. You get to split it yourself +to an array. +And the +.I split +operator has different arguments. +.Ip * 4 2 +The current input line is normally in $_, not $0. +It generally does not have the newline stripped. +($0 is the name of the program executed.) +.Ip * 4 2 +$<digit> does not refer to fields\*(--it refers to substrings matched by the last +match pattern. +.Ip * 4 2 +The +.I print +statement does not add field and record separators unless you set +$, and $\e. +.Ip * 4 2 +You must open your files before you print to them. +.Ip * 4 2 +The range operator is \*(L".\|.\*(R", not comma. +(The comma operator works as in C.) +.Ip * 4 2 +The match operator is \*(L"=~\*(R", not \*(L"~\*(R". +(\*(L"~\*(R" is the one's complement operator, as in C.) +.Ip * 4 2 +The exponentiation operator is \*(L"**\*(R", not \*(L"^\*(R". +(\*(L"^\*(R" is the XOR operator, as in C.) +.Ip * 4 2 +The concatenation operator is \*(L".\*(R", not the null string. +(Using the null string would render \*(L"/pat/ /pat/\*(R" unparsable, +since the third slash would be interpreted as a division operator\*(--the +tokener is in fact slightly context sensitive for operators like /, ?, and <. +And in fact, . itself can be the beginning of a number.) +.Ip * 4 2 +.IR Next , +.I exit +and +.I continue +work differently. +.Ip * 4 2 +The following variables work differently +.nf + + Awk \h'|2.5i'Perl + ARGC \h'|2.5i'$#ARGV + ARGV[0] \h'|2.5i'$0 + FILENAME\h'|2.5i'$ARGV + FNR \h'|2.5i'$. \- something + FS \h'|2.5i'(whatever you like) + NF \h'|2.5i'$#Fld, or some such + NR \h'|2.5i'$. + OFMT \h'|2.5i'$# + OFS \h'|2.5i'$, + ORS \h'|2.5i'$\e + RLENGTH \h'|2.5i'length($&) + RS \h'|2.5i'$/ + RSTART \h'|2.5i'length($\`) + SUBSEP \h'|2.5i'$; + +.fi +.Ip * 4 2 +When in doubt, run the +.I awk +construct through a2p and see what it gives you. +.PP +Cerebral C programmers should take note of the following: +.Ip * 4 2 +Curly brackets are required on ifs and whiles. +.Ip * 4 2 +You should use \*(L"elsif\*(R" rather than \*(L"else if\*(R" +.Ip * 4 2 +.I Break +and +.I continue +become +.I last +and +.IR next , +respectively. +.Ip * 4 2 +There's no switch statement. +.Ip * 4 2 +Variables begin with $ or @ in +.IR perl . +.Ip * 4 2 +Printf does not implement *. +.Ip * 4 2 +Comments begin with #, not /*. +.Ip * 4 2 +You can't take the address of anything. +.Ip * 4 2 +ARGV must be capitalized. +.Ip * 4 2 +The \*(L"system\*(R" calls link, unlink, rename, etc. return nonzero for success, not 0. +.Ip * 4 2 +Signal handlers deal with signal names, not numbers. +.PP +Seasoned +.I sed +programmers should take note of the following: +.Ip * 4 2 +Backreferences in substitutions use $ rather than \e. +.Ip * 4 2 +The pattern matching metacharacters (, ), and | do not have backslashes in front. +.Ip * 4 2 +The range operator is .\|. rather than comma. +.PP +Sharp shell programmers should take note of the following: +.Ip * 4 2 +The backtick operator does variable interpretation without regard to the +presence of single quotes in the command. +.Ip * 4 2 +The backtick operator does no translation of the return value, unlike csh. +.Ip * 4 2 +Shells (especially csh) do several levels of substitution on each command line. +.I Perl +does substitution only in certain constructs such as double quotes, +backticks, angle brackets and search patterns. +.Ip * 4 2 +Shells interpret scripts a little bit at a time. +.I Perl +compiles the whole program before executing it. +.Ip * 4 2 +The arguments are available via @ARGV, not $1, $2, etc. +.Ip * 4 2 +The environment is not automatically made available as variables. +.SH ERRATA\0AND\0ADDENDA +The Perl book, +.I Programming\0Perl , +has the following omissions and goofs. +.PP +On page 5, the examples which read +.nf + + eval "/usr/bin/perl + +should read + + eval "exec /usr/bin/perl + +.fi +.PP +On page 195, the equivalent to the System V sum program only works for +very small files. To do larger files, use +.nf + + undef $/; + $checksum = unpack("%32C*",<>) % 32767; + +.fi +.PP +The +.B \-0 +switch to set the initial value of $/ was added to Perl after the book +went to press. +.PP +The +.B \-l +switch now does automatic line ending processing. +.PP +The qx// construct is now a synonym for backticks. +.PP +$0 may now be assigned to set the argument displayed by +.I ps (1). +.PP +The new @###.## format was omitted accidentally from the description +on formats. +.PP +It wasn't known at press time that s///ee caused multiple evaluations of +the replacement expression. This is to be construed as a feature. +.PP +(LIST) x $count now does array replication. +.PP +There is now no limit on the number of parentheses in a regular expression. +.PP +In double-quote context, more escapes are supported: \ee, \ea, \ex1b, \ec[, +\el, \eL, \eu, \eU, \eE. The latter five control up/lower case translation. +.PP +The +.B $/ +variable may now be set to a multi-character delimiter. +.SH BUGS +.PP +.I Perl +is at the mercy of your machine's definitions of various operations +such as type casting, atof() and sprintf(). +.PP +If your stdio requires an seek or eof between reads and writes on a particular +stream, so does +.IR perl . +.PP +While none of the built-in data types have any arbitrary size limits (apart +from memory size), there are still a few arbitrary limits: +a given identifier may not be longer than 255 characters; +sprintf is limited on many machines to 128 characters per field (unless the format +specifier is exactly %s); +and no component of your PATH may be longer than 255 if you use \-S. +.PP +.I Perl +actually stands for Pathologically Eclectic Rubbish Lister, but don't tell +anyone I said that. +.rn }` '' |