diff options
author | Jarkko Hietaniemi <jhi@iki.fi> | 2001-10-18 00:43:18 +0000 |
---|---|---|
committer | Jarkko Hietaniemi <jhi@iki.fi> | 2001-10-18 00:43:18 +0000 |
commit | bfe16a1a9da3d6943a43492946f841e0774c6221 (patch) | |
tree | d300de2882cef84d66bad1ddab63df42c3a91f0b /pod | |
parent | 8305e449a259649641f455b333f66bc0de7f3b62 (diff) | |
download | perl-bfe16a1a9da3d6943a43492946f841e0774c6221.tar.gz |
Add perlintro, from Skud.
p4raw-id: //depot/perl@12487
Diffstat (limited to 'pod')
-rw-r--r-- | pod/perlintro.pod | 645 |
1 files changed, 645 insertions, 0 deletions
diff --git a/pod/perlintro.pod b/pod/perlintro.pod new file mode 100644 index 0000000000..db30810685 --- /dev/null +++ b/pod/perlintro.pod @@ -0,0 +1,645 @@ +=head1 NAME + +perlintro -- a brief introduction and overview of Perl + +=head1 DESCRIPTION + +This document is intended to give you a quick overview of the Perl +programming language, along with pointers to further documentation. It +is intended as a "bootstrap" guide for those who are new to the +language, and provides just enough information for you to be able to +read other peoples' Perl and understand roughly what it's doing, or +write your own simple scripts. + +This introductory document does not aim to be complete. It does not +even aim to be entirely accurate. In some cases perfection has been +sacrificed in the goal of getting the general idea across. You are +B<strongly> advised to follow this introduction with more information +from the full Perl manual, the table of contents to which can be found +in L<perltoc>. + +Throughout this document you'll see references to other parts of the +Perl documentation. You can read that documentation using the C<perldoc> +command or using whatever method you're using to read this document. + +=head2 What is Perl? + +Perl is a general-purpose programming language originally developed for +text manipulation and now used for a wide range of tasks including +system administration, web development, network programming, GUI +development, and more. + +The language is intended to be practical (easy to use, efficient, +complete) rather than beautiful (tiny, elegant, minimal). Its major +features are that it's easy to use, supports both procedural and OO +programming, has powerful built-in support for text processing, and +has one of the world's most impressive collections of third-party +modules. + +Different definitions of Perl are given in L<perl>, L<perlfaq1> and +no doubt other places. From this we can determine that Perl is different +things to different people, but that lots of people think it's at least +worth writing about. + +=head2 Running Perl programs + +To run a Perl program from the Unix command line: + + perl progname.pl + +Alternatively, put this as the first line of your script: + + #!/usr/bin/env perl + +... and run the script as C</path/to/script.pl>. Of course, it'll need +to be executable first, so C<chmod 755 script.pl> (under Unix). + +For more information, including instructions for other platforms such as +Windows and MacOS, read L<perlrun>. + +=head2 Basic syntax overview + +A Perl script or program consists of one or more statements. These +statements are simply written in the script in a straightforward +fashion. There is no need to have a main() function or anything of that +kind. + +Perl statements end in a semi-colon: + + print "Hello, world"; + +Comments start with a hash symbol and run to the end of the line + + # This is a comment + +Whitespace is irrelevant: + + print + "Hello, world" + ; + +... except inside quoted strings: + + # this would print with a linebreak in the middle + print "Hello + world"; + +Double quotes or single quotes may be used around literal strings: + + print "Hello, world"; + print 'Hello, world'; + +However, only double quotes "interpolate" variables and special +characters such as newlines (C<\n>): + + print "Hello, $name\n"; # works fine + print 'Hello, $name\n'; # prints $name\n literally + +Numbers don't need quotes around them: + + print 42; + +You can use parentheses for functions' arguments or omit them +according to your personal taste. They are only required +occasionally to clarify issues of precedence. + + print("Hello, world\n"); + print "Hello, world\n"; + +More detailed information about Perl syntax can be found in L<perlsyn>. + +=head2 Perl variable types + +Perl has three main variable types: scalars, arrays, and hashes. + +=over 4 + +=item Scalars + +A scalar represents a single value: + + my $animal = "camel"; + my $answer = 42; + +Scalar values can be strings, integers or floating point numbers, and Perl +will automatically convert between them as required. There is no need +to pre-declare your variable types. + +Scalar values can be used in various ways: + + print $animal; + print "The animal is $animal\n"; + print "The square of $answer is ", $answer * $answer, "\n"; + +There are a number of "magic" scalars with names that look like +punctuation or line noise. These special variables are used for all +kinds of purposes, and are documented in L<perlvar>. The only one you +need to know about for now is C<$_> which is the "default variable". +It's used as the default argument to a number of functions in Perl, and +it's set implicitly by certain looping constructs. + + print; # prints contents of $_ by default + +=item Arrays + +An array represents a list of values: + + my @animals = ("camel", "llama", "owl"); + my @numbers = (23, 42, 69); + my @mixed = ("camel", 42, 1.23); + +Arrays are zero-indexed. Here's how you get at elements in an array: + + print $animals[0]; # prints "camel" + print $animals[1]; # prints "llama" + +The special variable C<$#array> tells you the index of the last element +of an array: + + print $mixed[$#mixed]; # last element, prints 1.23 + +You might be tempted to use C<$#array + 1> to tell you how many items there +are in an array. Don't bother. As it happens, using C<@array> where Perl +expects to find a scalar value ("in scalar context") will give you the number +of elements in the array: + + if (@animals < 5) { ... } + +The elements we're getting from the array start with a C<$> because +we're getting just a single value out of the array -- you ask for a scalar, +you get a scalar. + +To get multiple values from a array: + + @animals[0,1]; # gives ("camel", "llama"); + @animals[0..2]; # gives ("camel", "llama", "owl"); + @animals[1..$#animals]; # gives all except the first element + +This is called an "array slice". + +You can do various useful things to lists: + + my @sorted = sort @animals; + my @backwards = reverse @numbers; + +There are a couple of special arrays too, such as C<@ARGV> (the command +line arguments to your script) and C<@_> (the arguments passed to a +subroutine). These are documented in L<perlvar>. + +=item Hashes + +A hash represents a set of key/value pairs: + + my %fruit_color = ("apple", "red", "banana", "yellow"); + +You can use whitespace and the C<< => >> operator to lay them out more +nicely: + + my %fruit_color = ( + apple => "red", + banana => "yellow", + ); + +To get at hash elements: + + $fruit_color{"apple"}; # gives "red" + +You can get at lists of keys and values with C<keys()> and +C<values()>. + + my @fruits = keys %fruit_colors; + my @colors = values %fruit_colors; + +Hashes have no particular internal order, though you can sort the keys +and loop through them. + +Just like special scalars and arrays, there are also special hashes. +The most well known of these is C<%ENV> which contains environment +variables. Read all about it (and other special variables) in +L<perlvar>. + +=back + +Scalars, arrays and hashes are documented more fully in L<perldata>. + +More complex data types can be constructed using references, which allow +you to build lists and hashes within lists and hashes. + +A reference is a scalar value and can refer to any other Perl data +type. So by storing a reference as the value of an array or hash +element, you can easily create lists and hashes within lists and +hashes. The following example shows a 2 level hash of hash +structure using anonymous hash references. + + my $variables = { + scalar => { + description => "single item", + sigil => '$', + }, + array => { + description => "ordered list of items", + sigil => '@', + }, + hash => { + description => "key/value pairs", + sigil => '%', + }, + }; + + print "Scalars begin with a $variables->{'scalar'}->{'sigil'}\n"; + +Exhaustive information on the topic of references can be found in +L<perlreftut>, L<perllol>, L<perlref> and L<perldsc>. + +=head2 Variable scoping + +Throughout the previous section all the examples have used the syntax: + + my $var = "value"; + +The C<my> is actually not required; you could just use: + + $var = "value"; + +However, the above usage will create global variables throughout your +program, which is bad programming practice. C<my> creates lexically +scoped variables instead. The variables are scoped to the block +(i.e. a bunch of statements surrounded by curly-braces) in which they +are defined. + + my $a = "foo"; + if ($some_condition) { + my $b = "bar"; + print $a; # prints "foo" + print $b; # prints "bar" + } + print $a; # prints "foo" + print $b; # prints nothing; $b has fallen out of scope + +Using C<my> in combination with a C<use strict;> at the top of +your Perl scripts means that the interpreter will pick up certain common +programming errors. For instance, in the example above, the final +C<print $b> would cause a compile-time error and prevent you from +running the program. Using C<strict> is highly recommended. + +=head2 Conditional and looping constructs + +Perl has most of the usual conditional and looping constructs except for +case/switch (but you can find a Switch module on CPAN, if you really +want one -- see the section on modules, below, for more information +about modules and CPAN). + +The conditions can be any Perl expression. See the list of operators in +the next section for information on comparison and boolean logic operators, +which are commonly used in conditional statements. + +=over 4 + +=item if + + if ( condition ) { + ... + } elsif ( other condition ) { + ... + } else { + ... + } + +There's also a negated version of it: + + unless ( condition ) { + ... + } + +This is provided as a more readable version of C<if (! condition)>. + +Note that the braces are required in Perl, even if you've only got one +line in the block. However, there is a clever way of making your one-line +conditional blocks more English like: + + # the traditional way + if ($zippy) { + print "Yow!"; + } + + # the Perlish post-condition way + print "Yow!" if $zippy; + print "We have no bananas" unless $bananas; + +=item while + + while ( condition ) { + ... + } + +There's also a negated version, for the same reason we have C<unless>: + + until ( condition ) { + ... + } + +You can also use C<while> in a post-condition: + + print "LA LA LA\n" while 1; # loops forever + +=item for + +Exactly like C: + + for ($i=0; $i <= $max; $i++) { + ... + } + +The C style for loop is rarely needed in Perl since Perl provides +the the more friendly list scanning C<foreach> loop. + +=item foreach + + foreach (@array) { + print "This element is $_\n"; + } + + # you don't have to use the default $_ either... + foreach my $key (keys %hash) { + print "The value of $key is $hash{$key}\n"; + } + +=back + +For more detail on looping constructs (and some that weren't mentioned in +this overview) see L<perlsyn>. + +=head2 Builtin operators and functions + +Perl comes with a wide selection of builtin functions. Some of the ones +we've already seen include C<print>, C<sort> and C<reverse>. A list of +them is given at the start of L<perlfunc> and you can easily read +about any given function by using C<perldoc -f functionname>. + +Perl operators are documented in full in L<perlop>, but here are a few +of the most common ones: + +=over 4 + +=item Arithmetic + + + addition + - subtraction + * multiplication + / division + +=item Numeric comparison + + == equality + != inequality + < less than + > greater than + <= less than or equal + >= greater than or equal + +=item String comparison + + eq equality + ne inequality + lt less than + gt greater than + le less than or equal + ge greater than or equal + +(Why do we have separate numeric and string comparisons? Because we don't +have special variable types, and Perl needs to know whether to sort +numerically (where 99 is less than 100) or alphabetically (where 100 comes +before 99). + +=item Boolean logic + + && and + || or + ! not + +(C<and>, C<or> and C<not> aren't just in the above table as descriptions +of the operators -- they're also supported as operators in their own +right. They're more readable than the C-style operators, but have +different precedence to C<&&> and friends. Check L<perlop> for more +detail.) + +=item Miscellaneous + + = assignment + . string concatenation + x string multiplication + .. range operator (creates a list of numbers) + +=back + +Many operators can be combined with a C<=> as follows: + + $a += 1; # same as $a = $a + 1 + $a -= 1; # same as $a = $a - 1 + $a .= "\n"; # same as $a = $a . "\n"; + +=head2 Files and I/O + +You can open a file for input or output using the C<open()> function. +It's documented in extravagant detail in L<perlfunc> and L<perlopentut>, +but in short: + + open(INFILE, "input.txt") or die "Can't open input.txt: $!"; + open(OUTFILE, ">output.txt") or die "Can't open output.txt: $!"; + open(LOGFILE, ">>my.log") or die "Can't open logfile: $!"; + +You can read from an open filehandle using the C<< <> >> operator. In +scalar context it reads a single line from the filehandle, and in list +context it reads the whole file in, assigning each line to an element of +the list: + + my $line = <INFILE>; + my @lines = <INFILE>; + +Reading in the whole file at one time is called slurping. It can +be useful but it may be a memory hog. Most text file processing +can be done a line at a time with Perl's looping constructs. + +The C<< <> >> operator is most often seen in a C<while> loop: + + while (<INFILE>) { # assigns each line in turn to $_ + print "Just read in this line: $_"; + } + +We've already seen how to print to standard output using C<print()>. +However, C<print()> can also take an optional first argument specifying +which filehandle to print to: + + print STDERR "This is your final warning.\n"; + print OUTFILE $record; + print LOGFILE $logmessage; + +When you're done with your filehandles, you should C<close()> them +(though to be honest, Perl will clean up after you if you forget): + + close INFILE; + +=head2 Regular expressions + +Perl's regular expression support is both broad and deep, and is the +subject of lengthy documentation in L<perlrequick>, L<perlretut>, and +elsewhere. However, in short: + +=over 4 + +=item Simple matching + + if (/foo/) { ... } # true if $_ contains "foo" + if ($a =~ /foo/) { ... } # true if $a contains "foo" + +The C<//> matching operator is documented in L<perlop>. It operates on +C<$_> by default, or can be bound to another variable using the C<=~> +binding operator (also documented in L<perlop>). + +=item Simple substitution + + s/foo/bar/; # replaces foo with bar in $_ + $a =~ s/foo/bar/; # replaces foo with bar in $a + $a =~ s/foo/bar/g; # replaces ALL INSTANCES of foo with bar in $a + +The C<s///> substitution operator is documented in L<perlop>. + +=item More complex regular expressions + +You don't just have to match on fixed strings. In fact, you can match +on just about anything you could dream of by using more complex regular +expressions. These are documented at great length in L<perlre>, but for +the meantime, here's a quick cheat sheet: + + . a single character + \s a whitespace character (space, tab, newline) + \S non-whitespace character + \d a digit (0-9) + \D a non-digit + \w a word character (a-z, A-Z, 0-9, _) + \W a non-word character + [aeiou] matches a single character in the given set + [^aeiou] matches a single character outside the given set + (foo|bar|baz) matches any of the alternatives specified + + ^ start of string + $ end of string + +Quantifiers can be used to specify how many of the previous thing you +want to match on, where "thing" means either a literal character, one +of the metacharacters listed above, or a group of characters or +metacharacters in parentheses. + + * zero or more of the previous thing + + one or more of the previous thing + ? zero or one of the previous thing + {3} matches exactly 3 of the previous thing + {3,6} matches between 3 and 6 of the previous thing + {3,} matches 3 or more of the previous thing + +Some brief examples: + + /^\d+/ string starts with one or more digits + /^$/ nothing in the string (start and end are adjacent) + /(\d\s){3}/ a three digits, each followed by a whitespace + character (eg "3 4 5 ") + /(a.)+/ matches a string in which every odd-numbered letter + is a (eg "abacadaf") + + # This loop reads from STDIN, and prints non-blank lines: + while (<>) { + next if /^$/; + print; + } + +=item Parentheses for capturing + +As well as grouping, parentheses serve a second purpose. They can be +used to capture the results of parts of the regexp match for later use. +The results end up in C<$1>, C<$2> and so on. + + # a cheap and nasty way to break an email address up into parts + + if ($email =~ /([^@]+@(.+)/) { + print "Username is $1\n"; + print "Hostname is $2\n"; + } + +=item Other regexp features + +Perl regexps also support backreferences, lookaheads, and all kinds of +other complex details. Read all about them in L<perlrequick>, +L<perlretut>, and L<perlre>. + +=back + +=head2 Writing subroutines + +Writing subroutines is easy: + + sub log { + my $logmessage = shift; + print LOGFILE $logmessage; + } + +What's that C<shift>? Well, the arguments to a subroutine are available +to us as a special array called C<@_> (see L<perlvar> for more on that). +The default argument to the C<shift> function just happens to be C<@_>. +So C<my $logmessage = shift;> shifts the first item off the list of +arguments and assigns it to C<$logmessage>. + +We can manipulate C<@_> in other ways too: + + my ($logmessage, $priority) = @_; # common + my $logmessage = $_[0]; # uncommon, and ugly + +Subroutines can also return values: + + sub square { + my $num = shift; + my $result = $num * $num; + return $result; + } + +For more information on writing subroutines, see L<perlsub>. + +=head2 OO Perl + +OO Perl is relatively simple and is implemented using references which +know what sort of object they are based on Perl's concept of packages. +However, OO Perl is largely beyond the scope of this document. +Read L<perlboot>, L<perltoot>, L<perltooc> and L<perlobj>. + +As a beginning Perl programmer, your most common use of OO Perl will be +in using third-party modules, which are documented below. + +=head2 Using Perl modules + +Perl modules provide a range of features to help you avoid reinventing +the wheel, and can be downloaded from CPAN (http://www.cpan.org). A +number of popular modules are included with the Perl distribution +itself. + +Categories of modules range from text manipulation to network protocols +to database integration to graphics. A categorized list of modules is +also available from CPAN. + +To learn how to install modules you download from CPAN, read +L<perlmodinstall> + +To learn how to use a particular module, use C<perldoc Module::Name>. +Typically you will want to C<use Module::Name>, which will then give you +access to exported functions or an OO interface to the module. + +L<perlfaq> contains questions and answers related to many common +tasks, and often provides suggestions for good CPAN modules to use. + +L<perlmod> describes Perl modules in general. L<perlmodlib> lists the +modules which came with your Perl installation. + +If you feel the urge to write Perl modules, L<perlnewmod> will give you +good advice. + +=head1 AUTHOR + +Kirrily "Skud" Robert <skud@cpan.org> |