perl 5.000perl-5.000

[editor's note: this commit combines approximate 4 months of furious releases of Andy Dougherty and Larry Wall - see pod/perlhist.pod for details. Andy notes that; Alas neither my "Irwin AccuTrack" nor my DC 600A quarter-inch cartridge backup tapes from that era seem to be readable anymore. I guess 13 years exceeds the shelf life for that backup technology :-(. ]
author: Larry Wall <lwall@netlabs.com> 1994-10-17 23:00:00 +0000
committer: Larry Wall <lwall@netlabs.com> 1994-10-17 23:00:00 +0000
commit: a0d0e21ea6ea90a22318550944fe6cb09ae10cda (patch)
tree: faca1018149b736b1142f487e44d1ff2de5cc1fa /pod/perlref.pod
parent: 85e6fe838fb25b257a1b363debf8691c0992ef71 (diff)
download: perl-a0d0e21ea6ea90a22318550944fe6cb09ae10cda.tar.gz
1 files changed, 332 insertions, 0 deletions
diff --git a/pod/perlref.pod b/pod/perlref.pod
new file mode 100644
index 0000000000..0ad25dfe66
--- /dev/null
+++ b/pod/perlref.pod
@@ -0,0 +1,332 @@
+=head1 NAME
+
+perlref - Perl references and nested data structures
+
+=head1 DESCRIPTION
+
+In Perl 4 it was difficult to represent complex data structures, because
+all references had to be symbolic, and even that was difficult to do when
+you wanted to refer to a variable rather than a symbol table entry.  Perl
+5 not only makes it easier to use symbolic references to variables, but
+lets you have "hard" references to any piece of data.  Any scalar may hold
+a hard reference.  Since arrays and hashes contain scalars, you can now
+easily build arrays of arrays, arrays of hashes, hashes of arrays, arrays
+of hashes of functions, and so on.
+
+Hard references are smart--they keep track of reference counts for you,
+automatically freeing the thing referred to when its reference count
+goes to zero.  If that thing happens to be an object, the object is
+destructed.  See L<perlobj> for more about objects.  (In a sense,
+everything in Perl is an object, but we usually reserve the word for
+references to objects that have been officially "blessed" into a class package.)
+
+A symbolic reference contains the name of a variable, just as a
+symbolic link in the filesystem merely contains the name of a file.  
+The C<*glob> notation is a kind of symbolic reference.  Hard references
+are more like hard links in the file system: merely another way
+at getting at the same underlying object, irrespective of its name.
+
+"Hard" references are easy to use in Perl.  There is just one
+overriding principle:  Perl does no implicit referencing or
+dereferencing.  When a scalar is holding a reference, it always behaves
+as a scalar.  It doesn't magically start being an array or a hash
+unless you tell it so explicitly by dereferencing it.
+
+References can be constructed several ways.
+
+=over 4
+
+=item 1.
+
+By using the backslash operator on a variable, subroutine, or value.
+(This works much like the & (address-of) operator works in C.)  Note
+that this typically creates I<ANOTHER> reference to a variable, since
+there's already a reference to the variable in the symbol table.  But
+the symbol table reference might go away, and you'll still have the
+reference that the backslash returned.  Here are some examples:
+
+    $scalarref = \$foo;
+    $arrayref  = \@ARGV;
+    $hashref   = \%ENV;
+    $coderef   = \&handler;
+
+=item 2.
+
+A reference to an anonymous array can be constructed using square
+brackets:
+
+    $arrayref = [1, 2, ['a', 'b', 'c']];
+
+Here we've constructed a reference to an anonymous array of three elements
+whose final element is itself reference to another anonymous array of three
+elements.  (The multidimensional syntax described later can be used to
+access this.  For example, after the above, $arrayref->[2][1] would have
+the value "b".)
+
+=item 3.
+
+A reference to an anonymous hash can be constructed using curly
+brackets:
+
+    $hashref = {
+	'Adam'  => 'Eve',
+	'Clyde' => 'Bonnie',
+    };
+
+Anonymous hash and array constructors can be intermixed freely to
+produce as complicated a structure as you want.  The multidimensional
+syntax described below works for these too.  The values above are
+literals, but variables and expressions would work just as well, because
+assignment operators in Perl (even within local() or my()) are executable
+statements, not compile-time declarations.
+
+Because curly brackets (braces) are used for several other things
+including BLOCKs, you may occasionally have to disambiguate braces at the
+beginning of a statement by putting a C<+> or a C<return> in front so
+that Perl realizes the opening brace isn't starting a BLOCK.  The economy and
+mnemonic value of using curlies is deemed worth this occasional extra
+hassle.
+
+For example, if you wanted a function to make a new hash and return a
+reference to it, you have these options:
+
+    sub hashem {        { @_ } }   # silently wrong
+    sub hashem {       +{ @_ } }   # ok
+    sub hashem { return { @_ } }   # ok
+
+=item 4.
+
+A reference to an anonymous subroutine can be constructed by using
+C<sub> without a subname:
+
+    $coderef = sub { print "Boink!\n" };
+
+Note the presence of the semicolon.  Except for the fact that the code
+inside isn't executed immediately, a C<sub {}> is not so much a
+declaration as it is an operator, like C<do{}> or C<eval{}>.  (However, no
+matter how many times you execute that line (unless you're in an
+C<eval("...")>), C<$coderef> will still have a reference to the I<SAME>
+anonymous subroutine.)
+
+For those who worry about these things, the current implementation 
+uses shallow binding of local() variables; my() variables are not
+accessible.  This precludes true closures.  However, you can work 
+around this with a run-time (rather than a compile-time) eval():
+
+    {
+	my $x = time;
+	$coderef = eval "sub { \$x }";
+    }
+
+Normally--if you'd used just C<sub{}> or even C<eval{}>--your unew sub
+would only have been able to access the global $x.  But because you've
+used a run-time eval(), this will not only generate a brand new subroutine
+reference each time called, it will all grant access to the my() variable
+lexically above it rather than the global one.  The particular $x 
+accessed will be different for each new sub you create.  This mechanism
+yields deep binding of variables.  (If you don't know what closures, deep
+binding, or shallow binding are, don't worry too much about it.)
+
+=item 5.
+
+References are often returned by special subroutines called constructors.
+Perl objects are just reference a special kind of object that happens to know
+which package it's associated with.  Constructors are just special
+subroutines that know how to create that association.  They do so by
+starting with an ordinary reference, and it remains an ordinary reference
+even while it's also being an object.  Constructors are customarily
+named new(), but don't have to be:
+
+    $objref = new Doggie (Tail => 'short', Ears => 'long');
+
+=item 6.
+
+References of the appropriate type can spring into existence if you
+dereference them in a context that assumes they exist.  Since we haven't
+talked about dereferencing yet, we can't show you any examples yet.
+
+=back
+
+That's it for creating references.  By now you're probably dying to
+know how to use references to get back to your long-lost data.  There
+are several basic methods.
+
+=over 4
+
+=item 1.
+
+Anywhere you'd put an identifier as part of a variable or subroutine
+name, you can replace the identifier with a simple scalar variable
+containing a reference of the correct type:
+
+    $bar = $$scalarref;
+    push(@$arrayref, $filename);
+    $$arrayref[0] = "January";
+    $$hashref{"KEY"} = "VALUE";
+    &$coderef(1,2,3);
+
+It's important to understand that we are specifically I<NOT> dereferencing
+C<$arrayref[0]> or C<$hashref{"KEY"}> there.  The dereference of the
+scalar variable happens I<BEFORE> it does any key lookups.  Anything more
+complicated than a simple scalar variable must use methods 2 or 3 below.
+However, a "simple scalar" includes an identifier that itself uses method
+1 recursively.  Therefore, the following prints "howdy".
+
+    $refrefref = \\\"howdy";
+    print $$$$refrefref;
+
+=item 2.
+
+Anywhere you'd put an identifier as part of a variable or subroutine
+name, you can replace the identifier with a BLOCK returning a reference
+of the correct type.  In other words, the previous examples could be
+written like this:
+
+    $bar = ${$scalarref};
+    push(@{$arrayref}, $filename);
+    ${$arrayref}[0] = "January";
+    ${$hashref}{"KEY"} = "VALUE";
+    &{$coderef}(1,2,3);
+
+Admittedly, it's a little silly to use the curlies in this case, but
+the BLOCK can contain any arbitrary expression, in particular,
+subscripted expressions:
+
+    &{ $dispatch{$index} }(1,2,3);	# call correct routine 
+
+Because of being able to omit the curlies for the simple case of C<$$x>,
+people often make the mistake of viewing the dereferencing symbols as
+proper operators, and wonder about their precedence.  If they were,
+though, you could use parens instead of braces.  That's not the case.
+Consider the difference below; case 0 is a short-hand version of case 1,
+I<NOT> case 2:
+
+    $$hashref{"KEY"}   = "VALUE";	# CASE 0
+    ${$hashref}{"KEY"} = "VALUE";	# CASE 1
+    ${$hashref{"KEY"}} = "VALUE";	# CASE 2
+    ${$hashref->{"KEY"}} = "VALUE";	# CASE 3
+
+Case 2 is also deceptive in that you're accessing a variable
+called %hashref, not dereferencing through $hashref to the hash
+it's presumably referencing.  That would be case 3.
+
+=item 3.
+
+The case of individual array elements arises often enough that it gets
+cumbersome to use method 2.  As a form of syntactic sugar, the two
+lines like that above can be written:
+
+    $arrayref->[0] = "January";
+    $hashref->{"KEY} = "VALUE";
+
+The left side of the array can be any expression returning a reference,
+including a previous dereference.  Note that C<$array[$x]> is I<NOT> the
+same thing as C<$array-E<gt>[$x]> here:
+
+    $array[$x]->{"foo"}->[0] = "January";
+
+This is one of the cases we mentioned earlier in which references could
+spring into existence when in an lvalue context.  Before this
+statement, C<$array[$x]> may have been undefined.  If so, it's
+automatically defined with a hash reference so that we can look up
+C<{"foo"}> in it.  Likewise C<$array[$x]-E<gt>{"foo"}> will automatically get
+defined with an array reference so that we can look up C<[0]> in it.
+
+One more thing here.  The arrow is optional I<BETWEEN> brackets
+subscripts, so you can shrink the above down to
+
+    $array[$x]{"foo"}[0] = "January";
+
+Which, in the degenerate case of using only ordinary arrays, gives you
+multidimensional arrays just like C's:
+
+    $score[$x][$y][$z] += 42;
+
+Well, okay, not entirely like C's arrays, actually.  C doesn't know how
+to grow its arrays on demand.  Perl does.
+
+=item 4.
+
+If a reference happens to be a reference to an object, then there are
+probably methods to access the things referred to, and you should probably
+stick to those methods unless you're in the class package that defines the
+object's methods.  In other words, be nice, and don't violate the object's
+encapsulation without a very good reason.  Perl does not enforce
+encapsulation.  We are not totalitarians here.  We do expect some basic
+civility though.
+
+=back
+
+The ref() operator may be used to determine what type of thing the
+reference is pointing to.  See L<perlfunc>.
+
+The bless() operator may be used to associate a reference with a package
+functioning as an object class.  See L<perlobj>.
+
+A type glob may be dereferenced the same way a reference can, since
+the dereference syntax always indicates the kind of reference desired.
+So C<${*foo}> and C<${\$foo}> both indicate the same scalar variable.
+
+Here's a trick for interpolating a subroutine call into a string:
+
+    print "My sub returned ${\mysub(1,2,3)}\n";
+
+The way it works is that when the C<${...}> is seen in the double-quoted
+string, it's evaluated as a block.  The block executes the call to
+C<mysub(1,2,3)>, and then takes a reference to that.  So the whole block
+returns a reference to a scalar, which is then dereferenced by C<${...}>
+and stuck into the double-quoted string.
+
+=head2 Symbolic references
+
+We said that references spring into existence as necessary if they are
+undefined, but we didn't say what happens if a value used as a
+reference is already defined, but I<ISN'T> a hard reference.  If you
+use it as a reference in this case, it'll be treated as a symbolic
+reference.  That is, the value of the scalar is taken to be the I<NAME>
+of a variable, rather than a direct link to a (possibly) anonymous
+value.
+
+People frequently expect it to work like this.  So it does.
+
+    $name = "foo";
+    $$name = 1;			# Sets $foo
+    ${$name} = 2;		# Sets $foo
+    ${$name x 2} = 3;		# Sets $foofoo
+    $name->[0] = 4;		# Sets $foo[0]
+    @$name = ();		# Clears @foo
+    &$name();			# Calls &foo() (as in Perl 4)
+    $pack = "THAT";
+    ${"${pack}::$name"} = 5;	# Sets $THAT::foo without eval
+
+This is very powerful, and slightly dangerous, in that it's possible
+to intend (with the utmost sincerity) to use a hard reference, and
+accidentally use a symbolic reference instead.  To protect against
+that, you can say
+
+    use strict 'refs';
+
+and then only hard references will be allowed for the rest of the enclosing
+block.  An inner block may countermand that with 
+
+    no strict 'refs';
+
+Only package variables are visible to symbolic references.  Lexical
+variables (declared with my()) aren't in a symbol table, and thus are
+invisible to this mechanism.  For example:
+
+    local($value) = 10;
+    $ref = \$value;
+    {
+	my $value = 20;
+	print $$ref;
+    } 
+
+This will still print 10, not 20.  Remember that local() affects package
+variables, which are all "global" to the package.
+
+=head2 Further Reading
+
+Besides the obvious documents, source code can be instructive.
+Some rather pathological examples of the use of references can be found
+in the F<t/op/ref.t> regression test in the Perl source directory.
author	Larry Wall <lwall@netlabs.com>	1994-10-17 23:00:00 +0000
committer	Larry Wall <lwall@netlabs.com>	1994-10-17 23:00:00 +0000
commit	a0d0e21ea6ea90a22318550944fe6cb09ae10cda (patch)
tree	faca1018149b736b1142f487e44d1ff2de5cc1fa /pod/perlref.pod
parent	85e6fe838fb25b257a1b363debf8691c0992ef71 (diff)
download	perl-a0d0e21ea6ea90a22318550944fe6cb09ae10cda.tar.gz