diff options
author | Ilya Zakharevich <ilya@math.berkeley.edu> | 1998-07-25 17:28:16 -0400 |
---|---|---|
committer | Gurusamy Sarathy <gsar@cpan.org> | 1998-08-01 19:52:19 +0000 |
commit | ee239bfe47dc5d504cf50bb8f48401031aa791d7 (patch) | |
tree | d639b7f5512d058ca4c8b24cbcd2f0fc5efab604 /lib/overload.pm | |
parent | 3654eb6c94c503df3bbf29cfeb2429609f7a0879 (diff) | |
download | perl-ee239bfe47dc5d504cf50bb8f48401031aa791d7.tar.gz |
fixes for overloading bugs and docs, tweaked some
Message-Id: <199807260128.VAA10543@monk.mps.ohio-state.edu>
Subject: [PATCH 5.004_76] better overloading
p4raw-id: //depot/maint-5.005/perl@1677
Diffstat (limited to 'lib/overload.pm')
-rw-r--r-- | lib/overload.pm | 543 |
1 files changed, 531 insertions, 12 deletions
diff --git a/lib/overload.pm b/lib/overload.pm index dfcdb02b1e..43fef8ae5e 100644 --- a/lib/overload.pm +++ b/lib/overload.pm @@ -62,7 +62,10 @@ sub OverloadedStringify { my $package = shift; $package = ref $package if ref $package; #$package->can('(""') - ov_method mycan($package, '(""'), $package; + ov_method mycan($package, '(""'), $package + or ov_method mycan($package, '(0+'), $package + or ov_method mycan($package, '(bool'), $package + or ov_method mycan($package, '(nomethod'), $package; } sub Method { @@ -108,6 +111,18 @@ sub mycan { # Real can would leave stubs. 'qr' => 0x10000, ); +%ops = ( with_assign => "+ - * / % ** << >> x .", + assign => "+= -= *= /= %= **= <<= >>= x= .=", + str_comparison => "< <= > >= == !=", + '3way_comparison'=> "<=> cmp", + num_comparison => "lt le gt ge eq ne", + binary => "& | ^", + unary => "neg ! ~", + mutators => '++ --', + func => "atan2 cos sin exp abs log sqrt", + conversion => 'bool "" 0+', + special => 'nomethod fallback ='); + sub constant { # Arguments: what, sub while (@_) { @@ -220,7 +235,8 @@ the arguments are reversed. the current operation is an assignment variant (as in C<$a+=7>), but the usual function is called instead. This additional -information can be used to generate some optimizations. +information can be used to generate some optimizations. Compare +L<Calling Conventions for Mutators>. =back @@ -230,9 +246,67 @@ Unary operation are considered binary operations with the second argument being C<undef>. Thus the functions that overloads C<{"++"}> is called with arguments C<($a,undef,'')> when $a++ is executed. +=head2 Calling Conventions for Mutators + +Two types of mutators have different calling conventions: + +=over + +=item C<++> and C<--> + +The routines which implement these operators are expected to actually +I<mutate> their arguments. So, assuming that $obj is a reference to a +number, + + sub incr { my $n = $ {$_[0]}; ++$n; $_[0] = bless \$n} + +is an appropriate implementation of overloaded C<++>. Note that + + sub incr { ++$ {$_[0]} ; shift } + +is OK if used with preincrement and with postincrement. (In the case +of postincrement a copying will be performed, see L<Copy Constructor>.) + +=item C<x=> and other assignment versions + +There is nothing special about these methods. They may change the +value of their arguments, and may leave it as is. The result is going +to be assigned to the value in the left-hand-side if different from +this value. + +This allows for the same method to be used as averloaded C<+=> and +C<+>. Note that this is I<allowed>, but not recommended, since by the +semantic of L<"Fallback"> Perl will call the method for C<+> anyway, +if C<+=> is not overloaded. + +=back + +B<Warning.> Due to the presense of assignment versions of operations, +routines which may be called in assignment context may create +self-referencial structures. Currently Perl will not free self-referential +structures until cycles are C<explicitly> broken. You may get problems +when traversing your structures too. + +Say, + + use overload '+' => sub { bless [ \$_[0], \$_[1] ] }; + +is asking for trouble, since for code C<$obj += $foo> the subroutine +is called as C<$obj = add($obj, $foo, undef)>, or C<$obj = [\$obj, +\$foo]>. If using such a subroutine is an important optimization, one +can overload C<+=> explicitly by a non-"optimized" version, or switch +to non-optimized version if C<not defined $_[2]> (see +L<Calling Conventions for Binary Operations>). + +Even if no I<explicit> assignment-variants of operators are present in +the script, they may be generated by the optimizer. Say, C<",$obj,"> or +C<',' . $obj . ','> may be both optimized to + + my $tmp = ',' . $obj; $tmp .= ','; + =head2 Overloadable Operations -The following symbols can be specified in C<use overload>: +The following symbols can be specified in C<use overload> directive: =over 5 @@ -247,6 +321,10 @@ the assignment variant is not available. Methods for operations "C<+>", increment and decrement methods. The operation "C<->" can be used to autogenerate missing methods for unary minus or C<abs>. +See L<"MAGIC AUTOGENERATION">, L<"Calling Conventions for Mutators"> and +L<"Calling Conventions for Binary Operations">) for details of these +substitutions. + =item * I<Comparison operations> "<", "<=", ">", ">=", "==", "!=", "<=>", @@ -298,7 +376,23 @@ see L<SPECIAL SYMBOLS FOR C<use overload>>. =back -See L<"Fallback"> for an explanation of when a missing method can be autogenerated. +See L<"Fallback"> for an explanation of when a missing method can be +autogenerated. + +A computer-readable form of the above table is available in the hash +%overload::ops, with values being space-separated lists of names: + + with_assign => '+ - * / % ** << >> x .', + assign => '+= -= *= /= %= **= <<= >>= x= .=', + str_comparison => '< <= > >= == !=', + '3way_comparison'=> '<=> cmp', + num_comparison => 'lt le gt ge eq ne', + binary => '& | ^', + unary => 'neg ! ~', + mutators => '++ --', + func => 'atan2 cos sin exp abs log sqrt', + conversion => 'bool "" 0+', + special => 'nomethod fallback =' =head2 Inheritance and overloading @@ -401,15 +495,15 @@ to a reference that shares its object with some other reference, such as $a=$b; - $a++; + ++$a; To make this change $a and not change $b, a copy of C<$$a> is made, and $a is assigned a reference to this new object. This operation is -done during execution of the C<$a++>, and not during the assignment, +done during execution of the C<++$a>, and not during the assignment, (so before the increment C<$$a> coincides with C<$$b>). This is only -done if C<++> is expressed via a method for C<'++'> or C<'+='>. Note -that if this operation is expressed via C<'+'> a nonmutator, i.e., as -in +done if C<++> is expressed via a method for C<'++'> or C<'+='> (or +C<nomethod>). Note that if this operation is expressed via C<'+'> +a nonmutator, i.e., as in $a=$b; $a=$a+1; @@ -443,6 +537,9 @@ C<'='> was overloaded with C<\&clone>. =back +Same behaviour is triggered by C<$b = $a++>, which is consider a synonim for +C<$b = $a; ++$a>. + =head1 MAGIC AUTOGENERATION If a method for an operation is not found, and the value for C<"fallback"> is @@ -499,7 +596,7 @@ value is a scalar and not a reference. =back -=head1 WARNING +=head1 Losing overloading The restriction for the comparison operation is that even if, for example, `C<cmp>' should return a blessed reference, the autogenerated `C<lt>' @@ -661,6 +758,416 @@ behavior by defining your own copy constructor (see L<"Copy Constructor">). It is expected that arguments to methods that are not explicitly supposed to be changed are constant (but this is not enforced). +=head1 Metaphor clash + +One may wonder why the semantic of overloaded C<=> is so counterintuive. +If it I<looks> counterintuive to you, you are subject to a metaphor +clash. + +Here is a Perl object metaphor: + +I< object is a reference to blessed data> + +and an arithmetic metaphor: + +I< object is a thing by itself>. + +The I<main> problem of overloading C<=> is the fact that these metaphors +imply different actions on the assignment C<$a = $b> if $a and $b are +objects. Perl-think implies that $a becomes a reference to whatever +$b was referencing. Arithmetic-think implies that the value of "object" +$a is changed to become the value of the object $b, preserving the fact +that $a and $b are separate entities. + +The difference is not relevant in the absence of mutators. After +a Perl-way assignment an operation which mutates the data referenced by $a +would change the data referenced by $b too. Effectively, after +C<$a = $b> values of $a and $b become I<indistinguishable>. + +On the other hand, anyone who has used algebraic notation knows the +expressive power of the arithmetic metaphor. Overloading works hard +to enable this metaphor while preserving the Perlian way as far as +possible. Since it is not not possible to freely mix two contradicting +metaphors, overloading allows the arithmetic way to write things I<as +far as all the mutators are called via overloaded access only>. The +way it is done is described in L<Copy Constructor>. + +If some mutator methods are directly applied to the overloaded values, +one may need to I<explicitly unlink> other values which references the +same value: + + $a = new Data 23; + ... + $b = $a; # $b is "linked" to $a + ... + $a = $a->clone; # Unlink $b from $a + $a->increment_by(4); + +Note that overloaded access makes this transparent: + + $a = new Data 23; + $b = $a; # $b is "linked" to $a + $a += 4; # would unlink $b automagically + +However, it would not make + + $a = new Data 23; + $a = 4; # Now $a is a plain 4, not 'Data' + +preserve "objectness" of $a. But Perl I<has> a way to make assignments +to an object do whatever you want. It is just not the overload, but +tie()ing interface (see L<perlfunc/tie>). Adding a FETCH() method +which returns the object itself, and STORE() method which changes the +value of the object, one can reproduce the arithmetic metaphor in its +completeness, at least for variables which were tie()d from the start. + +(Note that a workaround for a bug may be needed, see L<"BUGS">.) + +=head1 Cookbook + +Please add examples to what follows! + +=head2 Two-face scalars + +Put this in F<two_face.pm> in your Perl library directory: + + package two_face; # Scalars with separate string and + # numeric values. + sub new { my $p = shift; bless [@_], $p } + use overload '""' => \&str, '0+' => \&num, fallback => 1; + sub num {shift->[1]} + sub str {shift->[0]} + +Use it as follows: + + require two_face; + my $seven = new two_face ("vii", 7); + printf "seven=$seven, seven=%d, eight=%d\n", $seven, $seven+1; + print "seven contains `i'\n" if $seven =~ /i/; + +(The second line creates a scalar which has both a string value, and a +numeric value.) This prints: + + seven=vii, seven=7, eight=8 + seven contains `i' + +=head2 Symbolic calculator + +Put this in F<symbolic.pm> in your Perl library directory: + + package symbolic; # Primitive symbolic calculator + use overload nomethod => \&wrap; + + sub new { shift; bless ['n', @_] } + sub wrap { + my ($obj, $other, $inv, $meth) = @_; + ($obj, $other) = ($other, $obj) if $inv; + bless [$meth, $obj, $other]; + } + +This module is very unusual as overloaded modules go: it does not +provide any usual overloaded operators, instead it provides the L<Last +Resort> operator C<nomethod>. In this example the corresponding +subroutine returns an object which encupsulates operations done over +the objects: C<new symbolic 3> contains C<['n', 3]>, C<2 + new +symbolic 3> contains C<['+', 2, ['n', 3]]>. + +Here is an example of the script which "calculates" the side of +circumscribed octagon using the above package: + + require symbolic; + my $iter = 1; # 2**($iter+2) = 8 + my $side = new symbolic 1; + my $cnt = $iter; + + while ($cnt--) { + $side = (sqrt(1 + $side**2) - 1)/$side; + } + print "OK\n"; + +The value of $side is + + ['/', ['-', ['sqrt', ['+', 1, ['**', ['n', 1], 2]], + undef], 1], ['n', 1]] + +Note that while we obtained this value using a nice little script, +there is no simple way to I<use> this value. In fact this value may +be inspected in debugger (see L<perldebug>), but ony if +C<bareStringify> B<O>ption is set, and not via C<p> command. + +If one attempts to print this value, then the overloaded operator +C<""> will be called, which will call C<nomethod> operator. The +result of this operator will be stringified again, but this result is +again of type C<symbolic>, which will lead to an infinite loop. + +Add a pretty-printer method to the module F<symbolic.pm>: + + sub pretty { + my ($meth, $a, $b) = @{+shift}; + $a = 'u' unless defined $a; + $b = 'u' unless defined $b; + $a = $a->pretty if ref $a; + $b = $b->pretty if ref $b; + "[$meth $a $b]"; + } + +Now one can finish the script by + + print "side = ", $side->pretty, "\n"; + +The method C<pretty> is doing object-to-string conversion, so it +is natural to overload the operator C<""> using this method. However, +inside such a method it is not necessary to pretty-print the +I<components> $a and $b of an object. In the above subroutine +C<"[$meth $a $b]"> is a catenation of some strings and components $a +and $b. If these components use overloading, the catenation operator +will look for an overloaded operator C<.>, if not present, it will +look for an overloaded operator C<"">. Thus it is enough to use + + use overload nomethod => \&wrap, '""' => \&str; + sub str { + my ($meth, $a, $b) = @{+shift}; + $a = 'u' unless defined $a; + $b = 'u' unless defined $b; + "[$meth $a $b]"; + } + +Now one can change the last line of the script to + + print "side = $side\n"; + +which outputs + + side = [/ [- [sqrt [+ 1 [** [n 1 u] 2]] u] 1] [n 1 u]] + +and one can inspect the value in debugger using all the possible +methods. + +Something is is still amiss: consider the loop variable $cnt of the +script. It was a number, not an object. We cannot make this value of +type C<symbolic>, since then the loop will not terminate. + +Indeed, to terminate the cycle, the $cnt should become false. +However, the operator C<bool> for checking falsity is overloaded (this +time via overloaded C<"">), and returns a long string, thus any object +of type C<symbolic> is true. To overcome this, we need a way to +compare an object to 0. In fact, it is easier to write a numeric +conversion routine. + +Here is the text of F<symbolic.pm> with such a routine added (and +slightly modifed str()): + + package symbolic; # Primitive symbolic calculator + use overload + nomethod => \&wrap, '""' => \&str, '0+' => \# + + sub new { shift; bless ['n', @_] } + sub wrap { + my ($obj, $other, $inv, $meth) = @_; + ($obj, $other) = ($other, $obj) if $inv; + bless [$meth, $obj, $other]; + } + sub str { + my ($meth, $a, $b) = @{+shift}; + $a = 'u' unless defined $a; + if (defined $b) { + "[$meth $a $b]"; + } else { + "[$meth $a]"; + } + } + my %subr = ( n => sub {$_[0]}, + sqrt => sub {sqrt $_[0]}, + '-' => sub {shift() - shift()}, + '+' => sub {shift() + shift()}, + '/' => sub {shift() / shift()}, + '*' => sub {shift() * shift()}, + '**' => sub {shift() ** shift()}, + ); + sub num { + my ($meth, $a, $b) = @{+shift}; + my $subr = $subr{$meth} + or die "Do not know how to ($meth) in symbolic"; + $a = $a->num if ref $a eq __PACKAGE__; + $b = $b->num if ref $b eq __PACKAGE__; + $subr->($a,$b); + } + +All the work of numeric conversion is done in %subr and num(). Of +course, %subr is not complete, it contains only operators used in teh +example below. Here is the extra-credit question: why do we need an +explicit recursion in num()? (Answer is at the end of this section.) + +Use this module like this: + + require symbolic; + my $iter = new symbolic 2; # 16-gon + my $side = new symbolic 1; + my $cnt = $iter; + + while ($cnt) { + $cnt = $cnt - 1; # Mutator `--' not implemented + $side = (sqrt(1 + $side**2) - 1)/$side; + } + printf "%s=%f\n", $side, $side; + printf "pi=%f\n", $side*(2**($iter+2)); + +It prints (without so many line breaks) + + [/ [- [sqrt [+ 1 [** [/ [- [sqrt [+ 1 [** [n 1] 2]]] 1] + [n 1]] 2]]] 1] + [/ [- [sqrt [+ 1 [** [n 1] 2]]] 1] [n 1]]]=0.198912 + pi=3.182598 + +The above module is very primitive. It does not implement +mutator methods (C<++>, C<-=> and so on), does not do deep copying +(not required without mutators!), and implements only those arithmetic +operations which are used in the example. + +To implement most arithmetic operattions is easy, one should just use +the tables of operations, and change the code which fills %subr to + + my %subr = ( 'n' => sub {$_[0]} ); + foreach my $op (split " ", $overload::ops{with_assign}) { + $subr{$op} = $subr{"$op="} = eval "sub {shift() $op shift()}"; + } + my @bins = qw(binary 3way_comparison num_comparison str_comparison); + foreach my $op (split " ", "@overload::ops{ @bins }") { + $subr{$op} = eval "sub {shift() $op shift()}"; + } + foreach my $op (split " ", "@overload::ops{qw(unary func)}") { + print "defining `$op'\n"; + $subr{$op} = eval "sub {$op shift()}"; + } + +Due to L<Calling Conventions for Mutators>, we do not need anything +special to make C<+=> and friends work, except filling C<+=> entry of +%subr, and defining a copy constructor (needed since Perl has no +way to know that the implementation of C<'+='> does not mutate +the argument, compare L<Copy Constructor>). + +To implement a copy constructor, add C<'=' => \&cpy> to C<use overload> +line, and code (this code assumes that mutators change things one level +deep only, so recursive copying is not needed): + + sub cpy { + my $self = shift; + bless [@$self], ref $self; + } + +To make C<++> and C<--> work, we need to implement actual mutators, +either directly, or in C<nomethod>. We continue to do things inside +C<nomethod>, thus add + + if ($meth eq '++' or $meth eq '--') { + @$obj = ($meth, (bless [@$obj]), 1); # Avoid circular reference + return $obj; + } + +after the first line of wrap(). This is not a most effective +implementation, one may consider + + sub inc { $_[0] = bless ['++', shift, 1]; } + +instead. + +As a final remark, note that one can fill %subr by + + my %subr = ( 'n' => sub {$_[0]} ); + foreach my $op (split " ", $overload::ops{with_assign}) { + $subr{$op} = $subr{"$op="} = eval "sub {shift() $op shift()}"; + } + my @bins = qw(binary 3way_comparison num_comparison str_comparison); + foreach my $op (split " ", "@overload::ops{ @bins }") { + $subr{$op} = eval "sub {shift() $op shift()}"; + } + foreach my $op (split " ", "@overload::ops{qw(unary func)}") { + $subr{$op} = eval "sub {$op shift()}"; + } + $subr{'++'} = $subr{'+'}; + $subr{'--'} = $subr{'-'}; + +This finishes implementation of a primitive symbolic calculator in +50 lines of Perl code. Since the numeric values of subexpressions +are not cached, the calculator is very slow. + +Here is the answer for the exercise: In the case of str(), we need no +explicit recursion since the overloaded C<.>-operator will fall back +to an existing overloaded operator C<"">. Overloaded arithmetic +operators I<do not> fall back to numeric conversion if C<fallback> is +not explicitly requested. Thus without an explicit recursion num() +would convert C<['+', $a, $b]> to C<$a + $b>, which would just rebuild +the argument of num(). + +If you wonder why defaults for conversion are different for str() and +num(), note how easy it was to write the symbolic calculator. This +simplicity is due to an appropriate choice of defaults. One extra +note: due to teh explicit recursion num() is more fragile than sym(): +we need to explicitly check for the type of $a and $b. If componets +$a and $b happen to be of some related type, this may lead to problems. + +=head2 I<Really> symbolic calculator + +One may wonder why we call the above calculator symbolic. The reason +is that the actual calculation of the value of expression is postponed +until the value is I<used>. + +To see it in action, add a method + + sub STORE { + my $obj = shift; + $#$obj = 1; + @$obj->[0,1] = ('=', shift); + } + +to the package C<symbolic>. After this change one can do + + my $a = new symbolic 3; + my $b = new symbolic 4; + my $c = sqrt($a**2 + $b**2); + +and the numeric value of $c becomes 5. However, after calling + + $a->STORE(12); $b->STORE(5); + +the numeric value of $c becomes 13. There is no doubt now that the module +symbolic provides a I<symbolic> calculator indeed. + +To hide the rough edges under the hood, provide a tie()d interface to the +package C<symbolic> (compare with L<Metaphor clash>). Add methods + + sub TIESCALAR { my $pack = shift; $pack->new(@_) } + sub FETCH { shift } + sub nop { } # Around a bug + +(the bug is described in L<"BUGS">). One can use this new interface as + + tie $a, 'symbolic', 3; + tie $b, 'symbolic', 4; + $a->nop; $b->nop; # Around a bug + + my $c = sqrt($a**2 + $b**2); + +Now numeric value of $c is 5. After C<$a = 12; $b = 5> the numeric value +of $c becomes 13. To insulate the user of the module add a method + + sub vars { my $p = shift; tie($_, $p), $_->nop foreach @_; } + +Now + + my ($a, $b); + symbolic->vars($a, $b); + my $c = sqrt($a**2 + $b**2); + + $a = 3; $b = 4; + printf "c5 %s=%f\n", $c, $c; + + $a = 12; $b = 5; + printf "c13 %s=%f\n", $c, $c; + +shows that the numeric value of $c follows changes to the values of $a +and $b. + =head1 AUTHOR Ilya Zakharevich E<lt>F<ilya@math.mps.ohio-state.edu>E<gt>. @@ -676,7 +1183,7 @@ this overloading). Say, if C<eq> is overloaded, then the method C<(eq> is shown by debugger. The method C<()> corresponds to the C<fallback> key (in fact a presence of this method shows that this package has overloading enabled, and it is what is used by the C<Overloaded> -function). +function of module C<overload>). =head1 BUGS @@ -689,9 +1196,21 @@ C<fallback> is present (possibly undefined). This may create interesting effects if some package is not overloaded, but inherits from two overloaded packages. +Relation between overloading and tie()ing is broken. Overloading is +triggered or not basing on the I<previous> class of tie()d value. + +This happens because the presence of overloading is checked too early, +before any tie()d access is attempted. If the FETCH()ed class of the +tie()d value does not change, a simple workaround is to access the value +immediately after tie()ing, so that after this call the I<previous> class +coincides with the current one. + +B<Needed:> a way to fix this without a speed penalty. + Barewords are not covered by overloaded string constants. -This document is confusing. +This document is confusing. There are grammos and misleading language +used in places. It would seem a total rewrite is needed. =cut |