.. -*- mode: rst -*-

:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
OBJECT ORIENTED PROGRAMMING IN PYTHON
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:Version: 0.5
:Author: Michele Simionato
:E-mail: mis6@pitt.edu
:Home-page: http://www.phyast.pitt.edu/~micheles/

:Disclaimer: I release this book to the general public. 
  It can be freely distributed if unchanged.
  As usual, I don't give any warranty: while I have tried hard to ensure the
  correctness of what follows, I disclaim any responsability in case of
  errors . Use it at your own risk and peril !

.. contents::

.. raw:: latex

  \setcounter{chapter}{-1}

Preface
============

 .. line-block::

  *There is only one way to learn: trough examples*

The philosophy of this book
---------------------------

This book is written with the intent to help the programmer going trough
the fascinating concepts of Object Oriented Programming (OOP), in their
Python incarnation. Notice that I say to help, not to teach. Actually,
I do not think that a book can teach OOP or any other non-trivial matter 
in Computer Science or other disciplines. Only the
practice can teach: practice, then practice, and practice again. 
You must learn yourself from your experiments, not from the books. 
Nevertheless, books are useful. They cannot teach, but they can help. 
They should give you new ideas that you was not thinking about, they should
show tricks you do not find in the manual, and in general they should be of
some guidance in the uphill road to knowledge. That is the philosophy
of this book. For this reason

1. It is not comprehensive, not systematic; 
it is intended to give ideas and basis: from
that the reader is expected to cover the missing part on his own,
browsing the documentation, other sources and other books, and finally
the definite autority, the source itself.

2. It will not even try to teach the *best* practices. I will show what you can
do with Python, not what you "should" do. Often I will show solutions that are
not recommended. I am not a mammy saying this is
good, this is bad, do this do that.  


3. You can only learn from your failures. If you think "it should work, if I do
X and Y" and it works, then you have learned nothing new. 
You have merely verified
that your previous knowledge was correct, but you haven't create a new
knowledge. On the other hand, when you think "it should work, if I do
X and Y" and it doesn't, then you have learned that your previous knowlegde
was wrong or incomplete, and you are forced to learn something new to
overcome the difficulty. For this reason, I think it is useful to report
not only how to do something, but also to report how not to do something, 
showing the pitfalls of wrong approaches.

That's in my opinion is the goal of a good book. I don't know if have
reached this goal or not (the decision is up to the reader), but at least
I have tried to follow these guidelines.

Moreover, this is not a book on OOP, 
it is a book on OOP *in Python*. 

In other words, the point of view of this book is not 
to emphasize general topics of OOP that are exportable to other languages, 
but exactly the opposite: I want to emphasize specific techniques that one
can only use in Python, or that are difficult to translate to other 
languages. Moreover, I will not provide comparisons with other 
languages (except for the section "Why Python?" in this introduction and
in few selected other places), 
in order to keep the discussion focused. 

This choice comes from the initial motivation for this book, which was 
to fulfill a gap in the (otherwise excellent) Python documentation. 
The problem is that the available documentation still lacks an accessible 
reference of the new Python 2.2+ object-oriented features.
Since myself I have learned Python and OOP from scratch, 
I have decided to write this book in order to fill that gap and
help others. 

The emphasis in this book is not in giving 
solutions to specific problems (even if most of the recipes of this book
can easily be tailored to solve real life concrete problems), it is in 
teaching  how does it work, why it does work in some cases and why does 
not work in some other cases. Avoiding too specific problems has an
additional bonus, since it allows me to use *short* examples (the majority 
of the scripts presented here is under 20-30 lines) which I think are 
best suited to teach a new matter [#]_ . Notice, however, that whereas
the majority of the scripts in this book are short, it is also true
that they are pretty *dense*. The density is due to various reasons:

1. I am defining a lot of helper functions and classes, that are
   reused and enhanced during all the book.

2. I am doing a strong use of inheritance, therefore a script at the
   end of the book can inherits from the classes defined through all
   the book;

3. A ten line script involving metaclasses can easily perform the equivalent 
   of generating hundreds of lines of code in a language without metaclasses 
   such as Java or C++.

To my knowledge, there are no other books covering the same topics with
the same focus (be warned, however, that I haven't read so many Python 
books ;-). The two references that come closest to the present book are
the ``Python Cookbook`` by Alex Martelli and David Ascher, and
Alex Martelli's ``Python in a Nutshell``. They are quite recent books and 
therefore it covers (in much less detail) some of the 2.2 features that are 
the central topics to this book. 
However, the Cookbook reserves to OOP only one chapter and has a quite 
different philosophy from the present book, therefore there is 
practically no overlapping. Also ``Python in a Nutshell`` covers 
metaclasses in few pages, whereas half of this book is essentially
dedied to them. This means that you can read both ;-)
 

.. [#] Readers that prefer the  opposite philosophy of using longer, 
       real life-like, examples, have already the excellent "Dive into 
       Python" book http://diveintopython.org/ at their disposal. This is 
       a very good book that I certainly recommend to any (experienced) 
       Python programmer; it is also freely available (just like this ;-).
       However, the choice of arguments is quite different and there is 
       essentially no overlap between my book and "Dive into Python" 
       (therefore you can read both ;-).

For who this book in intended
-----------------------------

I have tried to make this tutorial useful to a large public of Pythonistas, 
i.e. both people with no previous experience of Object Oriented Programming
and people with experience on OOP, but unfamiliar with the most
recent Python 2.2-2.3 features (such as attribute descriptors,
metaclasses, change of the MRO in multiple inheritance, etc). 
However, this is not a book for beginners: the non-experienced reader should 
check (at least) the Internet sites www.python.org/newbies.com and 
www.awaretek.com, that provide a nice collection of resources for Python 
newbies.

These are my recommendations for the reader, according to her/his level:

1. If you are an absolute beginner, with no experience on programming,
   this book is *not* for you (yet ;-). Go to 
   http://www.python.org/doc/Newbies.html and read one of the introductive 
   texts listed there, then come back here. I recommend "How to Think Like 
   a Computer Scientist", available for free on the net (see 
   http://www.ibiblio.org/obp/thinkCSpy/); I found it useful myself when 
   I started learning Python; be warned, however, that it refers to the rather 
   old Python version 1.5.2. There are also excellent books 
   on the market (see http://www.awaretek.com/plf.html). 
   http://www.uselesspython.com/ is a good resource to find recensions 
   about available Python books. For free books, look at
   http://www.tcfb.com/freetechbooks/bookphyton.html .
   This is *not* another Python tutorial.

2. If you know already (at least) another programming language, but you don't
   know Python, then this book is *not* for you (again ;-). Read the FAQ, the
   Python Tutorial and play a little with the Standard Library (all this
   material can be downloaded for free from  http://www.python.org), then
   come back here. 

3. If you have passed steps 1 and 2, and you are confortable with Python
   at the level of simple procedural programming, but have no clue about
   objects and classes, *then* this book is for you. Read this book till
   the end and your knowledge of OOP will pass from zero to a quite advanced 
   level (hopefully). Of course, you will have to play with the code in 
   this book and write a lot of code on your own, first ;-)    

4. If you are confortable with Python and you also known OOP from other
   languages or from earlier version of Python, then this book is for
   you, too: you are ready to read the more advanced chapters.

5. If you are a Python guru, then you should read the book, too. I expect
   you will find the errors and send me feedback, helping me to improve
   this tutorial.

About the scripts in this book
-----------------------------------------------------------------------------

All the scripts  in this book are free. You are expected to play
with them, to modify them and to improve them. 

In order to facilitate the extraction of the scripts from the main text, both
visually for the reader and automatically for Python, I use the
convention of sandwiching the body of the example scripts in blocks like this

 ::

  #<myfirstscript.py>
 
  print "Here Starts the Python Way to Object Oriented Programming !"

  #</myfirstscript.py>

You may extract the source of this script with the a Python program
called "test.py" and provided in the distribution. Simply give the 
following command:

 ::

  $ python test.py myfirstscript.py

This will create a file called "myfirstscript.py", containing the
source of ``myfirstscript.py``; moreover it will execute the script 
and write its output in a file called "output.txt". I have tested
all the scripts in this tutorial under Red Hat Linux 7.x and 
Windows 98SE. You should not have any problem in running them,
but if a problem is there, "test.py" will probably discover it,
even if, unfortunately, it will not provide the solution :-(.
Notice that test.py requires Python 2.3+ to work, since most of
the examples in this book heavily depends on the new features
introduced in Python 2.2-2.3. Since the installation of Python 
2.3 is simple, quick and free, I think I am requiring to my readers
who haven't upgraded yet a very little effort. This is well worth
the pain since Python 2.3 fixes few bugs of 2.2 (notably in the subject of
attribute descriptors and the ``super`` built-in) that makes

You may give more arguments to test.py, as in this example:

 ::

  $ python test.py myfirstscript.py mysecondscript.py

The output of both scripts will still be placed in the file "output.txt".
Notice that if you give an argument which is not the name of a script in the
book, it will be simply ignored. Morever, if you will not give any argument,
"test.py" will automatically executes all the tutorial scripts, writing their 
output in "output.txt" [#]_ . You may want to give a look at this file, once 
you have finished the tutorial. It also contains the source code of 
the scripts, for better readability.

Many examples of this tutorial depend on utility functions defined
in a external module called ``oopp`` (``oopp`` is an obvious abbreviation 
for the title of the tutorial). The module ``oopp`` is automatically generated 
by "test.py", which works by extracting from the tutorial 
text blocks of code of the form ``#<oopp.py> something #</oopp.py>`` 
and saving them in a file called "oopp.py". 
Let me give an example. A very recent enhancement to Python (in 
Python 2.3) has been the addition of a built-in boolean type with
values True and False:

 ::

  $ python
  Python 2.3a1 (#1, Jan  6 2003, 10:31:14)
  [GCC 2.96 20000731 (Red Hat Linux 7.2 2.96-108.7.2)] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> 1+1==2
  True
  >>> 1+1==3
  False
  >>> type(True)
  <type 'bool'>
  >>> type(False)
  <type 'bool'>


However, previous version of Python use the integers 1 and 0 for 
True and False respectively. 

 ::

  $ python
  Python 2.2 (#1, Apr 12 2002, 15:29:57)
  [GCC 2.96 20000731 (Red Hat Linux 7.2 2.96-109)] on linux2
  Type "help", "copyright", "credits" or "license" for more information.
  >>> 1+1==2
  1
  >>> 1+1==3 
  0

Following the 2.3 convension, in this tutorial I will use the names 
``True`` and ``False`` to denotes the numbers 1 and 0 respectively. 
This is automatic in Python 2.2.1+, but not in Python 2.2. Therefore, 
for sake of compatibility, it is convenient to set the values ``True`` 
and ``False`` in our utility module:

 ::

  #<oopp.py>

  import __builtin__
  try: 
      __builtin__.True   #look if True is already defined
  except AttributeError: # if not add True and False to the builtins
      __builtin__.True = 1
      __builtin__.False = 0

  #</oopp.py>


Here there is an example of usage:

 ::

  #<mysecondscript.py>

  import oopp
  print "True =",True,
  print "False =",False

  #</mysecondscript.py>

The output is "True = 1 False = 0" under Python 2.2 and 
"True = True False = False" under Python 2.3+.

.. [#] "test.py", invoked without arguments, does not create '.py' files, 
       since I don't want to kludge the distribution with dozens of ten-line
       scripts. I expect you may want to save only few scripts as standalone
       programs, and cut and paste the others.
       
Conventions used in this book
----------------------------------------------------------------------

Python expressions are denoted with monospaced fonts when in the text.
Sections marked with an asterisk can be skipped in a first reading.
Typically they have the purpose of clarifying some subtle point and
are not needed for the rest of the book. These sections are intended
for the advanced reader, but could confuse the beginner.
An example is the section about the difference between methods and
functions, or the difference between the inheritance constraint and
the metaclass constraint.

Introduction
===========================================================================
       
 .. line-block::

  *A language that doesn't affect the way you think about programming,
  is not worth knowing.* -- Alan Perlis


Why OOP ?
----------------------------

I guess some of my readers, like me, have started programming in the mid-80's,
when traditional (i.e. non object-oriented) Basic and Pascal where popular as 
first languages. At the time OOP was not as pervasive in software development 
how it is now, most of the mainstream languages were non-object-oriented and 
C++ was just being released. That was a time when the transition from 
spaghetti-code to structured code was already well accomplished, but 
the transition from structured programming to (the first phase of) 
OOP was at the beginning. 

Nowaydays, we live in a similar time of transition . Today, the transition 
to (the first phase of) OOP is well accomplished and essentially all 
mainstream
languages support some elementary form of OOP. To be clear, when I say
mainstream langauges, I have in mind Java and C++: C is a remarkable 
exception to the  rule, since it is mainstream but not object-oriented. 

However, both Java an C++ (I mean standard Java and C++, not special
extension like DTS C++, that have quite powerful object oriented features)
are quite poor object-oriented languages: they provides only the most 
elementary aspects of OOP, the features of the *first phase* of OOP.

Hence, today the transition to the *second phase* of OOP is only at the 
beginning, i.e mainstream language are not yet really OO, but they will
become OOP in the near future. 

By second phase of OOP I mean the phase in which the primary
objects of concern for the programmer are no more the objects, but the
metaobjects. In elementary OOP one works on objects, which have attributes
and methods (the evolution of old-fashioned data and functions)  defined
by their classes; in the second phase of OOP one works on classes 
which behavior is described by metaclasses. We no more modify objects 
trough classes: nowadays we modify classes and class hierarchies 
through metaclasses and multiple inheritance. 

It would be tempting to represent the history of programming in the last
quarter of century with an evolutionary table like that:

======================== ==================== ====================== =======
         ~1975                  ~1985                 ~1995           ~2005
======================== ==================== ====================== =======
  procedural programming   OOP1                 OOP2                    ?
  data,functions           objects,classes      classes,metaclasses     ?
======================== ==================== ====================== =======

The problem is that table would be simply wrong, since in truth
Smalltalk had metaclasses already 25 years ago! And also Lisp
had *in nuce* everything a long *long* time ago.
The truth is that certains languages where too much ahead of their 
time ;-)

Therefore, today we already have all the ideas 
and the conceptual tools to go beyond the first phase of OOP 
(they where invented 20-30 years ago), nevertheless those ideas are  
not yet universally known, nor implemented in mainstream languages. 

Fortunately, there are good languages
where you can access the bonus of the second phase of OOP (Smalltalk, CLOS,
Dylan, ...): unfortunately
most of them are academic and/or little known in the real world
(often for purely commercial reasons, since typically languages are not
chosen accordingly to their merits, helas!). Python is an exception to this
rule, in the sense that it is an eminently practical language (it started
as a scripting language to do Operating System administrative jobs), 
which is relatively known and used in that application niche (even if some
people *wrongly* think that should not be used for 'serious' things).

There are various reasons why most mainstream languages are rather
poor languages, i.e. underfeatured languages (as Java) or powerful, but too
tricky to use, as C++. Some are good reasons (for instance *efficiency*: if
efficiency is the first concern, then poor languages can be much
better suited to the goal: for instance Fortran for number crunching
and C for system programming), some are less good (economical 
monopoly). There is nothing to do against these reasons: if you
need efficiency, or if you are forced to use a proprietary language
because it is the language used by your employer. However, if you
are free from these restrictions, there is another reason why you
could not choose to use a poweful language. The reason is that, 
till now, programmers working in the industrial world mostly had simple 
problems (I mean conceptually simple problems). In order to solve
simple problems one does not need a powerful language, and the effort
spent in learning it is not worth.

However, nowadays the situations has changed. Now, with Internet and graphics
programming everywhere, and object-oriented languages so widespread,
now it is the time when actually people *needs* metaprogramming, the
ability to changing classes and programs. Now everybody is programming
in the large.

In this situation, it is justified to spend some time to learn better
way of programming. And of course, it is convenient to start from
the language with the flattest learning curve of all.

Why Python ?
-----------------------------------------------------------------------

 .. line-block::

  *In many ways, it's a dull language, borrowing solid old concepts from 
  many other languages & styles:  boring syntax, unsurprising semantics,
  few automatic coercions, etc etc.  But that's one of the things I like
  about it.*  --Tim Peters on Python, 16 Sep 93

If you are reading this book, I assume you already  have some experience
with Python. If this is the case, you already know the obvious advantages
of Python such as readability, easy of use and short development time.
Nevertheless, you could only have used Python as a fast and simple
scripting language. If you are in this situation, then your risk to
have an incorrect opinion on the language like "it is a nice little
language, but too simple to be useful in 'real' applications". The
truth is that Python is designed to be *simple*, and actually it
is; but by no means it is a "shallow" language. Actually, it goes
quite *deep*, but it takes some time to appreciate this fact.

Let me contrast Python with Lisp, for instance. From the beginning,
Lisp was intended to be a language for experts, for people with difficult 
problems to solve. The first
users of Lisp were academicians, professors of CS and scientists.
On the contrary, from the beginning Python 
was intended to be language for everybody (Python predecessor was ABC, 
a language invented to teach CS to children). Python makes great a first 
language for everybody, whereas Lisp would require especially
clever and motivated students (and we all know that there is lack
of them ;-)

From this difference of origins, Python inherits an easy to learn syntax,
whereas Lisp syntax is horrible for the beginner (even if not as
horrible as C++ syntax ;-)


 .. line-block::
   
    *Macros are a powerful extension to weak languages.
    Powerful languages don't need macros by definition.* 
    -- Christian Tismer on c.l.p. (referring to C)

Despite the differences, Python borrows quite a lot from Lisp and it
is nearly as expressive as it (I say nearly since Python is 
not as powerful as Lisp: by tradition, Lisp has always been on the top of 
hierarchy of programming language with respect to power of abstraction).
It is true that Python lacks some powerful Lisp features: for instance 
Python object model lacks multiple dispatching (for the time being ;-) 
and the language lacks Lisp macros (but this unlikely to change in the 
near future since Pythonistas see the lack of macro as a Good Thing [#]_): 
nevertheless, the point is that Python is much *much* easier to learn. 
You have (nearly) all the power, but without the complexity.

One of the reasons, is that Python
try to be as *less* innovative as
possible: it takes the proven good things from others, more innovative
languages, and avoids their pitfalls. If you are an experienced
programmer , it will be even  easier to you to learn Python, since
there is more or less nothing which is really original to Python.
For instance:

1. the object model is took from languages that are good at it, such
   as Smalltalk; 
2. multiple inheritance has been modeled from languages good in it. such
   as CLOS and Dylan;
3. regular expression follows the road opened by Perl;
4. functional features are borrowed from functional languages;
5. the idea of documentation strings come from Lisp;
6. list comprehension come from Haskell; 
7. iterators and generators come from Icon;
8. etc. etc. (many other points here)

I thinks the really distinctive feature of Python with respect to
any other serious language I know, is that Python is *easy*. You have the 
power (I mean power in conceptual sense, not computational power: in
the sense of computational power the best languages are
non-object-oriented ones) 
of the most powerful languages with a very little investement.
In addition to that, Python has a relatively large user base 
(as compared to Smalltalk or Ruby, or the various fragmented Lisp
communities). Of course, 
there is quite a difference between the user base of Python with
respect to the user base of, let say, VisualBasic or Perl. But 
I would never take in consideration VisualBasic for anything serious, 
whereas Perl is too ugly for my taste ;-).  
Finally, Python is *practical*. With this I mean the fact that 
Python has libraries that
allow the user to do nearly everything, since you can access all the C/C++ 
libraries with little or no effort, and all the Java libraries, though the
Python implementation known as Jython. In particular, one has the choice
between many excellent GUI's trough PyQt, wxPython, Tkinter, etc.

Python started as an Object Oriented Programming
Languages from the beginning, nevertheless is was never intended to be
a *pure* OOPL as SmallTalk or, more recently, Ruby. Python is a 
*multiparadigm*
language such a Lisp, that you choose your programming style according
to your problem: spaghetti-code, structured programming, functional
programming, object-oriented programming are all supported. You can
even write bad code in Python, even if it is less simple than in other
languages ;-). Python is a language which has quite evolved in its twelve
years of life (the first public release was released in February 1991)
and many new features have been integrated in the language with time. 
In particular, Python 2.2 (released in 2002) was a major breakthrough 
in the history of the language
for what concerns support to Object Oriented Programming (OOP). 
Before the 2.2 revolution, Python Object
Orientation was good; now it is *excellent*. All the fundamental features
of OOP, including pretty sophisticated ones, as metaclasses and multiple
inheritance, have now a very good support (the only missing thing is
multiple dispatching).

.. [#] 
       Python lacks macros for an intentional design choice: many people
       in the community (including Guido itself) feel that macros are 
       "too powerful". If you give the user the freedom to create her
       own language, you must face at least three problems: i) the risk
       to split the original language in dozens of different dialects;
       ii) in collaborative projects, the individual programmer must 
       spend an huge amount of time and effort would be spent in learning
       macro systems written by others; iii) not all users are good
       language designers: the programmer will have to fight with badly
       designed macro systems. Due to these problems, it seems unlikely
       that macros will be added to Python in the future.

.. [#] 
       For a good comparison between Python and Lisp I remind the reader to
       the excellent Peter Norvig's article in
       http://www.norvig.com/python-lisp.html

Further thoughts 
---------------------------------------------------------------------------

Actually, the principal reasons why I begun studying 
Python was the documentation and the newsgroup: Python has an outstanding 
freely available documentation and an incredibly helpful newsgroup that
make extremely easy to learn the language. If I had found a comparable 
free documentation/newsgroup for C++ or Lisp, I would have studied that
languages instead. 

Unfortunately, the enormous development at the software level, had no
correspondence with with an appropriate development of documentation.
As a consequence, the many beatiful, powerful and extremely *useful*
new features of Python 2.2+ object orientation are mostly remained
confined to developers and power users: the average Python programmer
has remained a little a part from the rapid development and she
*wrongly* thinks she has no use for the new features. There have
also been *protestations* of the users against developers of the
kind "please, stop adding thousands of complicated new extensions
to the language for which we have no use" !

Extending a language is always a delicate thing to do, for a whole
bunch of reasons:

1. once one extension is done, it is there *forever*.


My experience has been the following.

When I first read about metaclasses, in Guido's essay
"Unifying types and classes in Python 2.2", I thought "Wow,
classes of classes, cool concept, but how useful is it?  
Are metaclasses really providing some new functionality?  
What can I do with metaclasses that I cannot do without?"

Clearly, in these terms, the question is rather retorical, since in principle
any Turing-complete programming languages contains all the features provided 
by metaclasses. Python metaclasses themselves are implemented in C, that has 
no metaclasses. Therefore, my real question was not "What can I do 
with metaclasses that I cannot do without?" but "How big is the convenience 
provided by metaclasses, with respect to my typical applications?".
 
The answer depends on the kind of problem you are considering. For certain
classes of problems it can be *very* large, as I will show in this and in
the next chapters.

I think the biggest advantage of metaclasses is *elegance*. Altough it
is true that most of what you can do with metaclasses, can be done without 
metaclasses, not using metaclasses can result in a much *uglier* solution. 


One needs difficult problems in order to appreciate the advantage
of powerful methods.


If all you need is to write few scripts for copying two or three files,
there is no point in learning OOP.On the other hand, if you only
write simple programs where you define only one of two classes, there
is no point in using metaclasses. Metaclasses becomes relevant only
when you have many classes, whole classes of classes with similar
features that you want to modify.

In this sense, metaprogramming is for experts only, i.e. with people
with difficult problems. The point however, is that nowaydays,
many persons have difficult problems.

Finally, let me conclude this preface by recalling the
gist of Python wisdom.

  >>> import this
  The Zen of Python, by Tim Peters
  .
  Beautiful is better than ugly.
  Explicit is better than implicit.
  Simple is better than complex.
  Complex is better than complicated.
  Flat is better than nested.
  Sparse is better than dense.
  Readability counts.
  Special cases aren't special enough to break the rules.
  Although practicality beats purity.
  Errors should never pass silently.
  Unless explicitly silenced.
  In the face of ambiguity, refuse the temptation to guess.
  There should be one-- and preferably only one --obvious way to do it.
  Although that way may not be obvious at first unless you're Dutch.
  Now is better than never.
  Although never is often better than *right* now.
  If the implementation is hard to explain, it's a bad idea.
  If the implementation is easy to explain, it may be a good idea.
  Namespaces are one honking great idea -- let's do more of those!


FIRST THINGS, FIRST
==============================================================================

This is an introductory chapter, with the main purpose of fixing the 
terminology used in the sequel. In particular, I give the definitions 
of objects, classes, attributes and methods. I discuss a few examples 
and I show some of the most elementary Python introspection features.

What's an object?
----------------------------------------------------------------------------

 .. line-block::

                     *So Everything Is An object.  
                     I'm sure the Smalltalkers are very happy :)*

                     -- Michael Hudson on comp.lang.python

"What's an object" is the obvious question raised by anybody starting 
to learn Object Oriented Programming. The answer is simple: in Python, 
everything in an object!

An operative definition is the following: an *object*
is everything that can be labelled with an *object reference*.

In practical terms, the object reference is implemented as 
the object memory address, that is an integer number which uniquely
specify the object. There is a simple way to retrieve the object reference:
to use the builtin ``id`` function. Informations on ``id`` can be retrieved 
via the ``help`` function [#]_:

  >>> help(id)
  Help on built-in function id:
  id(...)
  id(object) -> integer
  Return the identity of an object. This is guaranteed to be unique among
  simultaneously existing objects. (Hint: it's the object's memory address.)

The reader is strongly encouraged to try the help function on everything
(including help(help) ;-). This is the best way to learn how Python works,
even *better* than reading the standard documentation, since the on-line
help is often more update.

Suppose for instance we wonder if the number ``1`` is an object: 
it is easy enough to ask Python for the answer:

  >>> id(1)
  135383880

Therefore the number 1 is a Python object and it is stored at the memory 
address 135383880, at least in my computer and during the current session.
Notice that the object reference is a dynamic thing; nevertheless it
is guaranteed to be unique and constant for a given object during its 
lifetime (two objects whose lifetimes are disjunct may have the same id() 
value, though).

Here there are other examples of built-in objects:

  >>> id(1L) # long
  1074483312
  >>> id(1.0) #float
  135682468
  >>> id(1j) # complex
  135623440
  >>> id('1') #string
  1074398272
  >>> id([1]) #list
  1074376588
  >>> id((1,)) #tuple
  1074348844
  >>> id({1:1}) # dict
  1074338100

Even functions are objects:
   
  >>> def f(x): return x #user-defined function
  >>> id(f)
  1074292020
  >>> g=lambda x: x #another way to define functions
  >>> id(g)
  1074292468
  >>> id(id) #id itself is a built-in function
  1074278668

Modules are objects, too:

  >>> import math 
  >>> id(math) #module of the standard library
  1074239068
  >>> id(math.sqrt) #function of the standard library
  1074469420

``help`` itself is an object:

  >>> id(help)
  1074373452

Finally, we may notice that the reserved keywords are not objects:

  >>> id(print) #error
  File "<string>", line 1
    id(print)       ^
  SyntaxError: invalid syntax

The operative definition is convenient since it gives a practical way
to check if something is an object and, more importantly, if two
objects are the same or not:

  .. doctest

  >>> s1='spam'
  >>> s2='spam'
  >>> s1==s2
  True
  >>> id(s1)==id(s2)
  True

A more elegant way of spelling ``id(obj1)==id(obj2)`` is to use the
keyword ``is``:

  >>> s1 is s2
  True

However, I should warn the reader that sometimes ``is`` can be surprising:

  >>> id([]) == id([])
  True
  >>> [] is []
  False

This is happening because writing ``id([])`` dynamically creates an unique
object (a list) which goes away when you're finished with it.  So when an
expression needs both at the same time (``[] is []``), two unique objects
are created, but when an expression doesn't need both at the same time
(``id([]) == id([])``), an object gets created with an ID, is destroyed,
and then a second object is created with the same ID (since the last one
just got reclaimed) and their IDs compare equal. In other words, "the 
ID is guaranteed to be unique *only* among simultaneously existing objects".

Another surprise is the following:

  >>> a=1
  >>> b=1
  >>> a is b
  True
  >>> a=556
  >>> b=556
  >>> a is b
  False

The reason is that integers between 0 and 99 are pre-instantiated by the
interpreter, whereas larger integers are recreated each time.

Notice the difference between '==' and 'is':

  >>> 1L==1
  True

but

  >>> 1L is 1
  False 

since they are different objects:

  >>> id(1L) # long 1 
  135625536
  >>> id(1)  # int 1
  135286080


The disadvantage of the operative definition is that it gives little 
understanding of what an object can be used for. To this aim, I must
introduce the concept of *class*.

.. [#] Actually ``help`` is not a function but a callable object. The
       difference will be discussed in a following chapter.

Objects and classes
---------------------------------------------------------------------------

It is convenient to think of an object as an element of a set.

It you think a bit, this is the most general definition that actually 
grasps what we mean by object in the common language.
For instance, consider this book, "Object Oriented Programming in Python":
this book is an object, in the sense that it is a specific representative 
of the *class* of all possible books.
According to this definition, objects are strictly related to classes, and
actually we say that objects are *instances* of classes.

Classes are nested: for
instance this book belongs to the class of books about programming
language, which is a subset of the class of all possible books;
moreover we may further specify this book as a Python book; moreover
we may specify this book as a Python 2.2+ book. There is no limit
to the restrictions we may impose to our classes.
On the other hand. it is convenient to have a "mother" class,
such that any object belongs to it. All strongly Object Oriented
Language have such a class [#]_; in Python it is called *object*. 

The relation between objects and classes in Python can be investigated
trough the built-in function ``type`` [#]_ that gives the class of any 
Python object. 

Let me give some example:

1. Integers numbers are instances of the class ``int`` or ``long``:

  >>> type(1)
  <type 'int'>
  >>> type(1L)
  <type 'long'>

2. Floating point numbers are instances of the class ``float``:
  
  >>> type(1.0)
  <type 'float'>


3. Complex numbers are instances of the class ``complex``:

  >>> type(1.0+1.0j)
  <type 'complex'>

4. Strings are instances of the class ``str``:

  >>> type('1')
  <type 'str'>


5. List, tuples and dictionaries are instances of ``list``, ``tuple`` and
   ``dict`` respectively:

  >>> type('1')
  <type 'str'>
  >>> type([1])
  <type 'list'>
  >>> type((1,))
  <type 'tuple'>
  >>> type({1:1})
  <type 'dict'>

6. User defined functions are instances of the ``function`` built-in type

  >>> type(f)
  <type 'function'>
  >>> type(g)
  <type 'function'>

All the previous types are subclasses of object:

  >>> for cl in int,long,float,str,list,tuple,dict: issubclass(cl,object)
  True
  True
  True
  True
  True
  True
  True

However, Python is not a 100% pure Object
Oriented Programming language and its object model has still some minor
warts, due to historical accidents.

Paraphrasing George Orwell, we may say that in Python 2.2-2.3, 
all objects are equal, but some objects are more equal than others.
Actually, we may distinguish Python objects in new style objects, 
or rich man objects, and old style objects, or poor man objects. 
New style objects are instances of new style classes whereas old
style objects are instances of old style classes.
The difference is that new style classes are subclasses of object whereas
old style classes are not.

Old style classes are there for sake of compatibility with previous 
releases of Python, but starting from Python 2.2 practically all built-in 
classes are new style classes. 

Instance of old style classes are called old style objects. I will give
few examples of old style objects in the future.

In this tutorial with the term
object *tout court* we will mean new style objects, unless the contrary 
is explicitely stated. 


.. [#] one may notice that C++ does not have such a class, but C++
       is *not* a strongly object oriented language ;-)

.. [#] Actually ``type`` is not a function, but a metaclass; nevertheless, 
       since this is an advanced concept, discussed in the fourth chapter; 
       for the time being it is better to think of ``type`` as a built-in 
       function analogous to ``id``.

Objects have attributes
----------------------------------------------------------------------------

All objects have attributes describing their characteristics, that may
be accessed via the dot notation

 ::

  objectname.objectattribute

The dot notation is common to most Object Oriented programming languages,
therefore the reader with a little of experience should find it not surprising 
at all (Python strongly believes in the Principle of Least Surprise). However,
Python objects also have special attributes denoted by the double-double
underscore notation

 ::

  objectname.__specialattribute__

with the aim of helping the wonderful Python introspection features, that
does not have correspondence in all OOP language.

Consider for example the string literal "spam". We may discover its
class by looking at its special attribute *__class__*:
   
  >>> 'spam'.__class__
  <type 'str'>


Using the ``__class__`` attribute is not always equivalent to using the 
``type`` function, but it works for all built-in types. Consider for instance 
the number *1*: we may extract its class as follows:

  >>> (1).__class__
  <type 'int'>

Notice that the parenthesis are needed to avoid confusion between the integer
1 and the float (1.).

The non-equivalence type/class is the key to distinguish new style objects from
old style, since for old style objects ``type(obj)<>obj.__class__``.
We may use this knowledge to make and utility function that discovers
if an object is a "real" object (i.e. new style) or a poor man object:

 ::

  #<oopp.py>

  def isnewstyle(obj):
      try: #some objects may lack a __class__ attribute 
          obj.__class__
      except AttributeError:
          return False
      else: #look if there is unification type/class
          return type(obj) is obj.__class__
  #</oopp.py>

Let us check this with various examples:

  >>> from oopp import isnewstyle
  >>> isnewstyle(1)
  True
  >>> isnewstyle(lambda x:x)
  True
  >>> isnewstyle(id)
  True
  >>> isnewstyle(type)
  True
  >>> isnewstyle(isnewstyle)
  True
  >>> import math
  >>> isnewstyle(math)
  True
  >>> isnewstyle(math.sqrt)
  True
  >>> isnewstyle('hello')
  True

It is not obvious to find something which is not a real object,
between the built-in objects, however it is possible. For instance, 
the ``help`` "function" is an old style object:

  >>> isnewstyle(help)
  False

since

  >>> help.__class__
  <class site._Helper at 0x8127c94>

is different from 

  >>> type(help)
  <type 'instance'>

Regular expression objects are even poorer objects with no ``__class__`` 
attribute:

  >>> import re
  >>> reobj=re.compile('somestring')
  >>> isnewstyle(reobj)
  False
  >>> type(reobj)
  <type '_sre.SRE_Pattern'>
  >>> reobj.__class__ #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  AttributeError: __class__

There other special attributes other than ``__class__``; a particularly useful
one is ``__doc__``, that contains informations on the class it
refers to. Consider for instance the ``str`` class: by looking at its
``__doc__`` attribute we can get information on the usage of this class:

  >>> str.__doc__
  str(object) -> string
  Return a nice string representation of the object.
  If the argument is a string, the return value is the same object.

From that docstring we learn how to convert generic objects in strings;
for instance we may convert numbers, lists, tuples and dictionaries:

  >>> str(1)
  '1'
  >>> str([1])
  '[1]'
  >>> str((1,))
  (1,)'
  >>> str({1:1})
  '{1: 1}'

``str`` is implicitely called each time we use the ``print`` statement, since
``print obj`` is actually syntactic sugar for ``print str(obj)``.

Classes and modules have another interesting special attribute, the 
``__dict__`` attribute that gives the content of the class/module.
For instance, the contents of the standard ``math`` module can be retrieved 
as follows:

  >>> import math
  >>> for key in math.__dict__: print key,
  ...
  fmod atan pow __file__ cosh ldexp hypot sinh __name__ tan ceil asin cos 
  e log fabs floor tanh sqrt __doc__ frexp atan2 modf exp acos pi log10 sin

Alternatively, one can use the built-in function ``vars``:

  >>> vars(math) is math.__dict__
  True

This identity is true for any object with a ``__dict__`` attribute. 
Two others interesting special attributes are ``__doc__``

  >>> print math.__doc__
  This module is always available.  It provides access to the
  mathematical functions defined by the C standard. 

and ``__file__``:

  >>> math.__file__ #gives the file associated with the module
  '/usr/lib/python2.2/lib-dynload/mathmodule.so'
    
Objects have methods 
----------------------------------------------------------------------------

In addition to attributes, objects also have *methods*, i.e. 
functions attached to their classes [#]_.
Methods are also invoked with the dot notation, but
they can be distinguished by attributes because they are typically
called with parenthesis (this is a little simplistic, but it is enough for
an introductory chapter). As a simple example, let me show the
invocation of the ``split`` method for a string object:

  >>> s='hello world!'
  >>> s.split()
  ['hello', 'world!']

In this example ``s.split`` is called a *bount method*, since it is
applied to the string object ``s``:

  >>> s.split
  <built-in method split of str object at 0x81572b8>

An *unbound method*, instead, is applied to the class: in this case the
unbound version of ``split`` is applied to the ``str`` class:

  >>> str.split
  <method 'split' of 'str' objects>

A bound method is obtained from its corresponding unbound 
method by providing the object to the unbound method: for instance 
by providing ``s`` to ``str.split`` we obtain the same effect of `s.split()`:

  >>> str.split(s)
  ['hello', 'world!']

This operation is called *binding*  in the Python literature: when write
``str.split(s)`` we bind the unbound method ``str.split`` to the object ``s``.
It is interesting to recognize that the bound and unbound methods are
*different* objects:

  >>> id(str.split) # unbound method reference
  135414364
  >>> id(s.split) # this is a different object!
  135611408

The unbound method (and therefore the bound method) has a ``__doc__`` 
attribute explaining how it works:

  >>> print str.split.__doc__
  S.split([sep [,maxsplit]]) -> list of strings
  Return a list of the words in the string S, using sep as the
  delimiter string.  If maxsplit is given, at most maxsplit
  splits are done. If sep is not specified or is None, any
  whitespace string is a separator.


.. [#] A precise definition will be given in chapter 5 that introduces the
       concept of attribute descriptors. There are subtle
       differences between functions and methods.

Summing objects
--------------------------------------------------------------------------

In a pure object-oriented world, there are no functions and everything is 
done trough methods. Python is not a pure OOP language, however quite a
lot is done trough methods. For instance, it is quite interesting to analyze 
what happens when an apparently trivial statement such as

  >>> 1+1
  2

is executed in an object-oriented world. 

The key to understand, is to notice that the number 1 is an object, specifically
an instance of class ``int``: this means that that 1 inherits all the methods
of the ``int`` class. In particular it inherits a special method called 
``__add__``: this means 1+1 is actually syntactic sugar for

  >>> (1).__add__(1)
  2

which in turns is syntactic sugar for

  >>> int.__add__(1,1)
  2

The same is true for subtraction, multiplication, division and other 
binary operations.

  >>> 'hello'*2
  'hellohello'
  >>> (2).__mul__('hello')
  'hellohello'
  >>> str.__mul__('hello',2)
  'hellohello'

However, notice that

  >>> str.__mul__(2,'hello') #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: descriptor '__mul__' requires a 'str' object but received a 'int'

The fact that operators are implemented as methods, is the key to
*operator overloading*: in Python (as well as in other OOP languages)
the user can redefine the operators. This is already done by default
for some operators: for instance the operator ``+`` is overloaded
and works both for integers, floats, complex  numbers and for strings.

Inspecting objects
---------------------------------------------------------------------------

In Python it is possible to retrieve most of the attributes and methods 
of an object by using the built-in function ``dir()``
(try ``help(dir)`` for more information).

Let me consider the simplest case of a generic object:

  >>> obj=object()
  >>> dir(obj)
  ['__class__', '__delattr__', '__doc__', '__getattribute__', 
   '__hash__', '__init__', '__new__', '__reduce__', '__repr__', 
   '__setattr__', '__str__']

As we see, there are plenty of attributes available
even to a do nothing object; many of them are special attributes
providing introspection capabilities which are not 
common to all programming languages. We have already discussed the
meaning of some of the more obvious special attributes.
The meaning of some of  the others is quite non-obvious, however.
The docstring is invaluable in providing some clue.

Notice that  there are special *hidden* attributes that cannot be retrieved
with ``dir()``. For instance the ``__name__`` attribute, returning the 
name of the object (defined for classes, modules and functions) 
and the ``__subclasses__`` method, defined for classes and returning the 
list of immediate subclasses of a class:

  >>> str.__name__
  'str'
  >>> str.__subclasses__.__doc__
  '__subclasses__() -> list of immediate subclasses'
  >>> str.__subclasses__() # no subclasses of 'str' are currently defined
  []

For instance by doing

  >>> obj.__getattribute__.__doc__
  "x.__getattribute__('name') <==> x.name"

we discover that the expression ``x.name`` is syntactic sugar for

  ``x.__getattribute__('name')``

Another equivalent form which is more often used is

   ``getattr(x,'name')``

We may use this trick to make a function that retrieves all the
attributes of an object except the special ones:

 ::

  #<oopp.py>

  def special(name): return name.startswith('__') and name.endswith('__')

  def attributes(obj,condition=lambda n,v: not special(n)):
      """Returns a dictionary containing the accessible attributes of 
      an object. By default, returns the non-special attributes only."""
      dic={}
      for attr in dir(obj):
          try: v=getattr(obj,attr)
          except: continue #attr is not accessible
          if condition(attr,v): dic[attr]=v
      return dic

  getall = lambda n,v: True

  #</oopp.py>

Notice that certain attributes may be unaccessible (we will see how
to make attributes unaccessible in a following chapter) 
and in this case they are simply ignored.
For instance you may retrieve the regular (i.e. non special)
attributes of the built-in functions:

  >>> from oopp import attributes
  >>> attributes(f).keys()
  ['func_closure', 'func_dict', 'func_defaults', 'func_name', 
   'func_code', 'func_doc', 'func_globals']

In the same vein of the ``getattr`` function, there is a built-in
``setattr`` function (that actually calls the ``__setattr__`` built-in
method), that allows the user to change the attributes and methods of
and object. Informations on ``setattr`` can be retrieved from the help 
function:

 ::

  >>> help(setattr)
  Help on built-in function setattr:
  setattr(...)
  setattr(object, name, value)
  Set a named attribute on an object; setattr(x, 'y', v) is equivalent to
  ``x.y = v''.

``setattr`` can be used to add attributes to an object:

 ::

  #<oopp.py>
  
  import sys

  def customize(obj,errfile=None,**kw):
      """Adds attributes to an object, if possible. If not, writes an error
      message on 'errfile'. If errfile is None, skips the exception."""
      for k in kw:
          try: 
              setattr(obj,k,kw[k])
          except: # setting error
              if errfile:
                  print >> errfile,"Error: %s cannot be set" % k

  #</oopp.py>

The attributes of built-in objects cannot be set, however:

  >>> from oopp import customize,sys
  >>> customize(object(),errfile=sys.stdout,newattr='hello!') #error
  AttributeError: newattr cannot be set

On the other hand, the attributes of modules can be set:

  >>> import time
  >>> customize(time,newattr='hello!')
  >>> time.newattr
  'hello!'

Notice that this means we may enhances modules at run-time, but adding
new routines, not only new data attributes.

The ``attributes`` and ``customize`` functions work for any kind of objects; 
in particular, since classes are a special kind of objects, they work 
for classes, too. Here are the attributes of the ``str``, ``list`` and 
``dict`` built-in types:

  >>> from oopp import attributes
  >>> attributes(str).keys()
  ['startswith', 'rjust', 'lstrip', 'swapcase', 'replace','encode',
   'endswith', 'splitlines', 'rfind', 'strip', 'isdigit', 'ljust', 
   'capitalize', 'find', 'count', 'index', 'lower', 'translate','join', 
   'center', 'isalnum','title', 'rindex', 'expandtabs', 'isspace', 
   'decode', 'isalpha', 'split', 'rstrip', 'islower', 'isupper', 
   'istitle', 'upper']
  >>> attributes(list).keys()
  ['append', 'count', 'extend', 'index', 'insert', 'pop', 
   'remove', 'reverse', 'sort']
  >>> attributes(dict).keys()
  ['clear','copy','fromkeys', 'get', 'has_key', 'items','iteritems',
   'iterkeys', 'itervalues', 'keys', 'pop', 'popitem', 'setdefault', 
   'update', 'values']

Classes and modules have a special attribute ``__dict__`` giving the 
dictionary of their attributes. Since it is often a quite large dictionary, 
it is convenient to define an utility function printing this dictionary in a 
nice form:

 ::

  #<oopp.py>

  def pretty(dic):
      "Returns a nice string representation for the dictionary"
      keys=dic.keys(); keys.sort() # sorts the keys
      return '\n'.join(['%s = %s' % (k,dic[k]) for k in keys])

  #</oopp.py>

I encourage the use of this function in order to retrieve more 
information about the modules of the standard library:

  >>> from oopp import pretty
  >>> import time #look at the 'time' standard library module
  >>> print pretty(vars(time))
  __doc__ = This module provides various functions to manipulate time values.
  There are two standard representations of time.  One is the number
  of seconds since the Epoch, in UTC (a.k.a. GMT).  It may be an integer
  or a floating point number (to represent fractions of seconds).
  The Epoch is system-defined; on Unix, it is generally January 1st, 1970.
  The actual value can be retrieved by calling gmtime(0).
  The other representation is a tuple of 9 integers giving local time.
  The tuple items are:
    year (four digits, e.g. 1998)
    month (1-12)
    day (1-31)
    hours (0-23)
    minutes (0-59)
    seconds (0-59)
    weekday (0-6, Monday is 0)
    Julian day (day in the year, 1-366)
    DST (Daylight Savings Time) flag (-1, 0 or 1)
  If the DST flag is 0, the time is given in the regular time zone;
  if it is 1, the time is given in the DST time zone;
  if it is -1, mktime() should guess based on the date and time.
  Variables:
  timezone -- difference in seconds between UTC and local standard time
  altzone -- difference in  seconds between UTC and local DST time
  daylight -- whether local time should reflect DST
  tzname -- tuple of (standard time zone name, DST time zone name)
  Functions:
  time() -- return current time in seconds since the Epoch as a float
  clock() -- return CPU time since process start as a float
  sleep() -- delay for a number of seconds given as a float
  gmtime() -- convert seconds since Epoch to UTC tuple
  localtime() -- convert seconds since Epoch to local time tuple
  asctime() -- convert time tuple to string
  ctime() -- convert time in seconds to string
  mktime() -- convert local time tuple to seconds since Epoch
  strftime() -- convert time tuple to string according to format specification
  strptime() -- parse string to time tuple according to format specification
  __file__ = /usr/local/lib/python2.3/lib-dynload/time.so
  __name__ = time
  accept2dyear = 1
  altzone = 14400
  asctime = <built-in function asctime>
  clock = <built-in function clock>
  ctime = <built-in function ctime>
  daylight = 1
  gmtime = <built-in function gmtime>
  localtime = <built-in function localtime>
  mktime = <built-in function mktime>
  newattr = hello!
  sleep = <built-in function sleep>
  strftime = <built-in function strftime>
  strptime = <built-in function strptime>
  struct_time = <type 'time.struct_time'>
  time = <built-in function time>
  timezone = 18000
  tzname = ('EST', 'EDT')

The list of the built-in Python types can be found in the ``types`` module:

  >>> import types
  >>> t_dict=dict([(k,v) for (k,v) in vars(types).iteritems() 
  ... if k.endswith('Type')])
  >>> for t in t_dict: print t,
  ...
  DictType IntType TypeType FileType CodeType XRangeType EllipsisType 
  SliceType BooleanType ListType MethodType TupleType ModuleType FrameType 
  StringType LongType BuiltinMethodType BufferType FloatType ClassType 
  DictionaryType BuiltinFunctionType UnboundMethodType UnicodeType 
  LambdaType DictProxyType ComplexType GeneratorType ObjectType 
  FunctionType InstanceType NoneType TracebackType

For a pedagogical account of the most elementary 
Python introspection features,
Patrick O' Brien:
http://www-106.ibm.com/developerworks/linux/library/l-pyint.html

Built-in objects: iterators and generators
---------------------------------------------------------------------------

At the end of the last section , I have used the ``iteritems`` method 
of the dictionary, which returns an iterator:

  >>> dict.iteritems.__doc__
  'D.iteritems() -> an iterator over the (key, value) items of D'

Iterators (and generators) are new features of Python 2.2 and could not be
familiar to all readers. However, since they are unrelated to OOP, they 
are outside the scope of this book and will not be discussed here in detail. 
Nevertheless, I will give a typical example of use of a generator, since
this construct will be used in future chapters.

At the syntactical level, a generator is a "function" with (at least one) 
``yield`` statement (notice that in Python 2.2 the ``yield`` statement is
enabled trough the ``from __future__ import generators`` syntax):


 ::

  #<oopp.py>

  import re

  def generateblocks(regexp,text):
      "Generator splitting text in blocks according to regexp"
      start=0
      for MO in regexp.finditer(text):
          beg,end=MO.span()
          yield text[start:beg] # actual text
          yield text[beg:end] # separator
          start=end
      lastblock=text[start:] 
      if lastblock: yield lastblock; yield ''

  #</oopp.py>

In order to understand this example, the reader my want to refresh his/her 
understanding of regular expressions; since this is not a subject for 
this book, I simply remind the meaning of ``finditer``:

  >>> import re
  >>> help(re.finditer)
  finditer(pattern, string)
      Return an iterator over all non-overlapping matches in the
      string.  For each match, the iterator returns a match object.
      Empty matches are included in the result.

Generators can be thought of as resumable functions that stop at the
``yield`` statement and resume from the point where they left. 

  >>> from oopp import generateblocks
  >>> text='Python_Rules!'
  >>> g=generateblocks(re.compile('_'),text)
  >>> g
  <generator object at 0x401b140c>
  >>> dir(g)
  ['__class__', '__delattr__', '__doc__', '__getattribute__', '__hash__', 
   '__init__', '__iter__', '__new__', '__reduce__', '__reduce_ex__', 
   '__repr__', '__setattr__', '__str__', 'gi_frame', 'gi_running', 'next']

Generator objects can be used as iterators in a ``for`` loop.
In this example the generator takes a text and a regular expression
describing a fixed delimiter; then it splits the text in blocks
according to the delimiter. For instance, if the delimiter is
'_', the text 'Python Rules!' is splitted as 'Python', '_' and 'Rules!':

  >>> for n, block in enumerate(g): print n, block
  ...
  0 Python
  1 
  2 Rules!
  3

This example also show the usage of the new Python 2.3 built-in ``enumerate``.

Under the hood the ``for`` loop is calling the generator via its 
``next`` method, until the ``StopIteration`` exception is raised.
For this reason a new call to the ``for`` loop will have no effect:

  >>> for n, block in enumerate(g): print n, block
  ...

The point is that the generator has already yield its last element:

  >>> g.next() # error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  StopIteration

``generateblocks`` always returns an even number of blocks; odd blocks
are delimiters whereas even blocks are the intertwining text; there may be 
empty blocks, corresponding to the null string ''.

It must be remarked the difference with the 'str.split' method

  >>> 'Python_Rules!'.split('_')
  ['Python', 'Rules!']

and the regular expression split method:

  >>> re.compile('_').split('Python_Rules!')
  ['Python', 'Rules!']

both returns lists with an odd number of elements and both miss the separator. 
The regular expression split method can catch the separator, if wanted, 

  >>> re.compile('(_)').split('Python_Rules!')
  ['Python', '_', 'Rules!']

but still is different from the generator, since it returns a list. The
difference is relevant if we want to split a very large text, since 
the generator avoids to build a very large list and thus it is much more
memory efficient (it is faster, too). Moreover, ``generateblocks``
works differently in the case of multiple groups:

  >>> delim=re.compile('(_)|(!)') #delimiter is space or exclamation mark
  >>> for n, block in enumerate(generateblocks(delim,text)): 
  ...     print n, block
  0 Python
  1 _
  2 Rules
  3 !

whereas

  >>> delim.split(text)
  ['Python', '_', None, 'Rules', None, '!', '']

gives various unwanted ``None`` (which could be skipped with 
``[x for x in delim.split(text) if x is not None]``); notice, that
there are no differences (apart from the fact that ``delim.split(text)``
has an odd number of elements) when one uses a single group regular expression:

  >>> delim=re.compile('(_|!)')
  >>> delim.split(text)
  ['Python', '_', 'Rules', '!', '']

The reader unfamiliar with iterators and generators is encouraged
to look at the standard documentation and other 
references. For instance, there are Alex Martelli's notes on iterators at 
http://www.strakt.com/dev_talks.html
and there is a good article on generators by David Mertz
http://www-106.ibm.com/developerworks/linux/library/l-pycon.html


THE CONVENIENCE OF FUNCTIONS
============================================================================

Functions are the most basic Python objects. They are also the simplest
objects where one can apply the  metaprogramming techniques that are
the subject of this book. The tricks used in this chapter and the utility
functions defined here will be used over all the book. Therefore this
is an *essential* chapter.

Since it is intended to be a gentle introduction, the tone will be
informal.

Introduction
-------------

One could be surprised that a text on OOP begins with a chapter on the
well known old-fashioned functions. In some sense, this is also
against the spirit of an important trend in OOP, which tries to
shift the focus from functions to data. In pure OOP languages,
there are no more functions, only methods. [#]_

However, there are good reasons for that:

1. In Python,  functions  *are* objects. And particularly useful ones.
2. Python functions are pretty powerful and all their secrets are probably
   *not* well known to the average Python programmer.
3. In the solutions of many problems, you don't need the full apparatus
   of OOP: good old functions can be enough.

Moreover, I am a believer in the multiparadigm approach to programming, 
in which you choose your tools according to your problem. 
With a  bazooka you can kill a mosquito, yes, but this does not mean 
that you must use the bazooka *always*. 
In certain languages, you have no choice, and you must define
a class (involving a lot of boiler plate code) even for the most trivial
application. Python's philosophy is to keep simple things simple, but
having the capability of doing even difficult things with a reasonable
amount of effort. The message of this chapter will be: "use functions when 
you don't need classes". Functions are good because:

1. They are easy to write (no boiler plate);
2. They are easy to understand;
3. They can be reused in your code;
4. Functions are an essential building block in the construction of objects.

Even if I think that OOP is an extremely effective strategy, with
enormous advantages on design, maintanibility and reusability of code,
nevertheless this book is *not* intended to be a panegyric of OOP. There
are cases in which you don't need OOP. I think the critical parameter is
the size of the program. These are the rules I follows usually (to be
taken as indicative):

1. If I have to write a short script of 20-30 lines, that copies two or 
   three files and prints some message, I use fast and dirty spaghetti-code; 
   there is no use for OOP.
2. If your script grows to one-hundred lines or more, I structure
   it  write a few routines and a main program: but still I can live
   without OOP.
3. If the script goes beyond the two hundred lines, I start
   collecting my routines in few classes.
4. If the script goes beyond the five hundred lines, I split the program
   in various files and modules and convert it to a package.
5. I never write a function longer than 50 lines, since 50 lines is more
   or less the size of a page in my editor, and I need to be able to
   see the entire function in a page.

Of course your taste could be different and you could prefer to write a 
monolitic program of five thousand lines; however the average size of 
the modules in the Python standard library is of 111 lines.
I think this is a *strong* suggestion towards 
a modular style of programming, which
is *very* well supported in Python.

The point is that OOP is especially useful for *large* programs: if you
only use Python for short system administration scripts you may well
live without OOP. Unfortunaly, as everybody knows, short scripts have
an evil tendency to become medium size scripts, and medium size scripts
have the even more evil tendency to become large scripts and possible
even full featured applications ! For this reason it is very probable
that at a certain moment you will feel the need for OOP.

I remember my first big program, a long time ago: I wrote a program
to draw mathematical functions in AmigaBasic. It was good and nice
until it had size of few hundred lines; but when it passed a thousand
of lines, it became rapidly unmanageable and unmaintenable. There where
three problems:

1. I could not split the program in modules, as I wanted, due to the
   limitations of AmigaBasic;

2. I was missing OOP to keep the logic of the program all together, but
   at the time I didn't know that;

3. I was missing effective debugging techniques.

4. I was missing effective refactoring tools.

I am sure anybody who has ever written a large program has run in these
limitations: and the biggest help of OOP is in overcoming these limitations.
Obviously, miracles are impossible, and even object oriented programs can
grow to a size where they become unmaintanable: the point is that the
critical limit is much higher than the thousand lines of structured programs.
I haven't yet reached the limit of unmanageability with Python. The fact
that the standard library is 66492 lines long (as result from the total
number of lines in ``/usr/local/lib/python2.2/``), but it is still manageable,
give me an hope ;-)

 .. [#] However, one could argue that having functions distinguished from 
        methods is the best thing to do, even in a strongly object-oriented 
        world. For instance, generic functions can be used to implement 
        multimethods. See for instance Lisp, Dylan and MultiJava. This latter 
        is forced to introduce the concept of function outside a class, 
        foreign to traditional Java, just to implement multimethods.
  
A few useful functions
------------------------------------------------------------------------------

It is always a good idea to have a set of useful function collected in
a user defined module. The first function we want to have in our module
is the ``do_nothing`` function:

 ::

  #<oopp.py>

  def do_nothing(*args,**kw): pass

  #</oopp.py>

This function accept a variable number of arguments and keywords (I
defer the reader to the standard documentation if she is unfamiliar
with these concept; this is *not* another Python tutorial ;-) and
return ``None``. It is very useful for debugging purposes, when in a
complex program you may want concentrate your attention to few crucial
functions and set the non-relevant functions to ``do_nothing`` functions.

A second function which is useful in developing programs is a timer
function. Very ofter indeed,  we may want to determine the bottleneck
parts of a program, we are interested in profiling them and in seeing 
if we can improve the speed by improving the algorithm, or by using
a Python "compiler" such as Psyco, or if really we need to write a C 
extension. In my experience, I never needed to write a C extension,
since Python is fast enough. Nevertheless, to profile a program is
always a good idea and Python provides a profiler module in the 
stardard library with this aim. Still, it is convenient to have
a set of user defined functions to test the execution speed of
few selected routines (whereas the standard profiler profiles everything).

We see from the standard library documentation that
the current time can be retrieved from the ``time`` module: [#]_

  >>> import time
  >>> time.asctime()
  'Wed Jan 15 12:46:03 2003'

Since we are not interested in the date but only in the time, we need
a function to extract it. This is easily implemented:

 ::

  #<oopp.py>

  import time

  def get_time():
      "Return the time of the system in the format HH:MM:SS"
      return time.asctime().split()[3]

  #</oopp.py>

  >>> from oopp import get_time
  >>> get_time()
  '13:03:49'

Suppose, for instance, we want to know how much it takes to Python
to write a Gigabyte of data. This can be a quite useful benchmark
to have an idea of the I/O bottlenecks in our system. Since to take in memory
a file of a Gigabyte can be quite problematic, let me compute the
time spent in writing 1024 files of one Megabyte each. To this
aim we need a ``writefile`` function

 ::

  #<oopp.py>

  def writefile(fname,data):
      f=file(fname,'w')
      f.write(data)
      f.close()

  #</oopp.py>

and timing function. The idea is to wrap the ``writefile`` function in
a ``with_clock`` function as follows:

 ::

  #<oopp.py>

  def with_clock(func,n=1):
      def _(*args,**kw): # this is a closure
          print "Process started on",get_time()
          print ' .. please wait ..'
          for i in range(n): func(*args,**kw)
          print "Process ended on",get_time()
      return _

  #</oopp.py>

The wrapper function ``with_clock`` has converted the function ``writefile`` 
in a function ``with_clock(writefile)`` which has the same arguments
of ``writefile``, but contains additional features: in this case
timing capabilities. Technically speaking, the internal function ``_``
is called a *closure*. Closures are very common in functional languages 
and can be used in Python too, with very little effort [#]_.

I will use closures very often in the following, and I will use
the convention of denoting with "_" the inner 
function in the closure, since there is no reason of giving to it a 
descriptive name (the name 'with_clock' in the outer function 
is descriptive enough). For the same, reason I do not use a 
docstring for "_". If Python would allow multistatement lambda
functions, "_" would be a good candidate for an anonymous function.

Here is an example of usage:

  >>> from oopp import *
  >>> data='*'*1024*1024 #one megabyte
  >>> with_clock(writefile,n=1024)('datafile',data) #.
  Process started on 21:20:01
   .. please wait ..
  Process ended on 21:20:57

This example shows that Python has written one Gigabyte of data (splitted in
1024 chunks of one Megabyte each) in less than a minute. However,the 
result depends very much on the filesystem. I always suggest people
to profile their programs, since one *always* find surprises.
For instance, I have checked the performance of my laptop, 
a dual machine Windows 98 SE/ Red Hat Linux 7.3.
The results are collected in the following table:

  ================= ===================== ========================
                        Laptop                                  
   Linux ext-3       FAT under Linux       FAT under Windows 98 
  ================= ===================== ========================
       24-25 s           56-58 s                86-88 s 
  ================= ===================== ========================            


We see that Linux is *much* faster: more than three times faster than
Windows, using the same machine! Notice that the FAT filesystem under
Linux (where it is *not* native) is remarkably faster than the FAT
under Windows 98, where it is native !! I think that now my readers
can begin to understand why this book has been written under Linux 
and why I *never* use Windows for programming (actually I use it only 
to see the DVD's ;-).

I leave as an exercise for the reader to check the results on this
script on their machine. Since my laptop is quite old, you will probably
have much better performances (for instance on my linux desktop I can
write a Gigabyte in less than 12 seconds!). However, there are *always*
surprises: my desktop is a dual Windows 2000 machine with three different
filesystems, Linux ext-2, FAT and NTFS. Surprisingly enough, the NT
filesystem is the more inefficient for writing, *ten times slower* 
than Linux!

  ================= ===================== ========================
                      Desktop
  Linux ext-2       FAT under Win2000      NTFS under Win2000
  ================= ===================== ========================
  11-12 s           95-97 s                117-120 s
  ================= ===================== ========================

.. [#] Users of Python 2.3 can give a look to the new ``datetime`` module,
       if they are looking for a sophisticated clock/calendar.

.. [#] There are good references on functional programming in Python; 
       I suggest the Python Cookbook and the articles by David Mertz
       www.IBM.dW.


Functions are objects
---------------------------------------------------------------------------

As we said in the first chapter, objects have attributes accessible with the 
dot notation. This is not surprising at all. However, it could be
surprising to realize that since Python functions are objects, they
can have attributes, too. This could be surprising since this feature is quite 
uncommon: typically or i) the language is
not object-oriented, and therefore functions are not objects, or ii)
the language is strongly object-oriented and does not have functions, only 
methods. Python is a multiparadigm language (which I prefer to the
term "hybrid" language), therefore it has functions that are objects,
as in Lisp and other functional languages. 
Consider for instance the ``get_time`` function.
That function has at least an useful attribute, its doctring:

  >>> from oopp import get_time
  >>> print get_time.func_doc
  Return the time of the system in the format HH:MM:SS

The docstring can also be obtained with the ``help`` function:

  >>> help(get_time)
  Help on function get_time in module oopp:
  get_time()
      Return the time of the system in the format HH:MM:SS

Therefore ``help`` works on user-defined functions, too, not only on
built-in functions. Notice that ``help`` also returns the argument list of 
the function. For instance, this is
the help message on the ``round`` function that we will use in the
following:

  >>> help(round)
  Help on built-in function round:
  round(...)
      round(number[, ndigits]) -> floating point number
      Round a number to a given precision in decimal digits (default 0 
      digits).This always returns a floating point number.  Precision may 
      be negative.

I strongly recommend Python programmers to use docstrings, not
only for clarity sake during the development, but especially because
it is possible to automatically generate nice HTML documentation from 
the docstrings, by using the standard tool "pydoc".

One can easily add attributes to a function. For instance:

  >>> get_time.more_doc='get_time invokes the function time.asctime'
  >>> print get_time.more_doc
  get_time invokes the function time.asctime

Attributes can be functions, too:

  >>> def IamAfunction(): print "I am a function attached to a function"
  >>> get_time.f=IamAfunction
  >>> get_time.f()
  I am a function attached to a function

This is a quite impressive potentiality of Python functions, which has
no direct equivalent in most other languages.

One possible application is to fake C "static" variables. Suppose
for instance we need a function remembering how may times it is
called: we can simply use

 ::

  #<double.py>

  def double(x):
      try: #look if double.counter is defined
          double.counter
      except AttributeError:
          double.counter=0 #first call
      double.counter+=1
      return 2*x

  double(double(2))
  print "double has been called %s times" % double.counter

  #</double.py>

with output ``double has been called 2 times``.
A more elegant approach involves closures. A closure can enhance an
ordinary function, providing to it the capability of remembering 
the results of its previous calls and avoiding the duplication of
computations:

::
 
  #<oopp.py>

  def withmemory(f):
      """This closure invokes the callable object f only if need there is"""
      argskw=[]; result=[]
      def _(*args,**kw): 
          akw=args,kw
          try: # returns a previously stored result
              i=argskw.index(akw)
          except ValueError: # there is no previously stored result
              res=f(*args,**kw)  # returns the new result
              argskw.append(akw) # update argskw
              result.append(res) # update result
              return res
          else:
              return result[i]  
      _.argskw=argskw #makes the argskw list accessible outside
      _.result=result #makes the result list accessible outside
      return _

  def memoize(f):
      """This closure remembers all f invocations"""
      argskw,result = [],[]
      def _(*args,**kw): 
          akw=args,kw
          try: # returns a previously stored result
              return result[argskw.index(akw)]
          except ValueError: # there is no previously stored result
              argskw.append(akw) # update argskw
              result.append(f(*args,**kw)) # update result
              return result[-1] # return the new result
      _.argskw=argskw #makes the argskw list accessible outside
      _.result=result #makes the result list accessible outside
      return _
 
  #</oopp.py>

Now, if we call the wrapped function ``f``  twice with the same arguments, 
Python can give the result without repeating the (possibly very long) 
computation.

  >>> def f(x):
  ...    print 'called f'
  ...    return x*x
  >>> wrapped_f=withmemory(f)
  >>> wrapped_f(2) #first call with the argument 2; executes the computation
  called f
  4
  >>> wrapped_f(2) #does not repeat the computation
  4
  >>> wrapped_f.result
  [4]
  >>> wrapped_f.argskw
  [((2,), {})]

Profiling functions
---------------------------------------------------------------------------

The ``with_clock`` function provided before was intended to be
pedagogical; as such it is a quite poor solution to the
problem of profiling a Python routine. A better solution involves
using two others functions in the time library, ``time.time()`` 
that gives that time in seconds elapsed from a given date, and 
``time.clock()`` that gives the time spent by the CPU in a given 
computation. Notice that ``time.clock()`` has not an infinite
precision (the precision depends on the system) and one 
should expect relatively big errors if the function runs in
a very short time. That's the reason why it is convenient
to execute multiple times short functions and divide the total
time by the number of repetitions. Moreover, one should subtract the
overhead do to the looping. This can be computed with the following
routine:

 :: 

  #<oopp.py>

  def loop_overhead(N):
      "Computes the time spent in empty loop of N iterations"
      t0=time.clock()
      for i in xrange(N): pass
      return time.clock()-t0

  #</oopp.py>

For instance, on my laptop an empty loop of one million of iterations
is performed in 1.3 seconds. Typically the loop overhead is negligible,
whereas the real problem is the function overhead.

Using the attribute trick discussed above, we may
define a ``with_timer`` function that enhances quite a bit 
``with_clock``:

 ::

  #<oopp.py>

  def with_timer(func, modulename='__main__', n=1, logfile=sys.stdout):
      """Wraps the function func and executes it n times (default n=1). 
      The average time spent in one iteration, express in milliseconds, 
      is stored in the attributes func.time and func.CPUtime, and saved 
      in a log file which defaults to the standard output.
      """
      def _(*args,**kw): # anonymous function
          time1=time.time()
          CPUtime1=time.clock()
          print 'Executing %s.%s ...' % (modulename,func.__name__),
          for i in xrange(n): res=func(*args,**kw) # executes func n times
          time2=time.time()
          CPUtime2=time.clock()
          func.time=1000*(time2-time1)/n
          func.CPUtime=1000*(CPUtime2-CPUtime1-loop_overhead(n))/n
          if func.CPUtime<10: r=3 #better rounding
          else: r=1 #default rounding
          print >> logfile, 'Real time: %s ms' % round(func.time,r),
          print >> logfile, ' CPU time: %s ms' % round(func.CPUtime,r)
          return res
      return _

  #</oopp.py>

Here it is an example of application:

  >>> from oopp import with_timer,writefile
  >>> data='*'*1024*1024 #one megabyte
  >>> with_timer(writefile,n=1024)('datafile',data) #.
  Executing writefile ... Real time: 60.0 ms  CPU time: 42.2 ms
  
The CPU time can be quite different from the real time, 
as you can see in the following example:

  >>> import time
  >>> def sleep(): time.sleep(1)
  ...
  >>> with_timer(sleep)() #.
  Executing sleep ... Real time: 999.7 ms  CPU time: 0.0 ms
 
We see that Python has run for 999.7 ms (i.e. 1 second, up to
approximation errors in the system clock) during which the CPU has 
worked for 0.0 ms (i.e. the CPU took a rest ;-).
The CPU time is the relevant time to use with the purpose of
benchmarking Python speed.

I should notice that the approach pursued in ``with_timer`` is still
quite simple. A better approach would be to
plot the time versus the number of iteration, do a linear interpolation
and extract the typical time for iteration from that. This allows
to check visually that the machine is not doing something strange
during the execution time and it is what
I do in my personal benchmark routine; doing something similar is
left as an exercise for the reader ;-).

Another approach is to use the ``timeit.py`` module (new in Python 2.3,
but works also with Python 2.2):

 ::

  #<oopp.py>

  import timeit,__main__,warnings

  warnings.filterwarnings('ignore',
  'import \* only allowed at module level',SyntaxWarning)

  def timeit_(stmt,setup='from __main__ import *',n=1000):
      t=timeit.Timer(stmt,setup)
      try: print t.repeat(number=n) # class timeit 3 times
      except: t.print_exc()

  #</oopp.py>

It is often stated that Python is slow and quite ineffective
in application involving hard computations. This is generally speaking
true, but how bad is the situation ? To test the (in)efficiency of
Python on number crunching, let me give a function to compute the
Mandelbrot set, which I have found in the Python Frequently Asked
Question (FAQ 4.15. *Is it possible to write obfuscated one-liners 
in Python?*).
This function is due to Ulf Bartelt and you should ask him to know how
does it work ;-)

 ::

  #<oopp.py>

  def mandelbrot(row,col):
      "Computes the Mandelbrot set in one line"
      return (lambda Ru,Ro,Iu,Io,IM,Sx,Sy:reduce(
          lambda x,y:x+y,map(lambda y,Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,Sy=Sy,L=
          lambda yc,Iu=Iu,Io=Io,Ru=Ru,Ro=Ro,i=IM, Sx=Sx,Sy=Sy:reduce(
          lambda x,y:x+y,map(lambda x,xc=Ru,yc=yc,Ru=Ru,Ro=Ro, i=i,
          Sx=Sx,F=lambda xc,yc,x,y,k,f=lambda xc,yc,x,y,k,f:(k<=0)
          or (x*x+y*y>=4.0) or 1+f(xc,yc,x*x-y*y+xc,2.0*x*y+yc,k-1,f):
          f(xc,yc,x,y,k,f):chr(64+F(Ru+x*(Ro-Ru)/Sx,yc,0,0,i)),
          range(Sx))):L(Iu+y*(Io-Iu)/Sy),range(Sy))))(
          -2.1, 0.7, -1.2, 1.2, 30, col, row)
      #    \___ ___/  \___ ___/  |   |    |_ lines on screen
      #        V          V      |   |______ columns on screen
      #        |          |      |__________ maximum of "iterations"
      #        |          |_________________ range on y axis
      #        |____________________________ range on x axis
      
  #</oopp.py>

Here there is the benchmark on my laptop:

  >>> from oopp import mandelbrot,with_timer
  >>> row,col=24,75
  >>> output=with_timer(mandelbrot,n=1)(row,col)
  Executing __main__.mandelbrot ... Real time: 427.9 ms  CPU time: 410.0 ms
  >>> for r in range(row): print output[r*col:(r+1)*col]
  ...
  BBBBBBBBBBBBBBCCCCCCCCCCCCDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDCCCCCCCCCCCCCC
  BBBBBBBBBBBBCCCCCCCCCDDDDDDDDDDDDDDDDDDDDDDEEEEEEFGYLFFFEEEEEDDDDDCCCCCCCCC
  BBBBBBBBBBCCCCCCCDDDDDDDDDDDDDDDDDDDDDEEEEEEEEEFFFGIKNJLLGEEEEEEDDDDDDCCCCC
  BBBBBBBBBCCCCCDDDDDDDDDDDDDDDDDDDDDEEEEEEEEEFFFFGHJJR^QLIHGFFEEEEEEDDDDDDCC
  BBBBBBBBCCCDDDDDDDDDDDDDDDDDDDDDEEEEEEEEEFFFGGGHIK_______LHGFFFFFEEEEDDDDDD
  BBBBBBBCCDDDDDDDDDDDDDDDDDDDDEEEEEEEFFFGHILIIIJJKMS_____PLJJIHGGGHJFEEDDDDD
  BBBBBBCDDDDDDDDDDDDDDDDDDEEEEEFFFFFFGGGHMQ__T________________QLOUP[OGFEDDDD
  BBBBBCDDDDDDDDDDDDDDDEEEFFFFFFFFFGGGGHJNM________________________XLHGFFEEDD
  BBBBCDDDDDDDDDEEEEEFFGJKHHHHHHHHHHHHIKN[__________________________MJKGFEEDD
  BBBBDDDDEEEEEEEEFFFFGHIKPVPMNU_QMJJKKZ_____________________________PIGFEEED
  BBBCDEEEEEEEEFFFFFFHHHML___________PQ_______________________________TGFEEEE
  BBBDEEEEEEFGGGGHHHJPNQP^___________________________________________IGFFEEEE
  BBB_____________________________________________________________OKIHGFFEEEE
  BBBDEEEEEEFGGGGHHHJPNQP^___________________________________________IGFFEEEE
  BBBCDEEEEEEEEFFFFFFHHHML___________PQ_______________________________TGFEEEE
  BBBBDDDDEEEEEEEEFFFFGHIKPVPMNU_QMJJKKZ_____________________________PIGFEEED
  BBBBCDDDDDDDDDEEEEEFFGJKHHHHHHHHHHHHIKN[__________________________MJKGFEEDD
  BBBBBCDDDDDDDDDDDDDDDEEEFFFFFFFFFGGGGHJNM________________________XLHGFFEEDD
  BBBBBBCDDDDDDDDDDDDDDDDDDEEEEEFFFFFFGGGHMQ__T________________QLOUP[OGFEDDDD
  BBBBBBBCCDDDDDDDDDDDDDDDDDDDDEEEEEEEFFFGHILIIIJJKMS_____PLJJIHGGGHJFEEDDDDD
  BBBBBBBBCCCDDDDDDDDDDDDDDDDDDDDDEEEEEEEEEFFFGGGHIK_______LHGFFFFFEEEEDDDDDD
  BBBBBBBBBCCCCCDDDDDDDDDDDDDDDDDDDDDEEEEEEEEEFFFFGHJJR^QLIHGFFEEEEEEDDDDDDCC
  BBBBBBBBBBCCCCCCCDDDDDDDDDDDDDDDDDDDDDEEEEEEEEEFFFGIKNJLLGEEEEEEDDDDDDCCCCC
  BBBBBBBBBBBBCCCCCCCCCDDDDDDDDDDDDDDDDDDDDDDEEEEEEFGYLFFFEEEEEDDDDDCCCCCCCCC

I am willing to concede that this code is not typical Python code and
actually it could be an example of *bad* code, but I wanted a nice ASCII 
picture on my book ... :) Also, this prove that Python is not necessarily 
readable and easy to understand ;-)
I leave for the courageous reader to convert the previous algorithm to C and
measure the difference in speed ;-) 


About Python speed
---------------------------------------------------

The best way to improved the speed is to improve the algorithm; in
this sense Python is an ideal language since it allows you to test
many algorithms in an incredibly short time: in other words, the time you 
would spend fighting with the compiler in other languages, in Python
can be used to improve the algorithm. 
However in some cases, there is little to do: for instance, in many
problems one has to run lots of loops, and Python loops are horribly
inefficients as compared to C loops. In this case the simplest possibility 
is to use Psyco. Psyco is a specialing Python compiler written by Armin
Rigo. It works for 386 based processors and allows Python to run loops at 
C speed. Installing Psyco requires $0.00 and ten minutes of your time:
nine minutes to find the program, download it, and install it; one
minute to understand how to use it.

The following script explains both the usage and the advantages of Psyco:

 ::

  #<psyco1.py>

  import oopp,sys
  try: 
      import psyco
  except ImportError: 
      print "Psyco is not installed, sorry."
  else:
      n=1000000 # 1,000,000 loops

      without=oopp.loop_overhead(n) 
      print "Without Psyco:",without

      psyco.bind(oopp.loop_overhead) #compile the empty_loop

      with=oopp.loop_overhead(n) 
      print "With Psyco:",with

      print 'Speedup = %sx' % round(without/with,1)

  #</psyco1.py>

The output is impressive:

 ::

  Without Psyco: 1.3
  With Psyco: 0.02
  Speedup = 65.0x


Notice that repeating the test, you will obtain different speedups.
On my laptop, the speedup for an empty loop of 10,000,000 of
iteration is of the order of 70x, which is the same speed of a C loop, 
actually (I checked it). On my desktop, I have even found a speedup of
94x !

However, I must say that Psyco has some limitations. The problem is
the function call overhead. Psyco enhances the overhead and in some
programs it can even *worsen* the performance (this is way you should
*never* use the ``psyco.jit()`` function that wraps all the functions of
your program: you should only wrap the bottleneck loops). Generally speaking, 
you should expect a much more modest improvement, a factor of 2 or 3
is what I obtain usually in my programs.

Look at this second example, which essentially measure the function 
call overhead by invoking the ``do_nothing`` function:

 ::

  #<psyco2.py>

  import oopp
  try: 
      import psyco
  except ImportError:
      print "Psyco is not installed, sorry."
  else:
      n=10000 # 10,000 loops
   
      def do_nothing_loop():
          for i in xrange(n): oopp.do_nothing()

      print "Without Psyco:\n"
      oopp.with_timer(do_nothing_loop,n=5)() #50,000 times

      without=do_nothing_loop.CPUtime

      psyco.bind(do_nothing_loop) 
      print "With Psyco:\n"
      oopp.with_timer(do_nothing_loop,n=5)() #50,000 times

      with=do_nothing_loop.CPUtime

      print 'Speedup = %sx' % round(without/with,1)

  #</psyco2.py>

The output is less incredible:

 ::

  Without Psyco:
  Executing do_nothing_loop ... Real time: 138.2 ms  CPU time: 130.0 ms
  With Psyco:
  Executing do_nothing_loop ... Real time: 70.0 ms  CPU time: 68.0 ms
  Speedup = 1.9x


However, this is still impressive, if you think that you can double 
the speed of your program by adding *a line* of code! Moreover this
example is not fair since Psyco cannot improve very much the performance 
for loops invoking functions with a variable number of arguments. On the
other hand, it can do quite a lot for loops invoking functions with 
a fixed number of arguments. I have checked that you can easily reach 
speedups of 20x (!). The only disadvantage is that a program invoking
Psyco takes much more memory, than a normal Python program, but this
is not a problem for most applications in nowadays computers. 
Therefore, often Psyco
can save you the effort of going trough a C extension. In some cases,
however, there is no hope: I leave as an exercise for the reader
to check (at least the version 0.4.1 I am using now) is unable to
improve the performance on the Mandelbrot set example. This proves
that in the case bad code, there is no point in using a compiler:
you have to improve the algorithm first ! 

By the way, if you really want to go trough a C extension with a minimal
departure from Python, you can use Pyrex by Greg Ewing. A Pyrex program
is essentially a Python program with variable declarations that is
automatically converted to C code. Alternatively, you can inline
C functions is Python with ``weave`` of ... 
Finally, if you want to access C/C++ libraries, there tools
like Swig, Booster and others. 

Tracing functions
---------------------------------------------------------------------------

Typically, a script contains many functions that call themselves each
other when some conditions are satisfied. Also, typically during
debugging things do not work the way we would like and it is not
clear which functions are called, in which order they are called,
and which parameters are passed. The best way to know all these
informations, is to trace the functions in our script, and to write
all the relevant informations in a log file. In order to keep the
distinction between the traced functions and the original one, it
is convenient to collect all the wrapped functions in a separate dictionary. 
The tracing of a single function can be done with a closure 
like this:

::
  
  #<oopp.py>

  def with_tracer(function,namespace='__main__',output=sys.stdout, indent=[0]):
      """Closure returning traced functions. It is typically invoked
      trough an auxiliary function fixing the parameters of with_tracer."""
      def _(*args,**kw):
          name=function.__name__
          i=' '*indent[0]; indent[0]+=4 # increases indentation
          output.write("%s[%s] Calling '%s' with arguments\n" % 
                                                   (i,namespace,name))
          output.write("%s %s ...\n" % (i,str(args)+str(kw)))
          res=function(*args,**kw)
          output.write("%s[%s.%s] called with result: %s\n"
                           % (i,namespace,name,str(res)))
          indent[0]-=4 # restores indentation
          return res
      return _ # the traced function

  #</oopp.py>

Here is an example of usage:

  >>> from oopp import with_tracer
  >>> def fact(n): # factorial function
  ...     if n==1: return 1
  ...     else: return n*fact(n-1)
  >>> fact=with_tracer(fact)
  >>> fact(3)
  [__main__] Calling 'fact' with arguments
   (3,){} ...
      [__main__] Calling 'fact' with arguments
       (2,){} ...
          [__main__] Calling 'fact' with arguments
           (1,){} ...
          [__main__.fact] called with result: 1
      [__main__.fact] called with result: 2
  [__main__.fact] called with result: 6
  6

The logic behind ``with_tracer`` should be clear; the only trick is the
usage of a default list as a way to store a global indentation parameter.
Since ``indent`` is mutable, the value of ``indent[0]`` changes at any
recursive call of the traced function, resulting in a nested display.

Typically, one wants to trace all the functions in a given module; 
this can be done trough the following function:

 ::

  #<oopp.py>

  from types import *

  isfunction=lambda f: isinstance(f,(FunctionType,BuiltinFunctionType))

  def wrapfunctions(obj,wrapper,err=None,**options):
      "Traces the callable objects in an object with a dictionary"
      namespace=options.get('namespace',getattr(obj,'__name__',''))
      output=options.get('output',sys.stdout)
      dic=dict([(k,wrapper(v,namespace,output)) 
                for k,v in attributes(obj).items() if isfunction(v)])
      customize(obj,err,**dic)

  #</oopp.py>

Notice that 'wrapfunctions' accepts as first argument an object with 
a ``__dict__`` attribute (such as a module or a class) or with some 
explicit attributes (such as a simple object) and modifies it. One can 
trace a module as in this example:

 ::

  #<tracemodule.py>

  import oopp,random

  oopp.wrapfunctions(random,oopp.with_tracer) 

  random.random()

  #</tracemodule.py>

with output

 ::

  [random] Calling 'random' with arguments
  (){} ...
  -> 'random.random' called with result: 0.175450439202

The beauty of the present approach is its generality: 'wrap' can be
used to add any kind of capabilities to a pre-existing module.
For instance, we could time the functions in a module, with the
purpose of looking at the bottlenecks. To this aim, it is enough
to use a 'timer' nested closure:

An example of calling is  ``wrapfunction(obj,timer,iterations=1)``.

We may also compose our closures; for instance one could define a 
``with_timer_and_tracer`` closure:

  >>> with_timer_and_tracer=lambda f: with_timer(with_tracer(f))

It should be noticed that Python comes with a standard profiler
(in my system it is located in ``/usr/local/lib/python2.2/profile.py``)
that allows to profile a script or a module (try 
python /usr/local/lib/python2.2/profile.py oopp.py)

or 

  >>> import profile; help(profile)

and see the on-line documentation.

Tracing objects
----------------------------------------------------------------------

In this section, I will give a more sophisticated example, in which 
one can easily understand why the Python ability of changing methods and 
attributes during run-time, is so useful.
As a preparation to the real example, let me
first introduce an utility routine that allows the user
to add tracing capabilities to a given object.
Needless to say, this feature can be invaluable during debugging, or in trying
to understand the behaviour of a program written by others.

This routine is a little complex and needs some explanation.

1. The routine looks in the attributes of the object and try to access them.

2. If the access is possible, the routines looks for methods (methods
   are recognized trough the ``inspect.isroutine`` function in the
   standard library) and ignores regular attributes;

3. The routine try to override the original methods with improved ones,
   that possess tracing capabilities;

4. the traced method is obtained with the wrapping trick discussed before.

I give  now the real life example that I have anticipated before.
Improvements and elaborations of this example can be useful to the
professional programmer, too. Suppose you have an XML text you want
to parse. Python provides excellent support for this kind of operation
and various standard modules. One of the most common is the ``expat``
module (see the standard library documentation for more).

If you are just starting using the module, it is certainly useful
to have a way of tracing its behaviour; this is especially true if
you you find some unexpected error during the parsing of a document
(and this may happens even if you are an experience programmer ;-).

The tracing routine just defined can be used to trace the parser, as
it is exemplified in the following short script:

 ::

  #<expat.py>

  import oopp, xml.parsers.expat, sys

  # text to be parsed
  text_xml="""\
  <?xml version="1.0"?>
  <parent id="dad">
  <child name="kid">Text goes here</child>
  </parent>"""

  # a few do nothing functions
  def start(*args): pass
  def end(*args): pass
  def handler(*args): pass

  # a parser object
  p = xml.parsers.expat.ParserCreate()

  p.StartElementHandler = start
  p.EndElementHandler = end
  p.CharacterDataHandler = handler

  #adds tracing capabilities to p
  oopp.wrapfunctions(p,oopp.with_tracer, err=sys.stdout)

  p.Parse(text_xml)

  #</expat.py>

The output is:

 ::

  Error: SetBase cannot be set
  Error: Parse cannot be set
  Error: ParseFile cannot be set
  Error: GetBase cannot be set
  Error: SetParamEntityParsing cannot be set
  Error: ExternalEntityParserCreate cannot be set
  Error: GetInputContext cannot be set
  [] Calling 'start' with arguments
   (u'parent', {u'id': u'dad'}){} ...
  [.start] called with result: None
  [] Calling 'handler' with arguments
   (u'\n',){} ...
  [.handler] called with result: None
  [] Calling 'start' with arguments
   (u'child', {u'name': u'kid'}){} ...
  [.start] called with result: None
  [] Calling 'handler' with arguments
   (u'Text goes here',){} ...
  [.handler] called with result: None
  [] Calling 'end' with arguments
   (u'child',){} ...
  [.end] called with result: None
  [] Calling 'handler' with arguments
   (u'\n',){} ...
  [.handler] called with result: None
  [] Calling 'end' with arguments
   (u'parent',){} ...
  [.end] called with result: None


This is a case where certain methods cannot be managed with 
``getattr/setattr``, because they are internally coded in C: this
explain the error messages at the beginning. I leave as an exercise 
for the reader to understand the rest ;-)

Inspecting functions
----------------------------------------------------------------------

Python wonderful introspection features are really impressive when applied
to functions. It is possible to extract a big deal of informations
from a Python function, by looking at its associated *code object*.
For instance, let me consider  my, ``do_nothing`` function: its associated
code object can be extracted from the ``func_code`` attribute:

  >>> from oopp import *
  >>> co=do_nothing.func_code # extracts the code object
  >>> co
  <code object do_nothing at 0x402c5d20, file "oopp.py", line 48>
  >>> type(co)
  <type 'code'>

The code object is far being trivial: the docstring says it all:

  >>> print type(co).__doc__
  code(argcount, nlocals, stacksize, flags, codestring, constants, names,
        varnames, filename, name, firstlineno, lnotab[, freevars[, cellvars]])
  Create a code object.  Not for the faint of heart.

In the case of my ``do_nothing`` function, the code object 
possesses the following attributes:

  >>> print pretty(attributes(co))
  co_argcount = 0
  co_cellvars = ()
  co_code = dS
  co_consts = (None,)
  co_filename = oopp.py
  co_firstlineno = 48
  co_flags = 15
  co_freevars = ()
  co_lnotab =
  co_name = do_nothing
  co_names = ()
  co_nlocals = 2
  co_stacksize = 1
  co_varnames = ('args', 'kw')

Some of these arguments are pretty technical and implementation dependent;
however, some of these are pretty clear and useful:

 - co_argcount is the total number of arguments
 - co_filename is the name of the file where the function is defined
 - co_firstlineno is the line number where the function is defined
 - co_name is the name of the function
 - co_varnames are the names

The programmer that it is not a "faint of heart" can study
the built-in documentation on code objects; s/he should try

 ::

  for k,v in attributes(co).iteritems(): print k,':',v.__doc__,'\n'

 # does not work now !!

 ::

  add=[lambda x,i=i: x+i for i in range(10)]

  >>> def f(y):
  ...    return lambda x: x+y
  ...
  >>> f(1).func_closure #closure cell object
  (<cell at 0x402b56bc: int object at 0x811d6c8>,)

func.defaults, closure, etc.

#how to extract (non-default) arguments as help does.

print (lambda:None).func_code.co_filename

One cannot change the name of a function:

  >>> def f(): pass
  ...
  >>> f.__name__='ciao' # error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: readonly attribute

However, one can create a copy with a different name:

 ::

  #<oopp.py>

  def copyfunc(f,newname=None): # works under Python 2.3
      if newname is None: newname=f.func_name # same name
      return FunctionType(f.func_code, globals(), newname, 
                           f.func_defaults, f.func_closure)

  #</oopp.py>

  >>> copyfunc(f,newname='f2')
  <function f2 at 0x403e233c>

Notice that the ``copy`` module would not do the job:

  >>> import copy
  >>> copy.copy(f) # error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "/usr/local/lib/python2.3/copy.py", line 84, in copy
      y = _reconstruct(x, reductor(), 0)
    File "/usr/local/lib/python2.3/copy_reg.py", line 57, in _reduce
      raise TypeError, "can't pickle %s objects" % base.__name__
  TypeError: can't pickle function objects


THE BEAUTY OF OBJECTS
===========================================================================

In this chapter I will show how to define generic objects in Python, and
how to manipulate them.

User defined objects
--------------------------------------------------------------------------

In Python, one cannot directly modify methods and attributes of built-in 
types, since this would be a potentially frightening source of bugs. 
Imagine for instance of changing the sort method of a list and invoking an 
external module expecting the standard sort: all kind of hideous outcome 
could happen. 

Nevertheless, in Python, as in all OOP languages, the user can define 
her own kind of objects, customized to satisfy her needs. In order to
define a new object, the user must define the class of the objects she 
needs. The simplest possible class is a do-nothing class:

 ::

  #<oopp.py>

  class Object(object):
      "A convenient Object class"

  #</oopp.py>

Elements of the ``Object`` class can be created (instantiated) quite
simply:
 
  >>> from oopp import Object
  >>> obj1=Object()
  >>> obj1
  <oopp.Object object at 0x81580ec>
  >>> obj2=Object()
  obj2
  <object.Object object at 0x8156704>

Notice that the hexadecimal number 0x81580ec is nothing else that the
unique object reference to ``obj1``

  >>> hex(id(obj1))
   '0x81580ec'

whereas 0x8156704 is the object reference of ``obj2``:

  >>> hex(id(obj2))
  '0x8156704'

However, at this point ``obj1`` and ``obj2`` are generic 
doing nothing objects . Nevertheless, they have 
at least an useful attribute, the class docstring:

  >>> obj1.__doc__ #obj1 docstring
  'A convenient Object class'
  >>> obj2.__doc__ # obj2 docstring: it's the same
  'A convenient Object class'

Notice that the docstring is associate to the class and therefore all
the instances share the same docstring, unless one explicitly assigns
a different docstring to some instance. ``__doc__``
is a class attribute (or a static attribute for readers familiar with the
C++/Java terminology) and the expression is actually syntactic sugar for 

  >>> class Object(object): # with explicit assignement to __doc__
  ...    __doc__ = "A convenient Object class"


Since instances of 'Object' can be modified, I can transform them in
anything I want. For instance, I can create a simple clock:

  >>> myclock=Object()
  >>> myclock
  <__main__.Object object at 0x8124614>

A minimal clock should at least print the current time 
on the system. This is given by the ``get_time`` function
we defined in the first chapter. We may "attach" that function 
to our clock as follows:

  >>> import oopp
  >>> myclock.get_time=oopp.get_time
  >>> myclock.get_time # this is a function, not a method
  <function get_time at 0x815c40c>
 
In other words, we have converted the ``oopp.get_time`` function to a
``get_time`` function of the object ``myclock``. The procedure works

  >>> myclock.get_time()
  '15:04:57'

but has a disadvantage: if we instantiate another
clock

  >>> from oopp import Object
  >>> otherclock=Object()

the other clock will ``not`` have a get_time method:

  >>> otherclock.get_time() #first attempt; error
  AttributeError: 'Object' object has no attribute 'get_time'

Notice instead that the docstring is a *class attribute*, i.e. it
is defined both for the class and *all instances* of the class,
therefore even for ``otherclock``:

  >>> Object.__doc__
  'A convenient Object class' 
  >>> otherclock.__doc__
  'A convenient Object class'

We would like to convert the ``get_time`` function to a 
``get_time`` method for the *entire* class 'Object', i.e. for all its
instances. Naively, one would be tempted to write the following:

  >>> Object.get_time=oopp.get_time

However this would not work:

  >>> otherclock.get_time() #second attempt; still error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: oopp.get_time() takes no arguments (1 given)

This error message is something that all Python beginners encounter
(and sometimes even non-beginners ;-). The solution is to introduce
an additional argument:

  >>> Object.get_time=lambda self : oopp.get_time()
  >>> otherclock.get_time # this is method now, not a function
  <bound method Object.<lambda> of <__main__.Object object at 0x815881c>>
  >>> otherclock.get_time() #third attempt
  '15:28:41'

Why this works ? The explanation is the following:
when Python encounters an expression of the form
``objectname.methodname()`` it looks if there is a already a method 
*attached* to the object:

    a. if yes it invokes it with no arguments 
       (this is why our first example worked);
    b. if not it looks at the class of the object; if there is a method 
       bound to the class it invokes that method *by passing the 
       object as first argument*.

When we invoked ``otherclock.get_time()`` in our second attempt, Python
found that the function ``get_time`` was defined at the class level, 
and sent it the ``otherclock`` object as first argument: however ``get_time`` 
was bind to ``func_get_time``, which is function with *no* arguments: whence
the error message. The third attempt worked since, thanks to the 
lambda function trick, the ``get_time`` function has been converted to
a function accepting a first argument.

Therefore that's the rule: in Python, one can define methods 
at the class level, provided one explitely introduces a first argument
containing the object on which the method is invoked.

This first argument is traditionally called ``self``; the name 'self' is not
enforced, one could use any other valid Python identifier, however the
convention is so widespread that practically everybody uses it;
pychecker will even raise a warning in the case you don't follow the
convention.

I have just shown one the most interesting features of Python, its
*dynamicity*: you can create the class first and add methods to it later.
That logic cannot be followed in typical compiled language as C++. On the
other hand, one can also define methods in a static, more traditional way:

 ::

  #<clock1.py>

  "Shows how to define methods inside the class (statically)"

  import oopp

  class Clock(object):
      'Clock class; version 0.1'
      def get_time(self): # method defined inside the class
          return oopp.get_time()

  myclock=Clock() #creates a Clock instance
  print myclock.get_time() # print the current time

  #</clock1.py>

In this case we have defined the ``get_time`` method inside the class as a
normal function with an explicit first argument called self; this is
entirely equivalent to the use of a lambda function.

The syntax ``myclock.get_time()`` is actually syntactic sugar for
``Clock.get_time(myclock)``.

In this second form, it is clear the ``get_time`` is really "attached" to the
class, not to the instance.

Objects have static methods and classmethods
-----------------------------------------------------------------------------

 .. line-block::

  *There should be one--and preferably only one--obvious way to do it*
  -- Tim Peters, *The Zen of Python*.


For any rule, there is an exception, and despite the Python's motto    
there are many ways to define methods in classes. The way I presented
before was the obvious one before the Python 2.2 revolution; however, 
nowadays there is another possibility that, even if less obvious, has the 
advantage of some elegance (and it is also slightly more efficient too, even if
efficiency if never a primary concern for a Python programmer).
We see that the first argument in the ``get_time`` method is useless,
since the time is computed from the ``time.asctime()`` function which
does not require any information about the object that is calling
it. This waste is ugly, and since according to the Zen of Python
 
    *Beautiful is better than ugly.*

we should look for another way. The solution is to use a *static method*:
when a static method is invoked, the calling object is *not* implicitly passed
as first argument. Therefore we may use a normal function with no additional
first argument to define the ``get_time`` method:

 ::

  #<oopp.py>

  class Clock(object):
      'Clock with a staticmethod'
      get_time=staticmethod(get_time)

  #</oopp.py>

Here is how it works:

  >>> from oopp import Clock
  >>> Clock().get_time() # get_time is bound both to instances
  '10:34:23'
  >>> Clock.get_time() # and to the class
  '10:34:26'

The staticmethod idiom converts the lambda function to a
static method of the class 'Clock'. Notice that one can avoid the
lambda expression and use the (arguably more Pythonic) idiom

 ::

      def get_time()
          return oopp.get_time()
      get_time=staticmethod(oopp.get_time)

as the documentation suggests:

  >>> print  staticmethod.__doc__
  staticmethod(function) -> method
  Convert a function to be a static method.
  A static method does not receive an implicit first argument.
  To declare a static method, use this idiom:
     class C:
         def f(arg1, arg2, ...): ...
         f = staticmethod(f)
  It can be called either on the class (e.g. C.f()) or on an instance
  (e.g. C().f()).  The instance is ignored except for its class.
  Static methods in Python are similar to those found in Java or C++.
  For a more advanced concept, see the classmethod builtin.

At the present the notation for static methods is still rather ugly,
but it is expected to improve in future versions of Python (probably
in Python 2.4). Documentation for static methods can
be found in Guido's essay and in the PEP.. : however this is intended for
developers.

As the docstring says, static methods are also "attached" to the
class and may be called with the syntax ``Clock.get_time()``.

A similar remark applies for the so called *classmethods*:

  >>> print classmethod.__doc__
  classmethod(function) -> method
  Convert a function to be a class method.
  A class method receives the class as implicit first argument,
  just like an instance method receives the instance.
  To declare a class method, use this idiom:
  class C:
      def f(cls, arg1, arg2, ...): ...
      f = classmethod(f)
  It can be called either on the class (e.g. C.f()) or on an instance
  (e.g. C().f()).  The instance is ignored except for its class.
  If a class method is called for a derived class, the derived class
  object is passed as the implied first argument.
  Class methods are different than C++ or Java static methods.
  If you want those, see the staticmethod builtin.


#When a regular method is invoked, a reference to the calling object is 
#implicitely passed as first argument; instead, when a static method is 
#invoked, no reference to the calling object is passed.

As the docstring says, classmethods are convenient when one wants to pass 
to a method the calling *class*, not the calling object. Here there is an 
example:

  >>> class Clock(object): pass
  >>> Clock.name=classmethod(lambda cls: cls.__name__)
  >>> Clock.name() # called by the class
  'Clock'
  >>> Clock().name() # called by an instance
  'Clock'

Notice that classmethods (and staticmethods too) 
can only be attached to classes, not to objects:

  >>> class Clock(object): pass
  >>> c=Clock()
  >>> c.name=classmethod(lambda cls: cls.__name__) 
  >>> c.name() #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: 'classmethod' object is not callable

gives a TypeError. The reason is that classmethods and staticmethods 
are implemented 
trough *attribute descriptors*. This concept will be discussed in detail in a 
forthcoming in chapter 6.

Notice that classmethods are not proving any fundamental feature, since
one could very well use a normal method and retrieve the class with 
``self.__class__`` as we did in the first chapter. 
Therefore, we could live without (actually, I think they are a non-essential
complication to the language).
Nevertheless, now that we have them, we can use them, since
they come handy in various circumstances, as we will see in the following.

Objects have their privacy
---------------------------------------------------------------------------

In some situations, it is convenient to give to the developer
some information that should be hided to the final user. To this
aim Python uses private names (i.e. names starting with a single
underscore) and private/protected attributes (i.e. attributes starting with
a double underscore). 


Consider for instance the following script:

 ::

  #<privacy.py>

  import time

  class Clock(object):
      __secret="This Clock is quite stupid."

  myclock=Clock()
  try: print myclock.__secret
  except Exception,e: print "AttributeError:",e

  #</privacy.py>

The output of this script is

 ::

  AttributeError: 'Clock' object has no attribute '__secret'

Therefore, even if the Clock object *does* have a ``__secret`` attribute, 
the user cannot access it ! In this way she cannot discover that
actually "This Clock is quite stupid."

In other programming languages, attributes like ``__secret`` are
called "private" attributes. However, in Python private attributes
are not really private and their secrets can be accessed with very
little effort. 

First of all, we may notice that ``myclock`` really contains a secret
by using the builtin function ``dir()``:

 ::

   dir(myclock)
   ['_Clock__secret', '__class__', '__delattr__', '__dict__', '__doc__', 
    '__getattribute__', '__hash__', '__init__', '__module__', '__new__', 
    '__reduce__', '__repr__', '__setattr__', '__str__', '__weakref__']

We see that the first attribute of myclock is '_Clock__secret``, 
which we may access directly:

 ::

   print myclock._Clock__secret
   This clock is quite stupid.

We see here the secret of private variables in Python: the *name mangling*.
When Python sees a name starting with two underscores (and not ending
with two underscores, otherwise it would be interpreted as a special
attribute), internally it manage it as ``_Classname__privatename``.
Notice that if 'Classname' begins with underscores, the leading underscores
are stripped in such a way to guarantee that the private name starts with
only *one* underscore. For instance, the '__secret' private attribute 
of classes such as 'Clock', '_Clock', '__Clock', '___Clock', etc. is
mangled to '_Clock__secret'.

Private names in Python are *not* intended to keep secrets: they
have other uses. 

1. On one hand, private names are a suggestion to the developer. 
   When the Python programmer sees a name starting with one or two 
   underscores in a program written by others, she understands
   that name should not be of concern for the final user, but it 
   only concerns the internal implementation.

2. On the other hand, private names are quite useful in class
   inheritance, since they provides safety with respect to the overriding
   operation. This point we will discussed in the next chapter.

3. Names starting with one (or more) underscores are not imported by the 
   statement ``from module import *``

Remark: it makes no sense to define names with double underscores
outside classes, since the name mangling doesn't work in this case.
Let me show an example:

  >>> class Clock(object): __secret="This Clock is quite stupid"
  >>> def tellsecret(self): return  self.__secret
  >>> Clock.tellsecret=tellsecret
  >>> Clock().tellsecret() #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "<stdin>", line 2, in tellsecret
  AttributeError: 'Clock' object has no attribute '__secret'

The explanation is that since ``tellsecret()`` is defined outside the class,
``__secret`` is not expanded to ``_Clock__secret`` and therefore cannot be
retrieved, whereas

  >>> class Clock(object): 
  ...     __secret="This Clock is quite stupid"
  ...     def tellsecret(self): return self.__secret
  >>> Clock().tellsecret()
  This Clock is quite stupid

will work. In other words, private variables are attached to classes,
not objects.

Objects have properties
-------------------------------------------------------------------------------

In the previous section we have shown that private variables are of
little use for keeping secrets: if a developer really wants to restrict 
the access to some methods or attributes, she has to resort to
*properties*.

Let me show an example:

 ::

  #<secret.py>

  import oopp

  class Clock(object): 
      'Clock class with a secret'
    
      you_know_the_pw=False #default
                   
      def give_pw(self,pw):
          """Check if your know the password. For security, one should crypt
          the password."""
          self.you_know_the_pw=(pw=="xyz")
        
      def get_secret(self):
          if self.you_know_the_pw:
              return "This clock doesn't work."
          else:
              return "You must give the right password to access 'secret'"
        
      secret=property(get_secret)
 
  c=Clock()
  print c.secret # => You must give the right password to access 'secret'
  c.give_pw('xyz') # gives the right password
  print c.secret # => This clock doesn't work.
  print Clock.secret # => <property object at 0x814c1b4>

  #</secret.py>

In this script, one wants to restrict the access to the attribute
'secret', which can be accessed only is the user provide the
correct password. Obviously, this example is not very secure,
since I have hard coded the password 'xyz' in the source code,
which is easily accessible. In reality, one should crypt the
password a perform a more sophisticated test than the trivial
check ``(pw=="xyz")``; anyway, the example is only intended to
shown the uses of properties, not to be really secure.

The key action is performed by the descriptor class ``property`` that
converts the function ``get_secret`` in a property object. Additional
informations on the usage of ``property`` can be obtained from the
docstring:

  >>> print property.__doc__
  property(fget=None, fset=None, fdel=None, doc=None) -> property attribute
  fget is a function to be used for getting an attribute value, and likewise
  fset is a function for setting, and fdel a function for del'ing, an
  attribute.  Typical use is to define a managed attribute x:
  class C(object):
      def getx(self): return self.__x
      def setx(self, value): self.__x = value
      def delx(self): del self.__x
      x = property(getx, setx, delx, "I'm the 'x' property.")

Properties are another example of attribute descriptors. 

Objects have special methods
---------------------------------------------------------------------------

From the beginning, we stressed that objects have special attributes that
may turn handy, as for instance the docstring ``__doc__`` and the class
name attribute ``__class__``. They have special methods, too.

With little doubt, the most useful special method is the ``__init__``
method, that *initializes* an object right after its creation. ``__init__``
is typically used to pass parameters to *object factories*. Let me an
example with geometric figures:

 ::

  #<oopp.py>
    
  class GeometricFigure(object): #an example of object factory
      """This class allows to define geometric figures according to their
      equation in the cartesian plane. It will be extended later."""
      def __init__(self,equation,**parameters):
          "Specify the cartesian equation of the object and its parameters"
          self.eq=equation
          self.par=parameters
          for k,v in self.par.items(): #replaces the parameters in the equation
              self.eq=self.eq.replace(k,str(v))
          self.contains=eval('lambda x,y : '+self.eq) 
          # dynamically creates the function 'contains'

  #</oopp.py>

Here it is how it works:

  >>> from oopp import *
  >>> disk=GeometricFigure('(x-x0)**2+(y-y0)**2 <= r**2', x0=0,y0=0,r=5)
  >>> # creates a disk of radius 5 centered in the origing
  >>> disk.contains(1,2) #asks if the point (1,2) is inside the disk
  True
  >>> disk.contains(4,4) #asks if the point (4,4) is inside the disk
  False


Let me continue the section on special methods with some some observations on
``__repr__`` and ``__str__``.Notice that  I
will not discuss all the subtleties; for a thought discussion, see the
thread "Using __repr__ or __str__" in c.l.p. (Google is your friend).
The following discussion applies to new style classes, old style classes 
are subtly different; moreover.

When one  writes

  >>> disk
  <oopp.GeometricFigure instance at 0x81b496c>

one obtains the *string representation* of the object. Actually, the previous
line is syntactic sugar for

  >>> print repr(disk)
  <oopp.GeometricFigure instance at 0x81b496c>

or

  >>> print disk.__repr__()
  <oopp.GeometricFigure instance at 0x81b496c>

The ``repr`` function extracts the string representation from the
the special method ``__repr__``, which can be redefined in order to 
have objects pretty printed. Notice that ``repr`` is conceptually
different from the ``str`` function that controls the output of the ``print`` 
statement. Actually, ``print o`` is syntactic sugar for ``print str(o)``
which is sugar for ``print o.__str__()``.

If for instance we define 

 ::

  #<oopp.py>

  class PrettyPrinted(object):
      formatstring='%s' # default
      def __str__(self):
          """Returns the name of self in quotes, possibly formatted via 
          self.formatstring. If self has no name, returns the name 
          of its class in angular brackets.""" 
          try: #look if the selfect has a name 
             name="'%s'" % self.__name__ 
          except AttributeError: #if not, use the name of its class
              name='<%s>' % type(self).__name__
          if hasattr(self,'formatstring'):
              return self.formatstring % name
          else: 
              return name

  #</oopp.py>

then we have

  >>> from oopp import PrettyPrinted
  >>> o=PrettyPrinted() # o is an instance of PrettyPrinted
  >>> print o #invokes o.__str__() which in this case returns o.__class__.name
  <PrettyPrinted>

whereas
  
  >>> o # i.e. print repr(o)
  <oopp.PrettyPrinted object at 0x400a006c>

However, in most cases ``__repr__`` and ``__str__`` gives the same
output, since if ``__str__`` is not explicitely defined it defaults
to ``__repr__``. Therefore, whereas modifying ``__str__`` 
does not change ``__repr__``, modifying ``__repr__`` changes ``__str__``,
if ``__str__`` is not explicitely given:

 ::

  #<fairytale1.py>

  "__repr__ can also be a regular method, not a classmethod"

  class Frog(object):
      attributes="poor, small, ugly"
      def __str__(self):
          return "I am a "+self.attributes+' '+self.__class__.__name__

  class Prince(object):
      attributes='rich, tall, beautiful'
      def __str__(self):
          return "I am a "+self.attributes+' '+self.__class__.__name__

  jack=Frog(); print repr(jack),jack
  charles=Prince(); print repr(charles),charles  

  #</fairytale1.py>

The output of this script is:

 ::

   <Frog object at 0x81866ec> I am a poor, small, ugly Frog
   <Prince object at 0x818670c> I am a rich, tall, beautiful Prince

for jack and charles respectively. 

``__str__``  and ``__repr__`` are also called by the formatting
operators "%s" and "%r".

Notice that  i) ``__str__`` can be most naturally
rewritten as a class method; ii) Python is magic:

 ::

  #<fairytale2.py>
    
  """Shows two things: 
      1) redefining __repr__ automatically changes the output of __str__
      2) the class of an object can be dinamically changed! """

  class Frog(object):
      attributes="poor, small, ugly"
      def __repr__(cls):
          return "I am a "+cls.attributes+' '+cls.__name__
      __repr__=classmethod(__repr__)

  class Prince(object):
      attributes='rich, tall, beautiful'
      def __repr__(cls):
          return "I am a "+cls.attributes+' '+cls.__name__
      __repr__=classmethod(__repr__)

  def princess_kiss(frog):
        frog.__class__=Prince

  jack=Frog()
  princess_kiss(jack)
  print jack # the same as repr(jack)

  #</fairytale2.py>

Now the output for jack is "I am a rich, tall, beautiful Prince" !
In Python you may dynamically change the class of an object!!

Of course, this is a feature to use with care ;-)

There are many others special methods, such as __new__, __getattr__,
__setattr__, etc. They will be discussed in the next chapters, in
conjunction with inheritance.

Objects can be called, added, subtracted, ...
---------------------------------------------------------------------------

Python provides a nice generalization of functions, via the concept
of *callable objects*. A callable object is an object with a ``__call__``
special method. They can be used to define "functions" that remember
how many times they are invoked:

 ::

  #<call.py>

  class MultiplyBy(object):
      def __init__(self,n):
          self.n=n
          self.counter=0
      def __call__(self,x):
          self.counter+=1
          return self.n*x

  double=MultiplyBy(2)
  res=double(double(3)) # res=12
  print "double is callable: %s" % callable(double)
  print "You have called double %s times." % double.counter

  #</call.py>

With output

 ::

  double is callable:  True
  You have called double 2 times.

The script also show that callable objects (including functions) 
can be recognized with the ``callable`` built-in function.

Callable object solves elegantly the problem of having "static" variables
inside functions (cfr. with the 'double' example in chapter 2).
A class with a ``__call__`` method can be used to generate an entire
set of customized "functions". For this reason, callable objects are 
especially useful in the conjunction with object factories. Let me show 
an application to my factory of geometric figures:

 ::

  #<oopp.py>

  class Makeobj(object):
      """A factory of object factories. Makeobj(cls) returns instances
       of cls"""
      def __init__(self,cls,*args):
          self.cls=cls
          self.args=args
      def __call__(self,**pars):
          return self.cls(*self.args,**pars)

  #</oopp.py>

  #<factory.py>

  from oopp import Makeobj,GeometricFigure

  makedisk=Makeobj(GeometricFigure,'(x-x0)**2+(y-y0)**2<r**2')
  makesquare=Makeobj(GeometricFigure,'abs(x-x0)<L and abs(y-y0)<L')
  disk=makedisk(x0=0,y0=0,r=10) # make a disk of radius 10
  square=makesquare(x0=0,y0=0,L=20) # make a disk of side 10

  print disk.contains(9,9) # => False
  print square.contains(9,9) # => True
  #etc.

  #</factory.py>

This factory generates callable objects, such as ``makedisk`` and
``makesquare`` that returns geometric objects. It gives a nicer interface
to the object factory provided by 'GeometricFigure'. 

Notice that the use of the expression ``disk.contains(9,9)`` in order to
know if the point of coordinates (9,9) is contained in the disk, it is
rather inelegant: it would be much better to be able to ask if 
``(9,9) in disk``. This is possibile, indeed: and the secrets is to
define the special method ``__contains__``. This is done in the next
example, that I think give a good taste of the beauty of objects

 ::

  #<funnyformatter.py>

  from oopp import Makeobj

  Nrow=50; Ncol=78
    
  class GeometricFigure(object):
      """This class allows to define geometric figures according to their
      equation in the cartesian plane. Moreover addition and subtraction
      of geometric figures are defined as union and subtraction of sets."""
      def __init__(self,equation,**parameters):
          "Initialize "
          self.eq=equation
          self.par=parameters
          for (k,v) in self.par.items(): #replaces the parameters
              self.eq=self.eq.replace(k,str(v))
          self.contains=eval('lambda x,y : '+self.eq)
      def combine(self,fig,operator):
          """Combine self with the geometric figure fig, using the
          operators "or" (addition) and "and not" (subtraction)"""
          comboeq="("+self.eq+")"+operator+"("+fig.eq+")"
          return GeometricFigure(comboeq)
      def __add__(self,fig):
          "Union of sets"
          return self.combine(fig,' or ')
      def __sub__(self,fig):
          "Subtraction of sets"
          return self.combine(fig,' and not')
      def __contains__(self,point): #point is a tuple (x,y)
          return self.contains(*point)

  makedisk=Makeobj(GeometricFigure,'(x-x0)**2/4+(y-y0)**2 <= r**2')
  upperdisk=makedisk(x0=38,y0=7,r=5)
  smalldisk=makedisk(x0=38,y0=30,r=5)
  bigdisk=makedisk(x0=38,y0=30,r=14)

  def format(text,shape):
      "Format the text in the shape given by figure"
      text=text.replace('\n',' ')
      out=[]; i=0; col=0; row=0; L=len(text)
      while 1:
          if (col,row) in shape:
              out.append(text[i]); i+=1
              if i==L: break
          else:
              out.append(" ")
          if col==Ncol-1:
              col=0; out.append('\n') # starts new row
              if row==Nrow-1: row=0   # starts new page
              else: row+=1
          else: col+=1 
      return ''.join(out)

  composition=bigdisk-smalldisk+upperdisk
  print format(text='Python Rules!'*95,shape=composition)

  #</funnyformatter.py>

I leave as an exercise for the reader to understand how does it work and to
play with other geometric figures (he can also generate them trough the
'Makeobj' factory). I think it is nicer to show its output:

  ::
                                                                     
                                                                              
                                     Pyt                                      
                                hon Rules!Pyt                                 
                              hon Rules!Python                                
                             Rules!Python Rules!                              
                             Python Rules!Python                              
                             Rules!Python Rules!P                             
                             ython Rules!Python                               
                             Rules!Python Rules!                              
                              Python Rules!Pyth                               
                                on Rules!Pyth                                 
                                     on                                       
                                                                              
                                                                              
                                     Rul                                      
                            es!Python Rules!Pytho                             
                        n Rules!Python Rules!Python R                         
                     ules!Python Rules!Python Rules!Pyth                      
                   on Rules!Python Rules!Python Rules!Pyth                    
                 on Rules!Python Rules!Python Rules!Python R                  
               ules!Python Rules!Python Rules!Python Rules!Pyt                
              hon Rules!Python Rules!Python Rules!Python Rules!               
             Python Rules!Python Rules!Python Rules!Python Rules              
            !Python Rules!Python Rule   s!Python Rules!Python Rul             
            es!Python Rules!Pyth             on Rules!Python Rule             
           s!Python Rules!Pyth                 on Rules!Python Rul            
           es!Python Rules!Py                   thon Rules!Python             
           Rules!Python Rules                   !Python Rules!Pyth            
          on Rules!Python Ru                     les!Python Rules!P           
           ython Rules!Python                    Rules!Python Rule            
           s!Python Rules!Pyt                   hon Rules!Python R            
           ules!Python Rules!P                 ython Rules!Python             
            Rules!Python Rules!P             ython Rules!Python R             
            ules!Python Rules!Python    Rules!Python Rules!Python             
              Rules!Python Rules!Python Rules!Python Rules!Pytho              
              n Rules!Python Rules!Python Rules!Python Rules!Py               
               thon Rules!Python Rules!Python Rules!Python Rul                
                 es!Python Rules!Python Rules!Python Rules!P                  
                   ython Rules!Python Rules!Python Rules!P                    
                     ython Rules!Python Rules!Python Rul                      
                        es!Python Rules!Python Rules!                         
                            Python Rules!Python R                             
                                     ule                                      
                                                                              
                                                                              
                                     s!

Remark.

Unfortunately, "funnyformatter.py" does not reuse old code: in spite of the
fact that we already had in our library the 'GeometricFigure' class, with 
an "__init__" method that is exactly the same of the "__init__" method in 
"funnyformatter.py", we did not reuse that code. We simply did a cut
and paste. This means that if we later find a bug in the ``__init__`` method,
we will have to fix it twice, both in the script and in the library. Also,
if we plan to extend the method later, we will have to extend it twice.
Fortunately, this nasty situation can be avoided: but this requires the
power of inheritance.


THE POWER OF CLASSES
==========================================================================

This chapter is devoted to the concept of class inheritance. I will discuss
single inheritance, cooperative methods, multiple inheritance and more.


The concept of inheritance
----------------------------------------------------------------------

Inheritance is perhaps the most important basic feature in OOP, since it
allows the reuse and incremental improvement of old code.
To show this point, let me come back to one of the 
examples I have introduced in the last chapter, 'fairytale1.py' script, 
where I defined the classes 'Frog' and 'Prince' as

 ::

  class Frog(object):
      attributes="poor, small, ugly"
      def __str__(self):
          return "I am a "+self.attributes+' '+self.__class__.__name__

  class Prince(object):
      attributes='rich, tall, beautiful'
      def __str__(self):
          return "I am a "+self.attributes+' '+self.__class__.__name__

We see that the way we followed here was very bad since:

1. The ``__str__`` method is duplicated both in Frog and in Prince: that
   means that if we find a bug a later, we have to fix it twice!

2. The ``__str__`` was already defined in the PrettyPrinted class (actually 
   more elegantly), therefore we have triplicated the work and worsened the 
   situation!

This is very much against the all philosophy of OOP:

  *never cut and paste!*

We should *reuse* old code, not paste it!

The solution is *class inheritance*. The idea behind inheritance is to 
define new classes as subclasses of a *parent* classes, in such a way that 
the *children* classes possess all the features of the parents. 
That means that we do not need to 
redefine the properties of the parents explicitely.
In this example, we may derive both 'Frog' and 'Prince' from
the 'PrettyPrinted' class, thus providing to both 'Frog' and  'Prince'
the ``PrettyPrinted.__str__`` method with no effort:

  >>> from oopp import PrettyPrinted
  >>> class Frog(PrettyPrinted): attributes="poor, small, ugly"
  ...
  >>> class Prince(PrettyPrinted):  attributes="rich, tall, beautiful"
  ...
  >>> print repr(Frog()), Frog()
  <__main__.Frog object at 0x401cbeac> <Frog>
  >>> print Prince()
  >>> print repr(Prince()),Prince()
  <__main__.Prince object at 0x401cbaac> <Prince>

Let me show explicitly that both 'Frog' and 'Prince' share the 
'PrettyPrinted.__str__' method:

  >>> id(Frog.__str__) # of course, YMMV
  1074329476
  >>> id(Prince.__str__)
  1074329476
  >>> id(PrettyPrinted.__str__)
  1074329476

The method is always the same, since the object reference is the same
(the precise value of the reference is not guaranteed to be  1074329476,
however!).

This example is good to show the first advantage of inheritance: 
*avoiding duplication of code*.
Another advantage of inheritance, is *extensibility*: one can very easily
improve existing code. For instance, having written the ``Clock`` class once, 
I can reuse it in many different ways. for example I can build a ``Timer`` 
to be used for benchmarks. It is enough to reuse the function ``with_timer``
introduced in the first chapter (functions are good for reuse of code, too ;):

 ::

  #<oopp.py>

  class Timer(Clock):
      "Inherits the get_time staticmethod from Clock"
      execute=staticmethod(with_timer)
      loop_overhead=staticmethod(loop_overhead)


  #</oopp.py>

Here there is an example of application:

  >>> from oopp import Timer
  >>> Timer.get_time()
  '16:07:06'

Therefore 'Timer' inherits 'Clock.get_time'; moreover it has the additional 
method ``execute``:

  >>> def square(x): return x*x
  ...
  >>> Timer.execute(square,n=100000)(1)
  executing square ...
    Real time: 0.01 ms  CPU time: 0.008 ms

The advantage of putting the function ``execute`` in a class is that
now we may *inherit* from that class and improve out timer *ad
libitum*.

Inheritance versus run-time class modifications
-------------------------------------------------------------------------

Naively, one could think of substituting inheritance with run-time 
modification of classes, since this is allowed by Python. However,
this is not such a good idea, in general. Let me give a simple example.
Suppose we want to improve our previous clock, to show the date, too.
We could reach that goal with the following script:

 ::

  #<clock2.py>

  "Shows how to modify and enhances classes on the fly"

  from oopp import *

  clock=Clock() #creates a Clock instance
  print clock.get_time() # print the current time

  get_data=lambda : ' '.join(time.asctime().split()[0:3])+ \
                         ' '+time.asctime().split()[-1]

  get_data_and_time=lambda : "Today is: %s \nThe time is: %s" % (
                         get_data(),get_time()) # enhances get_time

  Clock.get_time=staticmethod(get_data_and_time)

  print clock.get_time() # print the current time and data

  #</clock2.py>

The output of this script is:

  12:51:25
  Today is: Sat Feb 22 2003 
  The time is: 12:51:25

Notice that:

1. I instantiated the ``clock`` object *before* redefining the ``get_time``
   method, when it only could print the time and *not* the date.
2. However, after the redefinition of the class, the behaviour of all its 
   instances is changed, *including the behaviour of objects instantiated 
   before the change!*. Then ``clock`` *can* print the date, too.

This is not so surprising, once you recognize that Guido own a very famous
time-machine ... ;-)

Seriously, the reason is that an object does not contains a reserved copy
of the attributes and methods of its class: it only contains *references*
to them. If we change them in the class, the references to them in the
object will stay the same, but the contents will change.

In this example, I have solved the problem of enhancing the 'Clock' class
without inheritance, but dynamically replaceing its ``get_time`` 
(static) method with the `get_data_and_time`` (static) method. 
The dynamics modification of methods can be cool, but it should be avoided 
whenever possible, at least for two reasons [#]_:

1. having a class and therefore all its instances (including the instances
   created before the modification !) changed during the life-time of the
   program can be very confusing to the programmer, if not to the interpreter.

2. the modification is destructive: I cannot have the old ``get_time`` method
   and the new one at the same time, unless one explicitly gives to it 
   a new name (and giving new names increases the pollution of the namespace).

Both these disadvantages can be solved by resorting to the mechanism of
inheritance. For instance, in this example, we can derive a new class 
``NewClock`` from ``Clock`` as follows:

 ::

  #<newclock.py>

  import oopp,time
  
  get_data=lambda : ' '.join(time.asctime().split()[0:3])+ \
                         ' '+time.asctime().split()[-1]

  get_data_and_time=lambda : "Today is: %s \nThe time is: %s" % (
                         get_data(),oopp.get_time()) # enhances get_time

  class NewClock(oopp.Clock):
         """NewClock is a class that inherits from Clock, provides get_data
          and overrides get_time."""
         get_data=staticmethod(get_data)
         get_time=staticmethod(get_data_and_time)

  clock=oopp.Clock(); print 'clock output=',clock.get_time() 
  newclock=NewClock(); print 'newclock output=',newclock.get_time()

  #</newclock.py>

The output of this script is:

 ::

  clock output= 16:29:17
  newclock output= Today is: Sat Feb 22 2003 
  The time is: 16:29:17

We see that the two problems previously discussed are solved since:

i) there is no cut and paste: the old method ``Clock.get_time()`` is used
   in the definition of the new method ``NewClock.get_time()``;
ii) the old method is still accessible as ``Clock.get_time()``; there is
    no need to invent a new name like ``get_time_old()``.

We say that the method ``get_time`` in ``NewClock`` *overrides* the method
``get_time`` in Clock.


This simple example shows the power of inheritance in code
reuse, but there is more than that.

Inheritance is everywhere in Python, since
all  classes inherit from object. This means that all classes
inherit the methods and attributes of the object class, such as ``__doc__``,
``__class__``, ``__str__``, etc.


 .. [#] There are cases when run-time modifications of classes is useful
        anyway: particularly when one wants to modify the behavior of
        classes written by others without changing the source code. I
        will show an example in next chapter.

Inheriting from built-in types
-----------------------------------------------------------------------

However, one can subclass a built-in type, effectively creating an 
user-defined type with all the feature of a built-in type, and modify it.

Suppose for instance one has a keyword dictionary such as

  >>> kd={'title': "OOPP", 'author': "M.S.", 'year': 2003}

it would be nice to be able to access the attributes without
excessive quoting, i.e. using ``kd.author`` instead of ``kd["author"]``.
This can be done by subclassing the built-in class ``dict`` and
by overriding the ``__getattr__`` and ``__setattr__`` special methods:

 ::

  #<oopp/py>

  class kwdict(dict):
      "Keyword dictionary base class"
      def __getattr__(self,attr): 
          return self[attr]
      def __setattr__(self,key,val): 
          self[key]=val
      __str__ = pretty 

  #</oopp/py>


Here there is an example of usage:

  >>> from oopp import kwdict
  >>> book=kwdict({'title': "OOPP", 'author': "M.S."})
  >>> book.author #it works
  'M.S.'
  >>> book["author"] # this also works
  'M.S.'
  >>> book.year=2003 #you may also add new fields on the fly
  >>> print book
  author = M.S.
  title = OOPP
  year = 2003
  
The advantage of subclassing the built-in 'dict', it that you have for free 
all the standard dictionary methods, without having to reimplement them.

However, to subclass built-in it is not always a piece of cake. In
many cases there are complications, indeed. Suppose for instance
one wants to create an enhanced string type, with
the ability of indent and dedent a block of text provided by 
the following functions:

 ::

  #<oopp.py>

  def indent(block,n):
       "Indent a block of code by n spaces"
       return '\n'.join([' '*n+line for line in block.splitlines()])

  def dedent(block):
      "Dedent a block of code, if need there is"""
      lines=block.splitlines()
      for line in lines:
          strippedline=line.lstrip()
          if strippedline: break
      spaces=len(line)-len(strippedline)
      if not spaces: return block
      return '\n'.join([line[spaces:] for line in lines])

  #</oopp.py>

The solution is to inherit from the built-in string type ``str``, and to 
add to the new class the ``indent`` and ``dedent`` methods:

  >>> from oopp import indent,dedent
  >>> class Str(str):
  ...    indent=indent
  ...    dedent=dedent
  >>> s=Str('spam\neggs')
  >>> type(s)
  <class '__main__.Str'>
  >>> print s.indent(4)
      spam
      eggs

However, this approach has a disadvantage, since the output of ``indent`` is
not a ``Str``, but a normal ``str``, therefore without the additional 
``indent`` and ``dedent`` methods:

  >>> type(s.indent(4))
  <type 'str'>
  >>> s.indent(4).indent(4) #error
  Traceback (most recent call last):
    File "<stdin>", line 9, in ?
  AttributeError: 'str' object has no attribute 'indent'
  >>> s.indent(4).dedent(4) #error
  Traceback (most recent call last):
    File "<stdin>", line 9, in ?
  AttributeError: 'str' object has no attribute 'dedent'

We would like ``indent`` to return a ``Str`` object. To solve this problem
it is enough to rewrite the class as follows:

 ::

  #<Str.py>

  from oopp import indent,dedent

  class Str(str):
     def indent(self,n):
         return Str(indent(self,n))
     def dedent(self):
         return Str(dedent(self))

  s=Str('spam\neggs').indent(4)
  print type(s)
  print s # indented s
  s=s.dedent()
  print type(s)
  print s # non-indented s
  
  #</Str.py>

Now, everything works and the output of the previous script is

 ::

  <class 'Str'>
      spam
      eggs
  <class 'Str'>
  spam
  eggs

The solution works because now ``indent()`` returns an instance
of ``Str``, which therefore has an ``indent`` method. Unfortunately,
this is not the end. Suppose we want to add another food to our list:

  >>> s2=s+Str("\nham")
  >>> s2.indent(4) #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  AttributeError: 'str' object has no attribute 'indent'

The problem is the same, again: the type of ``s2`` is ``str``

  >>> type(s2)
  <type 'str'>

and therefore there is no ``indent`` method available. There is a
solution to this problem, i.e. to redefine the addition operator
for objects of the class ``Str``. This can be done directly by hand,
but it is *ugly* for the following reasons:

1. If you derive a new class from ``Str`` you have to redefine the
   addition operator (both the left addition and the right addition [#]_)
   again (ughh!);
2. There are others operators you must redefine, in particular the
   the augumented assignement operator ``+=``, the repetition operator ``*`` 
   and its augmented version ``*=``; 
3. In the case of numeric types, one must redefine, ``+,-,*,/,//, mod,``,
   possibily ``<<,>>`` and others, including  the corresponding 
   augumented assignement operators and the left and the right form of
   the operators.

This is a mess, especially since due to point 1, one has to redefined
all the operators each time she defines a new subclass. I short, one has
to write a lot of boilerplate for a stupid job that the language
should be able to perform itself automatically. But here are the
good news: Python *can* do all that automatically, in an elegant
and beautiful way, which works for all types, too.

But this requires the magic of metaclasses.

 .. [#] The right addition works this way. Python looks at the expression x+y
        and if x has an explicit__add__ method invokes it; on the other hand, 
        if x does not define  an __add__ method, Python considers y+x. 
        If y defines a __radd__ method, it invokes it, otherwise
        raises an exception. The same is done for right multiplication, etc.

Controlling the creation of objects
---------------------------------------------------------------------------

Before introducing multiple inheritance, let me make a short digression on
the mechanism of object creation in Python 2.2+. The important point is
that new style classes have a ``__new__`` static method that allows
the user to take complete control of object creation. To understand how
``__new__`` works, I must explain what happens when an object is instantiated
with a statement like

 ::
  
  s=Str("spam") #object creation

What happens under the hood, is that the special static method  ``__new__`` 
of the class ``Str`` (inherited from the built-in ``str`` class)
is invoked ``before`` the ``Str.__init__`` method. This means that
the previous line should really be considered syntactic sugar for:

 ::

  s=Str.__new__(Str,"spam") # Str.__new__ is actually str.__new__
  assert isinstance(s,Str)
  Str.__init__(s,"spam")  # Str.__init__ is actually str.__init__

Put it more verbosely, what happens during the object creation is the
following:

1. the static method ``__new__``  is invoked with the class of the created 
   object as first argument [#]_;

2. ``__new__`` returns an instance of that class. 

3. the instance is then initialized by the ``__init__`` method.

Notice that both ``__new__`` and ``__init__`` are called with the same 
argument list, therefore one must make sure that they have a compatible 
signature.

Let me discuss now why ``__new__`` must be a static method.
First of all, it cannot be a normal method with a first argument which is an
instance of the calling class, since at the time of ``__new__`` invocation
that instance (``myclock`` in the example) has still to be created
Since ``__new__`` needs information about the class calling it, one
could think of implementing ``__new__`` as a class method. However,
this would implicitly pass the caller class and return an instance
of it. It is more convenient, to have the ability of creating
instances of any class directly from C.__new__(B,*args,**kw)

For this reasons, ``__new__`` must be a static method and pass explicitly
the class which is calling it.
 
Let me now show an important application of the ``__new__`` static method:
forbidding object creation. For instance, sometimes it is useful to have 
classes that cannot be instantiated. This kind of classes can be
obtained by inheriting from a ``NonInstantiable`` class:

 ::

  #<oopp.py>

  class NonInstantiableError(Exception): 
      pass

  class NonInstantiable(object): 
      def __new__(cls,*args,**kw):
          raise NonInstantiableError("%s cannot be instantiated" % cls)

  #</oopp.py>

Here there is an example of usage:

  >>> from oopp import NonInstantiable,get_time
  >>> class Clock(NonInstantiable): 
  ...    get_time=staticmethod(get_time)
  >>> Clock.get_time() # works
  '18:48:08'
  Clock() #error
  Traceback (most recent call last):
    File "<pyshell#6>", line 1, in ?
      Clock()
    File "oopp.py", line 257, in __new__
      raise NonInstantiableError("%s cannot be instantiated" % cls)
  NonInstantiableError: <class '__main__.Clock'> cannot be instantiated

However, the approach pursued here has a disadvantage:``Clock`` was already 
defined as a subclass of ``object`` and I has to change the source code
to make it a subclass of 'NonInstantiable'. But what happens if 
I cannot change the sources? How can I *reuse* the old code?

The solution is provided by multiple inheritance.

Notice that '__new__' is a staticmethod: [#]_

  >>> type(NonInstantiable.__dict__['__new__'])
  <type 'staticmethod'>

 .. [#] This is how ``type(s)`` or ``s.__class__`` get to know that 
        ``s`` is an instance of ``Str``, since the class information is 
        explicitely passed to the newborn object trough ``__new__``.

 .. [#] However ``object.__dict__['__new__']`` is not a staticmethod

  >>> type(object.__dict__['__new__']) # special case
  <type 'builtin_function_or_method'>


Multiple Inheritance
----------------------------------------------------------------------------

Multiple Inheritance (often abbreviated as MI) is often 
considered one of the most advanced topic in Object Oriented Programming. 
It is also one of the most difficult features to implement
in an Object Oriented Programming language. Even, some languages by design
decided to avoid it. This is for instance the case of Java, that avoided
MI having seen its implementation in C++ (which is not for the faint of
heart ;-) and uses a poorest form of it trough interfaces.
For what concerns the scripting languages, of which
the most famous are Perl, Python and Ruby (in this order, even if
the right order would be Python, Ruby and Perl), only Python
implements Multiple Inheritance well (Ruby has a restricted form
of it trough mix-ins, whereas Perl implementation is too difficult
for me to understand what it does ;).

The fact that Multiple Inheritance can be hairy, does not mean that it
is *always* hairy, however. Multiple Inheritance is used with success
in Lisp derived languages (including Dylan).

The aims of this chapter is to discuss the
Python support for MI in the most recent version (2.2 and 2.3), which
has considerably improved with respect to previous versions.
The message is the following: if Python 1.5 had a basic support for
MI inheritance (basic but nevertheless with nice features, dynamic),
Python 2.2 has *greatly* improved that support and with the
change of the Method Resolution Order in Python 2.3, we may say
that support for MI is now *excellent*.

I strongly encourage Python programmers to use MI a lot: this will
allows even a stronger reuse of code than in single inheritance.

Often, inheritance is used when one has a complicate class B, and she wants 
to modify (or enhance) its behavior, by deriving a child class C, which is 
only slightly different from B. In this situation,  B is already a standalone 
class, providing some non-trivial functionality, independently from 
the existence of C.  This kind of design it typical of the so called
*top-down* philosophy, where one builds the 
all structure as a monolithic block, leaving room only for minor improvements.
An alternative approach is the so called *bottom-up* programming, in 
which one builds complicate things starting from very simple building blocks. 
In this logic, it is very appealing the idea of creating classes with the
only purpose of being derived. The 'NonInstantiable' just defined is a
perfect example of this kind of classes, though with multiple inheritance
in mind and often called *mixin* classes.
It can be used to create a new class ``NonInstantiableClock`` 
that inherits from ``Clock`` and from ``NonInstantiable``.

 ::

  #<oopp.py>

  class NonInstantiableClock(Clock,NonInstantiable): 
      pass

  #</oopp.py>

Now ``NonInstantiableClock`` is both a clock

  >>> from oopp import NonInstantiableClock
  >>> NonInstantiableClock.get_time() # works
  '12:57:00' 

and a non-instantiable class:

  >>> NonInstantiableClock() # as expected, give an error
  Traceback (most recent call last):
    File "<pyshell#2>", line 1, in ?
      NonInstantiableClock() # error
    File "oopp.py", line 245, in __new__
      raise NonInstantiableError("%s cannot be instantiated" % cls)
  NonInstantiableError: <class 'oopp.NonInstantiableClock'> 
  cannot be instantiated

Let me give a simple example of a situation where the mixin approach 
comes handy. Suppose that the owner of a 'Pizza-shop' needs a program to 
take care of all the pizzas to-go he sell. Pizzas are distinguished 
according to their size (small, medium or large) and their toppings. 
The problem can be solved by inheriting from a generic pizza factory
like this:

 ::

  #<oopp.py>

  class GenericPizza(object): # to be customized
      toppinglist=[] # nothing, default 
      baseprice=1 # one dollar, default
      topping_unit_price=0.5 # half dollar for each topping, default
      sizefactor={'small':1, 'medium':2, 'large':3} 
      # a medium size pizza costs twice a small pizza, 
      # a large pizza costs three times
      def __init__(self,size):
          self.size=size
      def price(self):
          return (self.baseprice+
                 self.toppings_price())*self.sizefactor[self.size]
      def toppings_price(self):
          return len(self.toppinglist)*self.topping_unit_price
      def __str__(self):
          return '%s pizza with %s, cost $ %s' % (self.size,
                                            ','.join(self.toppinglist),
                                            self.price())

  #</oopp.py>

Here the base class 'GenericPizza' is written with inheritance in mind: one
can derives many pizza classes from it by overriding the ``toppinglist``;
for instance one could define

  >>> from oopp import GenericPizza
  >>> class Margherita(GenericPizza): 
  ...     toppinglist=['tomato']

The problem of this approach is that one must define dozens of 
different pizza subclasses (Marinara, Margherita, Capricciosa, QuattroStagioni, 
Prosciutto, ProsciuttoFunghi, PizzaDellaCasa, etc. etc. [#]_). In such a 
situation, it is better to perform the generation of subclasses in a smarter 
way, i.e. via a customizable class factory.
A simpler approach is to use always the same class and to customize
its instances just after creation. Both approaches can be implemented via
the following 'Customizable' mixin class, not meant to be instantiated, 
but rather to be *inherited*:

 ::

  #<oopp.py>
  
  class Customizable(object):
      """Classes inhering from 'Customizable' have a 'with' method acting as
      an object modifier and 'With' classmethod acting as a class factory"""
      def with(self,**kw):
          customize(self,**kw)# customize the instance
          return self # returns the customized instance
      def With(cls,**kw):
          class ChildOf(cls): pass # a new class inheriting from cls
          ChildOf.__name__=cls.__name__ # by default, with the same name
          customize(ChildOf,**kw)       # of the original class
          return ChildOf
      With=classmethod(With) 

  #</oopp.py>

Descendants of 'Customizable' can be customized by using
'with', that directly acts on the instances, or 'With', that returns
new classes. Notice that one could make 'With' to customize the
original class, without returning a new one; however, in practice,
this would not be safe: I remind that changing a class modifies
automatically all its instances, even instances created *before*
the modification. This could produce bad surprises: it is better to 
returns new classes, that may have the same name of the original one, 
but are actually completely independent from it.

In order to solve the pizza shop problem we may define a 'CustomizablePizza' 
class

 ::

  #<oopp.py>

  class CustomizablePizza(GenericPizza,Customizable):
      pass

  #</oopp.py>

which can be used in two ways: i) to customize instances just after creation:

  >>> from oopp import CustomizablePizza
  >>> largepizza=CustomizablePizza('large') # CustomizablePizza instance
  >>> largemarinara=largepizza.with(toppinglist=['tomato'],baseprice=2)
  >>> print largemarinara
  large pizza with tomato mozzarella, cost $ 7.0

and ii) to generated customized new classes:

  >>> Margherita=CustomizablePizza.With(
  ...    toppinglist=['tomato','mozzarella'], __name__='Margherita')
  >>> print Margherita('medium')
  medium pizza with tomato,mozzarella, cost $ 4.0

The advantage of the bottom-up approach, is that the 'Customizable' class
can be reused in completely different problems; for instance, it could
be used as a class factory. For instance we could use it to generate a 
'CustomizableClock' class as in this example:

  >>> from oopp import *
  >>> CustomizableClock=Customizable.With(get_time=staticmethod(Clock.get_time),
  ... __name__='CustomizableClock') #adds get_time
  >>> CustomizableClock.get_time() # now it works
  '09:57:50'

Here 'Customizable' "steal" the 'get_time' method from 'Clock'.
However that would be a rather perverse usage ;) I wrote it to show
the advantage of classmethods, more than to suggest to the reader that
this is an example of good programming. 

 .. [#]  In Italy, you can easily find "pizzerie" with more than 50 different 
         kinds of pizzas (once I saw a menu with something like one hundred 
         different combinations ;)

Cooperative hierarchies
-----------------------------------------------------------------------

The examples of multiple inheritance hierarchies given until now were pretty 
easy. The reason is that there was no interaction between the methods of the 
children and of the parents. However, things get more complicated (and
interesting ;) when the methods in the hierarchy call each other. 
Let me consider an example coming from paleoantropology:

 ::

  #<paleoanthropology1.py>

  class HomoHabilis(object):
      def can(self):
          print self,'can:'
          print " - make tools"

  class HomoSapiens(HomoHabilis):
      def can(self): #overrides HomoHabilis.can
          HomoHabilis.can(self)
          print " - make abstractions"
        
  class HomoSapiensSapiens(HomoSapiens):
      def can(self): #overrides HomoSapiens.can
          HomoSapiens.can(self)
          print " - make art"

  modernman=HomoSapiensSapiens()
  modernman.can()

  #</paleoanthropology1.py>

In this example children methods call parent methods:  
'HomoSapiensSapiens.can' calls 'HomoSapiens.can' that in turns calls
'HomoHabilis.can' and the final output is:

 ::

  <__main__.HomoSapiensSapiens object at 0x814e1fc> can:
   - make tools
   - make abstractions
   - make art

The script works, but it is far from ideal,  if code reuse and refactoring
are considered important requirements. The point is that (very likely, as the
research in paleoanthropology progresses) we may want to extend the 
hierarchy, for instance by adding a class on the top or in the middle. 
In the present form, this would require a non-trivial modification of 
the source code (especially
if one think that the hierarchy could be fleshed out with dozens of others
methods and attributes). However, the aim of OOP is to avoid as 
much as possible source code modifications. This goal can be attained in
practice, if the source code is written to be friendly to extensions and 
improvements as much as possible. I think it is worth to spend some time 
in improving this example, since what can be learn here, 
can be lifted to real life cases.

First of all, let me define a generic *Homo* class, to be used
as first ring of the inheritance chain (actually the first ring is
'object'):

 ::
 
  #<oopp.py>
 
  class Homo(PrettyPrinted): 
      """Defines the method 'can', which is intended to be overriden 
      in the children classes, and inherits '__str__' from PrettyPrinted,
      ensuring a nice printing representation for all children."""
      def can(self): 
          print self,'can:'
      
  #</oopp.py>

Now, let me point out one of the shortcomings of the previous code: in each
subclass, we explicitly call its parent class (also called super class)
by its name. This is inconvenient, both because a change of name in
later stages of the project would require a lot of search and replace
(actually not a lot in this toy example, but you can imagine having
a very big projects with dozens of named method calls) and because it makes 
difficult to insert a new element in the inheritance hierarchy. 
The solution to this problems is the
``super`` built-in, which provides an easy access to the methods
of the superclass.
``super`` objects comes in two flavors: ``super(cls,obj)`` objects return 
bound methods whereas ``super(cls)`` objects return unbound methods.
In the next code we will use the first form. The hierarchy can more elegantly 
be rewritten as [#]_ :

 ::

  #<paleo2.py>
  
  from oopp import Homo

  class HomoHabilis(Homo):
      def can(self):
          super(HomoHabilis,self).can()
          print " - make tools"

  class HomoSapiens(HomoHabilis):
      def can(self):
          super(HomoSapiens,self).can()
          print " - make abstractions"
        
  class HomoSapiensSapiens(HomoSapiens):
      def can(self):
          super(HomoSapiensSapiens,self).can()
          print " - make art"


  HomoSapiensSapiens().can()

  #</paleo2.py>

with output
  
 ::

  <HomoSapiensSapiens> can:
   - make tools
   - make abstractions
   - make art
  
This is not yet the most elegant form, since even
if ``super`` avoids naming the base class explicitely, still it
requires to explicitely name the class where it is defined. This is
rather annoying.
Removing that restriction, i.e. implementing really anonymous 
``super`` calls, is possible but requires a good understand of
private variables in inheritance.

Inheritance and privacy
----------------------------------------------------------------------

In order to define anonymous cooperative super calls,  we need classes 
that know themselves, i.e. containing a reference to themselves. This
is not an obvious problem as it could seems, since it cannot be solved
without incurring in the biggest annoyance in inheritance: 
*name clashing*. Name clashing happens when names and attributes defined 
in different ancestors overrides each other in a unwanted order.
Name clashing is especially painful in the case of cooperative 
hierarchies and particularly in in the problem at hand. 


A naive solution would be to attach a plain (i.e. non-private) 
attribute '.this' to the class, containing a reference
to itself, that can be invoked by the methods of the class.
Suppose, for instance, that I want to use that attribute in the ``__init__``
method of that class. A naive attempt would be to write something like:

  >>> class B(object):
  ...     def __init__(self): 
  ...         print self.this,'.__init__' # .this defined later
  >>> B.this=B # B.this can be set only after B has been created
  >>> B()
  <class '__main__.B'>

Unfortunately, this approach does not work with cooperative hierarchies. 
Consider, for instance, extending 'B' with a cooperative children 
class 'C' as follows:

  >>> class C(B):
  ...    def __init__(self):
  ...       super(self.this,self).__init__() # cooperative call
  ...       print type(self).this,'.__init__'
  >>> C.this=C

``C.__init__`` calls ``B.__init__`` by passing a 'C' instance, therefore
``C.this`` is printed and not ``B.this``:

  >>> C()
  <class '__main__.C'> .__init__
  <class '__main__.C'> .__init__
  <__main__.C object at 0x4042ca6c>

The problem is that the ``C.this`` overrides ``B.this``. The only
way of avoiding the name clashing is to use a private attribute 
``.__this``, as in the following script:

 ::

  #<privateh.py>

  class B(object):
       def __init__(self): 
          print self.__this,'.__init__'
  B._B__this=B

  class C(B):
      def __init__(self):
         super(self.__this,self).__init__() # cooperative __init__ 
         print self.__this,'.__init__'
  C._C__this=C

  C()

  # output:
  # <class '__main__.B'> .__init__
  # <class '__main__.C'> .__init__

  #</privateh.py>

The script works since, due to the magic of the mangling mechanism,
in ``B.__init__``, ``self._B__this`` i.e. ``B`` is retrieved, whereas in 
``C.__init__`` ``self._C__this`` i.e. ``C`` is retrieved.

The elegance of the mechanism can be improved with an helper function
that makes its arguments reflective classes, i.e. classes with a
``__this`` private attribute:

 ::

  #<oopp.py>

  def reflective(*classes):
      """Reflective classes know themselves, i.e. they possess a private
      attribute __this containing a reference to themselves. If the class
      name starts with '_', the underscores are stripped."""
      for c in classes:
          name=c.__name__ .lstrip('_')  # in 2.3
          setattr(c,'_%s__this' % name,c) 

  #</oopp.py>

It is trivial to rewrite the paleonthropological hierarchy in terms of 
anonymous cooperative super calls by using this trick.

 ::

  #<oopp.py>

  class HomoHabilis(Homo):
      def can(self):
          super(self.__this,self).can()
          print " - make tools"

  class HomoSapiens(HomoHabilis):
      def can(self):
          super(self.__this,self).can()
          print " - make abstractions"
        
  class HomoSapiensSapiens(HomoSapiens):
      def can(self):
          super(self.__this,self).can()
          print " - make art"

  reflective(HomoHabilis,HomoSapiens,HomoSapiensSapiens)

  #</oopp.py>

Here there is an example of usage:

  >>> from oopp import *
  >>> man=HomoSapiensSapiens(); man.can()
  <HomoSapiensSapiens> can:
   - make tools
   - make abstractions
   - make art

We may understand why it works by looking at the attributes of man:

  >>> print pretty(attributes(man))
  _HomoHabilis__this = <class 'oopp.HomoHabilis'>
  _HomoSapiensSapiens__this = <class 'oopp.HomoSapiensSapiens'>
  _HomoSapiens__this = <class 'oopp.HomoSapiens'>
  can = <bound method HomoSapiensSapiens.can of 
        <oopp.HomoSapiensSapiens object at 0x404292ec>>
  formatstring = %s

It is also interesting to notice that the hierarchy can be entirely
rewritten without using cooperative methods, but using private attributes,
instead. This second approach is simpler, as the following script shows:

 ::

  #<privatehierarchy.py>

  from oopp import PrettyPrinted,attributes,pretty

  class Homo(PrettyPrinted):
      def can(self):
          print self,'can:'
          for attr,value in attributes(self).iteritems(): 
              if attr.endswith('__attr'): print value
  class HomoHabilis(Homo): 
      __attr=" - make tools"
  class HomoSapiens(HomoHabilis): 
      __attr=" - make abstractions"
  class HomoSapiensSapiens(HomoSapiens): 
      __attr=" - make art"

  modernman=HomoSapiensSapiens()
  modernman.can()
  print '----------------------------------\nAttributes of',modernman
  print pretty(attributes(modernman))

  #</privatehierarchy.py>

Here I have replaced the complicate chain of cooperative methods with 
much simpler private attributes. Only the 'can' method in the 'Homo' 
class survives, and it is modified to print the value of the '__attr' 
attributes. Moreover, all the classes of the hierarchy have been made
'Customizable', in view of future extensions.

The second script is much shorter and much more elegant than the original
one, however its logic can be a little baffling, at first. The solution
to the mistery is provided by the attribute dictionary of 'moderman',
given by the second part of the output:

 ::

  <HomoSapiensSapiens> can:
   - make abstractions  
   - make art  
   - make tools 
  ------------------------------------------
  Attributes of <HomoSapiensSapiens>:
  _HomoHabilis__attr =  - make tools
  _HomoSapiensSapiens__attr =  - make art
  _HomoSapiens__attr =  - make abstractions
  can = <bound method HomoSapiensSapiens.can of 
        <__main__.HomoSapiensSapiens object at 0x402d892c>>
  formatstring = %s

We see that, in addition to the 'can' method inherited from 'Homo',
the 'with' and 'With' method inherited from 'Customizable' and
the 'formatstring' inherited from 'PrettyPrinted', 
``moderman`` has the attributes

 ::

  _HomoHabilis__attr:' - make tools' # inherited from HomoHabilis
  _HomoSapiens__attr:' - make abstractions'# inherited from HomoSapiens
  _HomoSapiensSapiens__attr: ' - make art' # inherited from HomoSapiensSapiens

which origin is obvious, once one reminds the mangling mechanism associated
with private variables. The important point is that the trick would *not*
have worked for normal attributes. Had I used as variable name
'attr' instead of '__attr', the name would have been overridden: the only
attribute of 'HomoSapiensSapiens' would have been ' - make art'.

This example explains the advantages of private variables during inheritance:
they cannot be overridden. Using private name guarantees the absence of 
surprises due to inheritance. If a class B has only private variables,
deriving a class C from B cannot cause name clashes. 

Private variables have a drawbacks, too. The most obvious disadvantages is
the fact that in order to customize private variables outside their 
defining class, one needs to pass explicitly the name of the class. 

For instance we could not change an attribute with the syntax 
``HomoHabilis.With(__attr=' - work the stone')``, we must write the 
more verbose, error prone and redundant 
``HomoHabilis.With(_HomoHabilis__attr=' - work the stone')``

A subtler drawback will be discussed in chapter 6.

 .. [#] In single inheritance hierarchies, ``super`` can be dismissed
        in favor of ``__base__``: for instance, 
        ``super(HomoSapiens,self).can()`` is equivalent to
        ``HomoSapiens.__base__.can(self)``. Nevertheless, in view
        of possible extensions to multiple inheritance, using ``super`` is a
        much preferable choice.


THE SOPHISTICATION OF DESCRIPTORS
===========================================================================

Attribute descriptors are important metaprogramming tools that allows 
the user to customize the behavior of attributes in custom classes.
For instance, attribute descriptors (or descriptors for short) 
can be used as method wrappers, 
to modify or enhance methods (this is the case for the well
known staticmethods and classmethods attribute descriptors); they
can also be used as attribute wrappers, to change or restrict the access to 
attributes (this is the case for properties). Finally, descriptors 
allows the user to play with the resolution order of attributes: 
for instance, the ``super`` built-in object used in (multiple) inheritance 
hierarchies, is implemented as an attribute descriptor.

In this chapter, I will show how the user can define its own attribute 
descriptors and I will give some example of useful things you can do with 
them (in particular to add tracing and timing capabilities).

Motivation
---------------------------------------------------------------------------
Attribute descriptors are a recent idea (they where first introduced in 
Python 2.2) nevertheless, under the hood, are everywhere in Python. It is 
a tribute to Guido's ability of hiding Python complications that
the average user can easily miss they existence.
If you need to do simple things, you can very well live without 
the knowledge of descriptors. On the other hand, if you need difficult 
things (such as tracing all the attribute access of your modules) 
attribute descriptors, allow you to perform 
impressive things. 
Let me start by showing why the knowledge of attribute descriptors is 
essential for any user seriously interested  in metaprogramming applications.
Suppose I  want to trace the methods of a clock:

  >>> import oopp
  >>> clock=oopp.Clock()

This is easily done with the ``with_tracer`` closure of chapter 2:

  >>> oopp.wrapfunctions(clock,oopp.with_tracer)
  <oopp.Clock object at 0x4044c54c>
  >>> clock.get_time()
  [] Calling 'get_time' with arguments
  (){} ...
  -> '.get_time' called with result: 19:55:07
  '19:55:07'

However, this approach fails if I try to trace the entire class:

  >>> oopp.wrapfunctions(oopp.Clock,oopp.with_tracer)
  <class 'oopp.Clock'>
  >>> oopp.Clock.get_time() # error
  Traceback (most recent call last):
    File "<stdin>", line 6, in ?
  TypeError: unbound method _() must be called with Clock instance 
  as first argument (got nothing instead)

The reason is that ``wrapfunctions`` sets the attributes of 'Clock'
by invoking ``customize``, which uses ``setattr``. This converts
'_' (i.e. the traced version of ``get_time``) in a regular method, not in 
a staticmethod!
In order to trace staticmethods, one has to understand the nature 
of attribute descriptors.

Functions versus methods
----------------------------------------------------------------------

Attribute descriptors are essential for the implementation 
of one of the most basic Python features: the automatic conversion 
of functions in methods. As I already anticipated in chapter 1, there is 
a sort of magic when one writes ``Clock.get_time=lambda self: get_time()``
and Python automagically converts the right hand side, that is a
function, to a left hand side that is a (unbound) method. In order to 
understand this magic, one needs a better comprehension of the
relation between functions and methods.
Actually, this relationship is quite subtle 
and has no analogous in mainstream programming languages.
For instance, C is not OOP and has only functions, lacking the concept
of method, whereas Java (as other OOP languages) 
has no functions,  only methods.
C++ has functions and methods, but functions are completely
different from methods On the other hand, in Python, 
functions and methods can be transformed both ways.

To show how it works, let me start by defining a simple printing 
function:

 ::

  #<oopp.py>

  import __main__ # gives access to the __main__ namespace from the module

  def prn(s):
      """Given an evaluable string, print its value and its object reference.
      Notice that the evaluation is done in the __main__ dictionary."""
      try: obj=eval(s,__main__.__dict__)
      except: print 'problems in evaluating',s
      else: print s,'=',obj,'at',hex(id(obj))

  #</oopp.py>

Now, let me define a class with a method ``m`` equals to the identity
function ``f``:

  >>> def f(x): "Identity function"; return x
  ...
  >>> class C(object):
  ...    m=f
  ...    print m #here m is the function f
  <function f at 0x401c2b1c>

We see that *inside* its defining class, ``m`` coincides with the function 
``f`` (the object reference is the same):

  >>> f
  <function f at 0x401c2b1c>

We may retrieve ``m`` from *outside* the class via the class dictionary [#]_:

  >>> C.__dict__['m']
  <function prn at 0x401c2b1c>

However, if we invoke ``m`` with
the syntax ``C.m``, then it (magically) becomes a (unbound) method:

  >>> C.m #here m has become a method!
  <unbound method C.f>

But why it is so? How comes that in the second syntax the function 
``f`` is transformed in a (unbound) method? To answer that question, we have
to understand how attributes are really invoked in Python, i.e. via
attribute descriptors.

Methods versus functions
-----------------------------------------------------------------------------

First of all, let me point out the differences between methods and
functions. Here, ``C.m`` does *not* coincides with ``C.__dict__['m']``
i.e. ``f``, since its object reference is different:

  >>> from oopp import prn,attributes
  >>> prn('C.m')
  C.m = <unbound method C.prn> at 0x81109b4

The difference is clear since methods and functions have different attributes:

  >>> attributes(f).keys()
  ['func_closure', 'func_dict', 'func_defaults', 'func_name', 
  'func_code', 'func_doc', 'func_globals']

whereas

  >>> attributes(C.m).keys()
  ['im_func', 'im_class', 'im_self']

We discussed few of the functions attributes in the chapter
on functions. The instance method attributes are simpler: ``im_self`` 
returns the object to which the method is attached,

  >>> print C.m.im_self #unbound method, attached to the class
  None
  >>> C().m.im_self #bound method, attached to C()
  <__main__.C object at 0x81bf4ec> 

``im_class`` returns the class to which the
method is attached 

  >>> C.m.im_class #class of the unbound method
  <class '__main__.C'>
  >>> C().m.im_class #class of the bound method,
  <class '__main__.C'>

and ``im_func`` returns the function equivalent to
the method.

  >>> C.m.im_func
  <function m at 0x8157f44>
  >>> C().m.im_func # the same
  <function m at 0x8157f44>

As the reference manual states, calling 
``m(*args,**kw)`` is completely equivalent to calling 
``m.im_func(m.im_self, *args,**kw)``". 

As a general rule, an attribute descriptor is an object with a ``__get__`` 
special method. The most used descriptors are the good old functions:
they have a ``__get__`` special  method returning a *method-wrapper object*

  >>> f.__get__
  <method-wrapper object at 0x815cdc4>

method-wrapper objects can be transformed in (both bound and unbound) methods:

  >>> f.__get__(None,C)
  <unbound method C.f>
  >>> f.__get__(C(),C)
  <bound method C.f of <__main__.C object at 0x815cdc4>>

The general calling syntax for method-wrapper objects is 
``.__get__(obj,cls=None)``, where the first argument is an
instance object or None and the second (optional) argument is the class (or a
generic superclass) of the first one. 

Now we see what happens when we use the syntax ``C.m``: Python interprets
this as a shortcut for ``C.__dict['m'].__get__(None,C)`` (if ``m`` is
in the 'C' dictionary, otherwise it looks for ancestor dictionaries). 
We may check that everything is correct by observing that
``f.__get__(None,C)`` has exactly the same object reference than ``C.m``,
therefore they are the same object:

  >>> hex(id(f.__get__(None,C))) # same as hex(id(C.m))
  '0x811095c'

The process works equally well for the syntax ``getattr``:

  >>> print getattr(C,'m'), hex(id(getattr(C,'m')))
  <unbound method C.f> 0x811095c

and for bound methods: if

  >>> c=C()

is an instance of the class C, then the syntax

  >>> getattr(c,'m') #same as c.m
  <bound method C.f of <__main__.C object at 0x815cdc4>>

is a shortcut for

  >>> type(c).__dict__['m'].__get__(c,C) # or f.__get__(c,C)
  <bound method C.f of <__main__.C object at 0x815cdc4>>

(notice that the object reference for ``c.m`` and ``f.__get__(c,C)`` is
the same, they are *exactly* the same object).

Both the unbound method C.m and the bound method c.m refer to the same 
object at hexadecimal address 0x811095c. This object is common to all other
instances of C:

  >>> c2=C()
  >>> print c2.m,hex(id(c2.m)) #always the same method
  <bound method C.m of <__main__.C object at 0x815768c>> 0x811095c

One can also omit the second argument:

  >>> c.m.__get__(c)
  <bound method ?.m of <__main__.C object at 0x81597dc>>

Finally, let me point out that methods are attribute descriptors too,
since they have a ``__get__`` attribute returning a method-wrapper
object:

  >>> C.m.__get__
  <method-wrapper object at 0x815d51c>

Notice that this method wrapper is *not* the same than the ``f.__get__``
method wrapper.

 .. [#] If ``C.__dict['m']`` is not defined, Python looks if ``m`` is defined
        in some ancestor of C. For instance if `B` is the base of `C`, it
        looks in ``B.__dict['m']``, etc., by following the MRO.

Static methods and class methods
--------------------------------------------------------------------------

Whereas functions and methods are implicit attribute descriptors,
static methods and class methods are examples of explicit
descriptors. They allow to convert regular functions to 
specific descriptor objects. Let me show a trivial example. 
Given the identity function

  >>> def f(x): return x

we may convert it to a staticmethod object

  >>> sm=staticmethod(f)
  >>> sm
  <staticmethod object at 0x4018a0a0>

or to a classmethod object

  >>> cm=classmethod(f)
  >>> cm
  <classmethod object at 0x4018a0b0>

In both cases the ``__get__`` special method returns a method-wrapper object

  >>> sm.__get__
  <method-wrapper object at 0x401751ec>
  >>> cm.__get__
  <method-wrapper object at 0x4017524c>

However the static method wrapper is quite different from the class
method wrapper. In the first case the wrapper returns a function:

  >>> sm.__get__(C(),C)
  <function f at 0x4027a8b4>
  >>> sm.__get__(C())
  <function f at 0x4027a8b4>

in the second case it returns a method

  >>> cm.__get__(C(),C)
  <bound method type.f of <class '__main__.C'>>

Let me discuss more in detail the static methods, first. 

It is always possible to extract the function from the static method
via the syntaxes ``sm.__get__(a)`` and ``sm.__get__(a,b)`` with *ANY* valid
a and b, i.e. the result does not depend on a and b. This is correct,
since static methods are actually function that have nothing to do
with the class and the instances to which they are bound.

This behaviour of the method wrapper makes clear why the relation between 
methods and functions is inversed for static methods with respect to
regular methods:

  >>> class C(object):
  ...     s=staticmethod(lambda : None)
  ...     print s
  ...
  <staticmethod object at 0x8158ec8>

Static methods are non-trivial objects *inside* the class, whereas 
they are regular functions *outside* the class:

  >>> C.s
  <function <lambda> at 0x8158e7c>
  >>> C().s
  <function <lambda> at 0x8158e7c>

The situation is different for classmethods: inside the class they
are non-trivial objects, just as static methods,

  >>> class C(object):
  ...     cm=classmethod(lambda cls: None)
  ...     print cm
  ...
  <classmethod object at 0x8156100>

but outside the class they are methods bound to the class,

  >>> c=C()
  >>> prn('c.cm') 
  <bound method type.<lambda> of <class '__main__.C'>> 
  0x811095c

and not to the instance 'c'. The reason is that the ``__get__`` wrapper method
can be invoked with the syntax  ``__get__(a,cls)`` which 
is only sensitive to the second argument or with the syntax
``__get__(obj)`` which is only sensitive to the type of the first
argument:

  >>> cm.__get__('whatever',C) # the first argument is ignored
  <bound method type.f of <class '__main__.C'>>

sensitive to the type of 'whatever':

  >>> cm.__get__('whatever') # in Python 2.2 would give a serious error
  <bound method type.f of <type 'str'>>

Notice that the class method is actually bound to C's class, i.e.
to 'type'.

Just as regular methods (and differently
from static methods) classmethods have attributes ``im_class``, ``im_func``, 
and ``im_self``. In particular one can retrieve the function wrapped inside
the classmethod with

  >>> cm.__get__('whatever','whatever').im_func 
  <function f at 0x402c2534>

The difference with regular methods is that ``im_class`` returns the
class of 'C' whereas ``im_self`` returns 'C' itself.

  >>> C.cm.im_self # a classmethod is attached to the class
  <class '__main__.C'>
  >>> C.cm.im_class #the class of C
  <type 'type'>

Remark: Python 2.2.0 has a bug in classmethods (fixed in newer versions):
when the first argument of __get__ is None, then one must specify 
the second argument (otherwise segmentation fault :-() 

Properties
----------------------------------------------------------------------

Properties are a more general kind of attribute descriptors than 
staticmethods and classmethods, since their effect can be customized
trough arbitrary get/set/del functions. Let me give an example:

  >>> def getp(self): return 'property' # get function
  ...
  >>> p=property(getp) # property object
  >>> p
  <property object at 0x815855c>

``p`` has a ``__get__`` special method returning a method-wrapper
object, just as it happens for other descriptors:

  >>> p.__get__
  <method-wrapper object at 0x8158a7c>

The difference is that 

  >>> p.__get__(None,type(p))
  <property object at 0x4017016c>
  >>> p.__get__('whatever')
  'property'
  >>> p.__get__('whatever','whatever')
  'property'

As for static methods, the ``__get__`` method wrapper is independent from
its arguments, unless the first one is None: in such a case it returns
the property object, in all other circumstances it returns the result
of ``getp``. This explains the behavior

  >>> class C(object): p=p
  >>> C.p
  <property object at 0x815855c>
  >>> C().p
  'property'

Properties are a dangerous feature, since they change the semantics
of the language. This means that apparently trivial operations can have 
any kind of side effects:

  >>> def get(self):return 'You gave me the order to destroy your hard disk!!'
  >>> class C(object): x=property(get) 
  >>> C().x
  'You gave me the order to destroy your hard disk!!'

Invoking 'C.x' could very well invoke an external program who is going
to do anything! It is up to the programmer to not abuse properties.
The same is true for user defined attribute descriptors.

There are situations in which they are quite handy, however. For
instance, properties can be used to trace the access data attributes.
This can be especially useful during debugging, or for logging
purposes.

Notice that this approach has the problem that now data attributes cannot 
no more be called trough their class, but only though their instances.
Moreover properties do not work well with ``super`` in cooperative
methods.

User-defined attribute descriptors
----------------------------------------------------------------------

As we have seen, there are plenty of predefined attribute descriptors,
such as staticmethods, classmethods and properties (the built-in
``super`` is also an attribute descriptor which, for sake of
convenience, will be discussed in the next section).
In addition to them, the user can also define customized attribute 
descriptors, simply trough classes with a ``__get__`` special method.
Let me give an example:

 ::

  #<simpledescr.py>

  class ChattyAttr(object):
      """Chatty descriptor class; descriptor objects are intended to be 
      used as attributes in other classes"""
      def __get__(self, obj, cls=None):
          binding=obj is not None
          if  binding:
              return 'You are binding %s to %s' % (self,obj)
          else:
              return 'Calling %s from %s' % (self,cls)

  class C(object):
      d=ChattyAttr()

  c=C()

  print c.d # <=> type(c).__dict__['d'].__get__(c,type(c))
  print C.d # <=> C.__dict__['d'].__get__(None,C)

  #</simpledescr.py>

with output:

 ::

  You are binding <ChattyAttr object at 0x401bc1cc> to 
  <C object at 0x401bc2ec>
  Calling <ChattyAttr object at 0x401bc1cc> from <class 'C'>


Invoking a method with the syntax ``C.d`` or ``c.d`` involves calling
``__get__``. The ``__get__`` signature is fixed: it is
`` __get__=__get__(self,obj,cls=None)``, since the notation
``self.descr_attr`` automatically passes ``self`` and ``self.__class__`` to 
``__get__``.

Custom descriptors can be used to restrict the access to objects in a
more general way than trough properties. For instance, suppose one
wants to raise an error if a given attribute 'a' is accessed, both
from the class and from the instance: a property cannot help here,
since it works only from the instance. The solution is the following
custom descriptor:

 ::

  #<oopp.py>

  class AccessError(object):
      """Descriptor raising an AttributeError when the attribute is 
      accessed""" #could be done with a property
      def __init__(self,errormessage):
          self.msg=errormessage
      def __get__(self,obj,cls=None):
          raise AttributeError(self.msg)

  #</oopp.py>
  
  >>> from oopp import AccessError
  >>> class C(object):
  ...    a=AccessError("'a' cannot be accessed")
  >>> c=C()
  >>> c.a #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "oopp.py", line 313, in __get__
      raise AttributeError(self.msg)
  AttributeError: 'a' cannot be accessed
  >>> C.a #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "oopp.py", line 313, in __get__
      raise AttributeError(self.msg)
  AttributeError: 'a' cannot be accessed

It is always possibile to convert plain attributes (i.e. attributes
without a "__get__" method) to descriptor objects:

 ::

  #<oopp.py>

  class convert2descriptor(object):
      """To all practical means, this class acts as a function that, given an
      object, adds to it a __get__ method if it is not already there. The 
      added __get__ method is trivial and simply returns the original object, 
      independently from obj and cls."""
      def __new__(cls,a):
          if hasattr(a,"__get__"): # do nothing
              return a # a is already a descriptor
          else: # creates a trivial attribute descriptor
              cls.a=a
              return object.__new__(cls)
      def __get__(self,obj,cls=None):
          "Returns self.a independently from obj and cls"
          return self.a
  
  #</oopp.py>

This example also shows the magic of ``__new__``, that allows to use a
class as a function. The output of 'convert2descriptor(a)' can be both 
an instance of 'convert2descriptor' (in this case 'convert2descriptor' acts as 
a normal class, i.e. as an object factory) or 'a' itself 
(if 'a' is already a descriptor): in this case 'convert2descriptor' acts 
as a function.

For instance, a string is converted to a descriptor

  >>> from oopp import convert2descriptor
  >>> a2=convert2descriptor('a')
  >>> a2
  <oopp.convert2descriptor object at 0x4017506c>
  >>> a2.__get__('whatever')
  'a'

whereas a function is untouched:

  >>> def f(): pass
  >>> f2=convert2descriptor(f) # does nothing
  >>> f2
  <function f at 0x4019110c>

Data descriptors
-------------------------------------------------------------------------

It is also possible to specify a ``__set__`` method (descriptors
with a ``__set__`` method are typically data descriptors) with
the signature ``__set__(self,obj,value)`` as in the following
example:

 ::

  #<datadescr.py>

  class DataDescriptor(object):
      value=None
      def __get__(self, obj, cls=None):
          if obj is None: obj=cls
          print "Getting",obj,"value =",self.value
          return self.value
      def __set__(self, obj, value):
          self.value=value
          print "Setting",obj,"value =",value

  class C(object):
      d=DataDescriptor()

  c=C()
 
  c.d=1 #calls C.__dict__['d'].__set__(c,1)
  c.d   #calls C.__dict__['d'].__get__(c,C)
  C.d   #calls C.__dict__['d'].__get__(None,C)
  C.d=0 #does *not* call __set__
  print "C.d =",C.d

  #</datadescr.py>

With output:

 ::

  Setting <C object at 0x401bc1ec> value = 1
  Getting <C object at 0x401bc42c> value = 1
  Getting <class 'C'> value = 1      
  C.d = 0

With this knowledge, we may now reconsider the clock example given 
in chapter 3. #NO!??

  >>> import oopp
  >>> class Clock(object): pass
  >>> myclock=Clock()
  ...
  >>> myclock.get_time=oopp.get_time # this is a function
  >>> Clock.get_time=lambda self : oopp.get_time() # this is a method 

In this example, ``myclock.get_time``, which is attached to the ``myclock`` 
object, is a function, whereas ``Clock.get_time``, which is attached to 
the ``Clock`` class is a method. We may also check this by using the ``type`` 
function:

  >>> type(myclock.get_time)
  <type 'function'>

whereas

  >>> type(Clock.get_time) 
  <type 'instance method'>


It must be remarked that user-defined attribute descriptors, just as
properties, allow to arbitrarily change the semantics of the language
and should be used with care.

The ``super`` attribute descriptor
------------------------------------------------------------------------


super has also a second form, where it is more used as a descriptor.

``super`` objects are attribute descriptors, too, with a ``__get__``
method returning a method-wrapper object:

  >>> super(C,C()).__get__
  <method-wrapper object at 0x8161074>

Here I give some example of acceptable call:

  >>> super(C,C()).__get__('whatever')
  <super: <class 'C'>, <C object>>
  >>> super(C,C()).__get__('whatever','whatever')
  <super: <class 'C'>, <C object>>


Unfortunately, for the time being 
(i.e. for Python 2.3), the ``super`` mechanism  has various limitations. 
To show the issues, let me start by  considering the following base class:

 ::

  #<oopp.py>

  class ExampleBaseClass(PrettyPrinted):
      """Contains a regular method 'm', a staticmethod 's', a classmethod 
      'c', a property 'p' and a data attribute 'd'."""
      m=lambda self: 'regular method of %s' % self
      s=staticmethod(lambda : 'staticmethod')
      c=classmethod(lambda cls: 'classmethod of %s' % cls)
      p=property(lambda self: 'property of %s' % self)
      a=AccessError('Expected error')
      d='data'

  #</oopp.py>

Now, let me derive a new class C from ExampleBaseClass:

  >>> from oopp import ExampleBaseClass
  >>> class C(ExampleBaseClass): pass
  >>> c=C()

Ideally, we would like to retrieve the methods and attributes of 
ExampleBaseClass from C, by using the ``super`` mechanism.

1. We see that ``super`` works without problems for regular methods, 
   staticmethods and classmethods:

  >>> super(C,c).m()
  'regular method of <C>'
  >>> super(C,c).s()
  'staticmethod'
  >>> super(C,c).c()
  "classmethod of <class '__main__.C'>"

It also works for user defined attribute descriptors:

  >>> super(C,c).a # access error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "oopp.py", line 340, in __get__
      raise AttributeError(self.msg)
  AttributeError: Expected error
  
and for properties (only for Python 2.3+):

  >>> ExampleBaseClass.p
  <property object at 0x81b30fc>

In Python 2.2 one would get an error, instead

  >>> super(C,c).p #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  AttributeError: 'super' object has no attribute 'p'

3. Moreover, certain attributes of the superclass, such as its
``__name__``, cannot be retrieved:

  >>> ExampleBaseClass.__name__
  'ExampleBaseClass'
  >>> super(C,c).__name__ #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  AttributeError: 'super' object has no attribute '__name__'

4. There is no direct way to retrieve the methods of the super-superclass 
   (i.e. the grandmother class, if you wish) or in general the furthest 
   ancestors, since ``super`` does not chain.

5. Finally, there are some subtle issues with the ``super(cls)`` syntax:


  >>> super(C).m #(2) error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  AttributeError: 'super' object has no attribute 'm'

means ``super(C).__get__(None,C)``, but only 
``super(C).__get__(c,C).m==super(C,c)`` works.

   On the other hand,

  >>> super(C).__init__  #(1) 
  <built-in method __init__ of type object at 0x80e6fc0>
  >>> super(C).__new__  #(1) 
  <built-in method __init__ of type object at 0x80e6fc0>


   seems to work, whereas in reality does not. The reason is that since 
   ``super`` objects are instances 
   of ``object``, they inherit object's methods, and in particular 
   ``__init__`` ; therefore the ``__init__`` method in (1) is *not* 
   the ``ExampleBaseClass.__init__`` method. The point is that ``super`` 
   objects are attribute descriptors and not references to the superclass.

Probably, in future versions of Python the ``super`` mechanism will be 
improved. However, for the time being, one must provide a workaround for 
dealing with these issues. This will be discussed in the next chapter.

Method wrappers
----------------------------------------------------------------------

One of the most typical applications of attribute descriptors is their
usage as *method wrappers*.

Suppose, for instance, one wants to add tracing capabilities to 
the methods of a class for debugging purposes. The problem
can be solved with a custom descriptor class:

 ::

  #<oopp.py>
 
  import inspect

  class wrappedmethod(Customizable):
      """Customizable method factory intended for derivation.
      The wrapper method is overridden in the children."""

      logfile=sys.stdout # default
      namespace='' # default

      def __new__(cls,meth): # meth is a descriptor
          if isinstance(meth,FunctionType):
              kind=0 # regular method
              func=meth
          elif isinstance(meth,staticmethod):
              kind=1 # static method
              func=meth.__get__('whatever')
          elif isinstance(meth,classmethod):
              kind=2 # class method
              func=meth.__get__('whatever','whatever').im_func 
          elif isinstance(meth,wrappedmethod): # already wrapped
              return meth # do nothing
          elif inspect.ismethoddescriptor(meth):
              kind=0; func=meth # for many builtin methods 
          else:
              return meth # do nothing
          self=super(wrappedmethod,cls).__new__(cls)
          self.kind=kind; self.func=func # pre-initialize
          return self

      def __init__(self,meth): # meth not used
          self.logfile=self.logfile # default values
          self.namespace=self.namespace # copy the current

      def __get__(self,obj,cls): # closure 
          def _(*args,**kw):
              if obj is None: o=() # unbound method call
              else: o=(obj,) # bound method call
              allargs=[o,(),(cls,)][self.kind]+args 
              return self.wrapper()(*allargs,**kw)
          return _ # the wrapped function
          # allargs is the only nontrivial line in _; it adds
          # 0 - obj if meth is a regular method
          # 1 - nothing if meth is a static method
          # 2 - cls if meth is a class method

      def wrapper(self): return self.func # do nothing, to be overridden
  
  #</oopp.py>

This class is intended for derivation: the wrapper method has to be overridden
in the children in order to introduce the wanted feature. If I want to 
implement the capability of tracing methods, I can reuse the ``with_tracer``
closure introduced in chapter 2:

 ::

  #<oopp.py>

  class tracedmethod(wrappedmethod):
      def wrapper(self):
          return with_tracer(self.func,self.namespace,self.logfile)
          
  #</oopp.py>

Nothing prevents me from introducing timing features by reusing the 
``with_timer`` closure:

 ::

  #<oopp.py>

  class timedmethod(wrappedmethod):
      iterations=1 # additional default parameter

      def __init__(self,meth):
          super(timedmethod,self).__init__(self,meth)
          self.iterations=self.iterations # copy

      def wrapper(self):
          return with_timer(self.func,self.namespace,
                            self.iterations,self.logfile)

  #</oopp.py>

Here there is an example of usage:

The dictionary of wrapped functions is then built from an utility function

 ::

  #<oopp.py>

  def wrap(obj,wrapped,condition=lambda k,v: True, err=None):
      "Retrieves obj's dictionary and wraps it"
      if isinstance(obj,dict): # obj is a dictionary 
          dic=obj
      else: 
          dic=getattr(obj,'__dict__',{}).copy() # avoids dictproxy objects
          if not dic: dic=attributes(obj) # for simple objects
      wrapped.namespace=getattr(obj,'__name__','')
      for name,attr in dic.iteritems(): # modify dic
          if condition(name,attr): dic[name]=wrapped(attr)
      if not isinstance(obj,dict): # modify obj
          customize(obj,err,**dic) 

  #</oopp.py>

 ::

  #<tracingmethods.py>

  from oopp import *
  
  class C(object): 
      "Class with traced methods"

      def f(self): return self 
      f=tracedmethod(f)

      g=staticmethod(lambda:None)
      g=tracedmethod(g)
  
      h=classmethod(do_nothing)
      h=tracedmethod(h)
  
  c=C()

  #unbound calls
  C.f(c) 
  C.g()
  C.h()
  
  #bound calls
  c.f()  
  c.g()
  c.h()

  #</tracingmethods.py>

Output:

 ::

  [C] Calling 'f' with arguments
  (<C object at 0x402042cc>,){} ...
  -> 'C.f' called with result: <C object at 0x402042cc>

  [C] Calling '<lambda>' with arguments
  (){} ...
  -> 'C.<lambda>' called with result: None

  [C] Calling 'do_nothing' with arguments
  (<class 'C'>,){} ...
  -> 'C.do_nothing' called with result: None

  [C] Calling 'f' with arguments
  (<C object at 0x402042cc>,){} ...
  -> 'C.f' called with result: <C object at 0x402042cc>

  [C] Calling '<lambda>' with arguments
  (){} ...
  -> 'C.<lambda>' called with result: None

  [C] Calling 'do_nothing' with arguments
  (<class 'C'>,){} ...
  -> 'C.do_nothing' called with result: None


The approach in 'tracingmethods.py' works, but it is far from
being elegant, since I had to explicitly wrap each method in the
class by hand. 

Both problems can be avoided.

  >>> from oopp import *
  >>> wrap(Clock,tracedmethod)
  >>> Clock.get_time()
  [Clock] Calling 'get_time' with arguments
  (){} ...
  -> 'Clock.get_time' called with result: 21:56:52
  '21:56:52'


THE SUBTLETIES OF MULTIPLE INHERITANCE
==========================================================================

In chapter 4 we introduced the concept of multiple inheritance and discussed
its simplest applications in absence of name collisions. When with methods
with different names are derived from different classes multiple inheritance
is pretty trivial. However, all kind of subtilites comes in presence of name 
clashing, i.e. when we multiply inherits different methods defined in different
classes but with the *same* name.
In order to understand what happens in this situation, it is essential to 
understand the concept of Method Resolution Order (MRO). For reader's
convenience, I collect in this chapter some of the information
reported in http://www.python.org/2.3/mro.html.

A little bit of history: why Python 2.3 has changed the MRO
------------------------------------------------------------------------------

Everything started with a post by Samuele Pedroni to the Python
development mailing list [#]_.  In his post, Samuele showed that the
Python 2.2 method resolution order is not monotonic and he proposed to
replace it with the C3 method resolution order.  Guido agreed with his
arguments and therefore now Python 2.3 uses C3.  The C3 method itself
has nothing to do with Python, since it was invented by people working
on Dylan and it is described in a paper intended for lispers [#]_.  The
present paper gives a (hopefully) readable discussion of the C3
algorithm for Pythonistas who want to understand the reasons for the
change.

First of all, let me point out that what I am going to say only applies
to the *new style classes* introduced in Python 2.2:  *classic classes*
maintain their old method resolution order, depth first and then left to
right.  Therefore, there is no breaking of old code for classic classes;
and even if in principle there could be breaking of code for Python 2.2
new style classes, in practice the cases in which the C3 resolution
order differs from the Python 2.2 method resolution order are so rare
that no real breaking of code is expected.  Therefore: don't be scared!

Moreover, unless you make strong use of multiple inheritance and you
have non-trivial hierarchies, you don't need to understand the C3
algorithm, and you can easily skip this paper.  On the other hand, if
you really want to know how multiple inheritance works, then this paper
is for you.  The good news is that things are not as complicated as you
might expect.

Let me begin with some basic definitions.

1) Given a class C in a complicated multiple inheritance hierarchy, it
   is a non-trivial task to specify the order in which methods are
   overridden, i.e. to specify the order of the ancestors of C.

2) The list of the ancestors of a class C, including the class itself,
   ordered from the nearest ancestor to the furthest, is called the
   class precedence list or the *linearization* of C.

3) The *Method Resolution Order* (MRO) is the set of rules that
   construct the linearization.  In the Python literature, the idiom
   "the MRO of C" is also used as a synonymous for the linearization of
   the class C.

4) For instance, in the case of single inheritance hierarchy, if C is a
   subclass of C1, and C1 is a subclass of C2, then the linearization of
   C is simply the list [C, C1 , C2].  However, with multiple
   inheritance hierarchies, it is more difficult to construct a
   linearization that respects *local precedence ordering* and
   *monotonicity*.

5) I will discuss the local precedence ordering later, but I can give
   the definition of monotonicity here.  A MRO is monotonic when the
   following is true:  *if C1 precedes C2 in the linearization of C,
   then C1 precedes C2 in the linearization of any subclass of C*.
   Otherwise, the innocuous operation of deriving a new class could
   change the resolution order of methods, potentially introducing very
   subtle bugs.  Examples where this happens will be shown later.

6) Not all classes admit a linearization.  There are cases, in
   complicated hierarchies, where it is not possible to derive a class
   such that its linearization respects all the desired properties.

Here I give an example of this situation. Consider the hierarchy 

  >>> O = object
  >>> class X(O): pass
  >>> class Y(O): pass
  >>> class A(X,Y): pass
  >>> class B(Y,X): pass

which can be represented with the following inheritance graph, where I
have denoted with O the ``object`` class, which is the beginning of any
hierarchy for new style classes:

 ::

          -----------
         |           |
         |    O      |
         |  /   \    |
          - X    Y  /
            |  / | /
            | /  |/
            A    B
            \   /
              ?

In this case, it is not possible to derive a new class C from A and B,
since X precedes Y in A, but Y precedes X in B, therefore the method
resolution order would be ambiguous in C.

Python 2.3 raises an exception in this situation (TypeError:  MRO
conflict among bases Y, X) forbidding the naive programmer from creating
ambiguous hierarchies.  Python 2.2 instead does not raise an exception,
but chooses an *ad hoc* ordering (CABXYO in this case).

The C3 Method Resolution Order
------------------------------

Let me introduce a few simple notations which will be useful for the
following discussion.  I will use the shortcut notation

  C1 C2 ... CN

to indicate the list of classes [C1, C2, ... , CN].

The *head* of the list is its first element:

  head = C1

whereas the *tail* is the rest of the list:

  tail = C2 ... CN.

I shall also use the notation

  C + (C1 C2 ... CN) = C C1 C2 ... CN

to denote the sum of the lists [C] + [C1, C2, ... ,CN].

Now I can explain how the MRO works in Python 2.3.

Consider a class C in a multiple inheritance hierarchy, with C
inheriting from the base classes B1, B2, ...  , BN.  We want to compute
the linearization L[C] of the class C. In order to do that, we need the
concept of *merging* lists, since the rule says that

  *the linearization of C is the sum of C plus the merge of a) the
  linearizations of the parents and b) the list of the parents.*

In symbolic notation:

   L[C(B1 ... BN)] = C + merge(L[B1] ... L[BN], B1 ... BN)

How is the merge computed? The rule is the following:

  *take the head of the first list, i.e L[B1][0]; if this head is not in
  the tail of any of the other lists, then add it to the linearization
  of C and remove it from the lists in the merge, otherwise look at the
  head of the next list and take it, if it is a good head.  Then repeat
  the operation until all the class are removed or it is impossible to
  find good heads.  In this case, it is impossible to construct the
  merge, Python 2.3 will refuse to create the class C and will raise an
  exception.*

This prescription ensures that the merge operation *preserves* the
ordering, if the ordering can be preserved.  On the other hand, if the
order cannot be preserved (as in the example of serious order
disagreement discussed above) then the merge cannot be computed.

The computation of the merge is trivial if:

1. C is the ``object`` class, which has no parents; in this case its
   linearization coincides with itself,

       L[object] = object.

2. C has only one parent (single inheritance); in this case

       L[C(B)] = C + merge(L[B],B) = C + L[B]

However, in the case of multiple inheritance things are more cumbersome
and I don't expect you can understand the rule without a couple of
examples ;-)

Examples
--------

First example. Consider the following hierarchy:

  >>> O = object
  >>> class F(O): pass
  >>> class E(O): pass
  >>> class D(O): pass
  >>> class C(D,F): pass
  >>> class B(D,E): pass
  >>> class A(B,C): pass

In this case the inheritance graph can be drawn as 

 ::
 
                            6
                           ---
  Level 3                 | O |                  (more general)
                        /  ---  \
                       /    |    \                      |
                      /     |     \                     |
                     /      |      \                    |
                    ---    ---    ---                   |
  Level 2        3 | D | 4| E |  | F | 5                |
                    ---    ---    ---                   |
                     \  \ _ /       |                   |
                      \    / \ _    |                   |
                       \  /      \  |                   |
                        ---      ---                    |
  Level 1            1 | B |    | C | 2                 |
                        ---      ---                    |
                          \      /                      |
                           \    /                      \ /
                             ---
  Level 0                 0 | A |                (more specialized)
                             ---


The linearizations of O,D,E and F are trivial:

 ::

  L[O] = O
  L[D] = D O
  L[E] = E O
  L[F] = F O

The linearization of B can be computed as

 ::

  L[B] = B + merge(DO, EO, DE)

We see that D is a good head, therefore we take it and we are reduced to
compute merge(O,EO,E).  Now O is not a good head, since it is in the
tail of the sequence EO.  In this case the rule says that we have to
skip to the next sequence.  Then we see that E is a good head; we take
it and we are reduced to compute merge(O,O) which gives O. Therefore

 ::

  L[B] =  B D E O

Using the same procedure one finds:

 ::

  L[C] = C + merge(DO,FO,DF)
       = C + D + merge(O,FO,F)
       = C + D + F + merge(O,O)
       = C D F O

Now we can compute:

 ::

  L[A] = A + merge(BDEO,CDFO,BC)
       = A + B + merge(DEO,CDFO,C)
       = A + B + C + merge(DEO,DFO)
       = A + B + C + D + merge(EO,FO)
       = A + B + C + D + E + merge(O,FO)
       = A + B + C + D + E + F + merge(O,O)
       = A B C D E F O

In this example, the linearization is ordered in a pretty nice way
according to the inheritance level, in the sense that lower levels (i.e.
more specialized classes) have higher precedence (see the inheritance
graph).  However, this is not the general case.

I leave as an exercise for the reader to compute the linearization for
my second example:

  >>> O = object
  >>> class F(O): pass
  >>> class E(O): pass
  >>> class D(O): pass
  >>> class C(D,F): pass
  >>> class B(E,D): pass
  >>> class A(B,C): pass

The only difference with the previous example is the change B(D,E) -->
B(E,D); however even such a little modification completely changes the
ordering of the hierarchy

 ::

                             6
                            ---
  Level 3                  | O |
                         /  ---  \
                        /    |    \
                       /     |     \
                      /      |      \
                    ---     ---    ---
  Level 2        2 | E | 4 | D |  | F | 5
                    ---     ---    ---
                     \      / \     /
                      \    /   \   /
                       \  /     \ /
                        ---     ---
  Level 1            1 | B |   | C | 3
                        ---     ---
                         \       /
                          \     /
                            ---
  Level 0                0 | A |
                            ---


Notice that the class E, which is in the second level of the hierarchy,
precedes the class C, which is in the first level of the hierarchy, i.e.
E is more specialized than C, even if it is in a higher level.

A lazy programmer can obtain the MRO directly from Python 2.2, since in
this case it coincides with the Python 2.3 linearization.  It is enough
to invoke the .mro() method of class A:

  >>> A.mro()
  (<class '__main__.A'>, <class '__main__.B'>, <class '__main__.E'>,
  <class '__main__.C'>, <class '__main__.D'>, <class '__main__.F'>,
  <type 'object'>)

Finally, let me consider the example discussed in the first section,
involving a serious order disagreement.  In this case, it is
straightforward to compute the linearizations of O, X, Y, A and B:

 ::

  L[O] = 0
  L[X] = X O
  L[Y] = Y O
  L[A] = A X Y O
  L[B] = B Y X O

However, it is impossible to compute the linearization for a class C
that inherits from A and B:

 ::

  L[C] = C + merge(AXYO, BYXO, AB)
       = C + A + merge(XYO, BYXO, B)
       = C + A + B + merge(XYO, YXO)

At this point we cannot merge the lists XYO and YXO, since X is in the
tail of YXO whereas Y is in the tail of XYO:  therefore there are no
good heads and the C3 algorithm stops.  Python 2.3 raises an error and
refuses to create the class C.

Bad Method Resolution Orders
----------------------------

A MRO is *bad* when it breaks such fundamental properties as local
precedence ordering and monotonicity.  In this section, I will show
that both the MRO for classic classes and the MRO for new style classes
in Python 2.2 are bad.

It is easier to start with the local precedence ordering.  Consider the
following example:

  >>> F=type('Food',(),{'remember2buy':'spam'})
  >>> E=type('Eggs',(F,),{'remember2buy':'eggs'})
  >>> G=type('GoodFood',(F,E),{}) #under Python 2.3 this is an error

with inheritance diagram

 ::

                O
                |
   (buy spam)   F
                | \
                | E   (buy eggs)
                | /
                G

         (buy eggs or spam ?)


We see that class G inherits from F and E, with F *before* E:  therefore
we would expect the attribute *G.remember2buy* to be inherited by
*F.rembermer2buy* and not by *E.remember2buy*:  nevertheless Python 2.2
gives

  >>> G.remember2buy #under Python 2.3 this is an error
  'eggs'

This is a breaking of local precedence ordering since the order in the
local precedence list, i.e. the list of the parents of G, is not
preserved in the Python 2.2 linearization of G:

 ::

  L[G,P22]= G E F object   # F *follows* E

One could argue that the reason why F follows E in the Python 2.2
linearization is that F is less specialized than E, since F is the
superclass of E; nevertheless the breaking of local precedence ordering
is quite non-intuitive and error prone.  This is particularly true since
it is a different from old style classes:

  >>> class F: remember2buy='spam'
  >>> class E(F): remember2buy='eggs'
  >>> class G(F,E): pass
  >>> G.remember2buy
  'spam'

In this case the MRO is GFEF and the local precedence ordering is
preserved.

As a general rule, hierarchies such as the previous one should be
avoided, since it is unclear if F should override E or viceversa.
Python 2.3 solves the ambiguity by raising an exception in the creation
of class G, effectively stopping the programmer from generating
ambiguous hierarchies.  The reason for that is that the C3 algorithm
fails when the merge

 ::

   merge(FO,EFO,FE)

cannot be computed, because F is in the tail of EFO and E is in the tail
of FE.

The real solution is to design a non-ambiguous hierarchy, i.e. to derive
G from E and F (the more specific first) and not from F and E; in this
case the MRO is GEF without any doubt.

 ::

                O
                |
                F (spam)
              / |
     (eggs)   E |
              \ |
                G
                  (eggs, no doubt)


Python 2.3 forces the programmer to write good hierarchies (or, at
least, less error-prone ones).

On a related note, let me point out that the Python 2.3 algorithm is
smart enough to recognize obvious mistakes, as the duplication of
classes in the list of parents:

  >>> class A(object): pass
  >>> class C(A,A): pass # error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: duplicate base class A

Python 2.2 (both for classic classes and new style classes) in this
situation, would not raise any exception.

Finally, I would like to point out two lessons we have learned from this
example:

1. despite the name, the MRO determines the resolution order of
   attributes, not only of methods;

2. the default food for Pythonistas is spam !  (but you already knew
   that ;-)

Having discussed the issue of local precedence ordering, let me now
consider the issue of monotonicity.  My goal is to show that neither the
MRO for classic classes nor that for Python 2.2 new style classes is
monotonic.

To prove that the MRO for classic classes is non-monotonic is rather
trivial, it is enough to look at the diamond diagram:

 ::


                   C
                  / \
                 /   \
                A     B
                 \   /
                  \ /
                   D

One easily discerns the inconsistency:

 ::

  L[B,P21] = B C        # B precedes C : B's methods win
  L[D,P21] = D A C B C  # B follows C  : C's methods win!

On the other hand, there are no problems with the Python 2.2 and 2.3
MROs, they give both

 ::

  L[D] = D A B C

Guido points out in his essay [#]_ that the classic MRO is not so bad in
practice, since one can typically avoids diamonds for classic classes.
But all new style classes inherit from object, therefore diamonds are
unavoidable and inconsistencies shows up in every multiple inheritance
graph.

The MRO of Python 2.2 makes breaking monotonicity difficult, but not
impossible.  The following example, originally provided by Samuele
Pedroni, shows that the MRO of Python 2.2 is non-monotonic:

  >>> class A(object): pass
  >>> class B(object): pass
  >>> class C(object): pass
  >>> class D(object): pass
  >>> class E(object): pass
  >>> class K1(A,B,C): pass
  >>> class K2(D,B,E): pass
  >>> class K3(D,A):   pass
  >>> class Z(K1,K2,K3): pass

Here are the linearizations according to the C3 MRO (the reader should
verify these linearizations as an exercise and draw the inheritance
diagram ;-)

 ::

  L[A] = A O
  L[B] = B O
  L[C] = C O
  L[D] = D O
  L[E] = E O
  L[K1]= K1 A B C O
  L[K2]= K2 D B E O
  L[K3]= K3 D A O
  L[Z] = Z K1 K2 K3 D A B C E O

Python 2.2 gives exactly the same linearizations for A, B, C, D, E, K1,
K2 and K3, but a different linearization for Z:

 ::

  L[Z,P22] = Z K1 K3 A K2 D B C E O

It is clear that this linearization is *wrong*, since A comes before D
whereas in the linearization of K3 A comes *after* D. In other words, in
K3 methods derived by D override methods derived by A, but in Z, which
still is a subclass of K3, methods derived by A override methods derived
by D!  This is a violation of monotonicity.  Moreover, the Python 2.2
linearization of Z is also inconsistent with local precedence ordering,
since the local precedence list of the class Z is [K1, K2, K3] (K2
precedes K3), whereas in the linearization of Z K2 *follows* K3.  These
problems explain why the 2.2 rule has been dismissed in favor of the C3
rule.

.. [#] The thread on python-dev started by Samuele Pedroni:
       http://mail.python.org/pipermail/python-dev/2002-October/029035.html

.. [#] The paper *A Monotonic Superclass Linearization for Dylan*:
       http://www.webcom.com/haahr/dylan/linearization-oopsla96.html

.. [#] Guido van Rossum's essay, *Unifying types and classes in Python 2.2*:
       http://www.python.org/2.2.2/descrintro.html

.. [#] The (in)famous book on metaclasses, *Putting Metaclasses to Work*:
       Ira R. Forman, Scott Danforth, Addison-Wesley 1999 (out of print,
       but probably still available on http://www.amazon.com)


Understanding the Method Resolution Order
--------------------------------------------------------------------------

The MRO of any given (new style) Python class is given
by the special attribute ``__mro__``. Notice that since
Python is an extremely dynamic language it is possible
to delete and to generate whole classes at run time, therefore the MRO
is a dynamic concept. For instance, let me show how it is possibile to 
remove a class from my 
paleoanthropological hierarchy: for instance I can
replace the last class 'HomoSapiensSapiens' with 'HomoSapiensNeardenthalensis'
(changing a class in the middle of the hierarchy would be more difficult). The
following lines do the job dynamically:

  >>> from oopp import *
  >>> del HomoSapiensSapiens
  >>> class HomoSapiensNeardenthalensis(HomoSapiens):
  ...     def can(self):
  ...         super(self.__this,self).can()
  ...         print " - make something"
  >>> reflective(HomoSapiensNeardenthalensis)
  >>> HomoSapiensNeardenthalensis().can()
  HomoSapiensNeardenthalensis can:
   - make tools
   - make abstractions
   - make something

In this case the MRO of 'HomoSapiensNeardenthalensis', i.e. the list of
all its ancestors, is

  >>> HomoSapiensNeardenthalensis.__mro__
  [<class '__main__.HomoSapiensNeardenthalensis'>,<class 'oopp.HomoSapiens'>, 
   <class 'oopp.HomoHabilis'>, <class 'oopp.Homo'>, 
   <class 'oopp.PrettyPrinted'>, <class 'oopp.object'>]

The ``__mro__`` attribute gives the *linearization* of the class, i.e. the
ordered list of its ancestors, starting from the class itself and ending
with object. The linearization of a class is essential in order to specify
the resolution order of methods and attributes, i.e. the Method Resolution
Order (MRO). In the case of single inheritance hierarchies, such the
paleonthropological example, the MRO is pretty obvious; on the contrary
it is a quite non-trivial concept in the case of multiple inheritance 
hierarchies. 

For instance, let me reconsider my first example of multiple inheritance,
the ``NonInstantiableClock`` class, inheriting from 'NonInstantiable' and 
'Clock'. I may represent the hierarchy with the following inheritance graph:

 ::


                     --   object   -- 
                   /     (__new__)    \
                  /                    \
                 /                      \
              Clock                NonInstantiable
           (get_time)                 (__new__)
                \                         /
                 \                       /
                  \                     /
                   \                   /
                    \                 /
                    NonInstantiableClock   
                     (get_time,__new__)


The class ``Clock`` define a ``get_time`` method, whereas the class 
``NonInstantiable`` overrides the ``__new__`` method of the ``object`` class; 
the class ``NonInstantiableClock`` inherits ``get_time`` from 'Clock' and 
``__new__`` from 'NonInstantiable'.

The linearization of 'NonInstantiableClock' is 

  >>> NonInstantiableClock.mro()
  [<class '__main__.NonInstantiableClock'>, <class 'oopp.Clock'>, 
   <class 'oopp.NonInstantiable'>, <type 'object'>]


In particular, since 'NonInstantiable' precedes 'object', its ``__new__`` 
method overrides the ``object`` new method. However, with the MRO used before
Python 2.2, the linearization would have been ``NonInstantiableClock, Clock, 
object, NonInstantiable, object`` and the ``__new__`` method of object would 
have (hypothetically, of course, since before Python 2.2 there was not 
``__new__`` method! ;-)  overridden the ``__new__``
method of ``NonInstantiable``, therefore ``NonInstantiableClock`` would 
have lost the property of being non-instantiable!

This simple example shows that the choice of a correct Method Resolution 
Order is far from being obvious in general multiple inheritance hierarchies. 
After a false start in Python 2.2, (with a MRO failing in some subtle cases)
Python 2.3 decided to adopt the so-called C3 MRO, invented by people working 
on Dylan (even if Dylan itself uses the MRO of Common Lisp CLOS). Since this 
is quite a technical matter, I defer the interested reader to appendix 2 
for a full discussion of the C3 algorithm. 

Here, I prefer to point out how the built-in
``super`` object works in multiple inheritance situations. To this aim, it 
is convenient to define an utility function that retrieves the ancestors
of a given class with respect to the MRO of one of its subclasses:

 ::

  #<oopp.py>

  def ancestor(C,S=None):
      """Returns the ancestors of the first argument with respect to the 
      MRO of the second argument. If the second argument is None, then 
      returns the MRO of the first argument."""
      if C is object:
          raise TypeError("There is no superclass of object")
      elif S is None or S is C:
          return list(C.__mro__)
      elif issubclass(S,C): # typical case
          mro=list(S.__mro__)
          return mro[mro.index(C):] # compute the ancestors from the MRO of S
      else:
          raise TypeError("S must be a subclass of C")

  #</oopp.py>

Let me show how the function ``ancestor`` works. 
Consider the class ``Clock`` in isolation: then 
its direct superclass, i.e. the first ancestor, is ``object``,

  >>> from oopp import *
  >>> ancestor(Clock)[1]
  <type 'object'>

therefore ``super(Clock).__new__`` retrieves the ``object.__new__`` method:

  >>> super(Clock).__new__
  <built-in method __new__ of type object at 0x80e6fc0>

Consider now the ``Clock`` class together with its subclass 
``NonInstantiableClock``:
in this case the first ancestor of ``Clock``, *with respect to the MRO of 
'NonInstantiableClock'* is ``NonInstantiable``

  >>> ancestor(Clock,NonInstantiableClock)[1] 
  <class 'oopp.NonInstantiable'>

Therefore ``super(Clock,NonInstantiableClock).__new__`` retrieves the 
``NonInstantiable.__new__`` method:

  >>> super(Clock,NonInstantiableClock).__new__
  <function __new__ at 0x81b293c>
  >>> NonInstantiable.__new__
  <function __new__ at 0x81b293c>

It must be pointed out that ``super(C,S)`` is equivalent but not the same 
than ``ancestor(C,S)[1]``, since it does not return the superclass: 
it returns a super object, instead:

  >>> super(Clock,NonInstantiableClock)
  <super: <class 'Clock'>, <type object>>

  #<oopp.py>

  #class Super(super):
  #    def __init__(self,C,S=None):
  #        super(Super,self).__init__(C,S)
  #        self.__name__="Super(%s)" % C.__name__

  #</oopp.py>

Finally, there is little quirk of super:

  >>> class C(PrettyPrinted): pass
  >>> s=super(C,C())
  >>> s.__str__()

but 

  >>> str(s) # idem for print s
  "<super: <class 'C'>, <C object>>"

Idem for non-pre-existing methods:

  >>> class D(list): pass
  ...
  >>> s=super(D,D())
  >>> s.__len__()
  0
  >>> len(s) #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: len() of unsized object

The same problem comes with ``__getattr__``:

  >>> class E(object):
  ...     def __getattr__(self,name):
  ...             if name=='__len__': return lambda:0
  ...
  >>> e=E()
  >>> e.__len__()
  0
  >>> len(e) # error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: len() of unsized object

Counting instances
----------------------------------------------------------------------
 
 .. line-block::

  *Everything should be built top-down, except the first time.*
  -- Alan Perlis

Multiple inheritance adds a step further to the bottom-up philosophy and
it makes appealing the idea of creating classes with the only 
purpose of being derived. Whereas in the top-down approach one starts
with full featured standalone classes, to be further refined, in the
mix-in approach one starts with bare bone classes, providing very simple 
or even trivial features, with the purpose of providing 
basic reusable components in multiple inheritance hierarchies.
At the very end, the idea is to generate a library of *mixin* classes, to be
composed with other classes. We already saw a couple of examples of
mixin classes: 'NonInstantiable' and 'Customizable'. In this paragraph
I will show three other examples: 'WithCounter','Singleton' and
'AvoidDuplication'.

A common requirement for a class is the ability to count the number of its
instances. This is a quite easy problem: it is enough to increments a counter 
each time an instance of that class is initialized. However, this idea can
be implemented in the wrong way. i.e. naively one could implement
counting capabilities in a class without such capabilities by modifying the
``__init__`` method explicitly in the original source code. 
A better alternative is to follow the bottom-up approach and to implement 
the counting feature in a separate mix-in class: then the feature can be 
added to the original class via multiple inheritance, without touching 
the source.
Moreover, the counter class becomes a reusable components that can be
useful for other problems, too. In order to use the mix-in approach, the 
``__new__`` method of the counter class must me cooperative, and preferably
via an anonymous super call.

 ::

  #<oopp.py>

  class WithCounter(object): 
      """Mixin class counting the total number of its instances and storing 
       it in the class attribute counter."""
  
      counter=0 # class attribute (or static attribute in C++/Java terms)
   
      def __new__(cls,*args,**kw):
          cls.counter+=1 # increments the class attribute
          return super(cls.__this,cls).__new__(cls,*args,**kw)  
          #anonymous cooperative call to the superclass's method __new__

  reflective(WithCounter)

  #</oopp.py>

Each time an instance of 'WithCounter' is initialized, the counter 'count' is
incremented and when 'WithCounter' is composed trough multiple inheritance, 
its '__new__'  method cooperatively invokes the ``__new__`` method 
of the other components.

For instance, I can use 'WithCounter' to implement a 'Singleton', i.e. 
a class that can have only one instance. This kind of classes can be
obtained as follows:

 ::

  #<oopp.py>

  class Singleton(WithCounter):
      "If you inherit from me, you can only have one instance"
      def __new__(cls,*args,**kw):
          if cls.counter==0: #first call
              cls.instance=super(cls.__this,cls).__new__(cls,*args,**kw)
          return cls.instance

  reflective(Singleton)

  #</oopp.py>

As an application, I can create a 
class ``SingleClock`` that inherits from ``Clock`` 
*and* from ``Singleton``. This means that ``SingleClock`` is both a 
'Clock' and a 'Singleton', i.e. there can be only a clock:

  >>> from oopp import Clock,Singleton
  >>> class SingleClock(Clock,Singleton): pass
  ...
  >>> clock1=SingleClock()
  >>> clock2=SingleClock()
  >>> clock1 is clock2
  True

Instantiating many clocks is apparently possible (i.e. no error
message is given) but you always obtain the same instance. This makes
sense, since there is only one time on the system and a single
clock is enough.

A variation of the 'Singleton' is a class that generates a new
instance only when a certain condition is satisfied. Suppose for instance
one has a 'Disk' class, to be instantiated with the syntax 
``Disk(xpos,ypos,radius)``.
It is clear that two disks with the same radius and the same position in
the cartesian plane, are essentially the same disk (assuming there are no
additional attributes such as the color). Therefore it is a vaste of memory
to instantiate two separate objects to describe the same disk. To solve
this problem, one possibility is to store in a list the calling arguments.
When it is time to instanciate a new objects with arguments args = xpos,ypos,
radius, Python should check if a disk with these arguments has already
been instanciated: in this case that disk should be returned, not a new
one. This logic can be elegantly implemented in a mix-in class such as the 
following (compare with the ``withmemory`` wrapper in chapter 2):

 ::

  #<oopp.py>

  class AvoidDuplication(object):
      def __new__(cls,*args,**kw):
          return super(cls.__this,cls).__new__(cls,*args,**kw) 
      __new__=withmemory(__new__) # collects the calls in __new__.result

  reflective(AvoidDuplication)

  #</oopp.py>

Notice that 'AvoidDuplication' is introduced with the only purpose of
giving its functionality to 'Disk': in order to reach this goal, it is enough 
to derive 'Disk' from this class and our previously
introduced 'GeometricFigure' class by writing something like

  >>> from oopp import *
  >>> class Disk(GeometricFigure,AvoidDuplication): 
  ...     def __init__(self,xpos,ypos,radius):
  ...         return super(Disk,self).__init__('(x-x0)**2+(y-y0)**2 <= r**2', 
  ...                                          x0=xpos,y0=ypos,r=radius)

Now, if we create a disk

  >>> c1=Disk(0,0,10) #creates a disk of radius 10

it is easy enough to check that trying to instantiate a new disk with the
*same* arguments return the old disk:

  >>> c2=Disk(0,0,10) #returns the *same* old disk
  >>> c1 is c2
  True

Here, everything works, because through the 
cooperative ``super`` mechanism, ``Disk.__init__`` calls 
``AvoidDuplication.__init__`` that calls ``GeometricFigure.__init__`` 
that in turns initialize the disk. Inverting the order of
'AvoidDuplication' and 'GeometricFigure' would case a disaster, since
``GeometricFigure.__init__`` would override ``AvoidDuplication.__init__``.

Alternatively, one could use the object factory 'Makeobj' implemented in 
chapter 3:
  
  >>> class NonDuplicatedFigure(GeometricFigure,AvoidDuplication): pass
  >>> makedisk=Makeobj(NonDuplicatedFigure,'(x-x0)**2/4+(y-y0)**2 <= r**2')
  >>> disk1=makedisk(x0=38,y0=7,r=5)
  >>> disk2=makedisk(x0=38,y0=7,r=5)
  >>> disk1 is disk2
  True

Remark: it is interesting to notice that the previous approach would not work
for keyword arguments, directly, since dictionary are unhashable.

The pizza-shop example 
----------------------------------------------------------------

Now it is time to give a non-trivial example of multiple inheritance with
cooperative and non-cooperative classes. The point is that multiple 
inheritance can easily leads to complicated hierarchies: where the
resolution order of methods is far from being obvious and actually
can give bad surprises.

To explain the issue, let me extend the program for the pizza-shop owner of
chapter 4, by following the bottom-up approach and using anonymous
cooperative super calls.
In this approach, one starts from the simplest thing. 
It is clear that the pizza-shop owner has interest in recording all the 
pizzas he sell. 
To this aim, he needs a class providing logging capabilities: 
each time a new instance is created, its features are stored in a log file. In
order to count the total number of instances, 'WithLogger' must derive from
the 'WithCounter' class. In order to have a nicely printed message,
'WithLogger' must derive from 'PrettyPrinted'. Finally, 
since 'WithLogger' must be a general purpose 
class that I will reuse in other problem as a mixin class, it must be 
cooperative. 'WithLogger' can be implemented as follows:

 ::

  #<oopp.py>

  class WithLogger(WithCounter,PrettyPrinted):
      """WithLogger inherits from WithCounter the 'count' class attribute; 
      moreover it inherits '__str__' from PrettyPrinted"""
      logfile=sys.stdout #default
      verboselog=False #default
      def __init__(self,*args,**kw): 
          super(self.__this,self).__init__(*args,**kw) # cooperative
          dic=attributes(self) # non-special attributes dictionary
          print >> self.logfile,'*'*77
          print >> self.logfile, time.asctime()
          print >> self.logfile, "%s. Created %s" % (type(self).counter,self)
          if self.verboselog:
              print >> self.logfile,"with accessibile non-special attributes:"
              if not dic: print >> self.logfile,"<NOTHING>",
              else: print >> self.logfile, pretty(dic)

  reflective(WithLogger)

  #</oopp.py>

Here I could well use ``super(self.__this,self).__init__(*args,**kw)`` 
instead of ``super(self.__this,self).__init__(*args,**kw)``, nevertheless 
the standard ``super`` works in this case and I can use it with better 
performances.
Thanks to the power of multiple inheritance, we may give logging features
to the 'CustomizablePizza' class defined in chapter 4 
with just one line of code:

  >>> from oopp import *
  >>> class Pizza(WithLogger,CustomizablePizza):
  ...     "Notice, WithLogger is before CustomizablePizza"
  >>> Pizza.With(toppinglist=['tomato'])('small')
  ****************************************************************************
  Sat Feb 22 14:54:44 2003
  1. Created <Pizza>
  <__main__.Pizza object at 0x816927c>

It is also possible to have a more verbose output:

  >>> Pizza.With(verboselog=True)
  <class '__main__.Pizza'>
  >>> Pizza('large')
  ****************************************************************************
  Sat Feb 22 14:59:51 2003
  1. Created <Pizza>
  with accessibile non-special attributes: 
  With = <bound method type.customized of <class '__main__.Pizza'>>
  baseprice = 1
  count = 2
  formatstring = %s
  logfile = <open file '<stdout>', mode 'w' at 0x402c2058>
  price = <bound method Pizza.price of <__main__.Pizza object at 0x402f6c8c>>
  size = large
  sizefactor = {'small': 1, 'large': 3, 'medium': 2}
  topping_unit_price = 0.5
  toppinglist = ['tomato']
  toppings_price = <bound method Pizza.toppings_price of 
    <__main__.Pizza object at 0x402f6c8c>>
  verboselog = True
  with = <bound method Pizza.customized of 
  <__main__.Pizza object at 0x402f6c8c>>
  <__main__.Pizza object at 0x401ce7ac>

However, there is a problem here, since the output is '<Pizza>' and
not the nice 'large pizza with tomato, cost $ 4.5' that we would
expect from a child of 'CustomizablePizza'. The solution to the
puzzle is given by the MRO:

  >>> Pizza.mro()
  [<class '__main__.Pizza'>, <class 'oopp.WithLogger'>, 
   <class 'oopp.WithCounter'>, <class 'oopp.PrettyPrinted'>,
   <class 'oopp.CustomizablePizza'>, <class 'oopp.GenericPizza'>, 
   <class 'oopp.Customizable'>,  <type 'object'>]

The inheritance graph is rather complicated:

 ::
                                 
                              object  7

                           /     /   \     \
                         /      /     \      \
                       /       /       \       \
                     /        /         \        \
                   /         /           \         \
                 /          /             \          \
               /           /               \           \
             /            /                 \            \
           /             /                   \             \
    2  WithCounter   PrettyPrinted 3    GenericPizza 5  Customizable 6
      (__new__)    (__str__,__init__)      (__str__)       /              
          \            /                       /        /    
           \          /                       /      /         
            \        /                       /    /
             \      /                       /  /
              \    /            CustomizablePizza  4
               \  /                      /       
       1     WithLogger                /
             (__init__)              /        
                  \                /
                   \             /
                    \          /
                     \       /
                      \    /
  
                      Pizza  O
                              

As we see, the precedence in the resolution of methods is far from being 
trivial. It is denoted in the graph with numbers
from 0 to 7: first the methods of 'Pizza' (level 0), then the methods of 
'WithLogger' (level 1), then the methods of 'WithCounter' (level 2),  then 
the methods of 'PrettyPrinted' (level 3), then the methods of
'CustomizablePizza' (level 4), then the methods of 'GenericPizza' (level 5),
then the level of 'Customizable' (level 6), finally the 'object' methods 
(level 7). 

The reason why the MRO is so, can be understood by studying 
appendix 1.

We see that the ``__init__`` methods of 'WithLogger' and 
the ``__new__`` method of 'WithCounter' are cooperative. 
``WithLogger.__init__`` 
calls ``WithCounter.__init__`` that is
inherited from ``CustomizablePizza.__init__`` which is not cooperative, 
but this is not dangerous since ``CustomizablePizza.__init__`` does not need 
to call any other ``__init__``. 

However, ``PrettyPrinted.__str__`` and ``GenericPizza.__str__`` are not
cooperative and since 'PrettyPrinted' precedes 'GenericPizza', the
``GenericPizza.__str__`` method is overridden, which is bad.

If  ``WithLogger.__init__``  and ``WithCounter.__new__`` were not 
cooperative, they would therefore badly breaking the program.

The message is: when you inherit from both cooperative and non-cooperative
classes, put cooperative classes first. The will be fair and will not
blindly override methods of the non-cooperative classes.


With multiple inheritance you can reuse old code a lot,
however the price to pay, is to have a non-trivial hierarchy. If from
the beginning we knew that 'Pizza' was needing a 'WithLogger', 
a 'WithCounter' and the
ability to be 'Customizable' we could have put everything in an unique
class. The problem is that in real life one never knows ;) 
Fortunately, Python dynamism allows to correct design mistakes

Remark: in all text books about inheritance, the authors always stress
that inheritance should be used as a "is-a" relation, not
and "has-a" relation. In spite of this fact, I have decided to implement
the concept of having a logger (or a counter) via a mixin class. One 
should not blindly believe text books ;)

Fixing wrong hierarchies
-----------------------------------------------------------------------------

A typical metaprogramming technique, is the run-time modification of classes.
As I said in a previous chapter, this feature can confuse the programmer and 
should not be abused (in particular it should not be used as a replacement 
of inheritance!); nevertheless, there applications where the ability of 
modifying classes at run time is invaluable: for instance, 
it can be used to correct design mistakes. 

In this case we would like the ``__str__ method`` of 'PrettyPrinted' to be
overridden by ``GenericPizza.__str__``. Naively, this can be solved by
putting 'WithLogger' after 'GenericPizza'. Unfortunately, doing so 
would cause ``GenericPizza.__init__`` to override ``WithLogger.__init__``,
therefore by loosing logging capabilitiesr, unless countermeasures
are taken.

A valid countermeasure could be to replace the non-cooperative
``GenericPizza.__init__`` with a cooperative one. This can miraculously
done at run time in few lines of code:

 ::

  #<oopp.py>

  def coop_init(self,size): # cooperative __init__ for GenericPizza
      self.size=size
      super(self._GenericPizza__this,self).__init__(size)

  GenericPizza.__init__=coop_init # replace the old __init__

  reflective(GenericPizza) # define GenericPizza.__this

  #</oopp.py>

Notice the usage of the fully qualified private attribute
``self._GenericPizza__this`` inside ``coop_init``: since this function 
is defined outside any class, the automatica mangling mechanism cannot 
work and has to be implemented by hand. Notice also that 
``super(self._GenericPizza__this,self)`` could be replaced by
``super(GenericPizza,self)``; however the simpler approach is
less safe against possible future manipulations of the hierarchy. 
Suppose, for example, we want to create a copy of the hierarchy
with the same name but slightly different features (actually,
in chapter 8 we will implement a traced copy of the pizza hierarchy, 
useful for debugging purposes): then, using ``super(GenericPizza,self)``
would raise an error, since self would be an instance of the traced
hierarchy and ``GenericPizza``  the original nontraced class. Using
the form ``super(self._GenericPizza__this,self)`` and making 
``self._GenericPizza__this`` pointing to the traced 'GenericPizza'
class (actually this will happen automatically) the problems goes
away.

Now everything works if 'WithLogger' is put after 'CustomizablePizza'

  >>> from oopp import *
  >>> class PizzaWithLog(CustomizablePizza,WithLogger): pass 
  >>> PizzaWithLog.With(toppinglist=['tomato'])('large')
  ****************************************************************************
  Sun Apr 13 16:19:12 2003
  1. Created large pizza with tomato, cost $ 4.5
  <class '__main__.PizzaWithLog'>

The log correctly  says ``Created large pizza with tomato, cost $ 4.5`` and not
``Created <Pizza>`` as before since now ``GenericPizza.__str__``
overrides ``PrettyPrinted.__str__``. Moreover, the hierarchy is logically
better organized:

  >>> PizzaWithLog.mro()
  [<class '__main__.PizzaWithLog'>, <class 'oopp.CustomizablePizza'>, 
  <class 'oopp.GenericPizza'>,  <class 'oopp.Customizable'>, 
  <class 'oopp.WithLogger'>, <class 'oopp.WithCounter'>, 
  <class 'oopp.PrettyPrinted'>, <type 'object'>]

I leave as an exercise for the reader to make the ``__str__`` methods
cooperative ;)
 
Obviously, in this example it would have been better to correct the
original hierarchy, by leaving 'Beautiful' instantiable from the beginning
(that's why I said the 'Beautiful' is an example of wrong mix-in class): 
nevertheless, sometimes, one has do to with wrong hierarchies written by 
others, and it can be a pain to fix them, both directly by modifying the 
original source code, and indirectly
by inheritance, since one must change all the names, in order to distinghish
the original classes from the fixed ones. In those cases Python
dynamism can save your life. This also allows you enhance original
classes which are not wrong, but that simply don't do something you want
to implement.

Modifying classes at run-time can be trivial, as in the examples I have
shown here, but can also be rather tricky, as in this example

  >>> from oopp import PrettyPrinted
  >>> class PrettyPrintedWouldBe(object): __str__ = PrettyPrinted.__str__
  >>> print PrettyPrintedWouldBe() #error
  Traceback (most recent call last):
   File "<stdin>", line 1, in ?
  TypeError: unbound method __str__() must be called with PrettyPrinted 
  instance as first argument (got nothing instead)

As the error message says, the problem here, is that the 
``PrettyPrinted.__str__`` unbound method, has not received any argument. 
This is because in this
form ``PrettyPrintedWouldBe.__str__`` has been defined as an attribute, 
not as a real method. The solution is to write

  >>> class PrettyPrintedWouldBe(object): 
  ...     __str__ = PrettyPrinted.__dict__['__str__']
  ...
  >>> print PrettyPrintedWouldBe() # now it works
  <PrettyPrintedWouldBe>

This kind of run-time modifications does not work when private variables
are involved:

 ::

  #<changewithprivate.py>

  class C(object):
      __x='C.__init__'
      def __init__(self): 
          print self.__x # okay

  class D(object):
      __x='D.__init__'
      __init__=C.__dict__['__init__'] # error

  class New:
      class C(object):
          __x='New.C.__init__'
          __init__=C.__dict__['__init__'] # okay

  C()
  try: D()
  except AttributeError,e: print e

  #</changewithprivate.py>

Gives as result

 ::

  C.__init__
  'D' object has no attribute '_C__x'
  New.C.__init__

The problem is that when ``C.__dict__['__init__']`` is compiled 
(to byte-code) ``self.__x`` is expanded to ``self._C__x``. However,
when one invokes ``D.__init__``, a D-object is passed, which has
a ``self._D__x`` attribute, but not a ``self._C__x`` attribute (unless
'D' is a subclass of 'C'. Fortunately, Python wisdom

  *Namespaces are one honking great idea -- let's do more of those!*

suggests the right solution: to use a new class with the *same name*
of the old one, but in a different namespace, in order to avoid
confusion. The simplest way to generate a new namespace is to
declare a new class (the class 'New' in this example): then 'New.C'
becomes an inner class of 'New'. Since it has the same name of the
original class, private variables are correctly expanded and one
can freely exchange methods from 'C' to 'New.C' (and viceversa, too).

Modifying hierarchies
-------------------------------------------------------------------------

 ::

  def mod(cls): return cls

  class New: pass

  for c in HomoSapiensSapiens.__mro__:
      setattr(New,c.__name__,mod(c))

Inspecting Python code
-------------------------------------------------------------------------

how to inspect a class, by retrieving useful informations about its
information.

A first possibility is to use the standard ``help`` function.
The problem of this approach is that ``help`` gives too much 
information.

 ::

  #<oopp.py>

  #plaindata=
  plainmethod=lambda m:m #identity function

  class Get(object):
      """Invoked as Get(cls)(xxx) where xxx = staticmethod, classmethod,
      property, plainmethod, plaindata, returns the corresponding 
      attributes as a keyword dictionary. It works by internally calling 
      the routine inspect.classify_class_attrs. Notice that data
      attributes with double underscores are not retrieved 
      (this is by design)."""
      def __init__(self,cls):
          self.staticmethods=kwdict()
          self.classmethods=kwdict()
          self.properties=kwdict()
          self.methods=kwdict()
          self.data=kwdict()
          for name, kind, klass, attr in inspect.classify_class_attrs(cls):
              if kind=='static method':
                  self.staticmethods[name]=attr
              elif kind=='class method':
                  self.classmethods[name]=attr
              elif kind=='property':
                  self.properties[name]=attr
              elif kind=='method':
                  self.methods[name]=attr
              elif kind=='data':
                 if not special(name): self.data[name]=attr
      def __call__(self,descr): #could be done with a dict
          if descr==staticmethod: return self.staticmethods 
          elif descr==classmethod: return self.classmethods
          elif descr==property: return self.properties 
          elif descr==plainmethod: return self.methods
          elif descr==plaindata: return self.data
          else: raise SystemExit("Invalid descriptor")

  #</oopp.py>

With similar tricks one can automatically recognize cooperative methods:
#it is different, (better NOT to use descriptors)

 ::

  #<oopp.py>
  
  #class Cooperative(Class):
  #    __metaclass__ = WithWrappingCapabilities
  #
  #    def cooperative(method):
  #         """Calls both the superclass method and the class
  #         method (if the class has an explicit method). 
  #         Works for methods returning None."""
  #         name,cls=Cooperative.parameters # fixed by the meta-metaclass
  #         def _(*args,**kw):
  #            getattr(super(cls,args[0]),name)(*args[1:],**kw) 
  #            if method: method(*args,**kw) # call it
  #         return _
  #    
  #    cooperative=staticmethod(cooperative)

  
  #</oopp.py>

 ::

  #<oopp.py>

  def wrapH(cls):
      for c in cls.__mro__[:-2]:
          tracer.namespace=c.__name__
          new=vars(c).get('__new__',None)
          if new: c.__new__=tracedmethod(new)

  #</oopp.py>


THE MAGIC OF METACLASSES - PART I
==========================================================================

 .. line-block::

    *Metaclasses are deeper magic than 99% of users should ever
    worry about.  If you wonder whether you need them, you don't
    (the people who actually need them know with certainty that
    they need them, and don't need an explanation about why).*
    --Tim Peters

Python always had metaclasses, since they are inherent to its object
model. However, before Python 2.2, metaclasses where tricky and their
study could cause the programmer's brain to explode [#]_. Nowadays, 
the situation has changed, and the reader should be able to understand 
this chapter without risk for his/her brain (however I do not give any 
warranty ;)

Put it shortly, metaclasses give to the Python programmer
complete control on the creation of classes. This simple statement
has far reaching consequences, since the ability of interfering with
the process of class creation, enable the programmer to make miracles.

In this and in the following chapters, I will show some of these
miracles.

This chapter will focus on subtle problems of metaclasses in inheritance
and multiple inheritance, including multiple inheritance of metaclasses
with classes and metaclasses with metaclasses.

The next chapter will focus more on applications.


.. [#] Metaclasses in Python 1.5 [A.k.a the killer joke] 
       http://www.python.org/doc/essays/metaclasses/

There is very little documentation about metaclasses, except Guido's
essays and the papers by David Mertz and myself published in IBMdeveloperWorks

  http://www-106.ibm.com/developerworks/library/l-pymeta.html

Metaclasses as class factories
------------------------------------------------------------------------

In the Python object model (inspired from the Smalltalk, that had metaclasses 
a quarter of century ago!) classes themselves are objects. 
Now, since objects are instances of classes, that means that classes 
themselves can be seen as instances of special classes called *metaclasses*.
Notice that things get hairy soon, since by following this idea, one could 
say the metaclasses themselves are classes and therefore objects;  that 
would mean than even metaclasses can be seen as 
instances of special classes called meta-metaclasses. On the other hand,
meta-meta-classes can be seen as instances of meta-meta-metaclasses,
etc. Now, it should be obvious why metaclasses have gained such a 
reputation of brain-exploders ;). However, fortunately, the situation 
is not so bad in practice, since the infinite recursion of metaclasses is 
avoided because there is a metaclass that is the "mother of all metaclasses": 
the built-in metaclass *type*. 'type' has the property of being its own 
metaclass, therefore the recursion stops. Consider for instance the following
example:

  >>> class C(object): pass # a generic class
  >>> type(C) #gives the metaclass of C
  <type 'type'>
  >>> type(type(C)) #gives the metaclass of type
  <type 'type'>

The recursion stops, since the metaclass of 'type' is 'type'.
One cool consequence of classes being instances of 'type', 
is that since *type* is a subclass of object,

  >>> issubclass(type,object)
  True 

any Python class is not only a subclass of ``object``, but also
an instance of 'object':

  >>> isinstance(C,type)
  True
  >>> isinstance(C,object) 
  True
  >>> issubclass(C,object) 
  True

Notice that 'type' is an instance of itself (!) and therefore of 'object':

  >>> isinstance(type,type) # 'type' is an instance of 'type'
  True
  >>> isinstance(type,object) # therefore 'type' is an instance of 'object'
  True

As it is well known, ``type(X)`` returns the type of ``X``; however, 
``type`` has also a second form in which it acts as a class factory.
The form is ``type(name,bases,dic)`` where ``name`` is the name of
the new class to be created, bases is the tuple of its bases and dic
is the class dictionary. Let me give a few examples:

  >>> C=type('C',(),{})
  >>> C
  <class '__main__.C'>
  >>> C.__name__
  'C'
  >>> C.__bases__
  (<type 'object'>,)
  >>> C.__dict__
  <dict-proxy object at 0x8109054>

Notice that since all metaclasses inherits from ``type``, as a consequences
all metaclasses can be used as class factories. 

A fairy tale example will help in understanding the concept
and few subtle points on how attributes are transmitted from metaclasses
to their instances.

Let me start by defining a 'Nobility' metaclass :

  >>> class Nobility(type): attributes="Power,Richness,Beauty"

instances of 'Nobility' are classes such 'Princes', 'Dukes', 'Barons', etc.

  >>> Prince=Nobility("Prince",(),{})

Instances of 'Nobility' inherits its attributes, just as instances of normal
classes inherits the class docstring:

  >>> Prince.attributes
  'Power,Richness,Beauty'

Nevertheless, 'attributes' will not be retrieved by the ``dir`` function:

  >>> print dir(Prince)
  ['__class__', '__delattr__', '__dict__', '__doc__', '__getattribute__', 
   '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__repr__', 
   '__setattr__', '__str__', '__weakref__']

However, this is a limitation of ``dir``, in reality ``Prince.attributes``
is there. On the other hand, the situation is different for a specific 
'Prince' object

  >>> charles=Prince()
  >>> charles.attributes #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  AttributeError: 'Prince' object has no attribute 'attributes'

The transmission of metaclass attributes is not transitive:
instances of the metaclass inherits the attributes, but not the instances 
of the instances. This behavior is by design and is needed in order to avoid 
troubles with special methods. This point will be throughly 
explained in the last paragraph. For the moment, I my notice that the
behaviour is reasonable, since the abstract qualities  'Power,Richness,Beauty'
are more qualities of the 'Prince' class than of one specific representative.
They can always be retrieved via the ``__class__`` attribute:

  >>> charles.__class__.attributes
  'Power,Richness,Beauty'

Le me now define a metaclass 'Froggyness':

  >>> class Frogginess(type): attributes="Powerlessness,Poverty,Uglyness"

Instances of 'Frogginess' are classes like 'Frog', 'Toad', etc.

  >>> Frog=Frogginess("Frog",(),{})
  >>> Frog.attributes
  'Powerlessness,Poverty,Uglyness'

However, in Python miracles can happen:

  >>> def miracle(Frog): Frog.__class__=Nobility
  >>> miracle(Frog); Frog.attributes
  'Powerlessness,Richness,Beauty'

In this example a miracle happened on the class 'Frog', by changing its
(meta)class to 'Nobility'; therefore its attributes have changed accordingly.

However, there is subtle point here. Suppose we explicitly specify the 'Frog'
attributes, in such a way that it can be inherited by one of its specific
representative:

  >>> Frog.attributes="poor, small, ugly"
  >>> jack=Frog(); jack.attributes
  'poor, small, ugly'

Then the miracle cannot work:

 ::

  #<fairytale2.py>

  class Nobility(type): attributes="Power, Richness, Beauty"
  Prince=Nobility("Prince",(),{})
  charles=Prince()

  class Frogginess(type): attributes="Inpuissance, Poverty, Uglyness"
  Frog=Frogginess("Frog",(),{})
  Frog.attributes="poor, small, ugly"
  jack=Frog()

  def miracle(Frog): Frog.__class__=Nobility

  miracle(Frog)

  print "I am",Frog.attributes,"even if my class is",Frog.__class__

  #</fairytale2.py>

Output:

 ::

  I am poor, small, ugly even if my class is <class '__main__.Nobility'>

The reason is that Python first looks at specific attributes of an object
(in this case the object is the class 'Frog') an only if they are not found, 
it looks at the attributes of its class (here the metaclass 'Nobility').Since 
in this example the 'Frog' class has explicit attributes, the
result is ``poor, small, ugly``. If you think a bit, it makes sense.

Remark:

In Python 2.3 there are restrictions when changing the ``__class__`` 
attribute for classes:

  >>> C=type('C',(),{})
  >>> C.__class__ = Nobility #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: __class__ assignment: only for heap types

Here changing ``C.__class__`` is not allowed, since 'C' is an instance
of the built-in metaclass 'type'. This restriction, i.e. the fact that 
the built-in metaclass cannot be changed, has been imposed for
security reasons, in order to avoid dirty tricks with the built-in
classes. For instance, if it was possible to change the metaclass
of the 'bool' class, we could arbitrarily change the behavior of
boolean objects. This could led to abuses. 
Thanks to this restriction,
the programmer is always sure that built-in classes behaves as documented.
This is also the reason why 'bool' cannot be subclassed:

  >>> print bool.__doc__ # in Python 2.2 would give an error
  bool(x) -> bool
  Returns True when the argument x is true, False otherwise.
  The builtins True and False are the only two instances of the class bool.
  The class bool is a subclass of the class int, and cannot be subclassed.

In any case, changing the class of a class is not a good idea, since it
does not play well with inheritance, i.e. changing the metaclass of a base 
class does not change the metaclass of its children:

  >>> class M1(type): f=lambda cls: 'M1.f' #metaclass1
  >>> class M2(type): f=lambda cls: 'M2.f' #metaclass2
  >>> B=M1('B',(),{}) # B receives M1.f
  >>> class C(B): pass #C receives M1.f
  >>> B.f()
  'M1.f'
  B.__class__=M2 #change the metaclass
  >>> B.f() #B receives M2.f
  'M2.f'
  C.f() #however C does *not* receive M2.f
  >>> C.f()
  'M1.f'
  >>> type(B)
  <class '__main__.M2'>
  >>> type(C)
  <class '__main__.M1'>

Metaclasses as class modifiers
----------------------------------------------------------------------  

The interpretation of metaclasses in terms of class factories is quite
straightforward and I am sure that any Pythonista will be at home 
with the concept. However, metaclasses have such a reputation of black 
magic since their typical usage is *not* as class factories, but as 
*class modifiers*. This means that metaclasses are typically
used to modify *in fieri* classes. The trouble is that the
modification can be utterly magical.
Here there is another fairy tale example showing the syntax
(via the ``__metaclass__`` hook) and the magic of the game:

 ::

  #<oopp.py>

  class UglyDuckling(PrettyPrinted):
      "A plain, regular class"
      formatstring="Not beautiful, I am %s"

  class MagicallyTransformed(type):
      "Metaclass changing the formatstring of its instances"
      def __init__(cls,*args):
          cls.formatstring="Very beautiful, since I am %s"
          
  class TransformedUglyDuckling(PrettyPrinted):
      "A class metamagically modified"
      __metaclass__ = MagicallyTransformed
      formatstring="Not beautiful, I am %s" # will be changed

  #</oopp.py>

  >>> from oopp import *
  >>> print UglyDuckling()
  Not beautiful, I am <UglyDuckling>

In this example, even if in 'TransformedUglyDuckling' we explicitely
set the formatstring to  "Not beautiful, I am %s", the metaclass changes 
it to "Very beautiful, even if I am %s" and thus

  >>> print TransformedUglyDuckling() # gives
  Very beautiful, since I am <TransformedUglyDuckling>

Notice that the ``__metaclass__`` hook passes to the metaclass 
``MagicallyTransformed`` the name, bases and dictionary of the class 
being created, i.e. 'TransformedUglyDucking'.

Metaclasses, when used as class modifiers, act *differently*
from functions, when inheritance is
involved. To clarify this subtle point, consider a subclass 'Swan' 
of 'UglyDuckling':


  >>> from oopp import *
  >>> class Swan(UglyDuckling): 
  ...     formatstring="Very beautiful, I am %s"
  >>> print Swan()
  Very beautiful, I am <Swan>

Now, let me define a simple function acting as a class modifier:

  >>> def magicallyTransform(cls): 
  ...    "Modifies the class formatstring"
  ...    customize(cls,formatstring="Very beautiful, even if I am %s")
  ...    return cls

The function works:

  >>> magicallyTransform(UglyDuckling)
  >>> print UglyDuckling()
  Very beautiful, even if I am <UglyDuckling>

This approach is destructive, since we cannot have the original 
and the transformed class at the same time, and has potentially bad side
effects in the derived classes. Nevertheless, in this case it works
and it is not dangereous for the derived class 'Swan', since 'Swan' 
explicitly overrides the 'formatstring' attribute and doesn't care about 
the change in 'UglyDuckling.formatstring'. Therefore the output
of 

  >>> print Swan()
  Very beautiful, I am <Swan>

is still the same as before the action of the function ``magicallyTransform``.
The situation is quite different if we use the 'MagicallyTransformed'
metaclass:

  >>> from oopp import *
  >>> class Swan(TransformedUglyDuckling): 
  ...     formatstring="Very beautiful, I am %s"

  >>> print TransformedUglyDuckling() 
  Very beautiful, since I am <UglyDuckling>
  >>> print Swan() # does *not* print "Very beautiful, I am <Swan>"
  Very beautiful, since I am <Swan> 

Therefore,  not only the metaclass has magically transformed the 
'TransformedUglyDuckling.formatstring', it has also transformed the 
'Swan.formatstring'! And that, despite the fact that 
'Swan.formatstring' is explicitly set. 

The reason for this behaviour is that since 'UglyDuckling' is a base 
class with metaclass 'MagicallyTransformed', and since 'Swan' inherits from
'UglyDuckling', then 'Swan' inherits the metaclass 'MagicallyTransformed',
which is automatically called at 'Swan' creation time.
That's the reason why metaclasses are much more magical and much 
more dangerous than
functions: functions do not override attributes in the derived classes,
metaclasses do, since they are automagically called at the time of
creation of the subclass. In other words, functions are explicit,
metaclasses are implicit. Nevertheless, this behavior can be pretty
useful in many circumstances, and it is a feature, not a bug. In the
situations where this behavior is not intended, one should use a function,
not a metaclass. In general, metaclasses are better than functions,
since metaclasses are classes and as such they can inherit one from each 
other. This means that one can improve a basic metaclass trough 
(multiple) inheritance, with *reuse* of code.

A few caveats about the usage of metaclasses
------------------------------------------------------------------------

Let me start with some caveats about the ``__metaclass__`` hook, which
commonly used and quite powerful, but also quite dangereous. 

Let's imagine a programmer not
knowing about metaclasses and looking at the 'TransformedUglyDuckling'
code (assuming there are no comments): she would probably think
that "__metaclass__" is some special attribute used for introspection
purposes only, with no other effects, and she would probably expect
the output of the script to be "Not much, I am the class 
TransformedUglyDucking" whereas it is exacly the contrary! In other
words, when metaclasses are involved,  *what you see, is not what you get*.
The situation is even more implicit when the metaclass is inherited
from some base class, therefore lacking also the visual clue of the hook.

For these reasons, metaclasses are something to be used with great care; 
they can easily make your code unreadable and confuse inexpert programmers. 
Moreover, it is more difficult to debug programs involving metaclasses, since
methods are magically transformed by routines defined in the metaclass,
and the code you see in the class is *not* what Python sees. I think
the least confusing way of using metaclasses, is to concentrate all
the dynamics on them and to write empty classes except for the
metaclass hook. If you write a class with no methods such as

 ::

  class TransformedUglyDuckling(object):
      __metaclass__=MagicallyTransformed

then the only place to look at, is the metaclass. I have found extremely
confusing to have some of the methods defined in the class and some in
the metaclass, especially during debugging. 

Another point to make, is that the ``__metaclass__``
hook should not be used to modify pre-existing classes, 
since it requires modifying the source code (even if it is enough to 
change one line only). Moreover, it is confusing, since adding a 
``__metaclass__`` attribute *after* the class creation would not do the job:

  >>> from oopp import UglyDuckling, MagicallyTransformed
  >>> UglyDuckling.__metaclass__=MagicallyTransformed
  >>> print UglyDuckling()
  "Not much, I am the class UglyDuckling"

The reason is that we have to think of UglyDuckling as an instance of 
``type``, the built-in metaclasses; merely adding a ``__metaclass__`` 
attribute does not re-initialize the class.
The problem is elegantly solved by avoiding the hook and creating
an enhanced copy of the original class trough ``MagicallyTransformed``
used as a class factory.

  >>> name=UglyDuckling.__name__
  >>> bases=UglyDuckling.__bases__
  >>> dic=UglyDuckling.__dict__.copy()
  >>> UglyDuckling=MagicallyTransformed(name,bases,dic)

Notice that I have recreated 'UglyDuckling', giving to the new class
the old identifier. 

  >>> print UglyDuckling()
  Very beautiful, since I am <UglyDuckling>>

The metaclass of this new 'UglyDuckling' has been specified and will 
accompanies all future children of 'UglyDuckling':

  >>> class Swan(UglyDuckling): pass
  ...
  >>> type(Swan)
  <class '__main__.MagicallyTransformed'>

Another caveat, is in the overridding of `` __init__`` in the metaclass.
This is quite common in the case of metaclasses called trough the
``__metaclass__`` hook mechanism, since in this case the class
has been already defined (if not created) in the class statement,
and we are interested in initializing it, more than in recreating
it (which is still possible, by the way). 
The problem is that overriding ``__init__`` has severe limitations 
with respect to overriding ``__new__``,
since the 'name', 'bases' and 'dic' arguments cannot be directly
changed. Let me show an example:

 ::

  #<init_in_metaclass.py>

  from oopp import *

  class M(type):
      "Shows that dic cannot be modified in __init__, only in __new__"
      def __init__(cls,name,bases,dic):
          name='C name cannot be changed in __init__'
          bases='cannot be changed'
          dic['changed']=True

  class C(object):
      __metaclass__=M
      changed=False

  print C.__name__  # => C
  print C.__bases__ # => (<type 'object'>,)
  print C.changed   # => False

  #</init_in_metaclass.py>

The output of this script is ``False``: the dictionary cannot be changed in 
``__init__`` method. However, replacing ``dic['changed']=True`` with 
``cls.changed=True`` would work. Analougously, changing  ``cls.__name__`` 
would work. On the other hand, ``__bases__`` is a read-only attribute and 
cannot be changed once the class has been created, therefore there is no 
way it can be touched in ``__init__``. However, ``__bases__`` could be
changed in ``__new__`` before the class creation.

Metaclasses and inheritance 
-------------------------------------------------------------------------

It is easy to get confused about the difference between a metaclass
and a mix-in class in multiple inheritance, since 
both are denoted by adjectives and both share the same idea of 
enhancing a hierarchy. Moreover, both mix-in classes and metaclasses 
can be inherited in the whole hierarchy.
Nevertheless, they behaves differently
and there are various subtle point to emphasize. We have already
noticed in the first section that attributes of a metaclass
are transmitted to its instances, but not to the instances of the
instances, whereas the normal inheritance is transitive: the 
grandfather transmits its attributes to the children and to the grandchild 
too. The difference can be represented with the following picture, where
'M' is the metaclass, 'B' a base class, 'C' a children of 'B'
and c an instance of 'C':

 ::

           M (attr)         B (attr)   
           :                |
           C (attr)         C (attr)    
           :                :
           c ()             c (attr)    

Notice that here the relation of instantiation is denoted by a dotted line. 

This picture is valid when C has metaclass M but not base class, on when C
has base class but not metaclass. However, what happens whrn the class C has 
both a metaclass M and a base class B ?

  >>> class M(type): a='M.a'
  >>> class B(object): a='B.a'
  >>> class C(B): __metaclass__=M
  >>> c=C()

The situation can be represented by in the following graph,

 ::

      (M.a)   M   B  (B.a)
              :  /
              : /
         (?)  C
              :
              :
         (?)  c


Here the metaclass M and the base class B are fighting one against the other.
Who wins ? C should inherit the attribute 'B.a' from its base B, however,
the metaclass would like to induce an attribute 'M.a'.
The answer is that the inheritance constraint wins on the metaclass contraint:

  >>> C.a
  'B.a'
  >>> c.a
  'B.a'

The reason is the same we discussed in the fairy tale example: 'M.a' is
an attribute of the metaclass, if its instance C has already a specified
attributed C.a (in this case specified trough inheritance from B), then
the attribute is not modified. However, one could *force* the modification:

  >>> class M(type):
  ...     def __init__(cls,*args): cls.a='M.a'
  >>> class C(B): __metaclass__=M
  >>> C.a
  'M.a'

In this case the metaclass M would win on the base class B. Actually,
this is not surprising, since it is explicit. What could be surprising,
had we not explained why inheritance silently wins, is that

  >>> c.a
  'B.a'

This explain the behaviour for special methods like  
``__new__,__init__,__str__``, 
etc. which are defined both in the class and the metaclass with the same 
name (in both cases,they are inherited from ``object``).

In the chapter on objects, we learned that the printed representation of
an object can be modified by overring the ``__str__`` methods of its
class. In the same sense, the printed representation of a class can be 
modified by overring the ``__str__`` methods of its metaclass. Let me show an 
example:

 ::

  #<oopp.py>

  class Printable(PrettyPrinted,type):
     """Apparently does nothing, but actually makes PrettyPrinted acting as
        a metaclass."""

  #</oopp.py>

Instances of 'Printable' are classes with a nice printable representation:

  >>> from oopp import Printable 
  >>> C=Printable('Classname',(),{})
  >>> print C
  Classname

However, the internal string representation stays the same:

  >>> C # invokes Printable.__repr__
  <class '__main__.Classname'>

Notice that the name of class 'C' is ``Classname`` and not 'C' !

Consider for instance the following code:

  >>> class M(type):
  ...    def __str__(cls):
  ...        return cls.__name__
  ...    def method(cls):
  ...        return cls.__name__
  ...
  >>> class C(object):
  ...    __metaclass__=M
  >>> c=C()

In this case the ``__str__`` method in ``M`` cannot override the 
``__str__`` method in C, which is inherited from ``object``.
Moreover, if you experiment a little, you will see that
 
  >>> print C # is equivalent to print M.__str__(C)
  C
  >>> print c # is equivalent to print C.__str__(c)
  <__main__.C object at 0x8158f54>


The first ``__str__`` is "attached" to the metaclass and the
second to the class. 

Consider now the standard method "method". It is both attached to the
metaclass 

  >>> print M.method(C)
  C

and to the class

  >>> print C.method() #in a sense, this is a class method, i.e. it receives 
  C                    #the class as first argument

Actually it can be seen as a class method of 'C' (cfr. Guido van Rossum
"Unifying types and classes in Python 2.2". When he discusses
classmethods he says: *"Python also has real metaclasses, and perhaps 
methods defined in a metaclass have more right to the name "class method"; 
but I expect that most programmers won't be using metaclasses"*). Actually,
this is the SmallTalk terminology, Unfortunately, in Python the word
``classmethod`` denotes an attribute descriptor, therefore it is better
to call the methods defined in a metaclass *metamethods*, in order to avoid
any possible confusion.

The difference between ``method`` and ``__str__`` is that you cannot use the
syntax

  >>> print C.__str__() #error
  TypeError: descriptor '__str__' of 'object' object needs an argument

because of the confusion with the other __str__; you can only use the
syntax

  >>> print M.__str__(C)

Suppose now I change C's definition by adding a method called "method":

 ::

  class C(object):
      __metaclass__=M
      def __str__(self):
          return "instance of %s" % self.__class__
      def method(self):
          return "instance of %s" % self.__class__

If I do so, then there is name clashing and the previously working
statement print C.method() gives now an error:

 ::

  Traceback (most recent call last):
    File "<stdin>", line 24, in ?
  TypeError: unbound method method() must be called with C instance as
  first argument (got nothing instead)

Conclusion: ``__str__, __new__, __init__`` etc. defined in the metaclass
have name clashing with the standard methods defined in the class, therefore
they must be invoked with the extended syntax (ex. ``M.__str__(C)``),
whereas normal methods in the metaclass with no name clashing with the methods
of the class can be used as class methods (ex. ``C.method()`` instead of
``M.method(C)``).
Metaclass methods are always bound to the metaclass, they bind to the class 
(receiving the class as first argument) only if there is no name clashing with 
already defined methods in the class. Which is the case for ``__str__``,
``___init__``, etc. 

Conflicting metaclasses
----------------------------------------------------------------------------

Consider a class 'A' with metaclass 'M_A' and a class 'B' with 
metaclass 'M_B'; suppose I derive 'C' from 'A' and 'B'. The question is: 
what is the metaclass of 'C' ? Is it 'M_A' or 'M_B' ?

The correct answer (see "Putting metaclasses to work" for a thought 
discussion) is 'M_C', where 'M_C' is a metaclass that inherits from 
'M_A' and 'M_B', as in the following graph:


 .. figure:: fig1.gif

However, Python is not yet that magic, and it does not automatically create 
'M_C'. Instead, it will raise a ``TypeError``, warning the programmer of
the possible confusion:

  >>> class M_A(type): pass
  >>> class M_B(type): pass
  >>> A=M_A('A',(),{})
  >>> B=M_B('B',(),{})
  >>> class C(A,B): pass #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  TypeError: metatype conflict among bases

This is an example where the metaclasses 'M_A' and 'M_B' fight each other
to generate 'C' instead of cooperating. The metatype conflict can be avoided 
by assegning the correct metaclass to 'C' by hand:

  >>> class C(A,B): __metaclass__=type("M_AM_B",(M_A,M_B),{})
  >>> type(C)
  <class '__main__.M_AM_B'>

In general, a class A(B, C, D , ...) can be generated without conflicts only
if type(A) is a  subclass of each of type(B), type(C), ...

In order to avoid conflicts, the following function, that generates
the correct metaclass by looking at the metaclasses of the base
classes, is handy:

 ::

  #<oopp.py>

  metadic={}

  def _generatemetaclass(bases,metas,priority):
      trivial=lambda m: sum([issubclass(M,m) for M in metas],m is type)
      # hackish!! m is trivial if it is 'type' or, in the case explicit
      # metaclasses are given, if it is a superclass of at least one of them
      metabs=tuple([mb for mb in map(type,bases) if not trivial(mb)])
      metabases=(metabs+metas, metas+metabs)[priority]
      if metabases in metadic: # already generated metaclass
          return metadic[metabases]
      elif not metabases: # trivial metabase
          meta=type 
      elif len(metabases)==1: # single metabase
          meta=metabases[0]
      else: # multiple metabases
          metaname="_"+''.join([m.__name__ for m in metabases])
          meta=makecls()(metaname,metabases,{})
      return metadic.setdefault(metabases,meta)

  #</oopp.py>

This function is particularly smart since:

  1. Avoid duplications ..

  2. Remember its results.

We may generate the child of a tuple of base classes with a given metaclass 
and avoiding metatype conflicts thanks to the following ``child`` function:

 ::

  #<oopp.py>

  def makecls(*metas,**options):
      """Class factory avoiding metatype conflicts. The invocation syntax is
      makecls(M1,M2,..,priority=1)(name,bases,dic). If the base classes have 
      metaclasses conflicting within themselves or with the given metaclasses,
      it automatically generates a compatible metaclass and instantiate it. 
      If priority is True, the given metaclasses have priority over the 
      bases' metaclasses"""

      priority=options.get('priority',False) # default, no priority
      return lambda n,b,d: _generatemetaclass(b,metas,priority)(n,b,d)

  #</oopp.py>

Here is an example of usage:

  >>> class C(A,B): __metaclass__=makecls()
  >>> print C,type(C)
  <class 'oopp.AB_'> <class 'oopp._M_AM_B'>

Notice that the automatically generated metaclass does not pollute the 
namespace:

  >>> _M_A_M_B #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  NameError: name '_M_A_M_B' is not defined

It can only be accessed as ``type(C)``.

Put it shortly, the ``child`` function allows to generate a child from bases 
enhanced by different custom metaclasses, by generating under the hood a 
compatibile metaclass via multiple inheritance from the original metaclasses. 
However, this logic can only work if the original metaclasses are
cooperative, i.e. their methods are written in such a way to avoid
collisions. This can be done by using the cooperative the ``super`` call 
mechanism discussed in chapter 4.

Cooperative metaclasses 
----------------------------------------------------------------------------

In this section I will discuss how metaclasses can be composed with
classes and with metaclasses, too. Since we will discusss even
complicated hierarchies, it is convenient to have an utility 
routine printing the MRO of a given class:

 ::

  #<oopp.py>

  def MRO(cls):
      count=0; out=[]
      print "MRO of %s:" % cls.__name__
      for c in cls.__mro__:
          name=c.__name__
          bases=','.join([b.__name__ for b in c.__bases__])
          s="  %s - %s(%s)" % (count,name,bases)
          if type(c) is not type: s+="[%s]" % type(c).__name__
          out.append(s); count+=1
      return '\n'.join(out)

  #</oopp.py>

Notice that ``MRO`` also prints the metaclass' name in square brackets, for
classes enhanced by a non-trivial metaclass.

Consider for instance the following hierarchy:

  >>> from oopp import MRO
  >>> class B(object): pass
  >>> class M(B,type): pass
  >>> class C(B): __metaclass__=M

Here 'M' is a metaclass that inherits from 'type' and the base class 'B'
and 'C' is both an instance of 'M' and a child of 'B'. The inheritance
graph can be draw as

 ::

                 object
                 /   \
                B    type
                | \  /
                |  M
                \  :
                 \ :     
                   C
  
Suppose now we want to retrieve the ``__new__`` method of B's superclass
with respect to the MRO of C: obviously, this is ``object.__new__``, since

  >>> print MRO(C)
  MRO of C:
    0 - C(B)[M]
    1 - B(object)
    2 - object()

This allows to create an instance of 'C' in this way:

  >>> super(B,C).__new__(C) 
  <__main__.C object at 0x4018750c>

It is interesting to notice that this would not work in Python 2.2,
due to a bug in the implementation of ``super``, therefore do not
try this trick with older version of Python. 

Notice that everything works 
only because ``B`` inherits the ``object.__new__`` staticmethod that 
is cooperative and it turns out that it calls ``type.__new__``. However, 
if I give to 'B' a non-cooperative method

  >>> B.__new__=staticmethod(lambda cls,*args: object.__new__(cls))

things do not work:

  >>> M('D',(),{}) #error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
    File "<stdin>", line 1, in <lambda>
  TypeError: object.__new__(M) is not safe, use type.__new__()

A cooperative method would solve the problem:

  >>> B.__new__=staticmethod(lambda m,*args: super(B,m).__new__(m,*args))
  >>> M('D',(),{}) # calls B.__new__(M,'D',(),{})
  <class '__main__.D'>

Metamethods vs class methods
-------------------------------------------------------------------

Meta-methods, i.e. methods defined in
a metaclass. 

Python has already few built-in metamethods: ``.mro()`` 
and ``__subclass__``. These are methods of the metaclass 'type' and
there of any of its sub-metaclasses.

  >>> dir(type)
  ['__base__', '__bases__', '__basicsize__', '__call__', '__class__', 
   '__cmp__', '__delattr__', '__dict__', '__dictoffset__', '__doc__', 
   '__flags__', '__getattribute__', '__hash__', '__init__', '__itemsize__', 
   '__module__', '__mro__', '__name__', '__new__', '__reduce__', '__repr__', 
   '__setattr__', '__str__', '__subclasses__', '__weakrefoffset__', 'mro']


  >>> print type.mro.__doc__
  mro() -> list
  return a type's method resolution order
  >>> print type.__subclasses__.__doc__
  __subclasses__() -> list of immediate subclasses

  >>> class A(object): pass
  >>> class B(A): pass
  >>> B.mro()
  [<class '__main__.B'>, <class '__main__.A'>, <type 'object'>]
  >>> A.__subclasses__()
  [<class '__main__.B'>]

Notice that ``mro()`` and ``__subclasses__`` are not retrieved by ``dir``.

Let me constrast metamethods with the more traditional classmethods.
In many senses, the to concepts are akin:

  >>> class M(type): 
  ...    "Metaclass with a (meta)method mm"
  ...    def mm(cls): return cls
  >>> D=M('D',(),{'cm':classmethod(lambda cls: cls)}) 
  >>> # instance of M with a classmethod cm
  >>> D.mm # the metamethod
  <bound method M.mm of <class '__main__.C'>>
  >>> D.cm # the classmethod
  <unbound method D.<lambda>>

Notice the similarities between the classmethod and the metamethod:

  >>> D.mm.im_self, D.cm.im_self # the same
  (<class '__main__.D'>, <class '__main__.D'>)
  >>> D.mm.im_class, D.cm.im_class # still the same
  (<class '__main__.M'>, <class '__main__.M'>)

There are no surprises for ``im_func``:

  >>> D.mm.im_func, D.cm.im_func
  (<function mm at 0x402c272c>, <function <lambda> at 0x402c280c>)

Nevertheless, there are differences: metamethods are not bounded to
instances of the class 

  >>> D().cm() # the classmethod works fine
  <class '__main__.D'>
  >>> D().mm() # the metamethod does not: error
  Traceback (most recent call last):
    File "<stdin>", line 1, in ?
  AttributeError: 'D' object has no attribute 'mm'

and they are not retrieved by ``dir``:

  >>> from oopp import *
  >>> attributes(D).keys() # mm is not retrieved, only cm
  ['cm']

  >>> cm.__get__('whatever') #under Python 2.2.0 would give a serious error
  Segmentation fault
  >>> cm.__get__(None) #under Python 2.3 there is no error
  <bound method type.<lambda> of <type 'NoneType'>>

Moreover metamethods behaves differently with respect to multiple
inheritance. If a class A define a classmethod cA and a class B
defines a classmethod cB, then the class C(A,B) inherits both the
classmethods cA and cB. In the case of metamethods defined in M_A
and M_B, the same is true only if one resolves the meta-type
conflict by hand, by generating the metaclass M_C(M_A,M_B). In this
sense, classmethods are simpler to use than metamethods.


THE MAGIC OF METACLASSES - PART 2 
===========================================================================

Metaclasses are so powerful that a single chapter is not enough to make
justice to them ;) In this second chapter on metaclasses I will 
unravel their deepest secrets, covering topics such as meta-metaclasses,
anonymous inner metaclasses, global metaclasses and advanced class factories.

Moreover, I will give various magical applications of metaclasses,
in the realm of enhancing the Python language itself. Actually, this is
probably the most idiomatic application of metaclasses (Guido's examples
on the metaclass usage are all in this area). I will show
how metaclasses can be used to enhance the ``super`` cooperatice call
mechanism.

This is not a chapter for the faint of heart.

The secrets of the ``__metaclass__`` hook
------------------------------------------------------------------------

In the previous chapter we have seen how the ``__metaclass__`` hook can
be used as a way of metaclass enhancing pre-existing classes 
with a minimal change of the sourcecode. 

But it has much deeper secrets.

The first and simplest of them, 
is the fact that the hook can be used it can also be defined 
at the module level, *outside* the class. This allows a number of neat 
tricks, since in presence of a  ``__metaclass__`` hook at the module
level *all* the old style classes in the module (including nested ones!)
acquire that hook. A first application is to rejuvenate old style classes 
to new style classes.

I remind that old style classes are retained with compability with old 
code, but they are a pain in the back, if you want to use features 
intended for new style classes only (for instance properties etc.). 
Naively, one would expect the conversion from old style classes
to new style to be long and error prone: suppose you have a very large 
application with hundreds of old style classes defined in dozens of modules. 
Suppose you want to update your application to Python 2.2+ classes in order
to take advantage of the new features I have discussed extensively in this
book: the naive way to go would be to go trough the source, look for
all classes definitions and change 

 ::

  Classname: --> Classname(object)

One could solve this problem with a regular expression search and replace
in all modules, but this would require to change *all* the source.
This is againt the spirit of OOP, we must *reuse* old code. 

Metaclasses are particularly handy to solve this problem: actually it is
enough to add to your modules the following line as first line: 

 ::

   __metaclass__ = type

Then, all your old style classes will have 'type' as their metaclass: this
is akin to say that all the old style classes  are *automagically* rejuvenate 
to new style classes! And this also works for *nested* classes!!

 ::

  #<rejuvenate.py>

  __metaclass__ = type # this rejuvanate all the class in the module

  class C:
     class D: pass

  print dir(C)   # both C and C.D
  print dir(C.D) # are now new style classes

  #</rejuvenate.py>

This very first example add consistence (if needed) to the
widespread belief that metaclasses have a well deserved reputation of magic.

The explanation is that defining a global metaclass called ``__metaclass__``
automatically makes all old style classes (new style class simply ignore 
the existence of the global ``__metaclass__``)  defined in you module 
instances of the given metaclass; this automatically converts them to 
new style classes.

Anonymous inner metaclasses
---------------------------------------------------------------------------

A second, deeper secret of the ``__metaclass__`` hook is that it can be
used to define anonymous *inner metaclasses*. The following example
explain what I mean:

 ::

  #<oopp.py>

  def totuple(arg):
      "Converts the argument to a tuple, if need there is"
      if isinstance(arg,tuple): return arg # do nothing
      else: return (arg,) # convert to tuple

  class BracketCallable(object):
      """Any subclass C(BracketCallable) can be called with the syntax C[t], 
      where t is a tuple of arguments stored in bracket_args;  returns the 
      class or an instance of it, depending on the flag 'returnclass'."""

      returnclass=True
      class __metaclass__(type): # anonymous inner metaclass
          def __getitem__(cls,args): # non cooperative metamethod
              if cls.returnclass: 
                  c=type(cls.__name__,(cls,),{'bracket_args':totuple(args)})
                  return c # a customized copy of the original class
              else:
                  self=cls(); self.bracket_args=totuple(args)
                  return self

  #</oopp.py>

In this code 'BracketCallable.__metaclass__' is the anonymous (actually 
it has a special name, ``__metaclass__``) inner metaclass of 'BracketCallable'.

The effect of 'BracketCallable.__metaclass__' is the following: it makes
'BracketCallable' and its descendants callable with brackets. Since
the 'returnclass' flag is set, ``__getitem__`` returns the class
with an attribute 'bracket_args' containing the tuple of the passed
arguments (otherwise it returns an instance of the class).
This works since when 
Python encounters an expression of kind ``cls[arg]`` it interprets it 
as  ``type(cls).__getitem__(cls,arg)``. Therefore, if ``cls`` is a subclass 
of  'BracketCallable', this means that

 ::

  cls[arg] <=> BracketCallable.__metaclass__.__getitem__(cls,arg)

Let me give few examples:

  >>> from oopp import BracketCallable
  >>> type(BracketCallable)
  <class 'oopp.__metaclass__'>
  >>> print type(BracketCallable).__name__ # not really anonymous
  __metaclass__
  >>> print BracketCallable['a1'].bracket_args
  ('a1',)
  >>> print BracketCallable['a1','a2'].bracket_args
  ('a1', 'a2')

This syntactical feature is an example of a thing that can be done 
*trough metaclasses only*: it cannot be emulated by functions.


Anonymous inner metaclasses are the least verbose manner 
of defining metamethods. Moreover, they are a neat trick to define 
mix-in classes that, when inherited, can metamagically enhance 
an entire multiple inheritance hierarchy. 

In the previous example ``__getitem__`` is noncooperative, but nothing
forbids anonymous inner metaclasses from being made cooperative. However,
there is some subtlety one must be aware of.
Let me give an example. My 'WithCounter' class counts how many instances
of 'WithCounter' and its subclasses are generated. However, it does not
distinguishes bewteen different subclasses. 
This was correct in the pizza shop example, simple only the total
number of produced pizzas mattered, however, in other situations,
one may want to reset the counter each time a new subclass is created. 
This can be done automagically by a cooperative inner metaclass:

 ::

  class WithMultiCounter(WithCounter):
      """Each time a new subclass is derived, the counter is reset"""
      class __metaclass__(type):
          def __init__(cls,*args):
              cls.counter=0
              super(cls.__this,cls).__init__(*args)
      reflective(__metaclass__)

Notice that the order of execution of this code is subtle:

1) first, the fact that WithMulticounter has a non-trivial metaclass is
   registered, but nothing else is done;
2) then, the line ``reflective(__metaclass__)`` is executed: this means
   that the inner metaclass (and therefore its instances) get an
   attribute ``._metaclass__this`` containing a reference to the
   inner metaclass;
3) then, the outer class is passed to its inner metaclass and created
   by the inherited metaclass' ``__new__`` method;
4) at this point  ``cls`` exists and ``cls.__this`` is inherited from
   ``__metaclass__._metaclass__this``; this means that the expression
   ``super(cls.__this,cls).__init__(*args)`` is correctly recognized and
   'WithMultiCounter' can be initialized;
5) only after that, the name 'WithMultiCounter' enters in the global namespace
   and can be recognized.

Notice in particular that inside ``super``, we could also
use ``cls.__metaclass__`` instead of ``cls.__this``, but this
would not work inside ``__new__``, whereas ``__this`` would be
recognized even in ``__new__``.

  >>> from oopp import *
  >>> print MRO(WithMultiCounter)
  1 - WithMultiCounter(WithCounter)[__metaclass__]
  2 - WithCounter(object)
  3 - object()

For sake of readability, often it is convenient
to give a name even to inner classes:

::

  #<oopp.py>

  class WithMultiCounter(WithCounter):
      """Each time a new subclass is derived, the counter is reset"""
      class ResetsCounter(type):
          def __init__(cls,*args):
              cls.counter=0
              super(cls.ResetsCounter,cls).__init__(*args)
      __metaclass__=ResetsCounter

  #</oopp.py>

Notice that inside super we used the expression ``cls.ResetsCounter`` and
not ``WithMultiCounter.ResetsCounter``: doing that would generate a
``NameError: global name 'WithMultiCounter' is not defined`` since at the
time when  ``ResetsCounter.__init__`` is called for the first time, 
the class ``WithMultiCounter`` exists but is has not yet entered the global
namespace: this will happens only after the initialization in the
``ResetsCounter`` metaclass, as we said before.

Without the metaclass one can reset the counter by hand each time, or
can reset the counter on all the classes of the hierarchy with a
convenient function (akin to the 'traceH' routine defined in chapter 6).

Example:

  >>> from oopp import *
  >>> class GrandFather(WithMultiCounter): pass
  >>> class Father(GrandFather): pass
  >>> class Child(Father): pass
  >>> GrandFather()
  <__main__.GrandFather object at 0x402f7f6c> # first GrandFather instance
  >>> Father()
  <__main__.Father object at 0x402f79ec> # first Father instance
  >>> Father()
  <__main__.Father object at 0x402f7d4c> # second Father instance
  >>> Child.counter # zero instances
  0
  >>> Father.counter # two instances
  2
  >>> GrandFather.counter # one instance
  1

I leave as an exercise for the reader to show that the original 'WithCounter'
would fail to count correctly the different subclasses and would put the 
total number of instances in 'Child'.

Passing parameters to (meta) classes
-------------------------------------------------------------------------

Calling a class with brackets is a way of passing parameters to it (or
to its instances, if the 'returnclass' flag is not set). 
There additional ways for of doing that. 
One can control the instantiation syntax of classes by redefining the 
``__call__`` method of the metaclass. The point is that when we instantiate 
an object with the syntax ``c=C()``, Python looks
at the ``__call__`` method of the metaclass of 'C'; the default behaviour
it is to call ``C.__new__`` and ``C.__init__`` in succession, however, that
behavior can be overridden. Let me give an example without using
anonymous metaclasses (for sake of clarity only).

 ::

  #<metacall.py>

  class M(type): # this is C metaclass
      def __call__(cls):
          return "Called M.__call__" 

  C=M('C',(),{}) # calls type(M).__call__
  c=C() # calls type(C).__call__
  # attention: c is a string!
  print c #=> Called M.__call__

  #</metacall.py>

In this example, ``M.__call__`` simply 
returns the string ``Called M.__call__``, and the class
'C' is *not* instantiated. Overriding the metaclass
``__call__ `` method therefore provides another way to implement
the ``Singleton`` pattern. However, savage overridings as the one in
this example, are not a good idea, since it will confuse everybody.
This is an example where metaclasses change the semantics: whereas
usually the notation ``C()`` means "creates a C instance", the
metaclass can give to the syntax ``C()`` any meaning we want.
Here there is both the power and the danger of metaclasses: they
allows to make both miracles and disasters. Nevertheless, used with
a grain of salt, they provide a pretty nice convenience.

Anyway, overriding the '__call__' method of the metaclass can be 
confusing, since parenthesis are usually reserved to mean instantion,
therefore I will prefere to pass arguments trough brackets.

The beauty and the magic of metaclasses stays in the fact that this mechanism 
is completely general: since metaclasses themselves are classes, we can
'CallableWithBrackets' to pass arguments to a metaclass, i.e.
'CallableWithBrackets' can also be used as a meta-metaclass!

I leave as an exercise for the reader to figure out
how to define meta-meta-metaclasses, meta-meta-meta-metaclasses, etc.
etc. (there is no limit to the abstraction level you can reach with 
metaclasses;-)

Let me show an example: a magical way of making methods cooperative.
This can be done trough a 'Cooperative' metaclass that inherits from
'BracketCallable' and therefore has 'BracketCallable.__metaclass__'
as (meta)metaclass:
 
 ::

  #<oopp.py>

  class Cooperative(BracketCallable,type):
      """Bracket-callable metaclass implementing cooperative methods. Works
      well for plain methods returning None, such as __init__"""
      def __init__(cls,*args):
          methods=cls.bracket_args
          for meth in methods: 
              setattr(cls,meth,cls.coop_method(meth,vars(cls).get(meth)))
      def coop_method(cls,name,method): # method can be None
          """Calls both the superclass method and the class method (if the 
          class has an explicit method). Implemented via a closure"""
          def _(self,*args,**kw):
              getattr(super(cls,self),name)(*args,**kw) # call the supermethod
              if method: method(self,*args,**kw) # call the method
          return _

  #</oopp.py>

The code above works for methods returing ``None``, such as ``__init__``.
Here I give a first example of application: a hierarchy where the ``__init__``
methods are automatically called (similar to automatic initialization
in Java).

 ::

  #<cooperative.py>

  from oopp import Cooperative

  class B(object):
      """Cooperative base class; all its descendants will automagically 
      invoke their ancestors __init__ methods in chain."""
      __metaclass__=Cooperative['__init__']
      def __init__(self,*args,**kw):
          print "This is B.__init__"
  
  class C(B):
      "Has not explicit __init__"
  
  class D(C):
      """The metaclass makes D.__init__ to call C.__init__ and 
      therefore B.__init__"""
      def __init__(self,*args,**kw):
          print "This is D.__init__"

  d=D()
  
  print "The metaclass of B is",type(B)
  print "The meta-metaclass of B is", type(type(B))
  
  #</cooperative.py>

Output:

 ::

  This is B.__init__
  This is D.__init__
  The metaclass of B is <class 'oopp.Cooperative'>
  The meta-metaclass of B  is <class 'oopp.__metaclass__'>

A second example, is the following, an alternative way of
making the paleoanthropological hierarchy of chapter 4 cooperative:

 ::

  #<paleo.py>

  from oopp import Cooperative,Homo

  class HomoHabilis(Homo):
      __metaclass__=Cooperative['can']
      def can(self):
          print " - make tools"

  class HomoSapiens(HomoHabilis):
      def can(self):
          print " - make abstractions"
      
  class HomoSapiensSapiens(HomoSapiens):
      def can(self):
          print " - make art"

  HomoSapiensSapiens().can()

  # Output:

  # <HomoSapiensSapiens> can:
  #  - make tools
  #  - make abstractions
  #  - make art

  #</paleo.py>

Metaclasses can be used to violate the old good rule "explicit is
better than implicit". Looking at the source code for 'HomoSapiens'
and 'HomoSapiensSapiens' one would never imagine the ``can`` is
somewhat special. That is why in the following I will prefer to
use the anonymous super call mechanism, which is explicit, instead
of the implicit cooperative mechanism.

Meta-functions
---------------------------------------------------------------------

The third and deepest secret of the ``__metaclass__`` hook is that, even if
it is typically used in conjunction with metaclasses, actually the hook 
can refer to generic class factories callable with the signature
``(name,bases,dic)``. Let me show a few examples 
where ``__metaclass__`` is a function or a generic callable object
instead of being a metaclass:

 ::

  #<metafun.py>

  from oopp import kwdict

  class Callable(object):
      def __call__(self,name,bases,dic):
          print name,bases,'\n',kwdict(dic)
          return type(name,bases,dic)

  callableobj=Callable()

  class C: __metaclass__=callableobj

  print "type of C:",C.__class__

  def f(name,bases,dic):
      print name,bases,'\n',kwdict(dic)
      return type(name,bases,dic)

  class D: __metaclass__=f

  print "type of D:",D.__class__

  class B(object):
      def __metaclass__(name,bases,dic):
          """In this form, the __metaclass__ attribute is a function. 
          In practice, it works as a special static method analogous 
          to __new__"""
          print "name: ", name
          print "bases:", bases
          print "dic:\n",kwdict(dic)
          return type(name,bases,dic)

  class E(B): pass

  print "type of E:",E.__class__
  print "Non-called E.__metaclass__:", E.__metaclass__

  #</metafun.py>

With output

 ::

  C () 
  __metaclass__ = <Callable object at 0x401c964c>
  __module__ = __builtin__
  type of C: <type 'type'>
  D () 
  __metaclass__ = <function f at 0x401c4994>
  __module__ = __builtin__
  type of D: <type 'type'>
  name:  B
  bases: (<type 'object'>,)
  dic: 
  __metaclass__ = <function __metaclass__ at 0x401c4a3c>
  __module__ = __builtin__
  type of E: <type 'type'>
  Non-called E.__metaclass__: <unbound method E.__metaclass__>

The advantage/disadvantage of this solution is that the ``__metaclass__`` 
hook is called only once, i.e. it is not called again if a new class
is derived from the original one. For instance in this example 'E' is
derived from 'B', but the function ``B.__metaclass__`` is *not* called
during the creation of 'E'.

Metafunctions can also be used when one does not want to transmit the
metaclass contraint. Therefore they usage is convenient in exactly
the opposite situation of a cooperative metaclass.

Anonymous cooperative super calls
-----------------------------------------------------------------------

As I noticed in the previous chapters, the ``super`` 
mechanism has an annoying 
problem: one needs to pass explicitely the name of the base class. Typically, 
this is simply an 
inelegance since it is annoying to be forced to retype the name of the base 
class. However, in particular
cases, it can be a problem. This happens for instance if we try to
pass the class's methods to a different class: one cannot do that,
since the methods contains an explicit reference to the original class
and would not work with the new one. Moreover, having named super calls
is annoying in view of refactoring. Consider for
instance the previous ``supernew.py`` script: in the ``__new__`` method
defined inside the class 'B', we called ``Super`` with the syntax
``Super(B,cls)`` by repeating the name of the class 'B'. Now,
if in the following I decide to give to 'B' a more descriptive
name, I have to go trough the source, search all the ``super``
calls, and change them accordingly to the new name. It would be
nice having Python do the job for me. A first solution is to call
``super`` (or ``Super``) with the syntax ``super(self.__this,obj)``,
where the special name ``__this`` is explicitly replaced by the name 
of the class where the call is defined by the 'reflective' function
of last chapter. This approach has the disadvantage that each time we
derive a new class, we need to invoke *explicitely* the routine 
``reflective``. It would be marvelous to instruct Python to invoke
``reflective`` automatically at each class creation. Actually, this
seems to be deep magic and indeed it is: fortunately, a custom metaclass 
can perform this deep magic in few lines:

 ::

  #<oopp.py>
    
  class Reflective(type):
      """Cooperative metaclass that defines the private variable __this in
      its instances. __this contains a reference to the class, therefore
      it allows anonymous cooperative super calls in the class."""
      def __init__(cls,*args):
          super(Reflective,cls).__init__(*args)
          reflective(cls)

  #</oopp.py>

Now, let me show how 'Reflective' can be used in a practical example. 

By deriving new metaclasses from 'Reflective', one can easily 
create powerful class factories that generate reflective classes. 
 
Suppose I want to define a handy class 
factory with the abilitity of counting the number of its instances.

This can be done by noticing that metaclasses are just classes, therefore 
they can be composed with regular classes in multiple inheritance. In 
particular one can derive a 'Logged' metaclass from 'WithLogger': in
this way we send a message to a log file each time a new class is created.
This can be done by composing 'WithLogger' with 'WithMultiCounter.__metaclass__'
and with 'Reflective':

 ::

  #<oopp.py>

  class Logged(WithLogger,Reflective): 
      """Metaclass that reuses the features provided by WithLogger. In particular
      the classes created by Logged are Reflective, PrettyPrinted 
      and Customizable.""" #WithLogger provides logfile and verboselog
      def __init__(cls,*args,**kw):
          super(Logged,cls).__init__(*args,**kw) 
          bases=','.join([c.__name__ for c in cls.__bases__])
          print >> cls.logfile, "%s is a child of %s" % (cls,bases)
          print >> cls.logfile,'and an instance of %s' % type(cls).__name__

  #</oopp.py>

The MRO is

  >>> print MRO(Logged)
  MRO of Logged:
    0 - Logged(WithLogger,Reflective)
    1 - WithLogger(WithCounter,PrettyPrinted)
    2 - WithCounter(object)
    3 - PrettyPrinted(object)
    4 - Reflective(type)
    5 - type(object)
    6 - object()

and the inheritance graph can be drawn as follows:

 ::


          _____________________ object 6 ___
         /                        /         \
   2 WithCounter        3 PrettyPrinted        type 5
            \                /               /
             \              /               /
              \            /               /
               \          /               /
                \        /               /
                 \      /               /
                1 WithLogger       Reflective 4
                     \           /
                      \         /
                       \       /
                        \     /
                   Logged 0
                           :
                           :
                           C1

'WithCounter' acts now as a metaclass, since WithCounter.__new__ invokes
type.__new__. Since ``type.__new__`` is non-cooperative,
in the composition of a metaclass with a regular class, the metaclass
should be put first: this guarantees that ``__new__`` derives from
``type.__new__``, thus avoiding the error message.

  >>> Logged.verboselog=True
  >>> C1=Logged('C1',(),{})
  *****************************************************************************
  Tue Apr 22 18:47:05 2003
  1. Created 'C1'
  with accessibile non-special attributes:
  _C1__this = 'C1'
  'C1' is a child of object
  and an instance of Logged

Notice that any instance of 'WithCounterReflective' inherits the 'WithCounter' 
attribute ``counter``, that counts the number of classes that have been 
instantiated (however it is not retrieved by ``dir``; moreover the 
instances of 'WithCounterReflective' instances have no ``counter`` attribute).

  >>> C1.counter
  1

More on metaclasses as class factories 
----------------------------------------------------------------------------

A slight disadvantage of the approach just described, 
is that 'Logged'  cooperatively invokes the ``type.__new__`` 
static method, therefore, when we invoke the metaclass, we must explicitly 
provide a name, a tuple of base classes and a dictionary, since the 
``type.__new__`` staticmethod requires that signature. Actually, 
the expression  

 ::

  C=Logged(name,bases,dic)

is roughly syntactic sugar for 

 ::

  C=Logged.__new__(Logged,name,bases,dic) 
  assert isinstance(C,Logged)
  Logged.__init__(C,name,bases,dic)

If a different interface is desired, the best way is to use a class
factory 'ClsFactory' analogous to the object factory 'Makeobj' 
defined in chapter 4. It is convenient to make 'ClsFactory'
bracket-callable.

 ::

  #<oopp.py>
 
  class ClsFactory(BracketCallable):
      """Bracket callable non-cooperative class acting as 
      a factory of class factories.

      ClsFactory instances are class factories accepting 0,1,2 or 3 arguments. 
    . They automatically converts functions to static methods 
      if the input object is not a class. If an explicit name is not passed
      the name of the created class is obtained by adding an underscore to 
      the name of the original object."""
      
      returnclass=False # ClsFactory[X] returns an *instance* of ClsFactory

      def __call__(self, *args):
          """Generates a new class using self.meta and avoiding conflicts.
          The first metaobject can be a dictionary, an object with a
          dictionary (except a class), or a simple name."""
          
          # default attributes
          self.name="CreatedWithClsFactory"    
          self.bases=()
          self.dic={}
          self.metas=self.bracket_args

          if len(args)==1:
              arg=args[0]
              if isinstance(arg,str): # is a name 
                  self.name=arg
              elif hasattr(arg,'__name__'): # has a name
                  self.name=arg.__name__+'_'
              self.setbasesdic(arg)
          elif len(args)==2: 
              self.name=args[0] 
              assert isinstance(self.name,str) # must be a name
              self.setbasesdic(args[1])
          elif len(args)==3: # must be name,bases,dic
              self.name=args[0]
              self.bases+=args[1]
              self.dic.update(args[2])
          if len(args)<3 and not self.bases: # creating class from a non-class
              for k,v in self.dic.iteritems():
                  if isfunction(v): self.dic[k]=staticmethod(v)
          #return child(*self.bases,**vars(self))
          return makecls(*self.metas)(self.name,self.bases,self.dic)

      def setbasesdic(self,obj):
          if isinstance(obj,tuple): # is a tuple
              self.bases+=obj
          elif hasattr(obj,'__bases__'): # is a class
              self.bases+=obj.__bases__
          if isinstance(obj,dict): # is a dict
              self.dic.update(obj)
          elif hasattr(obj,"__dict__"): # has a dict
              self.dic.update(obj.__dict__)

  #</oopp.py>

'ClsFactory[X]' where 'X' is a metaclass returns callable objects acting as
class factories. For instance

 ::

  #<oopp.py>

  Class=ClsFactory[type] # generates non-conflicting classes
  Mixin=ClsFactory[Reflective] # generates reflective classes
    
  #</oopp.py>

can be used as a class factories that automatically provides a default name,
base classes and dictionary, and avoids meta-type conflicts.
'Mixin' generates reflective classes that can be used as mixin in multiple
inheritance hierarchies. Here I give few example of usage of 'Class':

  >>> from oopp import *
  >>> C1,C2,C3=[Class('C'+str(i+1)) for i in range(3)]
  >>> C1
  <class 'oopp.C1'>
  >>> C2
  <class 'oopp.C2'>
  >>> C3
  <class 'oopp.C3'>]
 
  >>> Clock=Class('Clock',{'get_time':get_time})
  >>> Clock
  <class 'oopp.Clock'>
  >>> Clock.get_time()
  16:01:02

Another typical usage of 'Class' is the conversion of a module in a class: 
for instance

  >>> time_=Class(time)
  >>> time_
  <class 'oopp.time_'>

Notice the convention of adding an underscore to the name of the class 
generated from the 'time' module.
  
  >>> time_.asctime()
  'Mon Jan 20 16:33:21 2003'

Notice that all the functions in the module ``time`` has been magically 
converted in staticmethods of the class ``time_``. An advantage of this 
approach is that now the module is a class and can be enhanced with 
metaclasses: for instance we could add tracing capabilities, debugging 
features, etc. 

By design, 'Class' and 'Reflective' also works when the first argument 
is a class or a tuple of base classes:

  >>> ClsFactory_=Class(ClsFactory)
  >>> type(ClsFactory_)
  <class 'oopp.__metaclass__'>
  >>> ClsFactory_=Mixin(ClsFactory) 
  >>> type(ClsFactory_) # automagically generated metaclass
  <class 'oopp._Reflective__metaclass__'>

Programming with metaclasses
--------------------------------------------------------------------------
In order to how a non-trivial application of metaclasses in real life,
let me come back to the pizza shop example discussed in chapter 4 and 6.

 ::

  #<oopp.py>

  def Pizza(toppings,**dic): 
       """This function produces classes inheriting from GenericPizza and 
       WithLogger, using a metaclass inferred from Logged"""
       toppinglist=toppings.split()
       name='Pizza'+''.join([n.capitalize() for n in toppinglist])
       dic['toppinglist']=toppinglist
       return ClsFactory[Logged](name,
              (GenericPizza,WithLogger,WithMultiCounter),dic)

  #</oopp.py>

  >>> from oopp import *
  >>> Margherita=Pizza('tomato mozzarella',verboselog=True)
  *****************************************************************************
  Tue May 13 14:42:17 2003
  1. Created 'PizzaTomatoMozzarella'
  with accessibile non-special attributes:
  ResetsCounter = <class 'oopp.ResetsCounter'>
  _GenericPizza__this = <class 'oopp.GenericPizza'>
  _WithCounter__this = <class 'oopp.WithCounter'>
  _WithLogger__this = <class 'oopp.WithLogger'>
  baseprice = 1
  counter = 0
  formatstring = %s
  logfile = <open file '<stdout>', mode 'w' at 0x402c2058>
  price = <unbound method PizzaTomatoMozzarella.price>
  sizefactor = {'small': 1, 'large': 3, 'medium': 2}
  topping_unit_price = 0.5
  toppinglist = ['tomato', 'mozzarella']
  toppings_price = <unbound method PizzaTomatoMozzarella.toppings_price>
  verboselog = True
  'PizzaTomatoMozzarella' is a child of GenericPizza,WithLogger,
  WithMultiCounter and an instance of _LoggedResetsCounter

Notice the *deep* magic: ``Pizza`` invokes ``ClsFactory[Logged]`` which in 
turns calls the class factory ``child`` that creates 'Margherita' from 
'GenericPizza', 'WithLogger' and 'WithMultiCounter' by using the
metaclass 'Logged': however, since 'WithMultiCounter', has the internal
metaclass 'ResetsCounter' , there is a metatype conflict:
``child`` *automagically* solves the conflict by creating the metaclass
'_LoggedResetsCounter' that inherits both from 'Logged' and 'ResetsCounter'. 
At this point, 'Margherita' can be safely created
by '_LoggedResetsCounter'. As such, the creation of 'Margherita'
will be registered in the log file and 'Margherita' (with all its
children) will continue to be able to recognize the special identifier 
``this``. 

  >>> print Margherita('large')
  *****************************************************************************
  Tue May 13 14:47:03 2003
  1. Created large pizza with tomato,mozzarella, cost $ 6.0
  with accessibile non-special attributes:
  ResetsCounter = <class 'oopp.ResetsCounter'>
  _GenericPizza__this = <class 'oopp.GenericPizza'>
  _WithCounter__this = <class 'oopp.WithCounter'>
  _WithLogger__this = <class 'oopp.WithLogger'>
  baseprice = 1
  counter = 1
  formatstring = %s
  logfile = <open file '<stdout>', mode 'w' at 0x402c2058>
  price = <bound method PizzaTomatoMozzarella.price of 
  <oopp.PizzaTomatoMozzarella object at 0x4032764c>>
  size = large
  sizefactor = {'small': 1, 'large': 3, 'medium': 2}
  topping_unit_price = 0.5
  toppinglist = ['tomato', 'mozzarella']
  toppings_price = <bound method PizzaTomatoMozzarella.toppings_price of 
  <oopp.PizzaTomatoMozzarella object at 0x4032764c>>
  verboselog = True
  large pizza with tomato,mozzarella, cost $ 6.0
  >>> print MRO(Margherita)
  MRO of PizzaTomatoMozzarella:
    0 - PizzaTomatoMozzarella(GenericPizza,WithLogger)[_LoggedResetsCounter]
    1 - GenericPizza(object)
    2 - WithLogger(WithCounter,Customizable,PrettyPrinted)
    3 - WithMultiCounter(WithCounter)[ResetsCounter]
    4 - WithCounter(object)
    5 - PrettyPrinted(object)
    6 - object()

Notice that

  >>> print Margherita
  'PizzaTomatoMozzarella'

The power of inheritance in this example is quite impressive, since
I have reused the same class 'WithLogger' (and its children) both in the 
metaclass hierarchy and in the regular hierarchy: this means that I have added
logging capabilities both to classes and their instances in a
strike! And there is no confusion between the two. For instance,
there is a ``counter`` attribute for the metaclass 'Logged' 
and many independent ``counter`` attributes for any generated class,
i.e. for any kind of pizza.

 It is interesting to notice that '' itself is an instance of
 its inner metaclass, as ``type()`` would show. This technique
 avoids the need for inventing a new name for the metaclass. The inner
 metaclass is automatically inherited by classes inheriting from the outer
 class. 

Metaclass-aided operator overloading 
---------------------------------------------------------------------------

As we discussed in chapter 4, inheriting from built-in types is generally
painful. The problem is that if P is a primitive class, i.e. a 
Python built-in type, and D=D(P) is a derived class, then the 
primitive methods returning P-objects have to be modified (wrapped) in 
such a way to return D-objects. 

The problem is expecially clear in the context of operator overloading.

Consider for instance the problem of defining a 'Vector' class in the 
mathematical sense. Mathematically-speaking, vectors are defined as objects 
that can be summed each other and multiplied by numbers; they can be represented 
by (finite or infinite) sequences. In the case of finite sequences, vectors 
can be represented with lists and a vector class can be naturally
implemented by subclassing ``list``:

 ::

  #<vector.py>

  class Vector(list):
      """Implements finite dimensional vectors as lists. Can be instantiated
      as Vector([a,b,c,..]) or as Vector(a,b,c ..)"""
      def __add__(self,other):
          return [el+other[i] for i,el in enumerate(self)]
      __radd__=__add__
      def __mul__(self,scalar):
          return [el*scalar for el in self]
      def __rmul__(self,scalar):
          return [scalar*el for el in self]

  v=Vector([1,0])
  w=Vector([0,1])

  print v+w, type(v+w) 
  print 2*v, type(2*v) 
  print v*2, type(v*2) 

  #</vector.py>

With output

 ::

  [1, 1] <type 'list'>
  [2, 0] <type 'list'>
  [2, 0] <type 'list'>

The problem is that the overloaded methods must be wrapped in such a way
to return ``Vector`` object and not ``list`` object; moreover, if
``Vector`` is subclassed (for instance by defining a ``NumericVector``),
the overloaded methods must return instances of the subclass. There is
only one way of doing that automatically: trough the magic of metaclasses.

Here is the solution, involving an ``autowrappedmethod`` descriptor class,
that wraps the overloaded operators and is automatically invoked by
the metaclass ``AutoWrapped``.
 
 ::

  #<oopp.py>

  class autowrappedmethod(wrappedmethod):
      """Makes the method returning cls instances, by wrapping its
      output with cls"""
      klass=None # has to be fixed dynamically from outside
      def __init__(self,meth):
          super(autowrappedmethod,self).__init__(meth) # cooperative
          self.klass=self.klass # class variable -> instance variable
      def wrapper(self): # closure
          return lambda *args,**kw: self.klass(self.func(*args,**kw))

  class AutoWrapped(type):
      """Metaclass that looks at the methods declared in the attributes 
      builtinlist and wraplist of its instances and wraps them with
      autowrappedmethod."""
      def __init__(cls,name,bases,dic):
          super(AutoWrapped,cls).__init__(name,bases,dic) # cooperative
          cls.builtinlist=getattr(cls,'builtinlist',[])
          if not hasattr(cls,'diclist') : # true only at the first call
              cls.diclist=[(a,vars(bases[0])[a]) for a in cls.builtinlist]
          if dic.has_key('wraplist'): # can be true at any call
              cls.diclist+=[(a,dic[a]) for a in cls.wraplist] 
          wrapper=autowrappedmethod.With(klass=cls)
          d=dict([(a,wrapper(v)) for a,v in cls.diclist])
          customize(cls,**d)
      
  #</oopp.py>

Now the ``Vector`` class can be written as

 ::

  #<oopp.py>

  class Vector(list):
      """Implements finite dimensional vectors as lists. Can be instantiated
      as Vector([a,b,c,..]) or as Vector(a,b,c ..)"""
      __metaclass__=AutoWrapped
      wraplist='__add__ __radd__ __mul__ __rmul__'.split()
      def __add__(self,other):
          return [el+other[i] for i,el in enumerate(self)]
      __radd__=__add__
      def __mul__(self,scalar):
          return [scalar*el for el in self]
      def __rmul__(self,scalar):
          return [el*scalar for el in self]

  #</oopp.py>

Here the ``AutoWrapped`` metaclass wraps the output of ``__add__, 
__radd__, __mul__, __rmul__``, guaranteeing that they returns ``Vector``
instances or instances of some subclass of ``Vector``, if ``Vector`` is
subclassed. This is an example of usage:

  .. doctest

  >>> from oopp import Vector
  >>> v=Vector([1,0])
  >>> v
  <oopp.Vector object at 0x4032858c>
  >>> w=Vector([0,1])
  >>> v+2*w
  <oopp.Vector object at 0x403190ac>
  >>> print v+2*w
  [1, 2]

It should be clear by now that metaclasses are the natural framework where
to discuss operator overloading
(at least in languages that have metaclasses ;-). After all, operator
overloading is another kind of (very nice) syntactic sugar and we know
already that metaclasses are very good when we need syntactic sugar.


ADVANCED METAPROGRAMMING TECHNIQUES
========================================================================

In elementary OOP, the programmer works with objects; in advanced OOP,
the programmer works with classes, taking full advantage of
(multiple) inheritance and metaclasses. Metaprograming is the activity of 
building, composing and modifying classes.

I will give various examples of metaprogramming techniques
using run-time class modifications
multiple inheritance, metaclasses, attribute descriptors and
even simple functions.

Moreover, I will show show metaclasses can change the
semantics of Python programs: hence theire reputation of *black* magic. 
That is to say that the techniques explained here are dangerous!

On code processing
--------------------------------------------------------------------

It is a good programming practice to avoid the direct modification
of source code. Nevertheless, there are situations where the ability of 
modifying the source code *dynamically* is invaluable. Python has the
capability of 

1) generating new code from scratch;

2) modifying pre-existing source code;

3) executing the newly created/modified code at run-time.

The capability of creating source code and executing it *immediately* has
no equivalent in static languages such as C/C++/Java and it is maybe the 
most poweful feature of dynamics languages such as Java/Python/Perl.
This feature has been exploited to its ultimate consequences in the languages
of the Lisp family, in which one can use incredibly poweful macros, which
in a broad sense, are programs that write themselves

In this chapter I will discuss how to implement macros in Python and I will
present some of the miracles you may perform with this technique. To this
aim, I will discuss various ways of manipulating Python source code, by
using regular expressions and state machines.


Regular expressions
-------------------------------------------------------------------------

 .. line-block::

  *Some people, when confronted with a problem, 
  think "I know, I'll use regular expressions." 
  Now they have two problems.*
     -- Jamie Zawinski


Python source code is a kind of text and can manipulated with the same
techniques that are used to manipulate text:

1. the trivial search and replace;

2. regular expressions;

3. state machines;

4. parsers

There is not very much to say about the search and replace methods: it
is fast, efficient and it works. It should always be used whenever
possible. However, in this chapter I will only be interested in cases
where something more sophisticated than a plain search and replace is
needed. Cases that can be managed with regular expressions
or with something even more sophisticated than them: a state machine or
even a full featured parser.


I will *not* give a primer on regular expression here, since they are
already well documented in the standard documentation (see Andrew's
Kuchling 'Howto') as well in many books (for instance 'Mastering Regular
Expression' first edition and 'Python in a Nutshell'). 
Instead, I will give various practical examples of usage.

  >>> import re
  >>> reobj=re.compile(r'x')


More on metaclasses and subclassing built-in types 
-------------------------------------------------------------------------

Subclassing ``list`` is easy since there are no methods returning lists
except the methods correspondings to the '+' and '*' operators. 
Subclassing ``str`` is more complicated, since one has many methods
that return strings. Nevertheless, it can be done with the ``AutoWrapped``
metaclass, simply by specifying the list of the builtins to be wrapped.

 ::

  #<oopp.py>

  class Str(str):
      __metaclass__=AutoWrapped
      builtinlist="""__add__ __mod__ __mul__ __rmod__ __rmul__ capitalize
          center expandtabs join ljust lower lstrip replace rjust rstrip strip
          swapcase title translate upper zfill""".split()

  #</oopp.py>
 
Here I show various tests.

  .. doctest

  >>> from oopp import Str
  >>> sum=Str('a')+Str('b') # check the sum
  >>> print sum, type(sum)
  ab <class 'oopp.Str'>
  >>> rprod=Str('a')*2 # check the right product 
  >>> print rprod,type(rprod)
  aa <class 'oopp.Str'>
  >>> lprod=2*Str('a') # check the left product 
  >>> print lprod,type(lprod)
  aa <class 'oopp.Str'>
  >>> r=Str('a').replace('a','b') # check replace
  >>> print r,type(r)
  b <class 'oopp.Str'>
  >>> r=Str('a').capitalize() # check capitalize
  >>> print r,type(r)
  A <class 'oopp.Str'>

``Str`` acts as a nice base class to built abstractions based on strings.
In particular, regular expressions can be built on top of strings describing
their representation (I remind that if ``x`` is a regular expression object, 
``x.pattern`` is its string representation). Then, the sum of two regular 
expressions ``x`` and ``y`` can be defined as the sum of their string 
representation, ``(x+y).pattern=x.pattern+y.pattern``. Moreover, it is
convenient to define the ``__or__`` method of two regular expression in
such a way that ``(x | y).pattern=x.pattern+'|'+y.pattern``.
All this can be achieved trough the following class:

 ::

  #<oopp.py>
 
  class BaseRegexp(Str):

      builtinlist=['__radd__', '__ror__']
      wraplist=['__add__','__or__']

      __add__ = lambda self,other: self.pattern + other 
      __or__  = lambda self,other: self.pattern+'|'+other

      def __init__ (self,regexp):
          "Adds to str methods the regexp methods"
          reobj=re.compile(regexp)
          for attr in dir(reobj)+['pattern']:
              setattr(self,attr,getattr(reobj,attr))

  #</oopp.py>
 
  >>> from oopp import *
  >>> aob=BaseRegexp('a')|BaseRegexp('b'); print aob
  a|b
  >>> print pretty(attributes(aob))
  encode = <built-in method encode of BaseRegexp object at 0x401b25cc>
  endswith = <built-in method endswith of BaseRegexp object at 0x401b25cc>
  expandtabs = <function _ at 0x401b69cc>
  find = <built-in method find of BaseRegexp object at 0x401b25cc>
  findall = <built-in method findall of _sre.SRE_Pattern object at 0x4019b890>
  finditer = <built-in method finditer of _sre.SRE_Pattern object at 
  0x4019b890>
  index = <built-in method index of BaseRegexp object at 0x401b25cc>
  isalnum = <built-in method isalnum of BaseRegexp object at 0x401b25cc>
  isalpha = <built-in method isalpha of BaseRegexp object at 0x401b25cc>
  isdigit = <built-in method isdigit of BaseRegexp object at 0x401b25cc>
  islower = <built-in method islower of BaseRegexp object at 0x401b25cc>
  isspace = <built-in method isspace of BaseRegexp object at 0x401b25cc>
  istitle = <built-in method istitle of BaseRegexp object at 0x401b25cc>
  isupper = <built-in method isupper of BaseRegexp object at 0x401b25cc>
  join = <function _ at 0x401b6a74>
  ljust = <function _ at 0x401b6b1c>
  lower = <function _ at 0x401b6cdc>
  lstrip = <function _ at 0x401b6d84>
  match = <built-in method match of _sre.SRE_Pattern object at 0x4019b890>
  pattern = ba
  replace = <function _ at 0x401b6ed4>
  rfind = <built-in method rfind of BaseRegexp object at 0x401b25cc>
  rindex = <built-in method rindex of BaseRegexp object at 0x401b25cc>
  rjust = <function _ at 0x401ba0d4>
  rstrip = <function _ at 0x401ba10c>
  scanner = <built-in method scanner of _sre.SRE_Pattern object at 0x4019b890>
  search = <built-in method search of _sre.SRE_Pattern object at 0x4019b890>
  split = <built-in method split of _sre.SRE_Pattern object at 0x4019b890>
  splitlines = <built-in method splitlines of BaseRegexp object at 0x401b25cc>
  startswith = <built-in method startswith of BaseRegexp object at 0x401b25cc>
  strip = <function _ at 0x401ba25c>
  sub = <built-in method sub of _sre.SRE_Pattern object at 0x4019b890>
  subn = <built-in method subn of _sre.SRE_Pattern object at 0x4019b890>
  swapcase = <function _ at 0x401ba294>
  title = <function _ at 0x401ba4fc>
  translate = <function _ at 0x401ba534>
  upper = <function _ at 0x401ba5dc>
  wraplist = ['__add__', '__radd__', '__or__', '__ror__']
  zfill = <function _ at 0x401ba614>

  #<oopp.py>

  class Regexp(BaseRegexp):
      class __metaclass__(BaseRegexp.__metaclass__):
          def __setattr__(cls,name,value):
              if name==name.upper(): # all caps means regexp constant
                  if not isinstance(value,cls): value=cls(value)
                  value.name=name # set regexp name
              BaseRegexp.__metaclass__.__setattr__(cls,name,value)
              # basic setattr

      def named(self,name=None):
          name=getattr(self,'name',name)
          if name is None: raise 'Unnamed regular expression'
          return self.__class__('(?P<%s>%s)' % (name,self.pattern))

      generateblocks=generateblocks

  #</oopp.py>

The magic of ``Regexp.__metaclass__`` allows to generate a library of 
regular expressions in an elegant way:

 ::

  #<oopp.py>

  r=Regexp

  customize(r,
      DOTALL =r'(?s)' ,      # starts the DOTALL mode; must be at the beginning
      NAME   =r'\b[a-zA-Z_]\w*', # plain Python name
      EXTNAME=r'\b[a-zA-Z_][\w\.]*', # Python name with or without dots
      DOTNAME=r'\b[a-zA-Z_]\w*\.[\w\.]*',# Python name with (at least one) dots
      COMMENT=r"#.*?(?=\n)", # Python comment
      QUOTED1="'.+?'",       # single quoted string '
      QUOTED2='".+?"',       # single quoted string "
      TRIPLEQ1="'''.+?'''",  # triple quoted string '
      TRIPLEQ2='""".+?"""'   # triple quoted string " 
    )

  r.STRING=r.TRIPLEQ1|r.TRIPLEQ2|r.QUOTED1|r.QUOTED2
  r.CODESEP=r.DOTALL+r.COMMENT.named()|r.STRING.named()

  #</oopp.py>

The trick is in the redefinition of ``__setattr__``, which magically converts
all caps attributes in ``Regexp`` objects.

The features of ``Regexp`` can be tested with the following code:

 ::

  #<test_re.py>

  """This script looks at its own source code and extracts dotted names,
  i.e. names containing at least one dot, such as object.attribute or
  more general one, such as obj.attr.subattr."""

  # Notice that dotted.names in comments and literal strings are ignored

  from oopp import *
  import __main__

  text=inspect.getsource(__main__)

  regexp=Regexp.CODESEP| Regexp.DOTNAME.named()

  print 'Using the regular expression',regexp

  print "I have found the following dotted names:\n%s" % [
      MO.group() for MO in regexp.finditer(text) if MO.lastgroup=='DOTNAME']
  
  #</test_re.py>

with output:

 ::

  Using the regular expression (?s)(?P<COMMENT>#.*?(?=\n))|(?P<STRING>
  '''.+?'''|""".+?"""|'.+?'|".+?")|(?P<DOTNAME>[a-zA-Z_]\w*\.[\w\.]*)
  I have found the following dotted names:
  ['inspect.getsource', 'Regexp.CODESEP', 'Regexp.DOTNAME.named', 'MO.group', 
   'dotname.finditer', 'MO.lastgroup']

Now one can define a good ``CodeStr`` class with replacing features

Let me consider for instance the solution to the problem discussed in chapter
4, i.e. the definition of a ``TextStr`` class able to indent and dedent
blocks of text.

 ::

  #<oopp.py>
 
  def codeprocess(code,TPO): # TPO=text processing operator
      code=code.replace("\\'","\x01").replace('\\"','\x02')
      genblock,out = Regexp.CODESEP.generateblocks(code),[]
      for block in genblock:
          out.append(TPO(block))
          out.append(genblock.next())
      return ''.join(out).replace("\x01","\\'").replace('\x02','\\"')

  def quotencode(text):
      return text.replace("\\'","\x01").replace('\\"','\x02')

  def quotdecode(text):
      return text.replace("\x01","\\'").replace('\x02','\\"')

  #</oopp.py>

Here is an example of usage: replacing 'Print' with 'print' except in
comments and literal strings.

 ::

  #<codeproc.py>

  from oopp import codeprocess

  wrongcode=r'''
  """Code processing example: replaces 'Print' with 'print' except in
  comments and literal strings"""
  Print "This program prints \"Hello World!\"" # look at this line!
  '''

  fixPrint=lambda s: s.replace('Print','print')
  validcode=codeprocess(wrongcode,fixPrint)

  print 'Source code:\n',validcode
  print 'Output:\n'; exec validcode
  
  #</codeproc.py>
 
with output

 ::

  Source code:

  """Code processing example: replaces 'Print' with 'print' except in
  comments and literal strings"""
  print "Prints \"Hello World!\"" # look at this line!

  Output:

  This program prints "Hello World!"

A simple state machine
---------------------------------------------------------------------------

Regular expression, however powerful, are limited in scope since they
cannot recognize recursive structures. For instance, they cannot parse
parenthesized expression.

The simplest way to parse a parenthesized expression is to use a state
machine.

 ::

  (?:...) non-grouping
  (?P<name>...) 

  (?=...) look-ahead 
  (?!...) negative
  (?<=...) look-behind 
  (?<!...) negative

  dec=\
  """
  paren   = ( .. )
  bracket = [ .. ]
  brace   = { .. }
  comment = # .. \n
  """

  actions=\
  """
  reobj1  : R' .. (?<!\\)' -> Regexp(r''' .. ''')
  reobj2  : R" .. (?<!\\)" -> Regexp(r""" .. """)
  string1 : (?<!')'(?!') .. (?<!\\)'(?!') -> ''' .. '''
  string2 : (?<!")"(?!") .. (?<!\\)"(?!") -> """ .. """
  """

  beg=0; end=1

  string1[beg]=r"(?<!')'(?!')" # an isolated single quote
  string2[beg]=r'(?<!")"(?!")' # an isolated double quote
  string1[end]=r"(?<!\\)'(?!')" # ending single quote
  string2[end]=r'(?<!\\)"(?!")' # ending double quote

  reobj1[beg]=r"R'"       
  reobj2[beg]=r'R"' 
  reobj1[end]=string1[end] # ending single quote
  reobj2[end]=string2[end] # ending double quote

  actions=\
  """
  reobj1  : R' .. (?<!\\)' -> Regexp(r''' .. ''')
  reobj2  : R" .. (?<!\\)" -> Regexp(r""" .. """)
  string1 : (?<!')'(?!') .. (?<!\\)'(?!') -> ''' .. '''
  string2 : (?<!")"(?!") .. (?<!\\)"(?!") -> """ .. """
  """

  beg={}; end={}; ls=[]
  for line in decl.splitlines():
     mode,rest=line.split(' : ')
     s,r=rest.split(' -> ')

   beg[mode],end[mode]=s.split(' .. ')
   ls.append('(?P<beg_%s>%s)' % (mode,beg[mode]))
   ls.append('(?P<end_%s>%s)' % (mode,end[mode]))

   beg2[mode],end2[mode]=r.split(' .. ')
   ls.append(beg2[mode])
   ls.append(end2[mode])

  delimiters='(%s)' % re.compile('|'.join(ls))
  splitlist=['']+delimiters.split(source)
  for delim,text in splitlist:
      delimiters.match(delim).lastgroup

Creating classes
----------------------------------------------------------------------

TODO

Modifying modules
-----------------------------------------------------------

Metaclasses are extremely
useful since they allows to change the behaviour of the code without
changing the sources. For instance, suppose you have a large library written 
by others that you want to enhance in some way.

Typically, it is always a bad idea to modify the sources, for many reasons:

+ touching code written by others, you may introduce new bugs;
+ you may have many scripts that requires the original version 
  of the library, not the modified one;
+ if you change the sources and then you buy the new version of the
  library, you have to change the sources again!

The solution is to enhance the proprierties of the library at run
time, when the module is imported, by using metaclasses.

To show a concrete example, let me consider the case of the module
*commands* in the Standard Library. This module is Unix-specific,
and cannot be used under Windows. It would be nice to have a
metaclass able to enhance the module in such a way that
when it is invoked on a Windows platform, Windows specific replacement
of the Unix functions provided in the module are used. However,
for sake of brevity, I will only give a metaclasses that display
a nice message in the case we are in a Window platform, without
raising an error (one could easily implement such a behaviour,
however).

 ::

  #<recognizewindows.py>

  import oopp,sys,commands

  class WindowsAware(type):
      def __init__(cls,*args): 
          if sys.platform=='win32': 
              for key,val in vars(cls).iteritems():
                  if isinstance(val,staticmethod):
                      setattr(cls,key,staticmethod(lambda *args: 
                           "Sorry, you are (or I pretend you are) on Windows,"
                           " you cannot use the %s.module" % cls.__name__))
            
  sys.platform="win32" #just in case you are not on Windows

  commands=oopp.ClsFactory[WindowsAware](commands)

  print commands.getoutput('date') #cannot be executed on Windows

  #</recognizewindows.py>

The output of this script is 

 ::

  Sorry, you are on Windows, you cannot use the commands.module

However, if you are on Linux and you comment out the line 

 ::

  sys.platform="win32" 

you will see that the script works.

Notice that the line ``commands=WindowsAware(commands)`` actually
converts the 'commands' module in a 'commands' class, but since
the usage is the same, this will fool all programs using the
commands module. In this case the class factory 'WindowsAware'
can also be thought as a module modifier. In this sense, it is
very useful to denote the metaclass with an *adjective*.

Metaclasses and attribute descriptors 
----------------------------------------------------------------------

Descriptors are especially useful in conjunction with metaclasses, since
a custom metaclass can use them as low level tools to modify the methods 
and the attributes of its instances. This allows to implement very 
sophisticated features with few lines of code.


Notice, anyway, that
even plain old function can be thought of as of descriptors. 

Descriptors share at least two features with metaclasses:

1. as metaclasses, descriptors are best used as adjectives, since they
   are intended to modify and enhance standard methods and attributes, in the
   same sense metaclasses modify and enhance standard classes;

2. as metaclasses, descriptors can change the *semantics* of Python, i.e.
   what you see is not necessarely what you get. As such, they are a 
   dangerous feature. Use them with judgement!

Now I will show a possible application of properties.
Suppose one has a given class with various kind of
attributes (plain methods, regular methods, static methods,
class methods, properties and data attributes) and she wants
to trace to access to the data attributes (notice that the motivation
for the following problem come from a real question asked in
comp.lang.python). Then one needs to retrieve data
attributes from the class and convert them in properties
controlling their access syntax. The first problem is solved
by a simple function

 ::

  #<oopp.py>

  def isplaindata(a):
      """A data attribute has no __get__ or __set__ attributes, is not
      a built-in function, nor a built-in method.""" 
      return not(hasattr(a,'__get__') or hasattr(a,'__set__')
                 or isinstance(a,BuiltinMethodType) or
                 isinstance(a,BuiltinFunctionType))

  #</oopp.py>

whereas the second problem is elegantly solved by a custom metaclass:

 ::

  #<tracedaccess.py>

  from oopp import isplaindata,inspect

  class TracedAccess(type):
      "Metaclass converting data attributes to properties"
      def __init__(cls,name,bases,dic):
          cls.datadic={}
          for a in dic:
              if isplaindata(a):
                  cls.datadic[a]=dic[a]
                  def get(self,a=a):
                      v=cls.datadic[a]
                      print "Accessing %s, value=%s" % (a,v)
                      return v
                  def set(self,v,a=a):
                      print "Setting %s, value=%s" % (a,v)
                      cls.datadic[a]=v
                  setattr(cls,a,property(get,set))

  class C(object):
      __metaclass__ = TracedAccess
      a1='x'

  class D(C): # shows that the approach works well with inheritance
      a2='y'

  i=D()
  i.a1 # => Accessing a1, value=x
  i.a2 # => Accessing a2, value=y
  i.a1='z' # => Setting a1, value=z
  i.a1 # => Accessing a1, value=z

  #</tracedaccess.py>

In this example the metaclass looks at the plain data attributes (recognized
thanks ot the ``isplaindata`` function) of its instances and put them
in the dictionary ``cls.datadic``. Then the original attributes are replaced
with property objects tracing the access to them. The solution is a 4-line 
custom metaclass doing the boring job for me:

 ::

  #<oopp.py>
  
  class Wrapped(Customizable,type):
      """A customizable metaclass to wrap methods with a given wrapper and
      a given condition"""
      __metaclass__=Reflective
      wrapper=wrappedmethod
      condition=lambda k,v: True # wrap all
      def __init__(cls,*args):
          super(cls.__this,cls).__init__(*args)
          wrap(cls,cls.wrapper,cls.condition.im_func)

  Traced=Wrapped.With(wrapper=tracedmethod,__name__='Traced')
  Timed=Wrapped.With(wrapper=timedmethod,__name__='Timed')

  #</oopp.py>

Here is an example of usage:

  >>> from oopp import *
  >>> time_=ClsFactory[Traced](time)
  >>> print time_.asctime()
  [time_] Calling 'asctime' with arguments
  (){} ...
  -> 'time_.asctime' called with result: Sun May  4 07:30:51 2003
  Sun May  4 07:30:51 2003

Another is

 ::

  #<tracemain.py>

  from oopp import ClsFactory,Traced,Reflective

  def f1(x): return x     # nested functions 
  def f2(x): return f1(x) # we want to trace

  f1orf2=lambda k,v : v is f1 or v is f2
  make=ClsFactory[Reflective,Traced.With(condition=f1orf2)]
  traced=make('traced',globals())

  traced.f2('hello!') # call traced.f2

  #</tracemain.py>

with output

 ::

  [__main__] Calling 'f2' with arguments
  ('hello!',){} ...
  [__main__] Calling 'f1' with arguments
  ('hello!',){} ...
  -> '__main__.f1' called with result: hello!
  -> '__main__.f2' called with result: hello!

Modifying hierarchies
---------------------------------------------------------

Suppose one wants to enhance a pre-existing class, for instance
by adding tracing capabilities to it. The problem is non-trivial
since it is not enough to derive a new class from the original
class using the 'Traced' metaclass. For instance, we could imagine of 
tracing the 'Pizza' class introduced in chapter 4 by defining

  >>> from oopp import *
  >>> class TracedTomatoPizza(GenericPizza,WithLogger):
  ...     __metaclass__=ClsFactory[Traced] 
  ...     toppinglist=['tomato']

However, this would only trace the methods of the newly defined class,
not of the original one. Since the new class does not introduce any 
non-trivial method, the addition of 'Traced' is practically without
any effect:

  >>> marinara=TracedTomatoPizza('small') # nothing happens
  *****************************************************************************
  Tue Apr 15 11:00:17 2003
  1. Created small pizza with tomato, cost $ 1.5

Tracing hierarchies
------------------------------------------------------------------------------

 ::

  #<traceH.py>

  from oopp import *

  def wrapMRO(cls,wrapped):
      for c in cls.__mro__[:-1]:
          wrap(c,wrapped)

  tracing=tracedmethod.With(logfile=file('trace.txt','w'))
  wrapMRO(HomoSapiensSapiens,tracing)
  HomoSapiensSapiens().can()
  
  #</traceH.py>

with output in trace.txt

 ::

  [HomoSapiensSapiens] Calling 'can' with arguments
   (<oopp.HomoSapiensSapiens object at 0x4020364c>,){} ...
      [HomoSapiens] Calling 'can' with arguments
       (<oopp.HomoSapiensSapiens object at 0x4020364c>,){} ...
          [HomoHabilis] Calling 'can' with arguments
           (<oopp.HomoSapiensSapiens object at 0x4020364c>,){} ...
              [Homo] Calling 'can' with arguments
               (<oopp.HomoSapiensSapiens object at 0x4020364c>,){} ...
                  [PrettyPrinted] Calling '__str__' with arguments
                   (<oopp.HomoSapiensSapiens object at 0x4020364c>,){} ...
                  [PrettyPrinted.__str__] called with result: 
                   <HomoSapiensSapiens>
              [Homo.can] called with result: None
          [HomoHabilis.can] called with result: None
      [HomoSapiens.can] called with result: None
  [HomoSapiensSapiens.can] called with result: None

Modifying source code
------------------------------------------------------------------------

The real solution would be to derive the original class 'GenericPizza'
from 'Traced' and not from 'object'. One could imagine of creating
a new class inhering from 'Traced' and with all the methods of the
original 'GenericPizza' class; then one should create copies of
all the classes in the whole multiple inheritance hierarchy.
This would be a little annoying, but feasable; the real problem is
that this approach would not work with cooperative methods, since 
cooperative calls in the derived classes would invoked methods in 
the original classes, which are not traced. 

This is a case where the modification of the original source code is 
much more appealing and simpler that any other method: it is enough 
to perform a search and replace in the original source code, by adding
the metaclass 'Traced', to enhance the whole multiple inheritance hierarchy.
Let me assume that the hierarchy is contained in a module (which is
typical case). The idea, is to generate *dynamically* a new module from the
modified source code, with a suitable name to avoid conflicts with the
original module. Incredibily enough, this can be done in few lines:

 ::

  #<oopp.py>

  def modulesub(s,r,module):
      "Requires 2.3"
      name=module.__name__
      source=inspect.getsource(module).replace(s,r)
      dic={name: module}; exec source in dic # exec the modified module
      module2=ModuleType(name+'2') # creates an an empty module 
      customize(module2,**dic) # populates it with dic
      return module2
    
  #</oopp.py>

Notice that the ``sub`` function, that modifies the source code of
a given module and returns a modified module, requires Python 2.3
to work. This is a due to a subtle bug in ``exec`` in Python 2.2.
Anyway, the restriction to Python 2.3 allows me to take advantage
of one of the most elegant convenience of Python 2.3: the name in
the ``types`` module acts are type factories and in particular
``ModuleType(s)`` returns an (empty) module named ``s``.
Here is an example of usage:
 
  >>> import oopp
  >>> s='GenericPizza(object):'
  >>> oopp2=oopp.modulesub(s,s+'\n    __metaclass__=oopp.Traced',oopp) 

Name clashes are avoided, being 'oopp2' a different module from 
'oopp'; we have simultaneously access to both the original hierarchy
in 'oopp' (non-traced) and the modified one in 'oopp2' (traced).
In particular 'oopp2.CustomizablePizza' is traced and therefore

  >>> class PizzaLog(oopp2.CustomizablePizza,oopp2.WithLogger):
  ...     __metaclass__=makecls()
  >>> marinara=PizzaLog.With(toppinglist=['tomato'])('small')

gives the output

 ::

  [PizzaLog] Calling '__init__' with arguments
  (<oopp.PizzaLog object at 0x40470dac>, 'small'){} ...
  -> 'PizzaLog.__init__' called with result: None

  *****************************************************************************
  Thu Mar 27 09:18:28 2003
  [PizzaLog] Calling '__str__' with arguments
  (<oopp.PizzaLog object at 0x40470dac>,){} ...
  [PizzaLog] Calling 'price' with arguments
  (<oopp.PizzaLog object at 0x40470dac>,){} ...
  [PizzaLog] Calling 'toppings_price' with arguments
  (<oopp.PizzaLog object at 0x40470dac>,){} ...
  -> 'PizzaLog.toppings_price' called with result: 0.5

  -> 'PizzaLog.price' called with result: 1.5

  -> 'PizzaLog.__str__' called with result: small pizza with tomato, cost $ 1.5

  1. Created small pizza with tomato, cost $ 1.5

From that we understand what is happening:

- ``PizzaLog.__init__`` calls ``GenericPizza.__init__`` that defines size and
  cooperatively calls ``WithLogger.__init__`` 

- WithLogger.__init__ cooperatively calls ``WithCounter.__init__`` 
  that increments the count attribute;

- at this point, the instruction 'print self' in ``WithLogger.__init__`` calls 
  ``PizzaLog.__str__`` (inherited from ``GenericPizza.__str__``);

- ``GenericPizza.__str__`` calls 'price' that in turns calls 
  'toppings_price'.

On top of that, notice that the metaclass of 'PizzaLog' is
``_TracedReflective`` that has been automagically generated by 
``makecls`` from the metaclasses of 'CustomizablePizza' (i.e. 'Traced')
and of 'WithLogger' (i.e. 'Reflective'); the leading underscore helps 
to understand the dynamical origin of '_TracedReflective'.
It turns out that '_TracedReflective' has a dynamically
generated (meta-meta)class:

  >>> print type(type(PizzaLog)) #meta-metaclass
  <class 'oopp._WithWrappingCapabilitiesReflective'>

Therefore this example has a non-trivial class hierarchy

  >>> print oopp.MRO(PizzaLog)
  MRO of PizzaLog:
    0 - PizzaLog(CustomizablePizza,WithLogger)[Traced]
    1 - CustomizablePizza(GenericPizza,Customizable)[Traced]
    2 - GenericPizza(object)[Traced]
    3 - WithLogger(WithCounter,Customizable,PrettyPrinted)
    4 - WithCounter(object)
    5 - Customizable(object)
    6 - PrettyPrinted(object)
    7 - object()

a non-trivial metaclass hierarchy,

  >>> print oopp.MRO(type(PizzaLog)) # the metaclass hierarchy
  MRO of Traced:
    0 - Traced(Reflective)[WithWrappingCapabilities]
    1 - Reflective(type)
    2 - type(object)
    3 - object()

and a non-trivial meta-metaclass hierarchy:

  >>> print oopp.MRO(type(type(PizzaLog))) # the meta-metaclass hierarchy
  MRO of WithWrappingCapabilities:
    0 - WithWrappingCapabilities(BracketCallable)
    1 - CallableWithBrackets(type)
    2 - type(object)
    3 - object()

Pretty much complicated, isn't it ? ;)

This example is there to show what kind of maintenance one can have
with programs doing a large use of metaclasses, particularly, when
they should be understood by somebody else than the autor ...

Metaclass regenerated hierarchies
--------------------------------------------------------------------------

 ::

  import types
    
  def hierarchy(self,cls):
      d=dict([(t.__name__,t) for t in vars(types).itervalues()
              if isinstance(t,type)])
      def new(c):
          bases=tuple([d[b.__name__] for b in c.__bases__])
          return self(c.__name__, bases, c.__dict__.copy())
      mro=list(cls.__mro__[:-1])
      mro.reverse()
      for c in mro:
          if not c.__name__ in d:
              d[c.__name__]=new(c)
      customize(self,**d)

  ClsFactory.hierarchy=hierarchy
  traced=ClsFactory[Traced,Reflective]

Unfortunately, this approach does not work if the original hierarchy makes
named cooperative super calls.

Therefore the source-code run-time modification has its advantages.


THE PROGRAMMABLE PROGRAMMING LANGUAGE
=========================================================================

  *I think that lisp is a better applications language than Python.
  However, Python is close enough, or at least so much better than the
  alternatives, that Python's social and glue language advantages are
  often decisive.*  -- Andy Freeman on c.l.p.

I go in *really* DEEP BLACK MAGIC here.

Lisp has been called the *programmable programming language* [#]_
since its macros allow the  programmer to change the *syntax* of
the language. Python has no macros and the syntax of the language
cannot be changed. Nevertheless, Python metaclasses allows
to change the *semantics* of the language. In this sense, they
are even more powerful and more dangerous than Lisp macros.
Python metaclass allow the user to customize the language (if not
its syntax). This is cool enough, however it can make your programs
unreadable by others. The techniques explained in this
chapter should be used with care. Nevertheless, I trust the judgement
of the programmer who has been able to reach this chapter, and I don't
mind providing him further rope to shoot in his/her foot ;)

.. [#] Paul Graham, 'OnLisp'
       citing 

Enhancing the Python language
--------------------------------------------------------------------------

Let me start with some minor usage of metaclasses. In this section I
will show how the user can implement in few lines features that are
built-in in other languages, through a minimal usage of metaclasses. 

For instance, suppose one wants to define a class which cannot be 
derived: in Java this can be done with the "final" keyword. 
In Python there is no need to add a new keyword to the language: 

 ::

  #<oopp.py>

  class NonDerivableError(Exception): pass

  class Final(type): # better derived from WithCounter,type
      "Instances of Final cannot be derived"
      def __new__(meta,name,bases,dic):
          try:
              meta.already_called is True
          except AttributeError: # not already called
              meta.already_called=True
              return super(Final,meta).__new__(meta,name,bases,dic)
          else: #if already called
              raise NonDerivableError("I cannot derive from %s" % bases)

  #</oopp.py>

Here there is an example of usage:

  >>> from oopp import Final
  >>> class C:
  ...    __metaclass__=Final
  ...
  >>> class D(C): pass #error
  ...
  NonDerivableError: D not created from (<class 'oopp.C'>,)

It is interesting to notice that a similar effect can be reached
with a ``singletonClass`` class factory: a 'MetaSingleton' inherits
from ``Singleton`` and from 'type' (therefore it is a metaclass):

 ::

  #<oopp.py>

  class S(Singleton,type): pass
  singletonClass=ClsFactory[S]

  #</oopp.py>

If we write

  >>> from oopp import singletonClass
  >>> C=singletonClass()
  >>> class D(C):
  ...    pass


we see that actually 'D' is not a new instance of 'Singleton', but
it coincides with 'C', instead:

  >>> id(C),id(D)
  (135622140, 135622140)
  >>> C is D
  True
  >>> type(C)
  <class '__main__._Singleton'>
  >>> type(C).__bases__
  (<class 'oopp.Singleton'>, <type 'type'>)
  >>> c=C(); d=D()
  >>> id(c),id(d)
  (1075378028, 1075378924)

Notice the order: 'SingletonClass' must inherit from 'Singleton' 
first and from ``Class`` second, otherwise the ``Class.__new__`` method would
override the  ``Singleton.__new__``, therefore losing the 'Singleton'
basic property of having only one instance. On the other hand, in
the correct order, 'Singleton' first and 'Class' second, the inheritance
diagram is

 ::


                       object   5
                     (__new__)
                    /          \
                   /            \
        2      WithCounter          type    4
              (__new__)       (__new__)
                  |              |
                  |              |
        1     Singleton         Class    3
              (__new__)       (__new__)
                   \             /
                    \           /
                    SingletonClass    0
                 (Singleton.__new__)


 ::

                        object
                       /     \
                      /       |
                  WithCounter     | 
                      |       |
                  Singleton  type
                       \     /
                        \   /
                     MetaSingleton
                          :
                          :       
                          :   instantiation
                          :
                          :
                        C = D


whereas 'SingletonClass' inherits ``Singleton.__new__`` which, trough
the ``super`` mechanism, calls 'type.__new__' and therefore creates
the class 'C'. Notice that class 'D' is never created, it is simply
an alias for 'C'.

I think it is simpler to write down the class 'Final' explicitely
(explicit is better than implicit) as I did; however a fanatic of code
reuse could derive it from 'SingletonClass':

 ::

  #<final.py>

  from oopp import *

  class Final(Singleton,type):
      "Inherits the 'instance' attribute from Singleton (default None)"
      def __new__(meta,name,bases,dic):
          if meta.counter==0: # first call
              return super(Final,meta).__new__(meta,name,bases,dic)
          else:
              raise NonDerivableError("I cannot derive from %s" % bases)
    
  class C:  __metaclass__=Final

  try:
      class D(C): pass
  except NonDerivableError,e:
      print e
  
  #</final.py>

The reader can check that this script has the correct output 
"I cannot derive from <class 'oopp.C'>". I leave to the reader
to understand the issues with trying to implement 'NonDerivable'
from 'NonInstantiable'. #And why an inner metaclass would not work.

Restricting Python dynamism
-----------------------------------------------------------

 ::

  #<oopp.py>

  def frozen(self,name,value):
      if hasattr(self,name):
          type(self).__bases__[0].__setattr__(self,name,value) 
      else:
          raise AttributeError("You cannot add attributes to %s" % self)

  class Frozen(object):
      """Subclasses of Frozen are frozen, i.e. it is impossibile to add
       new attributes to them and their instances"""
      __setattr__ = frozen
      class __metaclass__(type):
          __setattr__ = frozen

  #</oopp.py>


  #<frozen.py>

  from oopp import *

  class C(Frozen):
      c=1
      def __init__(self): 
          #self.x=5 # won't work anymore, __new__ will be okay
          pass

  class D(C):
      d=2
    
  C.c=2

  print D().d

  #</frozen.py>

Changing the language without changing the language
--------------------------------------------------------------------------

In Lisp the user has the possibility of changing the syntax of the 
language to suit her purposes (or simply to fit her taste). 
In Python, the user cannot change the basic grammar of the language,
nevertheless, to a great extent, metaclasses allows to emulate this effect. 
Notice that using metaclasses to this aim is not necessarely 
a good idea, since once you start
changing the Python standard behaviour, it will become impossible for
others  to understand your programs (which is what happened to Lisp ;).

Let me show how metaclasses can be used to provide notational convenience 
(i.e. syntactic sugar) for Python.

As first example, I will show how we may use metaclasses to provide some
convenient notation for staticmethods and classmethods:

 ::

  class MetaSugar(type):
      def __init__(cls,name,bases,clsdict):
          for key,value in clsdict.iteritems():
              if key.startswith("static_"):
                  setattr(cls,key[7:],staticmethod(value))
              elif key.startwith("class_"):
                  setattr(cls,key[6:],classmethod(value))
        
The same effect can be obtained trough normal inheritance

 ::

  class SyntacticSugar(object):
      def __init__(self):
          for k,v in self.__class__.__dict__.iteritems():
              if k.startswith('static_'):
                  self.__class__.__dict__[k[7:]] = staticmethod(v)
              if k.startswith('static_'):
                  self.__class__.__dict__[k[7:]] = staticmethod(v)

Let me now implement some syntactic sugar for the __metaclass__ hook.

 ::

  #<oopp.py>

  import re
  squarednames=re.compile('\[([A-Za-z_][\w\., ]*)\]')

  def inferredfromdocstring(name,bases,dic):
      docstring=dic['__doc__']
      match=squarednames.match(docstring)
      if not match: return ClsFactory[Reflective](name,bases,dic)
      metanames=[name.strip() for name in match.group(1).split(',')]
      metaname=''.join(metanames)  
      if len(metanames)>1: # creates a new metaclass
          metaclass=type(metaname,tuple(map(eval,metanames)),{})
      else:
          metaclass=eval(metaname)
      return ClsFactory[metaclass](name,bases,dic)

  #</oopp.py>

  #<sugar.py>

  from oopp import *
  __metaclass__ = inferredfromdocstring
  class B:
      "Do nothing class"

  class C: 
      "[Reflective]"
      " Do nothing class"

  class D:
      "[WithLogger,Final]"
      "Do nothing class"

  class E(C):
      pass

  #</sugar.py>

With output:

 ::

  *****************************************************************************
  Fri Feb 21 09:35:58 2003
  Creating class Logged_C descending from (),
  instance of <class 'oopp.Logged'>

  Logged_C dictionary:
   __doc__ = Do nothing class
  *****************************************************************************
  Fri Feb 21 09:35:58 2003
  Creating class Logged_Final_D descending from (),
  instance of <class 'oopp.LoggedFinal'>

  Logged_Final_D dictionary:
  __doc__ = Do nothing class
  *****************************************************************************
  Fri Feb 21 09:35:58 2003
  Creating class E descending from (<class 'oopp.Logged_C'>,),
  instance of <class 'oopp.Logged'>

  E dictionary:
  <EMPTY> 

At the end, let me point out few observations:
Metaclasses can be used to provide syntactic sugar, as I have shown
in the previous example. However, I have given the previous
routines as a proof of concept: I do *not* use these routines in
my actual code for many good reasons:

1. At the end a convenient notation will be provided in Python 2.4
2. I don't want to use magic tricks on my code, I want others to
   be able to understand what the code is doing;
3. I want to be able myself to understand my own code in six months
   from today ;)

Anyway, I think it is a good thing to know about this potentiality
of metaclasses, that can turn out to be very convenient in certain
applications: but this does not mean that should be blindly used
and/or abused. In other words: with great powers come 
great responsabilities ;)

Recognizing magic comments
--------------------------------------------------------------------------

In this section, I will begin to unravel the secrets of the black magic art 
of changing Python semantics and I will show that with few lines 
involving metaclasses
and the standard library 'inspect' module, even comments can be made
significant! (let me continue with my series "how to do what should not
be done").

To this aim, I need a brief digression on regular expressions.

 ::

  class RecognizesMagicComments(object):
     form=r'def %s(NAME)(args):#!\s?staticmethod'
     class __metaclass__(type):
         def __new__(meta,name,bases,dic):
             code=[]
             for attr in dic:
                 source=inspect.getsource(dic[attr]).splitlines()
                 for line in source:
                     split=line.split('#!')
                     if len(split)==2:
                         descriptor=split[1]; code.append(split[0])
                     else: code.append(line)
               
  class C(RecognizesMagicComments):
      #!staticmethod
      def f(x): #!staticmethod
          return x

Interpreting Python source code on the fly
---------------------------------------------------------------------------

At this point, I can really go *DEEP* in black magic.

 ::

  import sys, inspect, linecache, re

  def cls_source(name,module):
      lines = linecache.getlines(inspect.getsourcefile(module))
      if not lines: raise IOError, 'could not get source code'
      pat = re.compile(r'^\s*class\s*' + name + r'\b')
      for i in range(len(lines)):
          if pat.match(lines[i]): break
      else: raise IOError, 'could not find class definition'
      lines, lnum = inspect.getblock(lines[i:]), i + 1
      return ''.join(lines)

  class Interpreter(object):
      def __init__(self,CPO): # possible composition of code processing opers
          self.repl=CPO
      def __call__(self,name,bases,dic):
          try:
             modulename=dic['__module__'] # module where the class is defined
          except KeyError: # no __module__ attribute
             raise IOError("Class %s cannot be defined dynamically or in the\n"
             "interpreter and the source code cannot came from a pipe"% name)
          module=sys.modules[modulename] 
          source=self.repl(cls_source(name,module))
          source=re.sub('__metaclass__=.*','__metaclass__=type',source)
          #print source
          loc={}; exec source in vars(module),loc
          return loc[name]

  regexp_expand=Interpreter(regexp)

Implementing lazy evaluation
---------------------------------------------------------------------------

At this point of our knowledge, it becomes trivial to implement lazy 
evaluation and then a ternary operator. (My original, simpler, implementation
is posted on c.l.p.; see the thread 'PEP 312 (and thus 308) implemented 
with a black magic trick')

Implementing a ternary operator
---------------------------------------------------------------------------

 ::

  # module ternary.py

  "PEP 308 and 312 implemented via a metaclass-powered dirty trick"

  import inspect,__main__

  # the ternary operator:

  def if_(cond,f,g):
      "Short circuiting ternary operator implemented via lambdas"
      if cond: return f()
      else: return g()

  # the metaclass black magic:

  class DirtyTrick(type):
      """Cooperative metaclass that looks at the source code of its instances 
      and replaces the string '~' with 'lambda :' before the class creation"""
      def __new__(meta,name,bases,dic):
          for attr in dic.values():
              if inspect.isfunction(attr): 
                  code=inspect.getsource(attr)
                  if code.find('~')==-1: continue # no '~' found, skip
                  code=code.replace('~','lambda :')
                  code=dedent(code)+'\n'
                  exec code in __main__.__dict__,dic # modifies dic
          return super(DirtyTrick,meta).__new__(meta,name,bases,dic)

  # a convenient base class:

  class RecognizesImplicitLambdas:
      "Children of this class do recognize implicit lambdas"
      __metaclass__=DirtyTrick

Here there is an example of usage:

 ::

  from ternary import if_, RecognizesImplicitLambdas
  from math import sqrt

  class C(RecognizesImplicitLambdas):
     def safesqrt(self,x):
          return if_( x>0, ~sqrt(x), ~0) #short-circuiting ternary operator

  c=C()
  print c.safesqrt(4), c.safesqrt(-4)