First version of my paper mro+super

author: micheles <micheles@micheles-mac> 2010-03-28 17:48:06 +0200
committer: micheles <micheles@micheles-mac> 2010-03-28 17:48:06 +0200
commit: 0a1cac61367eeb4eae8829655cf361a10d6c5418 (patch)
tree: 3f5b311515634ccb7976baefeb5e9c28edaa61eb /pypers
parent: ecf277c15bda9e6284c2fff02ae5917e7ce159f2 (diff)
download: micheles-0a1cac61367eeb4eae8829655cf361a10d6c5418.tar.gz
1 files changed, 1033 insertions, 0 deletions
diff --git a/pypers/mro/mro+super.txt b/pypers/mro/mro+super.txt
new file mode 100644
index 0000000..2ac1381
--- /dev/null
+++ b/pypers/mro/mro+super.txt
@@ -0,0 +1,1033 @@
+The Method Resolution Order and Super
+======================================
+
+Most object oriented languages feature single inheritance.  Python
+instead features multiple inheritance: every Python class can have
+more than one direct parent. That means that a class inherits methods
+(and attributes) from all of its parents; moreover, it means that some
+rule must be given to manage the situation when two or more parents
+provide the same method with different implementations. Whereas some
+languages solve the conflict in a simple way (for instance in Eiffel
+it is an error to have two parents providing different implementations
+of the same method), Python adopts a relatively sophisticated
+algorithm to decide which method should have the precedence. Such an
+algorithm is called the *Method Resolution Order* or MRO for short.
+The reason why Python is using the most sophisticated approach is that
+the language wants to support a pattern called *method cooperation*.
+The goal of this chapter is to explain how the MRO works and what
+method cooperation is.
+
+Some basic definitions
+---------------------------------------------------------------
+
+Given a class in a complicated multiple inheritance hierarchy,
+it is a non-trivial task to specify the order in which methods are
+overridden.  The approach taken by Python - and other languages - is
+to order the ancestors of every given class (i.e. the set of the
+parents of the class, together with the parents of the parents up to
+the final base class ``object``) in a well determined way.
+
+Formally speaking, the list of the ancestors of a class ``C``,
+including the class itself, ordered from the nearest ancestor to the
+furthest, is called the class precedence list or the *linearization*
+of ``C``.  A method defined in a near ancestor has the precedence over
+a method defined in a remote ancestor.
+
+The Method Resolution Order is the set of rules that specifies how to
+build the linearization of a class. Different languages uses different
+MROs: Python, starting from release 2.3, adopts the so-called C3
+Method Resolution Order which was first discussed in `a paper about
+the Dylan language`_ and is arguably the best MRO you can think of.
+In the Python literature, the idiom *the MRO of C* is also used as a
+synonymous for the linearization of the class ``C``.
+
+In the case of single inheritance hierarchy, if ``C`` is a subclass of
+``C1``, and ``C1`` is a subclass of ``C2``, then the linearization of
+``C`` is simply the list ``[C, C1 , C2]``.  However, with multiple
+inheritance hierarchies, the construction of the linearization is more
+cumbersome.  In particular, not all classes admit a linearization: there
+are cases where it is not possible to derive a class such that its 
+linearization is non-ambiguous.
+
+Here I give an example of this situation. Consider the hierarchy 
+
+  >>> O = object
+  >>> class X(O): pass
+  >>> class Y(O): pass
+  >>> class A(X,Y): pass
+  >>> class B(Y,X): pass
+
+which can be represented with the following inheritance graph, where I
+have denoted with O the ``object`` class, which is the beginning of any
+hierarchy for new style classes:
+
+.. code-block:: python
+
+          -----------
+         |           |
+         |    O      |
+         |  /   \    |
+          - X    Y  /
+            |  / | /
+            | /  |/
+            A    B
+            \   /
+              ?
+
+In this case, it is not possible to derive a new class ``C`` from
+``A`` and ``B``, since ``X`` precedes ``Y`` in ``A``, but ``Y``
+precedes ``X`` in ``B``, therefore the method resolution order would
+be ambiguous in ``C``.
+
+Python (starting from version 2.3) raises an exception in this
+situation (``TypeError: MRO conflict among bases Y, X``) forbidding
+the naive programmer from creating ambiguous hierarchies.
+
+The C3 Method Resolution Order
+------------------------------
+
+The MRO used by Python is called C3 because it satifies three
+consistency constraints: consistency with the *extended precedence
+graph*, consistency with *local precedence ordering* and consistency
+with the *monotonicity* property. I will not discuss the extended
+precedence graph here, since it is too technical (you can read the
+Dylan paper if you are curious), whereas I will discuss local precedence
+ordering and monotonicity later on.
+
+But first let me introduce a few simple notations which will be useful for the
+following discussion.  I will use the shortcut notation
+
+  ``C1 C2 ... CN``
+
+to indicate the list of classes ``[C1, C2, ... , CN]``.
+
+The *head* of the list is its first element:
+
+  ``head = C1``
+
+whereas the *tail* is the rest of the list:
+
+  ``tail = C2 ... CN``.
+
+I will also use the notation
+
+  ``C + (C1 C2 ... CN) = C C1 C2 ... CN``
+
+to denote the sum of the lists ``[C] + [C1, C2, ... ,CN]``.
+
+Now I can explain how the MRO works in modern Python.
+
+Consider a class C in a multiple inheritance hierarchy, with ``C``
+inheriting from the base classes ``B1, B2, ...  , BN``.  We want to 
+compute the linearization ``L[C]`` of the class ``C``. The rule is the
+following:
+
+  *the linearization of C is the sum of C plus the merge of the
+  linearizations of the parents with the list of the parents.*
+
+In symbolic notation::
+
+   L[C(B1 ... BN)] = C + merge(L[B1] ... L[BN], B1 ... BN)
+
+In particular, if C is the ``object`` class, which has no parents, the
+linearization is trivial::
+
+       L[object] = object.
+
+However, in general one has to compute the merge according to the following 
+prescription:
+
+  *take the head of the first list, i.e L[B1][0]; if this head is not in
+  the tail of any of the other lists, then add it to the linearization
+  of C and remove it from the lists in the merge, otherwise look at the
+  head of the next list and take it, if it is a good head.  Then repeat
+  the operation until all the class are removed or it is impossible to
+  find good heads.  In this case, it is impossible to construct the
+  merge, Python 2.3 will refuse to create the class C and will raise an
+  exception.*
+
+This prescription ensures that the merge operation *preserves* the
+ordering, if the ordering can be preserved.  On the other hand, if the
+order cannot be preserved (as in the example of serious order
+disagreement discussed above) then the merge cannot be computed.
+
+The computation of the merge is trivial if ``C`` has only one parent 
+(single inheritance); in this case
+
+       ``L[C(B)] = C + merge(L[B],B) = C + L[B]``
+
+However, in the case of multiple inheritance things are more cumbersome
+and I don't expect you can understand the rule without a couple of
+examples ;-)
+
+Examples
+--------
+
+First example. Consider the following hierarchy:
+
+  >>> O = object
+  >>> class F(O): pass
+  >>> class E(O): pass
+  >>> class D(O): pass
+  >>> class C(D,F): pass
+  >>> class B(D,E): pass
+  >>> class A(B,C): pass
+
+In this case the inheritance graph can be drawn as 
+
+ ::
+ 
+                            6
+                           ---
+  Level 3                 | O |                  (more general)
+                        /  ---  \
+                       /    |    \                      |
+                      /     |     \                     |
+                     /      |      \                    |
+                    ---    ---    ---                   |
+  Level 2        3 | D | 4| E |  | F | 5                |
+                    ---    ---    ---                   |
+                     \  \ _ /       |                   |
+                      \    / \ _    |                   |
+                       \  /      \  |                   |
+                        ---      ---                    |
+  Level 1            1 | B |    | C | 2                 |
+                        ---      ---                    |
+                          \      /                      |
+                           \    /                      \ /
+                             ---
+  Level 0                 0 | A |                (more specialized)
+                             ---
+
+
+The linearizations of O,D,E and F are trivial:
+
+ ::
+
+  L[O] = O
+  L[D] = D O
+  L[E] = E O
+  L[F] = F O
+
+The linearization of B can be computed as
+
+ ::
+
+  L[B] = B + merge(DO, EO, DE)
+
+We see that D is a good head, therefore we take it and we are reduced to
+compute ``merge(O,EO,E)``.  Now O is not a good head, since it is in the
+tail of the sequence EO.  In this case the rule says that we have to
+skip to the next sequence.  Then we see that E is a good head; we take
+it and we are reduced to compute ``merge(O,O)`` which gives O. Therefore
+
+ ::
+
+  L[B] =  B D E O
+
+Using the same procedure one finds:
+
+ ::
+
+  L[C] = C + merge(DO,FO,DF)
+       = C + D + merge(O,FO,F)
+       = C + D + F + merge(O,O)
+       = C D F O
+
+Now we can compute:
+
+ ::
+
+  L[A] = A + merge(BDEO,CDFO,BC)
+       = A + B + merge(DEO,CDFO,C)
+       = A + B + C + merge(DEO,DFO)
+       = A + B + C + D + merge(EO,FO)
+       = A + B + C + D + E + merge(O,FO)
+       = A + B + C + D + E + F + merge(O,O)
+       = A B C D E F O
+
+In this example, the linearization is ordered in a pretty nice way
+according to the inheritance level, in the sense that lower levels (i.e.
+more specialized classes) have higher precedence (see the inheritance
+graph).  However, this is not the general case.
+
+I leave as an exercise for the reader to compute the linearization for
+my second example:
+
+  >>> O = object
+  >>> class F(O): pass
+  >>> class E(O): pass
+  >>> class D(O): pass
+  >>> class C(D, F): pass
+  >>> class B(E, D): pass
+  >>> class A(B, C): pass
+
+The only difference with the previous example is the change 
+``B(D, E) --> B(E, D)``; however even such a little modification 
+completely changes the ordering of the hierarchy
+
+ ::
+
+                             6
+                            ---
+  Level 3                  | O |
+                         /  ---  \
+                        /    |    \
+                       /     |     \
+                      /      |      \
+                    ---     ---    ---
+  Level 2        2 | E | 4 | D |  | F | 5
+                    ---     ---    ---
+                     \      / \     /
+                      \    /   \   /
+                       \  /     \ /
+                        ---     ---
+  Level 1            1 | B |   | C | 3
+                        ---     ---
+                         \       /
+                          \     /
+                            ---
+  Level 0                0 | A |
+                            ---
+
+
+Notice that the class ``E``, which is in the second level of the hierarchy,
+precedes the class ``C``, which is in the first level of the hierarc, i.e.
+``E`` is more specialized than ``C``, even if it is in a higher level.
+
+A lazy programmer can obtain the MRO directly from Python: it is enough
+to invoke the .mro() method of class A:
+
+  >>> A.mro()
+  (<class '__main__.A'>, <class '__main__.B'>, <class '__main__.E'>,
+  <class '__main__.C'>, <class '__main__.D'>, <class '__main__.F'>,
+  <type 'object'>)
+
+Finally, let me consider the example discussed in the first section,
+involving a serious order disagreement.  In this case, it is
+straightforward to compute the linearizations of O, X, Y, A and B:
+
+ ::
+
+  L[O] = 0
+  L[X] = X O
+  L[Y] = Y O
+  L[A] = A X Y O
+  L[B] = B Y X O
+
+However, it is impossible to compute the linearization for a class C
+that inherits from A and B:
+
+ ::
+
+  L[C] = C + merge(AXYO, BYXO, AB)
+       = C + A + merge(XYO, BYXO, B)
+       = C + A + B + merge(XYO, YXO)
+
+At this point we cannot merge the lists XYO and YXO, since X is in the
+tail of YXO whereas Y is in the tail of XYO:  therefore there are no
+good heads and the C3 algorithm stops.  Python raises an error and
+refuses to create the class C.
+
+Bad Method Resolution Orders
+----------------------------
+
+A MRO is *bad* when it breaks such fundamental properties as local
+precedence ordering (keeping the ordering of the direct parents) 
+and monotonicity (preserving the ordering of the linearization). 
+Bad MRO do exists and actually
+Python 2.2 had a bad MRO breaking both local
+precedence ordering and monotonicity. 
+
+For what concerns local precedence ordering consider the following example:
+
+  >>> F=type('Food',(),{'remember2buy':'spam'})
+  >>> E=type('Eggs',(F,),{'remember2buy':'eggs'})
+  >>> G=type('GoodFood',(F,E),{}) # under Python 2.3 this is an error!
+
+Notice that despite the name, the MRO determines also the resolution
+order of the attributes, not only of the methods.  
+Here is the inheritance diagram:
+
+ ::
+
+                O
+                |
+   (buy spam)   F
+                | \
+                | E   (buy eggs)
+                | /
+                G
+
+         (buy eggs or spam ?)
+
+
+
+In this example, class ``G`` inherits from ``F`` and ``E``, with F
+*before* ``E``: therefore we would expect the attribute
+*G.remember2buy* to be inherited by *F.rembermer2buy* and not by
+*E.remember2buy*: nevertheless Python 2.2 gives
+
+  >>> G.remember2buy
+  'eggs'
+
+This is a breaking of local precedence ordering since the order in the
+local precedence list, i.e. the list of the parents of ``G``, is not
+preserved in the Python 2.2 linearization of ``G``:
+
+ ::
+
+  L[G,P22]= G E F object   # F *follows* E
+
+One could argue that the reason why ``F`` follows ``E`` in the Python 2.2
+linearization is that ``F`` is less specialized than ``E``, since ``F`` is the
+superclass of ``E``; nevertheless the breaking of local precedence ordering
+is quite non-intuitive and error prone.  This is particularly true since
+it is a different from old style classes:
+
+  >>> class F: remember2buy='spam'
+  >>> class E(F): remember2buy='eggs'
+  >>> class G(F,E): pass
+  >>> G.remember2buy
+  'spam'
+
+In this case the MRO is ``GFEF`` and the local precedence ordering is
+preserved.
+
+As a general rule, hierarchies such as the previous one should be
+avoided, since it is unclear if ``F`` should override ``E`` or viceversa.
+Python solves the ambiguity by raising an exception in the creation
+of class ``G``, effectively stopping the programmer from generating
+ambiguous hierarchies.  The reason for that is that the C3 algorithm
+fails when the merge
+
+ ::
+
+   merge(FO,EFO,FE)
+
+cannot be computed, because ``F`` is in the tail of ``EFO`` and ``E``
+is in the tail of ``FE``.
+
+The real solution is to design a non-ambiguous hierarchy, i.e. to
+derive ``G`` from ``E`` and ``F`` (the more specific first) and not
+from ``F`` and ``E``; in this case the MRO is ``GEF`` without any
+doubt.
+
+ ::
+
+                O
+                |
+                F (spam)
+              / |
+     (eggs)   E |
+              \ |
+                G
+                  (eggs, no doubt)
+
+
+Python, starting from release 2.3, forces the programmer to write good
+hierarchies (or, at least, less error-prone ones).
+
+On a related note, let me point out that the MRO implementation is
+smart enough to recognize obvious mistakes, as the duplication of
+classes in the list of parents:
+
+  >>> class A(object): pass
+  >>> class C(A,A): pass # error
+  Traceback (most recent call last):
+    File "<stdin>", line 1, in ?
+  TypeError: duplicate base class A
+
+Python 2.2 (both for classic classes and new style classes) in this
+situation, would not raise any exception.
+
+
+Examples of non-monotonic MROs
+-------------------------------------------
+
+Having discussed the issue of local precedence ordering, let me now
+consider the issue of monotonicity.  A MRO is monotonic when the
+following is true: *if C1 precedes C2 in the linarization of 
+C, then C1 precedes C2 in the linearization of every subclass of C*.
+Otherwise, the innocuous operation of deriving a new class could
+change the resolution order of methods, potentially introducing very
+subtle bugs.
+
+My goal in this paragraph is to show that neither the
+MRO for classic classes nor that for Python 2.2 new style classes is
+monotonic.
+
+To prove that the MRO for classic classes is non-monotonic is rather
+trivial, it is enough to look at the diamond diagram:
+
+ ::
+
+
+                   C
+                  / \
+                 /   \
+                A     B
+                 \   /
+                  \ /
+                   D
+
+One easily discerns the inconsistency:
+
+ ::
+
+  L[B,P21] = B C        # B precedes C : B's methods win
+  L[D,P21] = D A C B C  # B follows C  : C's methods win!
+
+On the other hand, there are no problems with the Python 2.2 and 2.3
+MROs, they give both
+
+ ::
+
+  L[D] = D A B C
+
+Guido points out in his essay about `new style classes`_ that the
+classic MRO is not so bad in practice, since one can typically avoids
+diamonds for classic classes.  But all new style classes inherit from
+``object``, therefore diamonds are unavoidable and inconsistencies
+shows up in every multiple inheritance graph.
+
+The MRO of Python 2.2 makes breaking monotonicity difficult, but not
+impossible.  The following example, originally provided by Samuele
+Pedroni, shows that the MRO of Python 2.2 is non-monotonic:
+
+  >>> class A(object): pass
+  >>> class B(object): pass
+  >>> class C(object): pass
+  >>> class D(object): pass
+  >>> class E(object): pass
+  >>> class K1(A,B,C): pass
+  >>> class K2(D,B,E): pass
+  >>> class K3(D,A):   pass
+  >>> class Z(K1,K2,K3): pass
+
+Here are the linearizations according to the C3 MRO (the reader should
+verify these linearizations as an exercise and draw the inheritance
+diagram ;-)
+
+ ::
+
+  L[A] = A O
+  L[B] = B O
+  L[C] = C O
+  L[D] = D O
+  L[E] = E O
+  L[K1]= K1 A B C O
+  L[K2]= K2 D B E O
+  L[K3]= K3 D A O
+  L[Z] = Z K1 K2 K3 D A B C E O
+
+Python 2.2 gives exactly the same linearizations for A, B, C, D, E, K1,
+K2 and K3, but a different linearization for Z:
+
+ ::
+
+  L[Z,P22] = Z K1 K3 A K2 D B C E O
+
+It is clear that this linearization is *wrong*, since A comes before D
+whereas in the linearization of K3 A comes *after* D. In other words, in
+K3 methods derived by D override methods derived by A, but in Z, which
+still is a subclass of K3, methods derived by A override methods derived
+by D!  This is a violation of monotonicity.  Moreover, the Python 2.2
+linearization of Z is also inconsistent with local precedence ordering,
+since the local precedence list of the class Z is [K1, K2, K3] (K2
+precedes K3), whereas in the linearization of Z K2 *follows* K3.  These
+problems explain why the 2.2 rule has been dismissed in favor of the C3
+rule.
+
+Method cooperation
+-----------------------------------------------
+
+Having discussed the MRO, we are now ready to discuss method
+cooperation. We will see that even if Python has a good MRO algorithm
+method cooperation is still tricky.
+
+Let me define as *method cooperation* the ability for a
+method to "magically" call its *next* method in the inheritance
+hierarchy.  Everybody is familiar with method cooperation in single
+inheritance situations, when a method can call its parent method.  In
+Python method cooperation is enabled by the ``super`` built-in.
+
+Here is an example using Python 3 syntax:
+
+.. code-block:: python
+
+  class A(object):
+      def __init__(self):
+          print('A.__init__')
+          super().__init__()
+
+It is clear that the ``__init__`` method in the class ``A`` is calling
+the ``__init__`` method of the parent of ``A``,
+i.e. ``object.__init__``.  However, things are
+much subtler in multiple inheritance situations and it is non-obvious to
+figure out which is the next method to call.
+For instance, suppose we define the following additional classes:
+ 
+.. code-block:: python
+
+  class B(object):
+      def __init__(self):
+          print('B.__init__')
+          super().__init__()
+  
+  class C(A, B):
+      def __init__(self):
+          print('C.__init__')
+          super().__init__()
+ 
+When I create an instance of ``A``, which method will be called by
+``super().__init__()``?  Notice that I am considering here generic
+instances of ``A``, not only direct instances: in particular, an
+instance of ``C`` is also an instance of ``A`` and instantiating ``C``
+will call ``super().__init__()`` in ``A.__init__`` at some point.
+
+In a single inheritance language there is an unique answer both for
+direct and indirect instances (``object`` is the super class of ``A``
+and ``object.__init__`` is the method called by
+``super().__init__()``).  On the other hand, in a multiple inheritance
+language there is no easy answer. It is better to say that there is no
+super class and it is impossible to know which method will be called
+by ``super().__init__()`` unless the subclass from wich ``super`` is 
+called is known. In particular, this is what happens when we instantiate ``C``:
+
+>>> c = C()
+C.__init__
+A.__init__
+B.__init__
+
+As you see the super call  in ``C`` dispatches to ``A.__init__`` and then
+the super call there dispatches to  ``B.__init__`` which in turns dispatches to
+``object.__init__``. The important point to stress is that
+*the same super call can dispatch to different methods*: 
+when ``super().__init__()`` is called directly by instantiating 
+``A`` it dispatches to ``object.__init__`` whereas when it is called indirectly
+by instantiating ``C`` it dispatches to ``B.__init__``. If somebody 
+extends the hierarchy, adds subclasses of ``A`` and instantiated them, 
+then the super call in ``A.__init__``
+can dispatch to an entirely different method: *the next method called 
+depends on the instance I am starting from*.
+
+If you are considering a direct instance of ``A``,
+``object`` is the only class the super call can dispatch to:
+
+.. code-block:: python
+
+ >>> A.mro()
+ [<class '__main__.A'>, <class 'object'>]
+
+If you are considering a direct instance of ``C``, ``super`` looks at the
+linearization of ``C``:
+
+.. code-block:: python
+
+ >>> C.mro()
+ [<class '__main__.C'>, <class '__main__.A'>, <class '__main__.B'>, <class 'object'>]
+
+A super call in ``C`` will look first at ``A``, then at ``B`` and finally at
+``object``. Finding out the linearization is non-trivial; just to give
+an example suppose we add to our hierarchy three classes ``D``, ``E`` and ``F``
+in this way:
+
+.. code-block:: python
+
+ >>> class D: pass
+ >>> class E(A, D): pass
+ >>> class F(E, C): pass
+
+At this point the MRO is FECADBO:
+
+ >>> for c in F.mro():
+ ...    print(c.__name__)
+ F
+ E
+ C
+ A
+ D
+ B
+ object
+
+As you see, for an instance of ``F`` a super call in ``A.__init__`` 
+will dispatch at ``D.__init__`` and not directly at ``B.__init__``! 
+
+The problem with incompatible signatures
+----------------------------------------------------
+
+I have just shown that one cannot tell in advance
+where the supercall will dispatch. In this example when you design the
+original hierarchy you will expect that
+``A.__init__`` will call ``B.__init__``, but adding classes (and such
+classes may be added by a third party) may change the method chain. In the
+example ``A.__init__`` when invoked by an ``F`` instance will call
+``D.__init__``. This is clearly dangerous: for instance, 
+if the behavior of your code depends on the ordering of the
+methods you may get in trouble. Things are worse if one of the methods
+in the cooperative chain does not have a compatible signature, since the
+chain will break.
+
+This problem is not theoretical and it happens even in very trivial
+hierarchies.  For instance, here is an example of incompatible
+signatures in the ``__init__`` method (this problem 
+affects even Python 2.6, not only Python 3.X):
+
+.. code-block:: python
+
+ class X(object):
+    def __init__(self, a):
+        super().__init__()
+
+ class Y(object):
+    def __init__(self, a):
+        super().__init__()
+
+ class Z(X, Y):
+    def __init__(self, a):
+        super().__init__(a)
+
+Here instantiating ``X`` and ``Y`` works fine, but as soon as you
+introduce ``Z`` you get in trouble since ``super().__init__(a)`` in
+``Z.__init__`` will call ``super().__init__()`` in ``X`` which in
+turns will call ``Y.__init__`` with no arguments, resulting in a
+``TypeError``!  In older Python versions (from 2.2 to 2.5) such
+problem can be avoided by leveraging on the fact that
+``object.__init__`` accepts any number of arguments (ignoring them), by
+replacing ``super().__init__()`` with ``super().__init__(a)``. In Python
+2.6+ instead there is no real solution for this problem, except avoiding
+``super`` in the constructor or avoiding multiple inheritance.
+
+In general if you want to support multiple inheritance you should *use
+super only when the methods in a cooperative chain 
+have consistent signature*: that means that you
+will not use super in ``__init__`` and ``__new__`` since likely your
+constructors will have custom arguments whereas ``object.__init__``
+and ``object.__new__`` have no arguments.  However, in practice, you may
+inherits from third party classes which do not obey this rule, or
+others could derive from your classes without following this rule and
+breakage may occur. For instance, I have used ``super`` for years in my
+``__init__`` methods and I never had problems because in older Python
+versions ``object.__init__`` accepted any number of arguments: but in Python 3
+all that code is fragile under multiple inheritance. I am left with 
+two choices: removing ``super`` or telling people that
+those classes are not intended to be used in multiple inheritance
+situations, i.e. the constructors will break if they do that.
+Nowadays I tend to favor the second choice. 
+
+Luckily, usually multiple inheritance is used with mixin classes, and mixins do
+not have constructors, so that in practice the problem is mitigated.
+
+Method cooperation done right
+----------------------------------------------------
+
+Even if ``super`` has its shortcomings, there are meaningful use cases for
+it, assuming you think multiple inheritance is a legitimate design technique.
+For instance, if you use metaclasses and you want to support multiple 
+inheritance, you *must* use ``super`` in the ``__new__`` and ``__init__``
+methods: there is no problem, since the constructor for 
+metaclasses has a fixed signature *(name, bases, dictionary)*. But metaclasses
+are extremely rare, so let me give a more meaningful example for an application
+programmer where a design bases on cooperative
+multiple inheritances could be reasonable.
+
+Suppose you have a bunch of ``Manager`` classes which
+share many common methods and which are intended to manage different resources,
+such as databases, FTP sites, etc. To be concrete, suppose there are
+two common methods: ``getinfolist`` which returns a list of strings
+describing the managed resorce (containing infos such as the URI, the
+tables in the database or the files in the site, etc.) and ``close``
+which closes the resource (the database connection or the FTP connection).
+You can model the hierarchy with a ``Manager`` abstract base class
+
+.. code-block:: python
+ 
+  class Manager(object):
+      def close(self):
+          pass
+      def getinfolist(self):
+          return []  
+
+and two concrete classes ``DbManager`` and ``FtpManager``:
+
+.. code-block:: python
+ 
+  class DbManager(Manager):
+      def __init__(self, dsn):
+          self.conn = DBConn(dsn)
+      def close(self):
+          super().close()
+          self.conn.close()
+      def getinfolist(self):
+          return super().getinfolist() + ['db info']
+  
+  class FtpManager(Manager):
+      def __init__(self, url):
+          self.ftp = FtpSite(url)
+      def close(self):
+          super().close()
+          self.ftp.close()
+      def getinfolist(self):
+          return super().getinfolist() + ['ftp info']
+  
+Now suppose you need to manage both a database and an FTP site:
+then you can define a ``MultiManager`` as follows:
+
+.. code-block:: python
+
+  class MultiManager(DbManager, FtpManager):
+      def __init__(self, dsn, url):
+          DbManager.__init__(dsn)
+          FtpManager.__init__(url)
+
+Everything works: calling ``MultiManager.close`` will in turn call 
+``DbManager.close`` and ``FtpManager.close``. There is no risk of
+running in trouble with the signature since the ``close`` and ``getinfolist``
+methods have all the same signature (actually they take no arguments at all).
+Notice also that I did not use ``super`` in the constructor. 
+You see that ``super`` is *essential* in this design: without it, 
+only ``DbManager.close`` would be called and your FTP connection would leak. 
+The ``getinfolist`` method works similarly: forgetting ``super`` would
+mean losing some information. An alternative not using ``super`` would require
+defining an explicit method ``close`` in the ``MultiManager``, calling
+``DbManager.close`` and ``FtpManager.close`` explicitly, and an explicit
+method ``getinfolist`` calling ```DbManager.getinfolist`` and 
+``FtpManager.getinfolist``:
+
+.. code-block:: python
+
+  def close(self):
+      DbManager.close(self)
+      FtpManager.close(self)
+  
+  def getinfolist(self):
+      return DbManager.getinfolist(self) + FtpManager.getinfolist(self)
+
+This would be less elegant but probably clearer and safer so you can always
+decide not to use ``super`` if you really hate it. However, if you have
+``N`` common methods, there is some boiler plate to write; moreover, every time
+you add a ``Manager`` class you must add it to the ``N`` common methods, which
+is ugly. Here ``N`` is just 2, so not using ``super`` may work well, 
+but in general it is clear that the cooperative approach is more effective.
+Actually, I strongly believe (and always had) that ``super`` and the
+MRO are the *right* way to do multiple inheritance: but I also believe
+that multiple inheritance itself is *wrong*. For instance, in the
+``MultiManager`` example I would not use multiple
+inheritance but composition and I would probably use a generalization
+such as the following:
+
+.. code-block:: python
+
+  class MyMultiManager(Manager):
+      def __init__(self, *managers):
+          self.managers = managers
+      def close(self):
+          for mngr in self.managers:
+              mngr.close()
+      def getinfolist(self):
+          return sum(mngr.getinfolist() for mngr in self.managers)
+
+There are languages that do not provide inheritance (even single
+inheritance!)  and are perfectly fine, so you should always question
+if you should use inheritance or not. There are always many options
+and the design space is rather large.  Personally, I always use
+``super`` but I use single-inheritance only, so that my cooperative
+hierarchies are trivial.
+
+The magic of super in Python 3
+----------------------------------------------------------------------
+
+I will devolve this last paragraph to point out a few subtilities of
+``super`` in Python 3. In Python 3.X ``super`` is smart enough to
+figure out the class it is invoked from and the first argument of the
+containing method. Actually it is so smart that it works also for
+inner classes and even if the first argument is not called
+``self``. However, you can always pass the current class and the first
+argument of the method explicitly: for instance our first example
+could be written
+
+.. code-block:: python
+
+ class A(object):
+     def __init__(self):
+         print('A.__init__')
+         super(A, self).__init__()
+
+In other words, ``super()`` is just a shortcut for 
+``super(A, self)``. In Python 3 the (bytecode) compiler is able
+to recognize that the supercall is performed inside the class ``A`` so
+that it inserts the reference to ``A`` automagically; moreover it inserts
+the reference to the first argument of the current method too. Typically
+the first argument of the current method is ``self``, but it may be
+``cls`` or any identifier: ``super`` will work fine in any case. 
+
+Since ``super()`` knows the class it is invoked from and the class of
+the original caller, it can walk the MRO correctly. Such information
+is stored in the attributes ``.__thisclass__`` and ``.__self_class__``
+and you may understand how it works from the following example:
+
+.. code-block:: python
+
+  class Mother(object):
+      def __init__(self):
+          sup = super()
+          print(sup.__thisclass__)
+          print(sup.__self_class__)
+          sup.__init__()
+  
+  class Child(Mother):
+      pass  
+
+.. code-block:: python
+
+ >>> child = Child()
+ <class '__main__.Mother'>
+ <class '__main__.Child'>
+
+Here ``.__self__class__`` is just the class of the first argument (``self``)
+but this is not always the case. The exception is the case of classmethods and
+staticmethods taking a class as first argument, such as ``__new__``.
+Specifically, ``super(cls, x)`` checks if ``x`` is an instance
+of ``cls`` and then sets ``.__self_class__`` to ``x.__class__``; otherwise
+(and that happens for classmethods and for ``__new__``) it checks if ``x`` 
+is a subclass of ``cls`` and then sets  ``.__self_class__`` to ``x`` directly.
+For instance, in the following example
+
+.. code-block:: python
+
+  class C0(object):
+      @classmethod
+      def c(cls):
+          print('called classmethod C0.c')
+  
+  class C1(C0):
+      @classmethod
+      def c(cls):
+          sup = super()
+          print('__thisclass__', sup.__thisclass__)
+          print('__selfclass__', sup.__self_class__)
+          sup.c()
+  
+  class C2(C1):
+      pass
+  
+the attribute ``.__self_class__`` is *not* the class of the first argument
+(which would be ``type`` the metaclass of all classes) but simply the first
+argument:
+
+.. code-block:: python
+
+ >>> C2.c()
+ __thisclass__ <class '__main__.C1'>
+ __selfclass__ <class '__main__.C2'>
+ called classmethod C0.c
+
+There more magic going on in Python 3 ``super`` than you may expect.
+For instance, this is a syntax that cannot work:
+
+.. code-block:: python
+
+  def __init__(self):
+      print('calling __init__')
+      super().__init__()
+  
+  class C(object):
+      __init__ = __init__
+  
+  if __name__ == '__main__':
+      c = C()
+  
+If you try to run this code you will get a
+``SystemError: super(): __class__ cell not found`` and the reason is
+obvious: since the ``__init__`` method is external to the class the
+compiler cannot infer to which class it will be attached at runtime.
+On the other hand, if you are completely explicit and you use the full
+syntax, by writing the external method as
+
+.. code-block:: python
+
+   def __init__(self):
+       print('calling __init__')
+       super(C, self).__init__()
+   
+everything will work because you are explicitly telling than the method
+will be attached to the class ``C``.
+
+I will close this section by noticing a wart of ``super`` in Python 3,
+pointed out by `Armin Ronacher`_ and others: the fact that ``super``
+should be a keyword but it is not. Therefore horrors like the
+following are possible:
+
+.. code-block:: python
+
+  def super():
+      print("I am evil, you are NOT calling the supermethod!")
+  
+  class C(object):
+      def __init__(self):
+          super().__init__()
+  
+  if __name__ == '__main__':
+      c = C() # prints "I am evil, you are NOT calling the supermethod!"
+
+Don't do that! Here the called ``__init__`` is the ``__init__`` method
+of the object ``None``!
+
+Of course, only an evil programmer would shadow ``super`` on purpose,
+but that may happen accidentally. Consider for instance this use case:
+you are refactoring an old code base written before the existence of
+``super`` and using ``from mod import *`` (this is ugly but we know
+that there are code bases written this way), with ``mod`` defining a
+function ``super`` which has nothing to do with the ``super``
+builtin. If in this code you replace ``Base.method(self, *args)`` with
+``super().method(*args)`` you will introduce a bug. This is not common
+(it never happened to me), but still it is bug that could not happen if
+``super`` were a keyword.
+
+Moreover, ``super`` is special and it will not work if
+you change its name as in this example:
+
+.. code-block:: python
+
+ # from http://lucumr.pocoo.org/2010/1/7/pros-and-cons-about-python-3
+ _super = super
+ class Foo(Bar):
+     def foo(self):
+         _super().foo()
+
+Here the bytecode compiler will not treat specially ``_super``, only
+``super``. It is unfortunate that we missed the opportunity to make ``super``
+a keyword in Python 3, without good reasons (Python 3 was expected 
+to break compatibility anyway).
+
+Resources
+---------
+
+The analysis of the MRO given in this chapter is based on a `classic essay`_ I
+wrote when Python 2.3 came out. You can find it on the Python web site.
+It also contains a Python implementation of the MRO algorithm for people
+who wants to see code. In turns my essay is based on
+`a paper about the Dylan language`_ and on a `thread on python-dev`_
+started by Samuele Pedroni.
+
+The part about super is based on a `recent blog
+post` of mine.
+There is plenty of material about super and multiple inheritance. You
+should probably start from `Super considered harmful`_ by James
+Knight. A lot of the issues with ``super``, especially in old versions
+of Python are covered in `Things to know about super`_. I did spent
+some time thinking about ways to avoid multiple inheritance; you may
+be interested in reading my series `Mixins considered harmful`_.
+
+..  _thread on python-dev: http://mail.python.org/pipermail/python-dev/2002-October/029035.html
+.. _a paper about the Dylan language: http://www.webcom.com/haahr/dylan/linearization-oopsla96.html
+.. _classic essay: http://www.python.org/download/releases/2.3/mro/
+.. _new style classes: http://www.python.org/download/releases/2.2.3/descrintro/
+.. _Super considered harmful: http://fuhm.net/super-harmful/
+.. _Menno Smits: http://freshfoo.com/blog/object__init__takes_no_parameters
+.. _Things to know about super: http://www.phyast.pitt.edu/~micheles/python/super.pdf
+.. _Mixins considered harmful: http://www.artima.com/weblogs/viewpost.jsp?thread=246341
+.. _Armin Ronacher: http://lucumr.pocoo.org/2008/4/30/how-super-in-python3-works-and-why-its-retarded
+.. _warts in Python 3: http://lucumr.pocoo.org/2010/1/7/pros-and-cons-about-python-3
+.. _recent blog post: http://www.artima.com/weblogs/viewpost.jsp?thread=281127
author	micheles <micheles@micheles-mac>	2010-03-28 17:48:06 +0200
committer	micheles <micheles@micheles-mac>	2010-03-28 17:48:06 +0200
commit	0a1cac61367eeb4eae8829655cf361a10d6c5418 (patch)
tree	3f5b311515634ccb7976baefeb5e9c28edaa61eb /pypers
parent	ecf277c15bda9e6284c2fff02ae5917e7ce159f2 (diff)
download	micheles-0a1cac61367eeb4eae8829655cf361a10d6c5418.tar.gz