summaryrefslogtreecommitdiff
path: root/pypers/app2.txt
blob: df768ab8acc1873cbc700596262100f32666d9aa (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
---------------------------------------------------------------------------
CHAPTER 4. THE SECRETS OF MULTIPLE INHERITANCE
---------------------------------------------------------------------------

Introduction
---------------------------------------------------------------------------

In my opinion, Multiple Inheritance (often abbreviated as MI) is the most
advanced topic in Object Oriented Programming. It is also true that
Multiple Inheritance is one of the most difficult features to implement
in an Object Oriented Programming language. Even, some languages by design
decided to avoid it. This is for instance the case of Java, that avoided
MI having seen its implementation in C++ (which is not for the faint of
heart ;-); for what concerns the so called scripting languages, of which
the most famous are Perl, Python and Ruby (in this order, even if
the right order would be Python, Ruby and Perl), only Python implements
Multiple Inheritance.
The fact that Multiple Inheritance can be hairy, does not mean that it
is *always* hairy, however. Multiple Inheritance is used with success
in Lisp derived languages (CLOS, Dylan,..), for instance.
The aims of this chapter is to show how much the
Python support for MI has improved in the most recent version (2.2 and 2.3).
The message is the following: if Python 1.5 had a basic support for
MI inheritance (basic but nevertheless with nice features, dynamic),
Python 2.2 has *greatly* improved that support and with the
change of the Method Resolution Order in Python 2.3, we may say
that support for MI is now *excellent*.

I strongly encourage Python programmers to use MI a lot: this will
allows even a stronger reuse of code than in single inheritance.

Why Multiple Inheritance ?
--------------------------

I think the best way to explain the power of MI is trough a simple example.

#singleton, noninstantiable


The Python 2.3 Method Resolution Order
--------------------------------------

                *Felix qui potuit de rerum cognoscere causas* -- Virgilius

Everything started with a post by Samuele Pedroni in the Python development
mailing list [[#]_]. In his post, Samuele showed that the Python 2.2 method 
resolution order is not monotonic and he proposed to replace it with the 
C3 method resolution order. Guido agreed with his arguments and therefore 
now Python 2.3 uses C3. The C3 method itself has nothing to do with Python,
since it has been invented by people working on Dylan and it is described in 
a paper intended for lisper readers [[#]_]. The present paper gives a
(hopefully) readable discussion of the C3 algorithm for Pythonistas 
who want to understand the reasons for the change.

First of all, let me point out that all I am going to say only applies to 
the *new style classes* introduced in Python 2.2: *classic classes* maintain 
their old method resolution order, depth first and then left to right.
Therefore, there is no breaking of old code for classic classes; and even 
if in principle there could be breaking of code for Python 2.2 new style 
classes, in practice the cases in which the C3 resolution order 
differs from the Python 2.2 method resolution order are so rare
that no real breaking of Python 2.2 code is expected.
Therefore:

   *don't be scared !*

Moreover, unless you make strong use of multiple inheritance and you have
non-trivial hierarchies, you don't need to understand the C3 algorithm, 
and you can easily skip this paper. 
On the other hand, if you really want to knows how multiple
inheritance works, then this paper is for you. The good news is
that things are not as complicated as you could expect.

Let me begin with some basic definitions.

1) Given a class in a complicate multiple inheritance hierarchy, it 
   is a non-trivial task to specify the order in which methods are 
   overridden, i.e. to specify the order of the ancestors of the class.
 
2) The list of the ancestors of a class C, including
   the class itself, ordered from the nearest ancestor to
   the furthest, is called the class precedence list or the 
   *linearization* of the class C.

3) The *Method Resolution Order* (MRO) is the set of rules that 
   allow to construct the linearization. In the Python literature,
   the idiom "the MRO of C" is also used as a synonymous for the
   linearization of the class C.

4) In the case of a 
   single inheritance hierarchy in which C is a subclass of C1 
   which in turn is a subclass of C2, the linearization of C 
   is the list [C, C1 , C2]. In multiple inheritance hierarchies,
   the construction of the linearization is cumbersome, since it
   has to respect the essential constraints of *local precedence
   ordering* and *monotonicity*.

5) I will discuss the local precedence ordering later, but
   I can give the definition of monotonicity here. A MRO is monotonic 
   when the following it true: *if C1 precedes C2 in the linearization of C, 
   then C1 precedes C2 in the linearization of any subclass of C*. Otherwise, 
   the innocuous  operation of deriving a new class could change the 
   resolution order of methods, potentially introducing very subtle bugs. 
   Examples where this happens will be shown later.

6) Not all classes admit a linearization. There are cases, in 
   complicated hierarchies, where it is not possible to derive a
   class such that its linearization respects all the good properties.

Here I give an example of this situation. Consider the hierarchy ::

     O = object
     class X(O): pass
     class Y(O): pass
     class A(X,Y): pass
     class B(Y,X): pass

which can be represented with the following inheritance graph,
where I have denoted with O the class object, which is the beginning
of any hierarchy for new style classes:

 ::

          -----------
         |           |
         |    O      |
         |  /   \    |
          - X    Y  /
            |  / | /
            | /  |/
            A    B
            \   /
              ?

In this case, it is not possible to derive a new class C from A and B,
since X precedes Y in A, but Y precedes X in B, therefore the method
resolution order would be ambiguous in C. 

Python 2.3 raises an exception in this situation (TypeError: 
MRO conflict among bases Y, X) forbidding the naive programmer from 
creating ambiguous hierarchies. Python 2.2 instead does not raise 
an exception, but chooses an *ad hoc* ordering (ZABXYO in this case).


The C3 Method Resolution Order
------------------------------
 
Let me introduce few simple notations which will be useful for the
following discussion. I use the shortcut notation 

  C1 C2 ... CN 

to indicate the list of classes [C1, C2, ... , CN]. 

The *head* of the list is its first element: 

  head = C1 

whereas the *tail* is the rest of the list:

  tail = C2 ... CN.

I also use the notation

  C + (C1 C2 ...  CN) = C C1 C2 ... CN

to denote the sum of the lists [C] + [C1, C2, ... ,CN].

Now I can explain how the MRO works in Python 2.3, i.e. I give
the rules to compute the linearization of any given class C.

Rule 0: 
  If C is the "object" class, which has no parents, its linearization 
  coincides with itself:

     L(O) = O

Rule 1: 
  If C = C(B) is a class with an unique parent B (single inheritance) 
  its linearization is simply the sum of the class with the 
  linearization of the parent:

     L(C(B)) = C + L(B)

Rule 2:
  In the case of multiple inheritance things are more cumbersome
  and one has to introduce the concept of *merge* of lists.
  Consider for instance the case of two parents B1 and B2.
  Then the linearization of C is given by C plus the merge
  of the three sequences:

   - linearization of the first parent;
   - linearization of the second parent;
   - list of the two parents

In symbolic notation,

     L(C(B1,B2)) = C + merge(L(B1), L(B2), B1 B2)

How the merge is computed ? The rule is the following:

  *take the head of the first sequence, i.e L(B1)[0]; if this 
  head is not in the tail of any of the other sequences, then 
  add it to the linearization of C and remove it from the sequences
  in the merge, otherwise look at the head 
  of the next sequence and take it, if it is a good head. 
  Then repeat the operation. If there 
  are no good heads, then it is impossible to construct the merge. 
  In this case Python 2.3 will raise an error and will refuse 
  to create the class C.*

This prescription ensures that the merge operation *preserves* the 
ordering, if the ordering can be preserved. On the other hand,
if the order cannot be preserved (as in the example of 
serious order disagreement discussed before) then the merge cannnot be
computed.

The generalization to the case of a class with
three or more parents is obvious:

   L(C(B1, ... , BN)) = C + merge(L(B1), ... ,L(BN), B1 ... BN) 

The merge rule is the heart of the C3 algorithm: once you have 
understood that, you have understood the C3 Method Resolution Order. 
But I don't expect that you can understand the rule without a couple 
of examples ;-)

Examples
--------

First example. Consider the following hierarchy:

>>> O = object
>>> class F(O): pass
>>> class E(O): pass
>>> class D(O): pass
>>> class C(D,F): pass
>>> class B(D,E): pass
>>> class A(B,C): pass

In this case the inheritance graph can be drawn as ::


                           6
                          ---
 Level 3                 | O |                  (more general)
                       /  ---  \
                      /    |    \                      |
                     /     |     \                     |
                    /      |      \                    |
                   ---    ---    ---                   |
 Level 2        3 | D | 4| E |  | F | 5                |
                   ---    ---    ---                   |
                    \  \ _ /       |                   |
                     \    / \ _    |                   |
                      \  /      \  |                   |
                       ---      ---                    |
 Level 1            1 | B |    | C | 2                 |
                       ---      ---                    |
                         \      /                      |
                          \    /                      \ /
                            ---
 Level 0                 0 | A |                (more specialized)
                            ---


The linearizations of O,D,E and F are trivial::

  L(O) = O
  L(D) = D O
  L(E) = E O
  L(F) = F O

The linearization of B can be computed as 

  L(B) = B + merge(DO, EO, DE) 

We see that D is a good head, therefore we take it and we are reduced
to compute merge(O,EO,E). Now O is not a good head, since it is in 
the tail of the sequence EO. In this case the rule says that we have to
skip to the next sequence. Then we see that E is a good head; we take it 
and we are reduced to compute merge(O,O) which gives O. Therefore

  L(B) =  B D E O

With the same argument one finds::

  L(C) = C + merge(DO,FO,DF) 
       = C + D + merge(O,FO,F)
       = C + D + F + merge(O,O)
       = C D F O

Now we can compute ::

  L(A) = A + merge(BDEO,CDFO,BC)
       = A + B + merge(DEO,CDFO,C)
       = A + B + C + merge(DEO,DFO)
       = A + B + C + D + merge(EO,FO)
       = A + B + C + D + E + merge(O,FO)
       = A + B + C + D + E + F + merge(O,O) 
       = A B C D E F O

In this example, the linearization is ordered in a pretty nice way
according to the inheritance level, in the sense that lower levels
(i.e. more specialized classes) have higher precedence (see the 
inheritance graph). However, this is not the general case. 

I leave as an exercise for the reader to compute the linearization 
for my second example:

>>> O = object
>>> class F(O): pass
>>> class E(O): pass
>>> class D(O): pass
>>> class C(D,F): pass
>>> class B(E,D): pass
>>> class A(B,C): pass

The only difference with the previous example is the change
B(D,E) --> B(E,D); however even such a little modification
completely changes the ordering of the hierarchy::
  

                            6
                           ---
 Level 3                  | O |
                        /  ---  \
                       /    |    \
                      /     |     \
                     /      |      \
                   ---     ---    ---
 Level 2        2 | E | 4 | D |  | F | 5 
                   ---     ---    ---
                    \      / \     /
                     \    /   \   / 
                      \  /     \ /  
                       ---     ---
 Level 1            1 | B |   | C | 3
                       ---     ---
                        \       /
                         \     /
                           ---
 Level 0                0 | A |
                           ---


Notice that the class E, which is in the second level of the hierarchy,
precedes the class C, which is in the first level of the hierarchy, i.e.
E is more specialized than C, even if it is in an upper level.

The lazy programmer can obtain the MRO directly from Python 2.2, since
in this case it coincides with the Python 2.3 linearization. It is
enough to invoke the .mro() method of class A:

>>> A.mro()
(<class '__main__.A'>, <class '__main__.B'>, <class '__main__.E'>, 
<class '__main__.C'>, <class '__main__.D'>, <class '__main__.F'>, 
<type 'object'>)

Finally, let me consider the example discussed in the first section,
involving a serious order disagreement. In this case, it is
obvious to compute the linearizations of the classes O, X, Y, A and B::

  L(O) = 0
  L(X) = X O
  L(Y) = Y O
  L(A) = A X Y O
  L(B) = B Y X O

However, it is impossible to compute the linearization for a class
C that inherits from A and B::

  L(C) = C + merge(AXYO, BYXO, AB)
       = C + A + merge(XYO, BYXO, B)
       = C + A + B + merge(XYO, YXO)

At this point we cannot merge the lists XYO and YXO, since X is
in the tail of YXO whereas Y is in the tail of XYO: therefore there
are no good heads and the C3 algorithm stops. Python 2.3 raises an 
error and refuses to create the class C.



Bad Method Resolution Orders
----------------------------

A MRO is *bad* when it breaks such fundamental properties as local
precedence ordering and/or monotonicity. In this section, I will show
that both the MRO for classic classes and the MRO for new style classes 
in Python 2.2 are bad.

It is easier to start with the local precedence ordering.
Consider the following example:

>>> F=type('Food',(),{remember2buy:'spam'})
>>> E=type('Eggs',(F,),{remember2buy:'eggs'})
>>> G=type('GoodFood',(F,E),{})

We see that the class G inherits from F and E, with F *before* E:
therefore we would expect the attribute *G.remember2buy* to be
inherited by *F.rembermer2buy* and not by
*E.remember2buy*: nevertheless Python 2.2 gives

>>> G.remember2buy
'eggs' 

This is a breaking of local precedence ordering since the order in the
local precedence list, i.e. the list of the parents of G, is not
preserved in the Python 2.2 linearization of G:

  L(G,P22)= G E F object   # F *follows* E

One could argue that the reason why F follows E in the Python 2.2
linearization is that F is less specialized than E, since F is
the superclass of E; nevertheless the breaking of local
precedence ordering is quite non-intuitive and error prone.
This is particularly true since there is a difference with old style
classes:

>>> class F: remember2buy='spam'
>>> class E(F): remember2buy='eggs'
>>> class G(F,E): pass
>>> G.remember2buy
'spam'

In this case the MRO is GFEF and the local precedence ordering is
preserved.


As a general rule, hierarchies such as

 ::

                O
                |
                F
                | \
                | E
                | /
                G 

should be avoided, since it is unclear if F should override E or
viceversa. Python 2.3 solves the ambiguity buy raising an exception
in the creation of class G, effectively stopping the programming
from generating ambiguous hierarchies.
The reason for that is that the C3 algorithm fails, since the
merge

   merge(FO,EFO,FE)

cannot be computed, because F is in the tail of EFO and E is in the tail
of FE.

The real solution is to design a non-ambiguous hierarchy, i.e. to derive
G from E and F (the more specific first) and not from F and E; in this 
case the MRO is GEF without any doubt. 
Python 2.3 forces the programmer to write good hierarchies (or,
at least, less prone to errors). 

On a related note, let me point out that the Python 2.3 algorithm
is smart enough to recognize obvious mistakes, as the duplication
of classes in the list of parents:

>>> class A(object): pass
>>> class C(A,A): pass
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
TypeError: duplicate base class A

Python 2.2 (both for classic classes and new style classes) in this
situation would not raise any error message.
 

Finally, I would like to point out two lessons we have
learned from this example:

1. despite the name, the MRO determines the resolution order 
   of the attributes, not only of the methods;

2. the default food for Pythonistas is spam ! (but you already knew that ;-)


Having discussed the issue with the local precedence ordering, let
me now consider the issue of monotonicity. My goal is to show that 
both the MRO for classic classes and 
for new style classes for Python 2.2 are not monotonic.

To prove that the MRO for classic classes is non-monotonic 
is rather trivial, it is enough to look at the diamond diagram:

 ::


                   C
                  / \
                 /   \
                A     B
                 \   /
                  \ /
                   D

One easily sees the inconsistency::

  L(B,P21) = B C        # B precedes C : B's methods win
  L(D,P21) = D A C B C  # B follows C  : C's methods win!

On the other hand, there are no problems with the Python 2.2 and 2.3 MROs, 
that give both

  L(D) = D A B C
                    
Guido points out in his essay [[#]_] that the classic MRO is not so bad in
practice, since one can typically avoids diamonds for classic classes.
But all new style classes inherit from object, therefore diamonds are
unavoidable and inconsistencies shows up in every multiple inheritance
graph.

The MRO of Python 2.2 makes break monotonicity difficult, but
not impossible. The following example, originally provided by Samuele Pedroni,
shows that the MRO of Python 2.2 is non-monotonic::

  class A(object): pass
  class B(object): pass
  class C(object): pass
  class D(object): pass
  class E(object): pass
  class K1(A,B,C): pass
  class K2(D,B,E): pass
  class K3(D,A):   pass
  class Z(K1,K2,K3): pass

Here are the linearizations according to the C3 MRO (the reader
should verify these linearizations as an exercise and draw the
inheritance diagram ;-)::

  L(A) = A O
  L(B) = B O
  L(C) = C O
  L(D) = D O
  L(E) = E O
  L(K1)= K1 A B C O 
  L(K2)= K2 D B E O
  L(K3)= K3 D A O
  L(Z) = Z K1 K2 K3 D A B C E O 

Python 2.2 gives exactly the same linearizations for A, B, C, D, E, K1, K2
and K3, but a different linearization for Z:

  L(Z,P22) = Z K1 K3 A K2 D B C E O

It is clear that this linearization is *wrong*, since A comes before D
whereas in the linearization of K3 A comes *after* D. In other words,
in K3 methods derived by D override methods derived by A, but
in Z, which still is a subclass of K3, methods derived by A override 
methods derived by D !!
This is again a violation of monotonicity which cannot be allowed.
On top of that, the Python 2.2 linearization of Z is also inconsistent with
local precedence ordering, since the local
precedence list of the class Z is [K1, K2, K3] where K2 precedes K3,
whereas in the linearization of Z we have that K2 *follows* K3. 
These problems explain why the 2.2 rule has been dismissed in favor of the C3 
rule.

Here I give a short Python 2.2 script that allows anybody to compute the 
2.3 MRO without risk for his brain. 
Simply change the last line to play with the various examples 
I have discussed in this paper.

 ::

  #<c3.py>
 
  """C3 algorithm by Samuele Pedroni (with readability enhanced by me).
  Remark: it only works with Python 2.2 (not 2.3+ nor 2.1-).""" 
 
  class __metaclass__(type): 
      "All classes are metamagically modified to be nicely printed" 
      __repr__ = lambda cls: cls.__name__
  
  class ex_2:
      "Serious order disagreement" #From Guido
      class O: pass
      class X(O): pass
      class Y(O): pass
      class A(X,Y): pass
      class B(Y,X): pass
      try:
          class Z(A,B): pass #creates Z(A,B)
      except TypeError:
          pass # Z(A,B) cannot be created in Python 2.3+
      
  class ex_5:
      "My first example"
      class O: pass
      class F(O): pass
      class E(O): pass
      class D(O): pass
      class C(D,F): pass
      class B(D,E): pass
      class A(B,C): pass
  
  class ex_6:
      "My second example"
      class O: pass
      class F(O): pass
      class E(O): pass
      class D(O): pass
      class C(D,F): pass
      class B(E,D): pass
      class A(B,C): pass
  
  class ex_9:
      "Difference between Python 2.2 MRO and C3" #From Samuele
      class O: pass
      class A(O): pass
      class B(O): pass
      class C(O): pass
      class D(O): pass
      class E(O): pass
      class K1(A,B,C): pass
      class K2(D,B,E): pass
      class K3(D,A): pass
      class Z(K1,K2,K3): pass
  
  def merge(seqs):
      print '\n\nCPL[%s]=%s' % (seqs[0][0],seqs),
      res = []; i=0
      while 1:
        nonemptyseqs=[seq for seq in seqs if seq]
        if not nonemptyseqs: return res
        i+=1; print '\n',i,'round: candidates...',
        for seq in nonemptyseqs: # find merge candidates among seq heads
            cand = seq[0]; print ' ',cand,
            nothead=[s for s in nonemptyseqs if cand in s[1:]]
            if nothead: cand=None #reject candidate
            else: break
        if not cand: raise "Inconsistent hierarchy"
        res.append(cand)
        for seq in nonemptyseqs: # remove cand
            if seq[0] == cand: del seq[0]
  
  def mro(C):
      "Compute the class precedence list (mro) according to C3"
      return merge([[C]]+map(mro,C.__bases__)+[list(C.__bases__)])
  
  def print_mro(C):
      print '\nMRO[%s]=%s' % (C,mro(C))
      print '\nP22 MRO[%s]=%s' % (C,C.mro())
  
  print_mro(ex_9.Z)
  
  #</c3.py>
 
 
Resources
---------

.. [#] The thread on python-dev started by Samuele Pedroni:
       http://mail.python.org/pipermail/python-dev/2002-October/029035.html

.. [#] The paper *A Monotonic Superclass Linearization for Dylan*:
       http://www.webcom.com/haahr/dylan/linearization-oopsla96.html

.. [#] Guido van Rossum's essay, *Unifying types and classes in Python 2.2*:
       http://www.python.org/2.2.2/descrintro.html

.. [#] The (in)famous book on metaclasses, *Putting Metaclasses to Work*:
       Ira R. Forman, Scott Danforth, Addison-Wesley 1999.