summaryrefslogtreecommitdiff
path: root/pypers/meta/meta1.txt
blob: f2d041f1ad643ddc9bc2beab85e4497358f8a542 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
Metaclass programming in Python
--------------------------------

.. contents:

Pushing object-oriented programming to the next level

Level: Introductory

David Mertz, Ph.D. (mertz@gnosis.cx), Developer, Gnosis Software, Inc.
Michele Simionato, Ph.D. (mis6+@pitt.edu), Physicist, University of Pittsburgh

February 26, 2003

Most readers are already familiar with the concepts of object-oriented 
programming: inheritance, encapsulation, polymorphism. But the creation 
of objects of a given class, with certain parents, is usually thought of 
as a "just so" operation. It turns out that a number of new programming 
constructs become either easier, or possible at all, when you can 
customize the process of object creation. Metaclasses enable certain 
types of "aspect-oriented programming," for example, you can enhance 
classes with features like tracing capabilities, object persistence, 
exception logging, and more.

Review of object-oriented programming Let's start with a 30-second
review of just what OOP is. In an object-oriented programming
language, you can define classes, whose purpose is to bundle together
related data and behaviors. These classes can inherit some or all of
their qualities from their parents, but they can also define
attributes (data) or methods (behaviors) of their own. At the end of
the process, classes generally act as templates for the creation of
instances (at times also called simply objects). Different instances
of the same class will typically have different data, but it will come
in the same shape -- for example, the Employee objects bob and jane
both have a .salary and a .room_number, but not the same room and
salary as each other.

Some OOP languages, including Python, allow for objects to be
introspective (also called reflective). That is, an introspective
object is able to describe itself: What class does the instance belong
to? What ancestors does that class have? What methods and attributes
are available to the object? Introspection lets a function or method
that handles objects make decisions based on what kind of object it is
passed. Even without introspection, functions frequently branch based
on instance data -- for example, the route to jane.room_number differs
from that to bob.room_number because they are in different rooms. With
introspection, you can also safely calculate the bonus jane gets,
while skipping the calculation for bob, for example, because jane has
a .profit_share attribute, or because bob is an instance of the
subclass Hourly(Employee).

A metaprogramming rejoinder The basic OOP system sketched above is
quite powerful. But there is one element brushed over in the
description: in Python (and other languages), classes are themselves
objects that can be passed around and introspected. Since objects, as
stated, are produced using classes as templates, what acts as a
template for producing classes? The answer, of course, is metaclasses.

Python has always had metaclasses. But the machinery involved in
metaclasses became much better exposed with Python 2.2. Specifically,
with version 2.2, Python stopped being a language with just one
special (mostly hidden) metaclass that created every class object. Now
programmers can subclass the aboriginal metaclass type and even
dynamically generate classes with varying metaclasses. Of course, just
because you can manipulate metaclasses in Python 2.2, that does not
explain why you might want to.

Moreover, you do not need to use custom metaclasses to manipulate the
production of classes. A slightly less brain-melting concept is a
class factory: An ordinary function can return a class that was
dynamically created within the function body. In traditional Python
syntax, you can write:

        
Python 1.5.2 (#0, Jun 27 1999, 11:23:01) [...]
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> def class_with_method(func):
...     class klass: pass
...     setattr(klass, func.__name__, func)
...     return klass
...
>>> def say_foo(self): print 'foo'
...
>>> Foo = class_with_method(say_foo)
>>> foo = Foo()
>>> foo.say_foo()
foo

The factory function class_with_method() dynamically creates and
returns a class that contains the method/function passed into the
factory. The class itself is manipulated within the function body
before being returned. The new module provides a more concise
spelling, but without the same options for custom code within the body
of the class factory, for example:


        
>>> from new import classobj
>>> Foo2 = classobj('Foo2',(Foo,),{'bar':lambda self:'bar'})
>>> Foo2().bar()
'bar'
>>> Foo2().say_foo()
foo

In all these cases, the behaviors of the class (Foo, Foo2) are not
directly written as code, but are instead created by calling functions
at runtime, with dynamic arguments. And it should be emphasized that
it is not merely the instances that are so dynamically created, but
the classes themselves.

Metaclasses: a solution looking for a problem?  

Metaclasses are deeper
magic than 99% of users should ever worry about. If you wonder whether
you need them, you don't (the people who actually need them know with
certainty that they need them, and don't need an explanation about
why). -- Python Guru Tim Peters

Methods (of classes), like plain functions, can return objects. So in
that sense it is obvious that class factories can be classes just as
easily as they can be functions. In particular, Python 2.2+ provides a
special class called type that is just such a class factory. Of
course, readers will recognize type() as a less ambitious built-in
function of older Python versions -- fortunately, the behaviors of the
old type() function are maintained by the type class (in other words,
type(obj) returns the type/class of the object obj). The new type
class works as a class factory in just the same way that the function
new.classobj long has:


        
>>> X = type('X',(),{'foo':lambda self:'foo'})
>>> X, X().foo()
(<class '__main__.X'>, 'foo')

But since type is now a (meta)class, you are free to subclass it:


        
>>> class ChattyType(type):
...     def __new__(cls, name, bases, dct):
...         print "Allocating memory for class", name
...         return type.__new__(cls, name, bases, dct)
...     def __init__(cls, name, bases, dct):
...         print "Init'ing (configuring) class", name
...         super(ChattyType, cls).__init__(name, bases, dct)
...
>>> X = ChattyType('X',(),{'foo':lambda self:'foo'})
Allocating memory for class X
Init'ing (configuring) class X
>>> X, X().foo()
(<class '__main__.X'>, 'foo')

The magic methods .__new__() and .__init__() are special, but in
conceptually the same way they are for any other class. The
.__init__() method lets you configure the created object; the
.__new__() method lets you customize its allocation. The latter, of
course, is not widely used, but exists for every Python 2.2 new-style
class (usually inherited but not overridden).

There is one feature of type descendents to be careful about; it
catches everyone who first plays with metaclasses. The first argument
to methods is conventionally called cls rather than self, because the
methods operate on the produced class, not the metaclass. Actually,
there is nothing special about this; all methods attach to their
instances, and the instance of a metaclass is a class. A non-special
name makes this more obvious:


        
>>> class Printable(type):
...     def whoami(cls): print "I am a", cls.__name__
...
>>> Foo = Printable('Foo',(),{})
>>> Foo.whoami()
I am a Foo
>>> Printable.whoami()
Traceback (most recent call last):
TypeError:  unbound method whoami() [...]

All this surprisingly non-remarkable machinery comes with some syntax
sugar that both makes working with metaclasses easier, and confuses
new users. There are several elements to the extra syntax. The
resolution order of these new variations is tricky though. Classes can
inherit metaclasses from their ancestors -- notice that this is not
the same thing as having metaclasses as ancestors (another common
confusion). For old-style classes, defining a global _metaclass_
variable can force a custom metaclass to be used. But most of the
time, and the safest approach, is to set a _metaclass_ class attribute
for a class that wants to be created via a custom metaclass. You must
set the variable in the class definition itself since the metaclass is
not used if the attribute is set later (after the class object has
already been created). For example:


        
>>> class Bar:
...     __metaclass__ = Printable
...     def foomethod(self): print 'foo'
...
>>> Bar.whoami()
I am a Bar
>>> Bar().foomethod()
foo

Solving problems with magic

So far, we have seen the basics of metaclasses. But putting
metaclasses to work is more subtle. The challenge with utilizing
metaclasses is that in typical OOP design, classes do not really do
much. The inheritance structure of classes is useful to encapsulate
and package data and methods, but it is typically instances that one
works with in the concrete.

There are two general categories of programming tasks where we think
metaclasses are genuinely valuable.

The first, and probably more common category is where you do not know
at design time exactly what a class needs to do. Obviously, you will
have some idea about it, but some particular detail might depend on
information that is not available until later. "Later" itself can be
of two sorts: (a) When a library module is used by an application; (b)
At runtime when some situation exists. This category is close to what
is often called "Aspect-Oriented Programming" (AOP). We'll show what
we think is an elegant example:


        
% cat dump.py
#!/usr/bin/python
import sys
if len(sys.argv) > 2:
    module, metaklass  = sys.argv[1:3]
    m = __import__(module, globals(), locals(), [metaklass])
    __metaclass__ = getattr(m, metaklass)

class Data:
    def __init__(self):
        self.num = 38
        self.lst = ['a','b','c']
        self.str = 'spam'
    dumps   = lambda self: `self`
    __str__ = lambda self: self.dumps()

data = Data()
print data

% dump.py
<__main__.Data instance at 1686a0>

As you would expect, this application prints out a rather generic description of the data object (a conventional instance object). But if runtime arguments are passed to the application, we can get a rather different result:


        
% dump.py gnosis.magic MetaXMLPickler
<?xml version="1.0"?>
<!DOCTYPE PyObject SYSTEM "PyObjects.dtd">
<PyObject module="__main__" class="Data" id="720748">
<attr name="lst" type="list" id="980012" >
  <item type="string" value="a" />
  <item type="string" value="b" />
  <item type="string" value="c" />
</attr>
<attr name="num" type="numeric" value="38" />
<attr name="str" type="string" value="spam" />
</PyObject>

The particular example uses the serialization style of
gnosis.xml.pickle, but the most current gnosis.magic package also
contains metaclass serializers MetaYamlDump, MetaPyPickler, and
MetaPrettyPrint. Moreover, a user of the dump.py "application" can
impose the use of any "MetaPickler" desired, from any Python package
that defines one. Writing an appropriate metaclass for this purpose
will look something like:


        
class MetaPickler(type):
    "Metaclass for gnosis.xml.pickle serialization"
    def __init__(cls, name, bases, dict):
        from gnosis.xml.pickle import dumps
        super(MetaPickler, cls).__init__(name, bases, dict)
        setattr(cls, 'dumps', dumps)

The remarkable achievement of this arrangement is that the application
programmer need have no knowledge about what serialization will be
used -- nor even whether serialization or some other cross-sectional
capability will be added at the command-line.

Perhaps the most common use of metaclasses is similar to that of
MetaPicklers: adding, deleting, renaming, or substituting methods for
those defined in the produced class. In our example, a "native"
Data.dump() method is replaced by a different one from outside the
application, at the time the class Data is created (and therefore in
every subsequent instance).

More ways to solve problems with magic

There is a programming niche where classes are often more important
than instances. For example, declarative mini-languages are Python
libraries whose program logic is expressed directly in class
declarations. David examines them in his article "Create declarative
mini-languages". In such cases, using metaclasses to affect the
process of class creation can be quite powerful.

One class-based declarative framework is gnosis.xml.validity. Under
this framework, you declare a number of "validity classes" that
express a set of constraints about valid XML documents. These
declarations are very close to those contained in DTDs. For example, a
"dissertation" document can be configured with the code:


        
from gnosis.xml.validity import *
class figure(EMPTY):      pass
class _mixedpara(Or):     _disjoins = (PCDATA, figure)
class paragraph(Some):    _type = _mixedpara
class title(PCDATA):      pass
class _paras(Some):       _type = paragraph
class chapter(Seq):       _order = (title, _paras)
class dissertation(Some): _type = chapter

If you try to instantiate the dissertation class without the right
component subelements, a descriptive exception is raised; likewise for
each of the subelements. The proper subelements will be generated from
simpler arguments when there is only one unambiguous way of "lifting"
the arguments to the correct types.

Even though validity classes are often (informally) based on a
pre-existing DTD, instances of these classes print themselves as
unadorned XML document fragments, for example:


        
>>> from simple_diss import *
>>> ch = LiftSeq(chapter, ('It Starts','When it began'))
>>> print ch
<chapter><title>It Starts</title>
<paragraph>When it began</paragraph>
</chapter>

By using a metaclass to create the validity classes, we can generate a
DTD out of the class declarations themselves (and add an extra method
to the classes while we do it):


        
>>> from gnosis.magic import DTDGenerator, \
...                          import_with_metaclass, \
...                          from_import
>>> d = import_with_metaclass('simple_diss',DTDGenerator)
>>> from_import(d,'**')
>>> ch = LiftSeq(chapter, ('It Starts','When it began'))
>>> print ch.with_internal_subset()
<?xml version='1.0'?>
<!DOCTYPE chapter [
<!ELEMENT figure EMPTY>
<!ELEMENT dissertation (chapter)+>
<!ELEMENT chapter (title,paragraph+)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT paragraph ((#PCDATA|figure))+>
]>
<chapter><title>It Starts</title>
<paragraph>When it began</paragraph>
</chapter>

The package gnosis.xml.validity knows nothing about DTDs and internal
subsets. Those concepts and capabilities are introduced entirely by
the metaclass DTDGenerator, without any change made to either
gnosis.xml.validity or simple_diss.py. DTDGenerator does not
substitute its own .__str__() method into classes it produces -- you
can still print the unadorned XML fragment -- but it a metaclass could
easily modify such magic methods.

Meta conveniences

The package gnosis.magic contains several utilities for working with 
metaclasses, as well as some sample metaclasses you can use in 
aspect-oriented programming. The most important of these utilities 
is import_with_metaclass(). This function, utilized in the above 
example, lets you import a third-party module, but create all the 
module classes using a custom metaclass rather than type. Whatever 
new capability you might want to impose on that third-party module 
can be defined in a metaclass that you create (or get from somewhere 
else altogether). gnosis.magic contains some pluggable serialization 
metaclasses; some other package might contain tracing capabilities, 
or object persistence, or exception logging, or something else.

The import_with_metclass() function illustrates several qualities 
of metaclass programming::


        
def import_with_metaclass(modname, metaklass):
    "Module importer substituting custom metaclass"
    class Meta(object): __metaclass__ = metaklass
    dct = {'__module__':modname}
    mod = __import__(modname)
    for key, val in mod.__dict__.items():
        if inspect.isclass(val):
            setattr(mod, key, type(key,(val,Meta),dct))
    return mod

One notable style in this function is that an ordinary class Meta is 
produced using the specified metaclass. But once Meta is added as an 
ancestor, its descendent is also produced using the custom metaclass. 
In principle, a class like Meta could carry with it both a metaclass 
producer and a set of inheritable methods -- the two aspects of its 
bequest are orthogonal.

Resources

    * A useful book on metaclasses is Putting Metaclasses to Work by 
Ira R. Forman and Scott Danforth (Addison-Wesley; 1999).

    * For metaclasses in Python specifically, Guido van Rossum's 
essay, "Unifying types and classes in Python 2.2" is useful as well.

    * Also by David on developerWorks, read:
          o "Guide to Python introspection"
          o "Create declarative mini-languages"
          o "XML Matters: Enforcing validity with the gnosis.xml.
validity library"


    * Don't know Tim Peters? You should! Begin with Tim's wiki page 
and end with reading news:comp.lang.python more regularly.

    * New to AOP? You may find this "Introduction to Aspect-Oriented 
Programming" (PDF) by Ken Wing Kuen Lee of the Hong Kong University 
of Science and Technology interesting.

    * Gregor Kiczales and his team at Xerox PARC coined the term 
aspect-oriented programming in the 1990s and championed it as a 
way to allow software programmers to spend more time writing code 
and less time correcting it.

    * "Connections between Demeter/Adaptive Programming and 
Aspect-Oriented Programming (AOP)" by Karl J. Lieberherr also 
describes AOP.

    * You'll also find subject-oriented programming interesting. As 
described by the folks at IBM Research, it's essentially the same 
thing as aspect-oriented programming.

    * Find and download the Gnosis utils, mentioned several times in 
this article, at David's site.

    * Find more resources for Linux developers in the developerWorks 
Linux zone.

About the authors

David Mertz 
  David Mertz thought his brain would melt when he wrote 
  about continuations or semi-coroutines, but he put the gooey mess 
  back in his skull cavity and moved on to metaclasses. David may be 
  reached at mertz@gnosis.cx; his life pored over at his personal 
  Web page. Suggestions and recommendations on this, past, or future
  columns are welcome. Learn about his forthcoming book, Text 
  Processing in Python.


Michele Simionato 
  Michele Simionato is a plain, ordinary, theoretical 
  physicist who was driven to Python by a quantum fluctuation that could 
  well have passed without consequences had he not met David Mertz. He 
  will let his readers judge the final outcome. Michele can be reached 
  at mis6+@pitt.edu.