summaryrefslogtreecommitdiff
path: root/Doc/faq/design.rst
diff options
context:
space:
mode:
authorGeorg Brandl <georg@python.org>2009-10-11 21:25:26 +0000
committerGeorg Brandl <georg@python.org>2009-10-11 21:25:26 +0000
commit1dc4ca29f9b5a6a3cea0eaff656cdbf111f9d744 (patch)
treec3fa33135ab58c9ac0acbf0026067d4f226a787d /Doc/faq/design.rst
parent2bc09ad58be544ed71d330a4a61bb391d5556e27 (diff)
downloadcpython-1dc4ca29f9b5a6a3cea0eaff656cdbf111f9d744.tar.gz
Merged revisions 75363 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/trunk ........ r75363 | georg.brandl | 2009-10-11 20:31:23 +0200 (So, 11 Okt 2009) | 1 line Add the Python FAQ lists to the documentation. Copied from sandbox/faq. Many thanks to AMK for the preparation work. ........
Diffstat (limited to 'Doc/faq/design.rst')
-rw-r--r--Doc/faq/design.rst924
1 files changed, 924 insertions, 0 deletions
diff --git a/Doc/faq/design.rst b/Doc/faq/design.rst
new file mode 100644
index 0000000000..aacb476cdc
--- /dev/null
+++ b/Doc/faq/design.rst
@@ -0,0 +1,924 @@
+======================
+Design and History FAQ
+======================
+
+Why does Python use indentation for grouping of statements?
+-----------------------------------------------------------
+
+Guido van Rossum believes that using indentation for grouping is extremely
+elegant and contributes a lot to the clarity of the average Python program.
+Most people learn to love this feature after awhile.
+
+Since there are no begin/end brackets there cannot be a disagreement between
+grouping perceived by the parser and the human reader. Occasionally C
+programmers will encounter a fragment of code like this::
+
+ if (x <= y)
+ x++;
+ y--;
+ z++;
+
+Only the ``x++`` statement is executed if the condition is true, but the
+indentation leads you to believe otherwise. Even experienced C programmers will
+sometimes stare at it a long time wondering why ``y`` is being decremented even
+for ``x > y``.
+
+Because there are no begin/end brackets, Python is much less prone to
+coding-style conflicts. In C there are many different ways to place the braces.
+If you're used to reading and writing code that uses one style, you will feel at
+least slightly uneasy when reading (or being required to write) another style.
+
+Many coding styles place begin/end brackets on a line by themself. This makes
+programs considerably longer and wastes valuable screen space, making it harder
+to get a good overview of a program. Ideally, a function should fit on one
+screen (say, 20-30 lines). 20 lines of Python can do a lot more work than 20
+lines of C. This is not solely due to the lack of begin/end brackets -- the
+lack of declarations and the high-level data types are also responsible -- but
+the indentation-based syntax certainly helps.
+
+
+Why am I getting strange results with simple arithmetic operations?
+-------------------------------------------------------------------
+
+See the next question.
+
+
+Why are floating point calculations so inaccurate?
+--------------------------------------------------
+
+People are often very surprised by results like this::
+
+ >>> 1.2-1.0
+ 0.199999999999999996
+
+and think it is a bug in Python. It's not. This has nothing to do with Python,
+but with how the underlying C platform handles floating point numbers, and
+ultimately with the inaccuracies introduced when writing down numbers as a
+string of a fixed number of digits.
+
+The internal representation of floating point numbers uses a fixed number of
+binary digits to represent a decimal number. Some decimal numbers can't be
+represented exactly in binary, resulting in small roundoff errors.
+
+In decimal math, there are many numbers that can't be represented with a fixed
+number of decimal digits, e.g. 1/3 = 0.3333333333.......
+
+In base 2, 1/2 = 0.1, 1/4 = 0.01, 1/8 = 0.001, etc. .2 equals 2/10 equals 1/5,
+resulting in the binary fractional number 0.001100110011001...
+
+Floating point numbers only have 32 or 64 bits of precision, so the digits are
+cut off at some point, and the resulting number is 0.199999999999999996 in
+decimal, not 0.2.
+
+A floating point number's ``repr()`` function prints as many digits are
+necessary to make ``eval(repr(f)) == f`` true for any float f. The ``str()``
+function prints fewer digits and this often results in the more sensible number
+that was probably intended::
+
+ >>> 0.2
+ 0.20000000000000001
+ >>> print 0.2
+ 0.2
+
+One of the consequences of this is that it is error-prone to compare the result
+of some computation to a float with ``==``. Tiny inaccuracies may mean that
+``==`` fails. Instead, you have to check that the difference between the two
+numbers is less than a certain threshold::
+
+ epsilon = 0.0000000000001 # Tiny allowed error
+ expected_result = 0.4
+
+ if expected_result-epsilon <= computation() <= expected_result+epsilon:
+ ...
+
+Please see the chapter on :ref:`floating point arithmetic <tut-fp-issues>` in
+the Python tutorial for more information.
+
+
+Why are Python strings immutable?
+---------------------------------
+
+There are several advantages.
+
+One is performance: knowing that a string is immutable means we can allocate
+space for it at creation time, and the storage requirements are fixed and
+unchanging. This is also one of the reasons for the distinction between tuples
+and lists.
+
+Another advantage is that strings in Python are considered as "elemental" as
+numbers. No amount of activity will change the value 8 to anything else, and in
+Python, no amount of activity will change the string "eight" to anything else.
+
+
+.. _why-self:
+
+Why must 'self' be used explicitly in method definitions and calls?
+-------------------------------------------------------------------
+
+The idea was borrowed from Modula-3. It turns out to be very useful, for a
+variety of reasons.
+
+First, it's more obvious that you are using a method or instance attribute
+instead of a local variable. Reading ``self.x`` or ``self.meth()`` makes it
+absolutely clear that an instance variable or method is used even if you don't
+know the class definition by heart. In C++, you can sort of tell by the lack of
+a local variable declaration (assuming globals are rare or easily recognizable)
+-- but in Python, there are no local variable declarations, so you'd have to
+look up the class definition to be sure. Some C++ and Java coding standards
+call for instance attributes to have an ``m_`` prefix, so this explicitness is
+still useful in those languages, too.
+
+Second, it means that no special syntax is necessary if you want to explicitly
+reference or call the method from a particular class. In C++, if you want to
+use a method from a base class which is overridden in a derived class, you have
+to use the ``::`` operator -- in Python you can write baseclass.methodname(self,
+<argument list>). This is particularly useful for :meth:`__init__` methods, and
+in general in cases where a derived class method wants to extend the base class
+method of the same name and thus has to call the base class method somehow.
+
+Finally, for instance variables it solves a syntactic problem with assignment:
+since local variables in Python are (by definition!) those variables to which a
+value assigned in a function body (and that aren't explicitly declared global),
+there has to be some way to tell the interpreter that an assignment was meant to
+assign to an instance variable instead of to a local variable, and it should
+preferably be syntactic (for efficiency reasons). C++ does this through
+declarations, but Python doesn't have declarations and it would be a pity having
+to introduce them just for this purpose. Using the explicit "self.var" solves
+this nicely. Similarly, for using instance variables, having to write
+"self.var" means that references to unqualified names inside a method don't have
+to search the instance's directories. To put it another way, local variables
+and instance variables live in two different namespaces, and you need to tell
+Python which namespace to use.
+
+
+Why can't I use an assignment in an expression?
+-----------------------------------------------
+
+Many people used to C or Perl complain that they want to use this C idiom:
+
+.. code-block:: c
+
+ while (line = readline(f)) {
+ // do something with line
+ }
+
+where in Python you're forced to write this::
+
+ while True:
+ line = f.readline()
+ if not line:
+ break
+ ... # do something with line
+
+The reason for not allowing assignment in Python expressions is a common,
+hard-to-find bug in those other languages, caused by this construct:
+
+.. code-block:: c
+
+ if (x = 0) {
+ // error handling
+ }
+ else {
+ // code that only works for nonzero x
+ }
+
+The error is a simple typo: ``x = 0``, which assigns 0 to the variable ``x``,
+was written while the comparison ``x == 0`` is certainly what was intended.
+
+Many alternatives have been proposed. Most are hacks that save some typing but
+use arbitrary or cryptic syntax or keywords, and fail the simple criterion for
+language change proposals: it should intuitively suggest the proper meaning to a
+human reader who has not yet been introduced to the construct.
+
+An interesting phenomenon is that most experienced Python programmers recognize
+the ``while True`` idiom and don't seem to be missing the assignment in
+expression construct much; it's only newcomers who express a strong desire to
+add this to the language.
+
+There's an alternative way of spelling this that seems attractive but is
+generally less robust than the "while True" solution::
+
+ line = f.readline()
+ while line:
+ ... # do something with line...
+ line = f.readline()
+
+The problem with this is that if you change your mind about exactly how you get
+the next line (e.g. you want to change it into ``sys.stdin.readline()``) you
+have to remember to change two places in your program -- the second occurrence
+is hidden at the bottom of the loop.
+
+The best approach is to use iterators, making it possible to loop through
+objects using the ``for`` statement. For example, in the current version of
+Python file objects support the iterator protocol, so you can now write simply::
+
+ for line in f:
+ ... # do something with line...
+
+
+
+Why does Python use methods for some functionality (e.g. list.index()) but functions for other (e.g. len(list))?
+----------------------------------------------------------------------------------------------------------------
+
+The major reason is history. Functions were used for those operations that were
+generic for a group of types and which were intended to work even for objects
+that didn't have methods at all (e.g. tuples). It is also convenient to have a
+function that can readily be applied to an amorphous collection of objects when
+you use the functional features of Python (``map()``, ``apply()`` et al).
+
+In fact, implementing ``len()``, ``max()``, ``min()`` as a built-in function is
+actually less code than implementing them as methods for each type. One can
+quibble about individual cases but it's a part of Python, and it's too late to
+make such fundamental changes now. The functions have to remain to avoid massive
+code breakage.
+
+.. XXX talk about protocols?
+
+Note that for string operations Python has moved from external functions (the
+``string`` module) to methods. However, ``len()`` is still a function.
+
+
+Why is join() a string method instead of a list or tuple method?
+----------------------------------------------------------------
+
+Strings became much more like other standard types starting in Python 1.6, when
+methods were added which give the same functionality that has always been
+available using the functions of the string module. Most of these new methods
+have been widely accepted, but the one which appears to make some programmers
+feel uncomfortable is::
+
+ ", ".join(['1', '2', '4', '8', '16'])
+
+which gives the result::
+
+ "1, 2, 4, 8, 16"
+
+There are two common arguments against this usage.
+
+The first runs along the lines of: "It looks really ugly using a method of a
+string literal (string constant)", to which the answer is that it might, but a
+string literal is just a fixed value. If the methods are to be allowed on names
+bound to strings there is no logical reason to make them unavailable on
+literals.
+
+The second objection is typically cast as: "I am really telling a sequence to
+join its members together with a string constant". Sadly, you aren't. For some
+reason there seems to be much less difficulty with having :meth:`~str.split` as
+a string method, since in that case it is easy to see that ::
+
+ "1, 2, 4, 8, 16".split(", ")
+
+is an instruction to a string literal to return the substrings delimited by the
+given separator (or, by default, arbitrary runs of white space). In this case a
+Unicode string returns a list of Unicode strings, an ASCII string returns a list
+of ASCII strings, and everyone is happy.
+
+:meth:`~str.join` is a string method because in using it you are telling the
+separator string to iterate over a sequence of strings and insert itself between
+adjacent elements. This method can be used with any argument which obeys the
+rules for sequence objects, including any new classes you might define yourself.
+
+Because this is a string method it can work for Unicode strings as well as plain
+ASCII strings. If ``join()`` were a method of the sequence types then the
+sequence types would have to decide which type of string to return depending on
+the type of the separator.
+
+.. XXX remove next paragraph eventually
+
+If none of these arguments persuade you, then for the moment you can continue to
+use the ``join()`` function from the string module, which allows you to write ::
+
+ string.join(['1', '2', '4', '8', '16'], ", ")
+
+
+How fast are exceptions?
+------------------------
+
+A try/except block is extremely efficient. Actually catching an exception is
+expensive. In versions of Python prior to 2.0 it was common to use this idiom::
+
+ try:
+ value = dict[key]
+ except KeyError:
+ dict[key] = getvalue(key)
+ value = dict[key]
+
+This only made sense when you expected the dict to have the key almost all the
+time. If that wasn't the case, you coded it like this::
+
+ if dict.has_key(key):
+ value = dict[key]
+ else:
+ dict[key] = getvalue(key)
+ value = dict[key]
+
+(In Python 2.0 and higher, you can code this as ``value = dict.setdefault(key,
+getvalue(key))``.)
+
+
+Why isn't there a switch or case statement in Python?
+-----------------------------------------------------
+
+You can do this easily enough with a sequence of ``if... elif... elif... else``.
+There have been some proposals for switch statement syntax, but there is no
+consensus (yet) on whether and how to do range tests. See :pep:`275` for
+complete details and the current status.
+
+For cases where you need to choose from a very large number of possibilities,
+you can create a dictionary mapping case values to functions to call. For
+example::
+
+ def function_1(...):
+ ...
+
+ functions = {'a': function_1,
+ 'b': function_2,
+ 'c': self.method_1, ...}
+
+ func = functions[value]
+ func()
+
+For calling methods on objects, you can simplify yet further by using the
+:func:`getattr` built-in to retrieve methods with a particular name::
+
+ def visit_a(self, ...):
+ ...
+ ...
+
+ def dispatch(self, value):
+ method_name = 'visit_' + str(value)
+ method = getattr(self, method_name)
+ method()
+
+It's suggested that you use a prefix for the method names, such as ``visit_`` in
+this example. Without such a prefix, if values are coming from an untrusted
+source, an attacker would be able to call any method on your object.
+
+
+Can't you emulate threads in the interpreter instead of relying on an OS-specific thread implementation?
+--------------------------------------------------------------------------------------------------------
+
+Answer 1: Unfortunately, the interpreter pushes at least one C stack frame for
+each Python stack frame. Also, extensions can call back into Python at almost
+random moments. Therefore, a complete threads implementation requires thread
+support for C.
+
+Answer 2: Fortunately, there is `Stackless Python <http://www.stackless.com>`_,
+which has a completely redesigned interpreter loop that avoids the C stack.
+It's still experimental but looks very promising. Although it is binary
+compatible with standard Python, it's still unclear whether Stackless will make
+it into the core -- maybe it's just too revolutionary.
+
+
+Why can't lambda forms contain statements?
+------------------------------------------
+
+Python lambda forms cannot contain statements because Python's syntactic
+framework can't handle statements nested inside expressions. However, in
+Python, this is not a serious problem. Unlike lambda forms in other languages,
+where they add functionality, Python lambdas are only a shorthand notation if
+you're too lazy to define a function.
+
+Functions are already first class objects in Python, and can be declared in a
+local scope. Therefore the only advantage of using a lambda form instead of a
+locally-defined function is that you don't need to invent a name for the
+function -- but that's just a local variable to which the function object (which
+is exactly the same type of object that a lambda form yields) is assigned!
+
+
+Can Python be compiled to machine code, C or some other language?
+-----------------------------------------------------------------
+
+Not easily. Python's high level data types, dynamic typing of objects and
+run-time invocation of the interpreter (using :func:`eval` or :keyword:`exec`)
+together mean that a "compiled" Python program would probably consist mostly of
+calls into the Python run-time system, even for seemingly simple operations like
+``x+1``.
+
+Several projects described in the Python newsgroup or at past `Python
+conferences <http://python.org/community/workshops/>`_ have shown that this approach is feasible,
+although the speedups reached so far are only modest (e.g. 2x). Jython uses the
+same strategy for compiling to Java bytecode. (Jim Hugunin has demonstrated
+that in combination with whole-program analysis, speedups of 1000x are feasible
+for small demo programs. See the proceedings from the `1997 Python conference
+<http://python.org/community/workshops/1997-10/proceedings/>`_ for more information.)
+
+Internally, Python source code is always translated into a bytecode
+representation, and this bytecode is then executed by the Python virtual
+machine. In order to avoid the overhead of repeatedly parsing and translating
+modules that rarely change, this byte code is written into a file whose name
+ends in ".pyc" whenever a module is parsed. When the corresponding .py file is
+changed, it is parsed and translated again and the .pyc file is rewritten.
+
+There is no performance difference once the .pyc file has been loaded, as the
+bytecode read from the .pyc file is exactly the same as the bytecode created by
+direct translation. The only difference is that loading code from a .pyc file
+is faster than parsing and translating a .py file, so the presence of
+precompiled .pyc files improves the start-up time of Python scripts. If
+desired, the Lib/compileall.py module can be used to create valid .pyc files for
+a given set of modules.
+
+Note that the main script executed by Python, even if its filename ends in .py,
+is not compiled to a .pyc file. It is compiled to bytecode, but the bytecode is
+not saved to a file. Usually main scripts are quite short, so this doesn't cost
+much speed.
+
+.. XXX check which of these projects are still alive
+
+There are also several programs which make it easier to intermingle Python and C
+code in various ways to increase performance. See, for example, `Psyco
+<http://psyco.sourceforge.net/>`_, `Pyrex
+<http://www.cosc.canterbury.ac.nz/~greg/python/Pyrex/>`_, `PyInline
+<http://pyinline.sourceforge.net/>`_, `Py2Cmod
+<http://sourceforge.net/projects/py2cmod/>`_, and `Weave
+<http://www.scipy.org/site_content/weave>`_.
+
+
+How does Python manage memory?
+------------------------------
+
+The details of Python memory management depend on the implementation. The
+standard C implementation of Python uses reference counting to detect
+inaccessible objects, and another mechanism to collect reference cycles,
+periodically executing a cycle detection algorithm which looks for inaccessible
+cycles and deletes the objects involved. The :mod:`gc` module provides functions
+to perform a garbage collection, obtain debugging statistics, and tune the
+collector's parameters.
+
+Jython relies on the Java runtime so the JVM's garbage collector is used. This
+difference can cause some subtle porting problems if your Python code depends on
+the behavior of the reference counting implementation.
+
+Sometimes objects get stuck in tracebacks temporarily and hence are not
+deallocated when you might expect. Clear the tracebacks with::
+
+ import sys
+ sys.exc_clear()
+ sys.exc_traceback = sys.last_traceback = None
+
+Tracebacks are used for reporting errors, implementing debuggers and related
+things. They contain a portion of the program state extracted during the
+handling of an exception (usually the most recent exception).
+
+In the absence of circularities and tracebacks, Python programs need not
+explicitly manage memory.
+
+Why doesn't Python use a more traditional garbage collection scheme? For one
+thing, this is not a C standard feature and hence it's not portable. (Yes, we
+know about the Boehm GC library. It has bits of assembler code for *most*
+common platforms, not for all of them, and although it is mostly transparent, it
+isn't completely transparent; patches are required to get Python to work with
+it.)
+
+Traditional GC also becomes a problem when Python is embedded into other
+applications. While in a standalone Python it's fine to replace the standard
+malloc() and free() with versions provided by the GC library, an application
+embedding Python may want to have its *own* substitute for malloc() and free(),
+and may not want Python's. Right now, Python works with anything that
+implements malloc() and free() properly.
+
+In Jython, the following code (which is fine in CPython) will probably run out
+of file descriptors long before it runs out of memory::
+
+ for file in <very long list of files>:
+ f = open(file)
+ c = f.read(1)
+
+Using the current reference counting and destructor scheme, each new assignment
+to f closes the previous file. Using GC, this is not guaranteed. If you want
+to write code that will work with any Python implementation, you should
+explicitly close the file; this will work regardless of GC::
+
+ for file in <very long list of files>:
+ f = open(file)
+ c = f.read(1)
+ f.close()
+
+
+Why isn't all memory freed when Python exits?
+---------------------------------------------
+
+Objects referenced from the global namespaces of Python modules are not always
+deallocated when Python exits. This may happen if there are circular
+references. There are also certain bits of memory that are allocated by the C
+library that are impossible to free (e.g. a tool like Purify will complain about
+these). Python is, however, aggressive about cleaning up memory on exit and
+does try to destroy every single object.
+
+If you want to force Python to delete certain things on deallocation use the
+:mod:`atexit` module to run a function that will force those deletions.
+
+
+Why are there separate tuple and list data types?
+-------------------------------------------------
+
+Lists and tuples, while similar in many respects, are generally used in
+fundamentally different ways. Tuples can be thought of as being similar to
+Pascal records or C structs; they're small collections of related data which may
+be of different types which are operated on as a group. For example, a
+Cartesian coordinate is appropriately represented as a tuple of two or three
+numbers.
+
+Lists, on the other hand, are more like arrays in other languages. They tend to
+hold a varying number of objects all of which have the same type and which are
+operated on one-by-one. For example, ``os.listdir('.')`` returns a list of
+strings representing the files in the current directory. Functions which
+operate on this output would generally not break if you added another file or
+two to the directory.
+
+Tuples are immutable, meaning that once a tuple has been created, you can't
+replace any of its elements with a new value. Lists are mutable, meaning that
+you can always change a list's elements. Only immutable elements can be used as
+dictionary keys, and hence only tuples and not lists can be used as keys.
+
+
+How are lists implemented?
+--------------------------
+
+Python's lists are really variable-length arrays, not Lisp-style linked lists.
+The implementation uses a contiguous array of references to other objects, and
+keeps a pointer to this array and the array's length in a list head structure.
+
+This makes indexing a list ``a[i]`` an operation whose cost is independent of
+the size of the list or the value of the index.
+
+When items are appended or inserted, the array of references is resized. Some
+cleverness is applied to improve the performance of appending items repeatedly;
+when the array must be grown, some extra space is allocated so the next few
+times don't require an actual resize.
+
+
+How are dictionaries implemented?
+---------------------------------
+
+Python's dictionaries are implemented as resizable hash tables. Compared to
+B-trees, this gives better performance for lookup (the most common operation by
+far) under most circumstances, and the implementation is simpler.
+
+Dictionaries work by computing a hash code for each key stored in the dictionary
+using the :func:`hash` built-in function. The hash code varies widely depending
+on the key; for example, "Python" hashes to -539294296 while "python", a string
+that differs by a single bit, hashes to 1142331976. The hash code is then used
+to calculate a location in an internal array where the value will be stored.
+Assuming that you're storing keys that all have different hash values, this
+means that dictionaries take constant time -- O(1), in computer science notation
+-- to retrieve a key. It also means that no sorted order of the keys is
+maintained, and traversing the array as the ``.keys()`` and ``.items()`` do will
+output the dictionary's content in some arbitrary jumbled order.
+
+
+Why must dictionary keys be immutable?
+--------------------------------------
+
+The hash table implementation of dictionaries uses a hash value calculated from
+the key value to find the key. If the key were a mutable object, its value
+could change, and thus its hash could also change. But since whoever changes
+the key object can't tell that it was being used as a dictionary key, it can't
+move the entry around in the dictionary. Then, when you try to look up the same
+object in the dictionary it won't be found because its hash value is different.
+If you tried to look up the old value it wouldn't be found either, because the
+value of the object found in that hash bin would be different.
+
+If you want a dictionary indexed with a list, simply convert the list to a tuple
+first; the function ``tuple(L)`` creates a tuple with the same entries as the
+list ``L``. Tuples are immutable and can therefore be used as dictionary keys.
+
+Some unacceptable solutions that have been proposed:
+
+- Hash lists by their address (object ID). This doesn't work because if you
+ construct a new list with the same value it won't be found; e.g.::
+
+ d = {[1,2]: '12'}
+ print d[[1,2]]
+
+ would raise a KeyError exception because the id of the ``[1,2]`` used in the
+ second line differs from that in the first line. In other words, dictionary
+ keys should be compared using ``==``, not using :keyword:`is`.
+
+- Make a copy when using a list as a key. This doesn't work because the list,
+ being a mutable object, could contain a reference to itself, and then the
+ copying code would run into an infinite loop.
+
+- Allow lists as keys but tell the user not to modify them. This would allow a
+ class of hard-to-track bugs in programs when you forgot or modified a list by
+ accident. It also invalidates an important invariant of dictionaries: every
+ value in ``d.keys()`` is usable as a key of the dictionary.
+
+- Mark lists as read-only once they are used as a dictionary key. The problem
+ is that it's not just the top-level object that could change its value; you
+ could use a tuple containing a list as a key. Entering anything as a key into
+ a dictionary would require marking all objects reachable from there as
+ read-only -- and again, self-referential objects could cause an infinite loop.
+
+There is a trick to get around this if you need to, but use it at your own risk:
+You can wrap a mutable structure inside a class instance which has both a
+:meth:`__cmp_` and a :meth:`__hash__` method. You must then make sure that the
+hash value for all such wrapper objects that reside in a dictionary (or other
+hash based structure), remain fixed while the object is in the dictionary (or
+other structure). ::
+
+ class ListWrapper:
+ def __init__(self, the_list):
+ self.the_list = the_list
+ def __cmp__(self, other):
+ return self.the_list == other.the_list
+ def __hash__(self):
+ l = self.the_list
+ result = 98767 - len(l)*555
+ for i in range(len(l)):
+ try:
+ result = result + (hash(l[i]) % 9999999) * 1001 + i
+ except:
+ result = (result % 7777777) + i * 333
+ return result
+
+Note that the hash computation is complicated by the possibility that some
+members of the list may be unhashable and also by the possibility of arithmetic
+overflow.
+
+Furthermore it must always be the case that if ``o1 == o2`` (ie ``o1.__cmp__(o2)
+== 0``) then ``hash(o1) == hash(o2)`` (ie, ``o1.__hash__() == o2.__hash__()``),
+regardless of whether the object is in a dictionary or not. If you fail to meet
+these restrictions dictionaries and other hash based structures will misbehave.
+
+In the case of ListWrapper, whenever the wrapper object is in a dictionary the
+wrapped list must not change to avoid anomalies. Don't do this unless you are
+prepared to think hard about the requirements and the consequences of not
+meeting them correctly. Consider yourself warned.
+
+
+Why doesn't list.sort() return the sorted list?
+-----------------------------------------------
+
+In situations where performance matters, making a copy of the list just to sort
+it would be wasteful. Therefore, :meth:`list.sort` sorts the list in place. In
+order to remind you of that fact, it does not return the sorted list. This way,
+you won't be fooled into accidentally overwriting a list when you need a sorted
+copy but also need to keep the unsorted version around.
+
+In Python 2.4 a new builtin -- :func:`sorted` -- has been added. This function
+creates a new list from a provided iterable, sorts it and returns it. For
+example, here's how to iterate over the keys of a dictionary in sorted order::
+
+ for key in sorted(dict.iterkeys()):
+ ... # do whatever with dict[key]...
+
+
+How do you specify and enforce an interface spec in Python?
+-----------------------------------------------------------
+
+An interface specification for a module as provided by languages such as C++ and
+Java describes the prototypes for the methods and functions of the module. Many
+feel that compile-time enforcement of interface specifications helps in the
+construction of large programs.
+
+Python 2.6 adds an :mod:`abc` module that lets you define Abstract Base Classes
+(ABCs). You can then use :func:`isinstance` and :func:`issubclass` to check
+whether an instance or a class implements a particular ABC. The
+:mod:`collections` modules defines a set of useful ABCs such as
+:class:`Iterable`, :class:`Container`, and :class:`MutableMapping`.
+
+For Python, many of the advantages of interface specifications can be obtained
+by an appropriate test discipline for components. There is also a tool,
+PyChecker, which can be used to find problems due to subclassing.
+
+A good test suite for a module can both provide a regression test and serve as a
+module interface specification and a set of examples. Many Python modules can
+be run as a script to provide a simple "self test." Even modules which use
+complex external interfaces can often be tested in isolation using trivial
+"stub" emulations of the external interface. The :mod:`doctest` and
+:mod:`unittest` modules or third-party test frameworks can be used to construct
+exhaustive test suites that exercise every line of code in a module.
+
+An appropriate testing discipline can help build large complex applications in
+Python as well as having interface specifications would. In fact, it can be
+better because an interface specification cannot test certain properties of a
+program. For example, the :meth:`append` method is expected to add new elements
+to the end of some internal list; an interface specification cannot test that
+your :meth:`append` implementation will actually do this correctly, but it's
+trivial to check this property in a test suite.
+
+Writing test suites is very helpful, and you might want to design your code with
+an eye to making it easily tested. One increasingly popular technique,
+test-directed development, calls for writing parts of the test suite first,
+before you write any of the actual code. Of course Python allows you to be
+sloppy and not write test cases at all.
+
+
+Why are default values shared between objects?
+----------------------------------------------
+
+This type of bug commonly bites neophyte programmers. Consider this function::
+
+ def foo(D={}): # Danger: shared reference to one dict for all calls
+ ... compute something ...
+ D[key] = value
+ return D
+
+The first time you call this function, ``D`` contains a single item. The second
+time, ``D`` contains two items because when ``foo()`` begins executing, ``D``
+starts out with an item already in it.
+
+It is often expected that a function call creates new objects for default
+values. This is not what happens. Default values are created exactly once, when
+the function is defined. If that object is changed, like the dictionary in this
+example, subsequent calls to the function will refer to this changed object.
+
+By definition, immutable objects such as numbers, strings, tuples, and ``None``,
+are safe from change. Changes to mutable objects such as dictionaries, lists,
+and class instances can lead to confusion.
+
+Because of this feature, it is good programming practice to not use mutable
+objects as default values. Instead, use ``None`` as the default value and
+inside the function, check if the parameter is ``None`` and create a new
+list/dictionary/whatever if it is. For example, don't write::
+
+ def foo(dict={}):
+ ...
+
+but::
+
+ def foo(dict=None):
+ if dict is None:
+ dict = {} # create a new dict for local namespace
+
+This feature can be useful. When you have a function that's time-consuming to
+compute, a common technique is to cache the parameters and the resulting value
+of each call to the function, and return the cached value if the same value is
+requested again. This is called "memoizing", and can be implemented like this::
+
+ # Callers will never provide a third parameter for this function.
+ def expensive (arg1, arg2, _cache={}):
+ if _cache.has_key((arg1, arg2)):
+ return _cache[(arg1, arg2)]
+
+ # Calculate the value
+ result = ... expensive computation ...
+ _cache[(arg1, arg2)] = result # Store result in the cache
+ return result
+
+You could use a global variable containing a dictionary instead of the default
+value; it's a matter of taste.
+
+
+Why is there no goto?
+---------------------
+
+You can use exceptions to provide a "structured goto" that even works across
+function calls. Many feel that exceptions can conveniently emulate all
+reasonable uses of the "go" or "goto" constructs of C, Fortran, and other
+languages. For example::
+
+ class label: pass # declare a label
+
+ try:
+ ...
+ if (condition): raise label() # goto label
+ ...
+ except label: # where to goto
+ pass
+ ...
+
+This doesn't allow you to jump into the middle of a loop, but that's usually
+considered an abuse of goto anyway. Use sparingly.
+
+
+Why can't raw strings (r-strings) end with a backslash?
+-------------------------------------------------------
+
+More precisely, they can't end with an odd number of backslashes: the unpaired
+backslash at the end escapes the closing quote character, leaving an
+unterminated string.
+
+Raw strings were designed to ease creating input for processors (chiefly regular
+expression engines) that want to do their own backslash escape processing. Such
+processors consider an unmatched trailing backslash to be an error anyway, so
+raw strings disallow that. In return, they allow you to pass on the string
+quote character by escaping it with a backslash. These rules work well when
+r-strings are used for their intended purpose.
+
+If you're trying to build Windows pathnames, note that all Windows system calls
+accept forward slashes too::
+
+ f = open("/mydir/file.txt") # works fine!
+
+If you're trying to build a pathname for a DOS command, try e.g. one of ::
+
+ dir = r"\this\is\my\dos\dir" "\\"
+ dir = r"\this\is\my\dos\dir\ "[:-1]
+ dir = "\\this\\is\\my\\dos\\dir\\"
+
+
+Why doesn't Python have a "with" statement for attribute assignments?
+---------------------------------------------------------------------
+
+Python has a 'with' statement that wraps the execution of a block, calling code
+on the entrance and exit from the block. Some language have a construct that
+looks like this::
+
+ with obj:
+ a = 1 # equivalent to obj.a = 1
+ total = total + 1 # obj.total = obj.total + 1
+
+In Python, such a construct would be ambiguous.
+
+Other languages, such as Object Pascal, Delphi, and C++, use static types, so
+it's possible to know, in an unambiguous way, what member is being assigned
+to. This is the main point of static typing -- the compiler *always* knows the
+scope of every variable at compile time.
+
+Python uses dynamic types. It is impossible to know in advance which attribute
+will be referenced at runtime. Member attributes may be added or removed from
+objects on the fly. This makes it impossible to know, from a simple reading,
+what attribute is being referenced: a local one, a global one, or a member
+attribute?
+
+For instance, take the following incomplete snippet::
+
+ def foo(a):
+ with a:
+ print x
+
+The snippet assumes that "a" must have a member attribute called "x". However,
+there is nothing in Python that tells the interpreter this. What should happen
+if "a" is, let us say, an integer? If there is a global variable named "x",
+will it be used inside the with block? As you see, the dynamic nature of Python
+makes such choices much harder.
+
+The primary benefit of "with" and similar language features (reduction of code
+volume) can, however, easily be achieved in Python by assignment. Instead of::
+
+ function(args).dict[index][index].a = 21
+ function(args).dict[index][index].b = 42
+ function(args).dict[index][index].c = 63
+
+write this::
+
+ ref = function(args).dict[index][index]
+ ref.a = 21
+ ref.b = 42
+ ref.c = 63
+
+This also has the side-effect of increasing execution speed because name
+bindings are resolved at run-time in Python, and the second version only needs
+to perform the resolution once. If the referenced object does not have a, b and
+c attributes, of course, the end result is still a run-time exception.
+
+
+Why are colons required for the if/while/def/class statements?
+--------------------------------------------------------------
+
+The colon is required primarily to enhance readability (one of the results of
+the experimental ABC language). Consider this::
+
+ if a == b
+ print a
+
+versus ::
+
+ if a == b:
+ print a
+
+Notice how the second one is slightly easier to read. Notice further how a
+colon sets off the example in this FAQ answer; it's a standard usage in English.
+
+Another minor reason is that the colon makes it easier for editors with syntax
+highlighting; they can look for colons to decide when indentation needs to be
+increased instead of having to do a more elaborate parsing of the program text.
+
+
+Why does Python allow commas at the end of lists and tuples?
+------------------------------------------------------------
+
+Python lets you add a trailing comma at the end of lists, tuples, and
+dictionaries::
+
+ [1, 2, 3,]
+ ('a', 'b', 'c',)
+ d = {
+ "A": [1, 5],
+ "B": [6, 7], # last trailing comma is optional but good style
+ }
+
+
+There are several reasons to allow this.
+
+When you have a literal value for a list, tuple, or dictionary spread across
+multiple lines, it's easier to add more elements because you don't have to
+remember to add a comma to the previous line. The lines can also be sorted in
+your editor without creating a syntax error.
+
+Accidentally omitting the comma can lead to errors that are hard to diagnose.
+For example::
+
+ x = [
+ "fee",
+ "fie"
+ "foo",
+ "fum"
+ ]
+
+This list looks like it has four elements, but it actually contains three:
+"fee", "fiefoo" and "fum". Always adding the comma avoids this source of error.
+
+Allowing the trailing comma may also make programmatic code generation easier.