diff options
-rw-r--r-- | doc/build/content/datamapping.txt.old | 1053 | ||||
-rw-r--r-- | doc/build/content/dbengine.txt | 2 | ||||
-rw-r--r-- | doc/build/content/mappers.txt | 13 | ||||
-rw-r--r-- | doc/build/content/metadata.txt | 2 | ||||
-rw-r--r-- | doc/build/content/session.txt | 538 | ||||
-rw-r--r-- | doc/build/content/unitofwork.txt | 493 | ||||
-rw-r--r-- | doc/build/genhtml.py | 2 | ||||
-rw-r--r-- | lib/sqlalchemy/orm/session.py | 64 |
8 files changed, 578 insertions, 1589 deletions
diff --git a/doc/build/content/datamapping.txt.old b/doc/build/content/datamapping.txt.old deleted file mode 100644 index 13dff70d3..000000000 --- a/doc/build/content/datamapping.txt.old +++ /dev/null @@ -1,1053 +0,0 @@ -[alpha_api]: javascript:alphaApi() -[alpha_implementation]: javascript:alphaImplementation() - -Data Mapping {@name=datamapping} -============ - -### Basic Data Mapping {@name=datamapping} - -Data mapping describes the process of defining `Mapper` objects, which associate `Table` objects with user-defined classes. - -When a `Mapper` is created to associate a `Table` object with a class, all of the columns defined in the `Table` object are associated with the class via property accessors, which add overriding functionality to the normal process of setting and getting object attributes. These property accessors keep track of changes to object attributes, so that they may be stored to the database when the application "flushes" the current state of objects. This pattern is called a *Unit of Work* pattern. - -### Synopsis {@name=synopsis} - -Starting with a `Table` definition and a minimal class construct, the two are associated with each other via the `mapper()` function [[api](rel:docstrings_sqlalchemy.orm.mapper_Mapper)], which generates an object called a `Mapper`. SA associates the class and all instances of that class with this particular `Mapper`, which is stored in a registry such that SQLAlchemy knows how to find it automatically. - - {python} - from sqlalchemy import * - - # metadata - meta = MetaData() - - # table object - users_table = Table('users', meta, - Column('user_id', Integer, primary_key=True), - Column('user_name', String(16)), - Column('fullname', String(100)), - Column('password', String(20)) - ) - - # class definition - class User(object): - pass - - # create a mapper and associate it with the User class. - mapper(User, users_table) - -Thats all for configuration. Next, we will create an `Engine` and bind it to a `Session`, which represents a local collection of mapped objects to be operated upon. - - {python} - # engine - engine = create_engine("sqlite://mydb.db") - - # session - session = create_session(bind=engine) - -The `session` represents a "workspace" which can load objects and persist changes to the database. Note also that the `bind` parameter is optional; if the underlying `Table` objects are bound as described in [metadata_tables_binding](rel:metadata_tables_binding), it's not needed. A `Session` [[doc](rel:unitofwork)] [[api](rel:docstrings_sqlalchemy.orm.session_Session)] is best created as local to a particular set of related data operations, such as scoped within a function call, or within a single application request cycle. Next we illustrate a rudimental query which will load a single object instance. We will modify one of its attributes and persist the change back to the database. - - {python} - # select - {sql}user = session.query(User).filter_by(user_name='fred')[0] - SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, - users.fullname AS users_fullname, users.password AS users_password - FROM users - WHERE users.user_name = :users_user_name ORDER BY users.oid LIMIT 1 - {'users_user_name': 'fred'} - - # modify - user.user_name = 'fred jones' - - # flush - saves everything that changed - # within the scope of our Session - {sql}session.flush() - BEGIN - UPDATE users SET user_name=:user_name - WHERE users.user_id = :user_id - [{'user_name': 'fred jones', 'user_id': 1}] - COMMIT - -Things to note from the above include that the loaded `User` object has an attribute named `user_name` on it, which corresponds to the `user_name` column in `users_table`; this attribute was configured at the class level by the `Mapper`, as part of it's post-initialization process (this process occurs normally when the mapper is first used). Our modify operation on this attribute caused the object to be marked as "dirty", which was picked up automatically within the subsequent `flush()` process. The `flush()` is the point at which all changes to objects within the `Session` are persisted to the database; afterwards, the `User` object is no longer marked as "dirty" until it is again modified. - -### The Query Object {@name=query} - -The method `session.query(class_or_mapper)` returns a `Query` object [[api](rel:docstrings_sqlalchemy.orm.query_Query)]. `Query` implements methods which are used to produce and execute select statements tailored for loading object instances. It returns object instances in all cases; usually as a list, but in some cases scalar objects, or lists of tuples which contain multiple kinds of objects and sometimes individual scalar values. - -A `Query` is created from the `Session`, relative to a particular class we wish to load. - - {python} - # get a query from a Session based on class: - query = session.query(User) - -Alternatively, an actual `Mapper` instance can be specified instead of a class: - - {python} - # locate the mapper corresponding to the User class - usermapper = class_mapper(User) - - # create query against the User mapper - query = session.query(usermapper) - -A query which joins across multiple tables may also be used to request multiple entities, such as: - - {python} - query = session.query(User, Address) - -Once we have a query, we can start loading objects. The Query object, when first created, represents all the instances of its main class. You can iterate through it directly: - - {python} - {sql}for user in session.query(User): - print user.name - SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, - users.fullname AS users_fullname, users.password AS users_password - FROM users - ORDER BY users.oid - {} - -...and the SQL will be issued at the point where the query is evaluated as a list. To narrow results, the two main methods are `filter()` and `filter_by()`. `filter_by()` uses keyword arguments, which translate into equality clauses joined together via 'AND': - - {python} - {sql}for user in session.query(User).filter_by(name='john', fullname='John Smith'): - print user.name - SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, - users.fullname AS users_fullname, users.password AS users_password - FROM users - WHERE users.user_name = :users_user_name AND users.fullname = :users_fullname - ORDER BY users.oid - {'users_user_name': 'john', 'users_fullname': 'John Smith'} - -`filter()`, on the other hand, works with constructed SQL expressions, like those described in [sql](rel:sql): - - {python} - {sql}for user in session.query(User).filter(users_table.c.name=='john'): - print user.name - SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, - users.fullname AS users_fullname, users.password AS users_password - FROM users - WHERE users.user_name = :users_user_name - ORDER BY users.oid - {'users_user_name': 'john'} - -Evaluating the query using an array slice returns a new Query which will apply LIMIT and OFFSET clauses when iterated: - - {python} - {sql}for u in session.query(User)[1:3]: - print u - SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, - users.fullname AS users_fullname, users.password AS users_password - FROM users ORDER BY users.oid - LIMIT 2 OFFSET 1 - {} - -A single array index adds LIMIT/OFFSET and returns a result immediately: - - {python} - {sql}user = session.query(User).filter(user_table.c.name=='john')[2] - SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, - users.fullname AS users_fullname, users.password AS users_password - FROM users - WHERE users.user_name = :users_user_name - ORDER BY users.oid - LIMIT 1 OFFSET 2 - {'users_user_name': 'john'} - -There are also methods which will immediately issue the SQL represented by a `Query` without using an iterative context or array index; these methods are `all()`, `one()`, and `first()`. `all()` returns a list of all instances, `one()` returns exactly one instance as a scalar, and `first()` returns the first instance also as a scalar: - - {python} - query = session.query(User).filter(users_table.c.name=='john') - - # get all results into a list - allusers = query.all() - - # get the first user - user = query.first() - - # get exactly one user; raises an exception if not exactly one result is returned - user = query.one() - -Note that most methods on `Query` are *generative*, in that they return a new `Query` instance that is a modified version of the previous one. It's only when you evaluate the query in an iterative context, use an array index, or call `all()`, `first()`, or `one()` (as well as some other methods we'll cover later), that SQL is issued. For example, you can issue `filter()` or `filter_by()` as many times as needed; the various criteria are joined together using `AND`: - - {python} - result = session.query(User).filter(users_table.c.user_id>224).filter_by(name='john'). - {sql} filter(users.c.fullname=='John Smith').all() - SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, - users.fullname AS users_fullname, users.password AS users_password - FROM users - WHERE users.user_id>:users_user_id AND users.user_name = :users_user_name - AND users.fullname = :users_fullname - ORDER BY users.oid - {'users_user_name': 'john', 'users_fullname': 'John Smith', 'users_user_id': 224} - -If you need to use other conjunctions besides `AND`, all SQL conjunctions are available explicitly within expressions, such as `and_()` and `or_()`, when using `filter()`: - - {python} - result = session.query(User).filter( - and_(users_table.c.user_id>224, or_(users_table.c.name=='john', users_table.c.name=='ed')) - ).all() - -Sometimes, constructing criterion via expressions can be cumbersome. For a quick, string-based expression, the `filter()` method can also accomodate straight text: - - {python} - {sql}result = session.query(User).filter("user_id>224").all() - SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, - users.fullname AS users_fullname, users.password AS users_password - FROM users - WHERE users.user_id>224 - ORDER BY users.oid - {} - -When using text, bind parameters can be specified the same way as in a `text()` clause, using a colon. To specify the bind parameter values, use the `params()` method: - - {python} - {sql}result = session.query(User).filter("user_id>:value and user_name=:name").params(value=224, name='jack').all() - SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, - users.fullname AS users_fullname, users.password AS users_password - FROM users - WHERE users.user_id>:value and user_name=:name - ORDER BY users.oid - {'value': 224, 'name': 'jack'} - -It's also straightforward to use an entirely string-based statement, using `from_statement()`; just ensure that the columns clause of the statement contains the column names normally used by the mapper (below illustrated using an asterisk): - - {python} - {sql}result = session.query(User).from_statement("SELECT * FROM users").all() - SELECT * FROM users - {} - -`from_statement()` can also accomodate full `select()` constructs: - - {python} - result = session.query(User).from_statement( - select([users], users.c.name<'e', having=users.c.name==func.max(users.c.name), group_by=[c for c in users.c]) - {sql} ).all() - SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, - users.fullname AS users_fullname, users.password AS users_password - FROM users - WHERE users.user_name>:users_user_name HAVING users.user_name == max(users.user_name) - GROUP BY users.user_id, users.user_name, users.fullname, users.password - ORDER BY users.oid - {'users_user_name': 'e'} - -The current criterion represented by a `Query` can be distilled into a count of rows using `count()`. This is another function which executes SQL immediately, returning an integer result: - - {python} - {sql}num = session.query(Users).filter(users_table.c.user_id>224).count() - SELECT count(users.id) FROM users WHERE users.user_id>:users_user_id - {'users_user_id': 224} - -To add limit and offset values explicitly at any time, you can use `limit()` and `offset()`: - - {python} - {sql}result = session.query(User).limit(20).offset(5).all() - SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, - users.fullname AS users_fullname, users.password AS users_password - FROM users ORDER BY users.oid - LIMIT 20 OFFSET 5 - {} - -Ordering is applied, using `Column` objects and related SQL constructs, with `order_by()`: - - {python} - query = session.query(User).order_by(desc(users_table.c.user_name)) - {sql}for user in query: - print user - SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, - users.fullname AS users_fullname, users.password AS users_password - FROM users ORDER BY users.user_name DESC - {} - -There's also a way to combine scalar results with objects, using `add_column()`. This is often used for functions and aggregates. When `add_column()` (or its cousin `add_entity()`, described later) is used, tuples are returned: - - {python} - for r in session.query(User).add_column(func.max(users_table.c.name)).group_by([c for c in users_table.c]): - print "user:", r[0] - print "max name:", r[1] - -Later in this chapter, we'll discuss how to configure relations between mapped classes. Once that's done, we'll discuss how to return multiple objects at once, as well as how to join, in [datamapping_joins](rel:datamapping_joins). - -#### Loading by Primary Key {@name=primarykey} - -The `get()` method loads a single instance, given the primary key value of the desired entity: - - {python} - # load user with primary key 15 - user = query.get(15) - -The `get()` method, because it has the actual primary key value of the instance, can return an already-loaded instance from the `Session` without performing any SQL. It is the only result-returning method on `Query` that does not issue SQL to the database in all cases. - -To issue a composite primary key to `get()`, use a tuple. The order of the arguments matches that of the primary key columns of the table: - - {python} - myobj = query.get((27, 3, 'receipts')) - -Another special method on `Query` is `load()`. This method has the same signature as `get()`, except it always **refreshes** the returned instance with the latest data from the database. This is in fact a unique behavior, since as we will see in the [unitofwork](rel:unitofwork) chapter, most `Query` methods do not reload the contents of instances which are already present in the session. - -#### Column Objects Available via their Mapped Class {@name=columnsonclass} - -Some of the above examples above illustrate the usage of the mapper's Table object to provide the columns for a WHERE Clause. These columns are also accessible off of the mapped class directly. When a mapper is assigned to a class, it also attaches a special property accessor `c` to the class itself, which can be used just like that of a `Table` object to access the columns of the table: - - {python} - userlist = session.query(User).filter(User.c.user_id==12).first() - -In version 0.4 of SQLAlchemy, the "c" prefix will no longer be needed. - -### Saving Objects {@name=saving} - -When objects corresponding to mapped classes are created or manipulated, all changes are logged by the `Session` object. The changes are then written to the database when an application calls `flush()`. This pattern is known as a *Unit of Work*, and has many advantages over saving individual objects or attributes on those objects with individual method invocations. Domain models can be built with far greater complexity with no concern over the order of saves and deletes, excessive database round-trips and write operations, or deadlocking issues. The `flush()` operation batches its SQL statements into a transaction, and can also perform optimistic concurrency checks (using a version id column) to ensure the proper number of rows were in fact affected. - -The Unit of Work is a powerful tool, and has some important concepts that should be understood in order to use it effectively. See the [unitofwork](rel:unitofwork) section for a full description on all its operations. - -When a mapper is created, the target class has its mapped properties decorated by specialized property accessors that track changes. New objects by default must be explicitly added to the `Session` using the `save()` method: - - {python} - mapper(User, users_table) - - # create a new User - myuser = User() - myuser.user_name = 'jane' - myuser.password = 'hello123' - - # create another new User - myuser2 = User() - myuser2.user_name = 'ed' - myuser2.password = 'lalalala' - - # create a Session and save them - sess = create_session() - sess.save(myuser) - sess.save(myuser2) - - # load a third User from the database - {sql}myuser3 = sess.query(User).filter_by(name='fred').all()[0] - SELECT users.user_id AS users_user_id, - users.user_name AS users_user_name, users.password AS users_password - FROM users WHERE users.user_name = :users_user_name - {'users_user_name': 'fred'} - - myuser3.user_name = 'fredjones' - - # save all changes - {sql}session.flush() - UPDATE users SET user_name=:user_name - WHERE users.user_id =:users_user_id - [{'users_user_id': 1, 'user_name': 'fredjones'}] - INSERT INTO users (user_name, password) VALUES (:user_name, :password) - {'password': 'hello123', 'user_name': 'jane'} - INSERT INTO users (user_name, password) VALUES (:user_name, :password) - {'password': 'lalalala', 'user_name': 'ed'} - -The mapped class can also specify whatever methods and/or constructor it wants: - - {python} - class User(object): - def __init__(self, user_name, password): - self.user_id = None - self.user_name = user_name - self.password = password - def get_name(self): - return self.user_name - def __repr__(self): - return "User id %s name %s password %s" % (repr(self.user_id), - repr(self.user_name), repr(self.password)) - mapper(User, users_table) - - sess = create_session() - u = User('john', 'foo') - sess.save(u) - {sql}session.flush() - INSERT INTO users (user_name, password) VALUES (:user_name, :password) - {'password': 'foo', 'user_name': 'john'} - - >>> u - User id 1 name 'john' password 'foo' - -Note that the **__init__() method is not called when the instance is loaded**. This is so that classes can define operations that are specific to their initial construction which are not re-called when the object is restored from the database, and is similar in concept to how Python's `pickle` module calls `__new__()` when deserializing instances. To allow `__init__()` to be called at object load time, or to define any other sort of on-load operation, create a `MapperExtension` which supplies the `create_instance()` method (see [advdatamapping_extending](rel:advdatamapping_extending), as well as the example in the FAQ). - -### Defining and Using Relationships {@name=relations} - -So that covers how to map the columns in a table to an object, how to load objects, create new ones, and save changes. The next step is how to define an object's relationships to other database-persisted objects. This is done via the `relation` function [[doc](rel:advdatamapping_properties_relationoptions)][[api](rel:docstrings_sqlalchemy.orm_modfunc_relation)] provided by the `orm` module. - -#### One to Many {@name=onetomany} - -With our User class, lets also define the User has having one or more mailing addresses. First, the table metadata: - - {python} - from sqlalchemy import * - - metadata = MetaData() - - # define user table - users_table = Table('users', metadata, - Column('user_id', Integer, primary_key=True), - Column('user_name', String(16)), - Column('password', String(20)) - ) - - # define user address table - addresses_table = Table('addresses', metadata, - Column('address_id', Integer, primary_key=True), - Column('user_id', Integer, ForeignKey("users.user_id")), - Column('street', String(100)), - Column('city', String(80)), - Column('state', String(2)), - Column('zip', String(10)) - ) - -Of importance here is the addresses table's definition of a *foreign key* relationship to the users table, relating the user_id column into a parent-child relationship. When a `Mapper` wants to indicate a relation of one object to another, the `ForeignKey` relationships are the default method by which the relationship is determined (options also exist to describe the relationships explicitly). - -So then lets define two classes, the familiar `User` class, as well as an `Address` class: - - {python} - class User(object): - def __init__(self, user_name, password): - self.user_name = user_name - self.password = password - - class Address(object): - def __init__(self, street, city, state, zip): - self.street = street - self.city = city - self.state = state - self.zip = zip - -And then a `Mapper` that will define a relationship of the `User` and the `Address` classes to each other as well as their table metadata. We will add an additional mapper keyword argument `properties` which is a dictionary relating the names of class attributes to database relationships, in this case a `relation` object against a newly defined mapper for the Address class: - - {python} - mapper(Address, addresses_table) - mapper(User, users_table, properties = { - 'addresses' : relation(Address) - } - ) - -Lets do some operations with these classes and see what happens: - - {python} - engine = create_engine('sqlite:///mydb.db') - - # create tables - metadata.create_all(engine) - - session = create_session(bind=engine) - - u = User('jane', 'hihilala') - u.addresses.append(Address('123 anywhere street', 'big city', 'UT', '76543')) - u.addresses.append(Address('1 Park Place', 'some other city', 'OK', '83923')) - - session.save(u) - session.flush() - {opensql}INSERT INTO users (user_name, password) VALUES (:user_name, :password) - {'password': 'hihilala', 'user_name': 'jane'} - INSERT INTO addresses (user_id, street, city, state, zip) VALUES (:user_id, :street, :city, :state, :zip) - {'city': 'big city', 'state': 'UT', 'street': '123 anywhere street', 'user_id':1, 'zip': '76543'} - INSERT INTO addresses (user_id, street, city, state, zip) VALUES (:user_id, :street, :city, :state, :zip) - {'city': 'some other city', 'state': 'OK', 'street': '1 Park Place', 'user_id':1, 'zip': '83923'} - -A lot just happened there! The `Mapper` figured out how to relate rows in the addresses table to the users table, and also upon flush had to determine the proper order in which to insert rows. After the insert, all the `User` and `Address` objects have their new primary and foreign key attributes populated. - -Also notice that when we created a `Mapper` on the `User` class which defined an `addresses` relation, the newly created `User` instance magically had an "addresses" attribute which behaved like a list. This list is in reality a Python `property` which will return an instance of `sqlalchemy.orm.attributes.InstrumentedList`. This is a generic collection-bearing object which can represent lists, sets, dictionaries, or any user-defined collection class. By default it represents a list: - - {python} - del u.addresses[1] - u.addresses.append(Address('27 New Place', 'Houston', 'TX', '34839')) - - session.flush() - - {opensql}UPDATE addresses SET user_id=:user_id - WHERE addresses.address_id = :addresses_address_id - [{'user_id': None, 'addresses_address_id': 2}] - INSERT INTO addresses (user_id, street, city, state, zip) - VALUES (:user_id, :street, :city, :state, :zip) - {'city': 'Houston', 'state': 'TX', 'street': '27 New Place', 'user_id': 1, 'zip': '34839'} - -Note that when creating a relation with the `relation()` function, the target can either be a class, in which case the primary mapper for that class is used as the target, or a `Mapper` instance itself, as returned by the `mapper()` function. - -#### Lifecycle Relations {@name=lifecycle} - -In the previous example, a single address was removed from the `addresses` attribute of a `User` object, resulting in the corresponding database row being updated to have a user_id of `None`. But now, theres a mailing address with no user_id floating around in the database of no use to anyone. How can we avoid this ? This is acheived by using the `cascade` parameter of `relation`: - - {python} - clear_mappers() # clear mappers from the previous example - mapper(Address, addresses_table) - mapper(User, users_table, properties = { - 'addresses' : relation(Address, cascade="all, delete-orphan") - } - ) - - del u.addresses[1] - u.addresses.append(Address('27 New Place', 'Houston', 'TX', '34839')) - - session.flush() - {opensql}INSERT INTO addresses (user_id, street, city, state, zip) - VALUES (:user_id, :street, :city, :state, :zip) - {'city': 'Houston', 'state': 'TX', 'street': '27 New Place', 'user_id': 1, 'zip': '34839'} - DELETE FROM addresses WHERE addresses.address_id = :address_id - [{'address_id': 2}] - -In this case, with the `delete-orphan` **cascade rule** set, the element that was removed from the addresses list was also removed from the database. Specifying `cascade="all, delete-orphan"` means that every persistence operation performed on the parent object will be *cascaded* to the child object or objects handled by the relation, and additionally that each child object cannot exist without being attached to a parent. Such a relationship indicates that the **lifecycle** of the `Address` objects are bounded by that of their parent `User` object. - -Cascading is described fully in [unitofwork_cascade](rel:unitofwork_cascade). - -#### Backreferences {@name=backreferences} - -By creating relations with the `backref` keyword, a bi-directional relationship can be created which will keep both ends of the relationship updated automatically, independently of database operations. Below, the `User` mapper is created with an `addresses` property, and the corresponding `Address` mapper receives a "backreference" to the `User` object via the property name `user`: - - {python} - Address = mapper(Address, addresses_table) - User = mapper(User, users_table, properties = { - 'addresses' : relation(Address, backref='user') - } - ) - - u = User('fred', 'hi') - a1 = Address('123 anywhere street', 'big city', 'UT', '76543') - a2 = Address('1 Park Place', 'some other city', 'OK', '83923') - - # append a1 to u - u.addresses.append(a1) - - # attach u to a2 - a2.user = u - - # the bi-directional relation is maintained - >>> u.addresses == [a1, a2] - True - >>> a1.user is user and a2.user is user - True - -The backreference feature also works with many-to-many relationships, which are described later. When creating a backreference, a corresponding property (i.e. a second `relation()`) is placed on the child mapper. The default arguments to this property can be overridden using the `backref()` function: - - {python} - mapper(User, users_table) - mapper(Address, addresses_table, properties={ - 'user':relation(User, backref=backref('addresses', cascade="all, delete-orphan")) - }) - - -The `backref()` function is often used to set up a bi-directional one-to-one relationship. This is because the `relation()` function by default creates a "one-to-many" relationship when presented with a primary key/foreign key relationship, but the `backref()` function can redefine the `uselist` property to make it a scalar: - - {python} - mapper(User, users_table) - mapper(Address, addresses_table, properties={ - 'user' : relation(User, backref=backref('address', uselist=False)) - }) - -### Querying with Joins {@name=joins} - -When using mappers that have relationships to other mappers, the need to specify query criterion across multiple tables arises. SQLAlchemy provides several core techniques which offer this functionality. - -One way is just to build up the join criterion yourself. This is easy to do using `filter()`: - - {python} - {sql}l = session.query(User).filter(users.c.user_id==addresses.c.user_id). - filter(addresses.c.street=='123 Green Street').all() - SELECT users.user_id AS users_user_id, - users.user_name AS users_user_name, users.password AS users_password - FROM users, addresses - WHERE users.user_id=addresses.user_id - AND addresses.street=:addresses_street - ORDER BY users.oid - {'addresses_street', '123 Green Street'} - -Above, we specified selection criterion that included columns from both the `users` and the `addresses` table. Note that in this case, we had to specify not just the matching condition to the `street` column on `addresses`, but also the join condition between the `users` and `addresses` table. If we didn't do that, we'd get a *cartesian product* of both tables. The `Query` object never "guesses" what kind of join you'd like to use, but makes it easy using the `join()` method which we'll get to in a moment. - -A way to specify joins very explicitly, using the SQL `join()` construct, is possible via the `select_from()` method on `Query`: - - {python} - {sql}l = session.query(User).select_from(users_table.join(addresses_table)). - filter(addresses_table.c.street=='123 Green Street').all() - SELECT users.user_id AS users_user_id, - users.user_name AS users_user_name, users.password AS users_password - FROM users JOIN addresses ON users.user_id=addresses.user_id - WHERE addresses.street=:addresses_street - ORDER BY users.oid - {'addresses_street', '123 Green Street'} - -But the easiest way to join is automatically, using the `join()` method on `Query`. Just give this method the path from A to B, using the name of a mapped relationship directly: - - {python} - {sql}l = session.query(User).join('addresses'). - filter(addresses_table.c.street=='123 Green Street').all() - SELECT users.user_id AS users_user_id, - users.user_name AS users_user_name, users.password AS users_password - FROM users JOIN addresses ON users.user_id=addresses.user_id - WHERE addresses.street=:addresses_street - ORDER BY users.oid - {'addresses_street', '123 Green Street'} - -Each time the `join()` is called on `Query`, the **joinpoint** of the query is moved to be that of the endpoint of the join. As above, when we joined from `users_table` to `addresses_table`, all subsequent criterion used by `filter_by()` are against the `addresses` table. If we wanted to filter back on the starting table again, we can use the `reset_joinpoint()` function: - - {python} - l = session.query(User).join('addresses'). - filter_by(street='123 Green Street'). - reset_joinpoint().filter_by(user_name='ed').all() - -With `reset_joinpoint()`, we can also issue new `join()`s which will start back from the root table. - -In all cases, we can get the `User` and the matching `Address` objects back at the same time, by telling the session we want both. This returns the results as a list of tuples: - - {python} - result = session.query(User, Address).join('addresses'). - filter(addresses_table.c.street=='123 Green Street').all() - for r in result: - print "User:", r[0] - print "Address:", r[1] - -The above syntax is shorthand for using the `add_entity()` method: - - {python} - session.query(User).add_entity(Address).join('addresses').all() - -To join across multiple relationships, specify them in a list. Below, we load a `ShoppingCart`, limiting its `cartitems` collection to the single item which has a `price` object whose `amount` column is 47.95: - - {python} - cart = session.query(ShoppingCart).join(['cartitems', 'price']).filter_by(amount=47.95).one() - -`filter_by()` can also generate joins in some cases, such as when comparing to an object instance: - - {python} - # get an instance of Address. assume its primary key identity - # is 12. - someaddress = session.query(Address).filter_by(street='123 Green Street').one() - - # look for User instances which have the - # "someaddress" instance in their "addresses" collection - {sql}l = session.query(User).filter_by(addresses=someaddress).all() - SELECT users.user_id AS users_user_id, - users.user_name AS users_user_name, users.password AS users_password - FROM users, addresses - WHERE users.user_id=addresses.user_id - AND addresses.address_id=:addresses_address_id - ORDER BY users.oid - {'addresses_addresses_id': 12} - -You can also create joins in "reverse", that is, to find an object with a certain parent. This is accomplished using `with_parent()`: - - {python} - # load a user - someuser = session.query(User).get(2) - - # load an address with that user as a parent and email address foo@bar.com - {sql}someaddresses = session.query(Address).with_parent(someuser). - filter_by(email_address="foo@bar.com").all() - SELECT addresses.address_id AS addresses_address_id, - addresses.user_id AS addresses_user_id, addresses.street AS addresses_street, - addresses.city AS addresses_city, addresses.state AS addresses_state, - addresses.zip AS addresses_zip FROM addresses - WHERE addresses.email_address = :addresses_email_address AND - addresses.user_id = :users_user_id ORDER BY addresses.oid - {'users_user_id': 1, 'addresses_email_address': 'foo@bar.com'} - -Sometimes it's necessary to create repeated joins that are independent of each other, even though they reference the same tables. Using our one-to-many setup, an example is to locate users who have two partcular email addresses. We can do this using table aliases: - - {python} - ad1 = addresses_table.alias('ad1') - ad2 = addresses_table.alias('ad2') - {sql}result = session.query(User).filter(and_( - ad1.c.user_id==users.c.user_id, - ad1.c.email_address=='foo@bar.com', - ad2.c.user_id==users.c.user_id, - ad2.c.email_address=='lala@yahoo.com' - )).all() - SELECT users.user_id AS users_user_id, - users.user_name AS users_user_name, users.password AS users_password - FROM users, addresses AS ad1, addresses AS ad2 - WHERE users.user_id=ad1.user_id - AND ad1.email_address=:ad1_email_address - AND users.user_id=ad2.user_id - AND ad2.email_address=:ad2_email_address - ORDER BY users.oid - {'ad1_email_address': 'foo@bar.com', 'ad2_email_address': 'lala@yahoo.com'} - -Version 0.4 of SQLAlchemy will include better ability to issue queries like the above with less verbosity. - -### Loading Relationships {@name=selectrelations} - -We've seen how the `relation` specifier affects the saving of an object and its child items, and also how it allows us to build joins. How to we get the actual related items loaded ? By default, the `relation()` function indicates that the related property should be attached a *lazy loader* when instances of the parent object are loaded from the database; this is just a callable function that when accessed will invoke a second SQL query to load the child objects of the parent. - - {python} - # define a user mapper - mapper(User, users_table, properties = { - 'addresses' : relation(Address) - }) - - # define an address mapper - mapper(Address, addresses_table) - - # select users where username is 'jane', get the first element of the list - # this will incur a load operation for the parent table - {sql}user = session.query(User).filter(User.c.user_name=='jane')[0] - SELECT users.user_id AS users_user_id, - users.user_name AS users_user_name, users.password AS users_password - FROM users WHERE users.user_name = :users_user_name ORDER BY users.oid - {'users_user_name': 'jane'} - - # iterate through the User object's addresses. this will incur an - # immediate load of those child items - {sql}for a in user.addresses: - SELECT addresses.address_id AS addresses_address_id, - addresses.user_id AS addresses_user_id, addresses.street AS addresses_street, - addresses.city AS addresses_city, addresses.state AS addresses_state, - addresses.zip AS addresses_zip FROM addresses - WHERE addresses.user_id = :users_user_id ORDER BY addresses.oid - {'users_user_id': 1} - - print repr(a) - -#### Eager Loading {@name=eagerload} - -Eager Loading is another way for relationships to be loaded. It describes the loading of parent and child objects across a relation using a single query. The purpose of eager loading is strictly one of performance enhancement; eager loading has **no impact** on the results of a query, except that when traversing child objects within the results, lazy loaders will not need to issue separate queries to load those child objects. - -With just a single parameter `lazy=False` specified to the relation object, the parent and child SQL queries can be joined together. - - {python} - mapper(Address, addresses_table) - mapper(User, users_table, properties = { - 'addresses' : relation(Address, lazy=False) - } - ) - - {sql}users = session.query(User).filter(User.c.user_name=='Jane').all() - SELECT users.user_name AS users_user_name, users.password AS users_password, - users.user_id AS users_user_id, addresses_4fb8.city AS addresses_4fb8_city, - addresses_4fb8.address_id AS addresses_4fb8_address_id, addresses_4fb8.user_id AS addresses_4fb8_user_id, - addresses_4fb8.zip AS addresses_4fb8_zip, addresses_4fb8.state AS addresses_4fb8_state, - addresses_4fb8.street AS addresses_4fb8_street - FROM users LEFT OUTER JOIN addresses AS addresses_4fb8 ON users.user_id = addresses_4fb8.user_id - WHERE users.user_name = :users_user_name ORDER BY users.oid, addresses_4fb8.oid - {'users_user_name': 'jane'} - - for u in users: - print repr(u) - for a in u.addresses: - print repr(a) - -Above, a pretty ambitious query is generated just by specifying that the User should be loaded with its child Addresses in one query. When the mapper processes the results, it uses an *Identity Map* to keep track of objects that were already loaded, based on their primary key identity. Through this method, the redundant rows produced by the join are organized into the distinct object instances they represent. - -Recall that eager loading has no impact on the results of the query. What if our query included our own join criterion? The eager loading query accomodates this using aliases, and is immune to the effects of additional joins being specified in the original query. Joining against the "addresses" table to locate users with a certain street results in this behavior: - - {python} - {sql}users = session.query(User).join('addresses').filter_by(street='123 Green Street').all() - - SELECT users.user_name AS users_user_name, - users.password AS users_password, users.user_id AS users_user_id, - addresses_6ca7.city AS addresses_6ca7_city, - addresses_6ca7.address_id AS addresses_6ca7_address_id, - addresses_6ca7.user_id AS addresses_6ca7_user_id, - addresses_6ca7.zip AS addresses_6ca7_zip, addresses_6ca7.state AS addresses_6ca7_state, - addresses_6ca7.street AS addresses_6ca7_street - FROM users JOIN addresses on users.user_id = addresses.user_id - LEFT OUTER JOIN addresses AS addresses_6ca7 ON users.user_id = addresses_6ca7.user_id - WHERE addresses.street = :addresses_street ORDER BY users.oid, addresses_6ca7.oid - {'addresses_street': '123 Green Street'} - -The join resulting from `join('addresses')` is separate from the join produced by the eager join, which is "aliasized" to prevent conflicts. - -#### Using Options to Change the Loading Strategy {@name=options} - -The `options()` method on the `Query` object is allows modifications to the underlying querying methodology. The most common use of this feature is to change the "eager/lazy" loading behavior of a particular mapper, via the functions `eagerload()`, `lazyload()` and `noload()`: - - {python} - # user mapper with lazy addresses - mapper(User, users_table, properties = { - 'addresses' : relation(mapper(Address, addresses_table)) - } - ) - - # query object - query = session.query(User) - - # make an eager loading query - eagerquery = query.options(eagerload('addresses')) - u = eagerquery.all() - - # make another query that wont load the addresses at all - plainquery = query.options(noload('addresses')) - - # multiple options can be specified - myquery = oldquery.options(lazyload('tracker'), noload('streets'), eagerload('members')) - - # to specify a relation on a relation, separate the property names by a "." - myquery = oldquery.options(eagerload('orders.items')) - -### More Relationships {@name=morerelations} - -Previously, we've discussed how to set up a one-to-many relationship. This section will go over the remaining major types of relationships that can be configured. More detail on on relationships as well as more advanced patterns can be found in [advdatamapping](rel:advdatamapping). - -#### One to One/Many to One {@name=manytoone} - -The above examples focused on the "one-to-many" relationship. To do other forms of relationship is easy, as the `relation()` function can usually figure out what you want: - - {python} - metadata = MetaData() - - # a table to store a user's preferences for a site - prefs_table = Table('user_prefs', metadata, - Column('pref_id', Integer, primary_key = True), - Column('stylename', String(20)), - Column('save_password', Boolean, nullable = False), - Column('timezone', CHAR(3), nullable = False) - ) - - # user table with a 'preference_id' column - users_table = Table('users', metadata, - Column('user_id', Integer, primary_key = True), - Column('user_name', String(16), nullable = False), - Column('password', String(20), nullable = False), - Column('preference_id', Integer, ForeignKey("user_prefs.pref_id")) - ) - - # engine and some test data - engine = create_engine('sqlite:///', echo=True) - metadata.create_all(engine) - engine.execute(prefs_table.insert(), dict(pref_id=1, stylename='green', save_password=1, timezone='EST')) - engine.execute(users_table.insert(), dict(user_name = 'fred', password='45nfss', preference_id=1)) - - # classes - class User(object): - def __init__(self, user_name, password): - self.user_name = user_name - self.password = password - - class UserPrefs(object): - pass - - mapper(UserPrefs, prefs_table) - - mapper(User, users_table, properties = { - 'preferences':relation(UserPrefs, lazy=False, cascade="all, delete-orphan"), - }) - - # select - session = create_session(bind=engine) - {sql}user = session.query(User).filter_by(user_name='fred').one() - SELECT users.preference_id AS users_preference_id, users.user_name AS users_user_name, - users.password AS users_password, users.user_id AS users_user_id, - user_prefs_4eb2.timezone AS user_prefs_4eb2_timezone, user_prefs_4eb2.stylename AS user_prefs_4eb2_stylename, - user_prefs_4eb2.save_password AS user_prefs_4eb2_save_password, user_prefs_4eb2.pref_id AS user_prefs_4eb2_pref_id - FROM (SELECT users.user_id AS users_user_id FROM users WHERE users.user_name = :users_user_name ORDER BY users.oid - LIMIT 1 OFFSET 0) AS rowcount, - users LEFT OUTER JOIN user_prefs AS user_prefs_4eb2 ON user_prefs_4eb2.pref_id = users.preference_id - WHERE rowcount.users_user_id = users.user_id ORDER BY users.oid, user_prefs_4eb2.oid - {'users_user_name': 'fred'} - - save_password = user.preferences.save_password - - # modify - user.preferences.stylename = 'bluesteel' - - # flush - {sql}session.flush() - UPDATE user_prefs SET stylename=:stylename - WHERE user_prefs.pref_id = :pref_id - [{'stylename': 'bluesteel', 'pref_id': 1}] - -#### Many to Many {@name=manytomany} - -The `relation()` function handles a basic many-to-many relationship when you specify an association table using the `secondary` argument: - - {python} - metadata = MetaData() - - articles_table = Table('articles', metadata, - Column('article_id', Integer, primary_key = True), - Column('headline', String(150), key='headline'), - Column('body', TEXT, key='body'), - ) - - keywords_table = Table('keywords', metadata, - Column('keyword_id', Integer, primary_key = True), - Column('keyword_name', String(50)) - ) - - itemkeywords_table = Table('article_keywords', metadata, - Column('article_id', Integer, ForeignKey("articles.article_id")), - Column('keyword_id', Integer, ForeignKey("keywords.keyword_id")) - ) - - engine = create_engine('sqlite:///') - metadata.create_all(engine) - - # class definitions - class Keyword(object): - def __init__(self, name): - self.keyword_name = name - - class Article(object): - pass - - mapper(Keyword, keywords_table) - - # define a mapper that does many-to-many on the 'itemkeywords' association - # table - mapper(Article, articles_table, properties = { - 'keywords':relation(Keyword, secondary=itemkeywords_table, lazy=False) - } - ) - - session = create_session(bind=engine) - - article = Article() - article.headline = 'a headline' - article.body = 'this is the body' - article.keywords.append(Keyword('politics')) - article.keywords.append(Keyword('entertainment')) - session.save(article) - - {sql}session.flush() - INSERT INTO keywords (name) VALUES (:name) - {'name': 'politics'} - INSERT INTO keywords (name) VALUES (:name) - {'name': 'entertainment'} - INSERT INTO articles (article_headline, article_body) VALUES (:article_headline, :article_body) - {'article_body': 'this is the body', 'article_headline': 'a headline'} - INSERT INTO article_keywords (article_id, keyword_id) VALUES (:article_id, :keyword_id) - [{'keyword_id': 1, 'article_id': 1}, {'keyword_id': 2, 'article_id': 1}] - - # select articles based on a keyword. - {sql}articles = session.query(Article).join('keywords').filter_by(keyword_name='politics').all() - SELECT keywords_e2f2.keyword_id AS keywords_e2f2_keyword_id, keywords_e2f2.keyword_name AS keywords_e2f2_keyword_name, - articles.headline AS articles_headline, articles.body AS articles_body, articles.article_id AS articles_article_id - FROM keywords, article_keywords, articles - LEFT OUTER JOIN article_keywords AS article_keyword_3da2 ON articles.article_id = article_keyword_3da2.article_id - LEFT OUTER JOIN keywords AS keywords_e2f2 ON keywords_e2f2.keyword_id = article_keyword_3da2.keyword_id - WHERE (keywords.keyword_name = :keywords_keywords_name AND articles.article_id = article_keywords.article_id) - AND keywords.keyword_id = article_keywords.keyword_id ORDER BY articles.oid, article_keyword_3da2.oid - {'keywords_keyword_name': 'politics'} - - a = articles[0] - - # clear out keywords with a new list - a.keywords = [] - a.keywords.append(Keyword('topstories')) - a.keywords.append(Keyword('government')) - - # flush - {sql}session.flush() - INSERT INTO keywords (name) VALUES (:name) - {'name': 'topstories'} - INSERT INTO keywords (name) VALUES (:name) - {'name': 'government'} - DELETE FROM article_keywords - WHERE article_keywords.article_id = :article_id - AND article_keywords.keyword_id = :keyword_id - [{'keyword_id': 1, 'article_id': 1}, {'keyword_id': 2, 'article_id': 1}] - INSERT INTO article_keywords (article_id, keyword_id) VALUES (:article_id, :keyword_id) - [{'keyword_id': 3, 'article_id': 1}, {'keyword_id': 4, 'article_id': 1}] - -#### Association Object {@name=association} - -Many to Many can also be done with an association object, that adds additional information about how two items are related. In this pattern, the "secondary" option to `relation()` is no longer used; instead, the association object becomes a mapped entity itself, mapped to the association table. If the association table has no explicit primary key columns defined, you also have to tell the mapper what columns will compose its "primary key", which are typically the two (or more) columns involved in the association. Also, the relation between the parent and association mapping is typically set up with a cascade of `all, delete-orphan`. This is to ensure that when an association object is removed from its parent collection, it is deleted (otherwise, the unit of work tries to null out one of the foreign key columns, which raises an error condition since that column is also part of its "primary key"). - - {python} - from sqlalchemy import * - metadata = MetaData() - - users_table = Table('users', metadata, - Column('user_id', Integer, primary_key = True), - Column('user_name', String(16), nullable = False), - ) - - articles_table = Table('articles', metadata, - Column('article_id', Integer, primary_key = True), - Column('headline', String(150), key='headline'), - Column('body', TEXT, key='body'), - ) - - keywords_table = Table('keywords', metadata, - Column('keyword_id', Integer, primary_key = True), - Column('keyword_name', String(50)) - ) - - # add "attached_by" column which will reference the user who attached this keyword - itemkeywords_table = Table('article_keywords', metadata, - Column('article_id', Integer, ForeignKey("articles.article_id")), - Column('keyword_id', Integer, ForeignKey("keywords.keyword_id")), - Column('attached_by', Integer, ForeignKey("users.user_id")) - ) - - engine = create_engine('sqlite:///', echo=True) - metadata.create_all(engine) - - # class definitions - class User(object): - pass - class Keyword(object): - def __init__(self, name): - self.keyword_name = name - class Article(object): - pass - class KeywordAssociation(object): - pass - - # Article mapper, relates to Keyword via KeywordAssociation - mapper(Article, articles_table, properties={ - 'keywords':relation(KeywordAssociation, lazy=False, cascade="all, delete-orphan") - } - ) - - # mapper for KeywordAssociation - # specify "primary key" columns manually - mapper(KeywordAssociation, itemkeywords_table, - primary_key=[itemkeywords_table.c.article_id, itemkeywords_table.c.keyword_id], - properties={ - 'keyword' : relation(Keyword, lazy=False), - 'user' : relation(User, lazy=False) - } - ) - - # user mapper - mapper(User, users_table) - - # keyword mapper - mapper(Keyword, keywords_table) - - session = create_session(bind=engine) - # select by keyword - {sql}alist = session.query(Article).join(['keywords', 'keyword']).filter_by(keyword_name='jacks_stories').all() - SELECT article_keyword_f9af.keyword_id AS article_keyword_f9af_key_b3e1, - article_keyword_f9af.attached_by AS article_keyword_f9af_att_95d4, - article_keyword_f9af.article_id AS article_keyword_f9af_art_fd49, - users_9c30.user_name AS users_9c30_user_name, users_9c30.user_id AS users_9c30_user_id, - keywords_dc54.keyword_id AS keywords_dc54_keyword_id, keywords_dc54.keyword_name AS keywords_dc54_keyword_name, - articles.headline AS articles_headline, articles.body AS articles_body, articles.article_id AS articles_article_id - FROM keywords, article_keywords, articles - LEFT OUTER JOIN article_keywords AS article_keyword_f9af ON articles.article_id = article_keyword_f9af.article_id - LEFT OUTER JOIN users AS users_9c30 ON users_9c30.user_id = article_keyword_f9af.attached_by - LEFT OUTER JOIN keywords AS keywords_dc54 ON keywords_dc54.keyword_id = article_keyword_f9af.keyword_id - WHERE (keywords.keyword_name = :keywords_keywords_name AND keywords.keyword_id = article_keywords.keyword_id) - AND articles.article_id = article_keywords.article_id - ORDER BY articles.oid, article_keyword_f9af.oid, users_9c30.oid, keywords_dc54.oid - {'keywords_keywords_name': 'jacks_stories'} - - # user is available - for a in alist: - for k in a.keywords: - if k.keyword.name == 'jacks_stories': - print k.user.user_name - -Keep in mind that the association object works a little differently from a plain many-to-many relationship. Members have to be added to the list via instances of the association object, which in turn point to the associated object: - - {python} - user = User() - user.user_name = 'some user' - - article = Article() - - assoc = KeywordAssociation() - assoc.keyword = Keyword('blue') - assoc.user = user - - assoc2 = KeywordAssociation() - assoc2.keyword = Keyword('green') - assoc2.user = user - - article.keywords.append(assoc) - article.keywords.append(assoc2) - - session.save(article) - - session.flush() - -SQLAlchemy includes an extension module which can be used in some cases to decrease the explicitness of the association object pattern; this extension is described in [plugins_associationproxy](rel:plugins_associationproxy). - -Note that you should **not** combine the usage of a `secondary` relationship with an association object pattern against the same association table. This is because SQLAlchemy's unit of work will regard rows in the table tracked by the `secondary` argument as distinct from entities mapped into the table by the association mapper, causing unexpected behaviors when rows are changed by one mapping and not the other. diff --git a/doc/build/content/dbengine.txt b/doc/build/content/dbengine.txt index c88a6cd6b..4dddf3ec9 100644 --- a/doc/build/content/dbengine.txt +++ b/doc/build/content/dbengine.txt @@ -276,7 +276,7 @@ Implicit execution is also connectionless, and calls the `execute()` method on t {python} engine = create_engine('sqlite:///file.db') - meta.connect(engine) + meta.bind = engine result = users_table.select().execute() for row in result: # .... diff --git a/doc/build/content/mappers.txt b/doc/build/content/mappers.txt index b1919ccbd..005eae98f 100644 --- a/doc/build/content/mappers.txt +++ b/doc/build/content/mappers.txt @@ -4,12 +4,17 @@ Mapper Configuration {@name=advdatamapping} ====================== -This section references most major usage patterns involving the [mapper()](rel:docstrings_sqlalchemy.orm_modfunc_mapper) and [relation()](rel:docstrings_sqlalchemy.orm_modfunc_relation) functions. It assumes you've worked through the [datamapping](rel:datamapping) and know how to construct and use rudimental mappers and relations. +This section references most major configurational patterns involving the [mapper()](rel:docstrings_sqlalchemy.orm_modfunc_mapper) and [relation()](rel:docstrings_sqlalchemy.orm_modfunc_relation) functions. It assumes you've worked through the [datamapping](rel:datamapping) and know how to construct and use rudimental mappers and relations. ### Mapper Configuration -Full API documentation for the ORM is at [docstrings_sqlalchemy.orm](rel:docstrings_sqlalchemy.orm). -The overall list of options to the `mapper()` function can be viewed at: [Options for mapper()](rel:docstrings_sqlalchemy.orm_modfunc_mapper). +Full API documentation for the ORM: + +[docstrings_sqlalchemy.orm](rel:docstrings_sqlalchemy.orm). + +Options for the `mapper()` function: + +[docstrings_sqlalchemy.orm_modfunc_mapper](rel:docstrings_sqlalchemy.orm_modfunc_mapper). #### Customizing Column Properties {@name=columns} @@ -696,7 +701,7 @@ Multiple extensions will be chained together and processed in order; they are sp The full list of options for the `relation()` function: -[Options for relation()](rel:docstrings_sqlalchemy.orm_modfunc_relation) +[docstrings_sqlalchemy.orm_modfunc_relation](rel:docstrings_sqlalchemy.orm_modfunc_relation) #### Basic Relational Patterns {@name=patterns} diff --git a/doc/build/content/metadata.txt b/doc/build/content/metadata.txt index 39e276d68..068d1af9f 100644 --- a/doc/build/content/metadata.txt +++ b/doc/build/content/metadata.txt @@ -121,7 +121,7 @@ And `Table` provides an interface to the table's properties as well as that of i A `MetaData` object can be associated with an `Engine` or an individual `Connection`; this process is called **binding**. The term used to describe "an engine or a connection" is often referred to as a **connectable**. Binding allows the `MetaData` and the elements which it contains to perform operations against the database directly, using the connection resources to which it's bound. Common operations which are made more convenient through binding include being able to generate SQL constructs which know how to execute themselves, creating `Table` objects which query the database for their column and constraint information, and issuing CREATE or DROP statements. -To bind `MetaData` to an `Engine`, use the `connect()` method: +To bind `MetaData` to an `Engine`, use the `bind` attribute: {python} engine = create_engine('sqlite://', **kwargs) diff --git a/doc/build/content/session.txt b/doc/build/content/session.txt new file mode 100644 index 000000000..6fdcd9d8b --- /dev/null +++ b/doc/build/content/session.txt @@ -0,0 +1,538 @@ +Using the Session {@name=unitofwork} +============ + +The [Mapper](rel:advdatamapping) is the entrypoint to the configurational API of the SQLAlchemy object relational mapper. But the primary object one works with when using the ORM is the [Session](rel:docstrings_sqlalchemy.orm.session_Session). + +## What does the Session do ? + +In the most general sense, the `Session` establishes all conversations with the database and represents a "holding zone" for all the mapped instances which you've loaded or created during its lifespan. It implements the [Unit of Work](http://martinfowler.com/eaaCatalog/unitOfWork.html) pattern, which means it keeps track of all changes which occur, and is capable of **flushing** those changes to the database as appropriate. Another important facet of the `Session` is that it's also maintaining **unique** copies of each instance, where "unique" means "only one object with a particular primary key" - this pattern is called the [Identity Map](http://martinfowler.com/eaaCatalog/identityMap.html). + +Beyond that, the `Session` implements an interface which let's you move objects in or out of the session in a variety of ways, it provides the entryway to a `Query` object which is used to query the database for data, it is commonly used to provide transactional boundaries (though this is optional), and it also can serve as a configurational "home base" for one or more `Engine` objects, which allows various vertical and horizontal partitioning strategies to be achieved. + +## Getting a Session + +[This section is under construction !] + +### Using a sessionmaker() Configuration {@name=sessionmaker} + + {python} + Session = sessionmaker(autoflush=True, transactional=True, **kwargs) + + sess = Session() + # work with sess + + sess.close() + +[Generated docstrings for sessionmaker()](rel:docstrings_sqlalchemy.orm_modfunc_sessionmaker) + +### Binding Session to an Engine or Connection {@name=binding} + +Bind to an engine: + + {python} + Session = sessionmaker(...) + Session.configure(bind=engine) + +Bind to a connection: + + {python} + Session = sessionmaker(...) + + connection = engine.connect() + + sess = Session(bind=connection) + +### Using create_session() {@name=createsession} + +This function is a "lower level" version of `sessionmaker()`. It creates a new `Session` directly: + + {python} + session = create_session(**kwargs) + +[Generated docstrings for create_session()](rel:docstrings_sqlalchemy.orm_modfunc_create_session) + + +## Using the Session + +A typical session conversation starts with creating a new session, or acquiring one from an ongoing context. You save new objects and load existing ones, make changes, mark some as deleted, and then persist your changes to the database. If your session is transactional, you use `commit()` to persist any remaining changes and to commit the transaction. If not, you call `flush()` which will flush any remaining data to the database. + +Below, we open a new `Session` using a configured `sessionmaker()`, make some changes, and commit: + + {python} + # configured Session class + Session = sessionmaker(autoflush=True, transactional=True) + + sess = Session() + d = Data(value=10) + sess.save(d) + d2 = sess.query(Data).filter(Data.value=15).one() + d2.value = 19 + sess.commit() + +### Quickie Intro to Object States {@name=states} + +It's helpful to know the states at which an instance can have within a session: + +* *Transient* - an instance that's not in a session, and is not saved to the database; i.e. it has no database identity. The only relationship such an object has to the ORM is that its class has a `mapper()` associated with it. + +* *Pending* - when you `save()` a transient instance, it becomes pending. It still wasn't actually flushed to the database yet, but it will be when the next flush occurs. + +* *Persistent* - An instance which is present in the session and has a record in the database. You get persistent instances by either flushing so that the pending instances become persistent, or by querying the database for existing instances (or moving persistent instances from other sessions into your local session). + +* *Detached* - an instance which has a record in the database, but is not in any session. Theres nothing wrong with this, and you can use objects normally when they're detached, **except** they will not be able to issue any SQL in order to load collections or attributes which are not yet loaded, or were marked as "expired". + +Knowing these states is important, since the `Session` tries to be strict about ambiguous operations (such as trying to save the same object to two different sessions at the same time). + +### Frequently Asked Questions {@name=faq} + +* When do I make a `sessionmaker` ? + + Just one time, somewhere in your application's global scope. It should be looked upon as part of your application's configuration. If your application has three .py files in a package, you could, for example, place the `sessionmaker` line in your `__init__.py` file; from that point on your other modules say "from mypackage import Session". That way, everyone else just uses `Session()`, and the configuration of that session is controlled by that central point. + + If your application starts up, does imports, but does not know what database it's going to be connecting to, you can bind the `Session` at the "class" level to the engine later on, using `configure()`. + +In the examples in this section, we will frequently show the `sessionmaker` being created right above the line where we actually create the `Session`. But that's just for example's sake ! In reality, the `sessionmaker` would be somewhere at the module level, and your individual `Session()` calls would be sprinkled all throughout your app, such as in a web application within each controller method. + +* When do I make a `Session` ? + +You typically create a `Session` when you first need to talk to your database, and want to save some objects or load some existing ones. Then, you work with it, save your changes, and then dispose of it....or at the very least `close()` it. It's not a "global" kind of object, and should be handled more like a "local variable", as it's generally **not** safe to use with concurrent threads. Sessions are very inexpensive to make, and don't use any resources whatsoever until they are first used...so create some ! + +There is also a pattern whereby you're using a **contextual session**, this is described later in [unitofwork_contextual](rel:unitofwork_contextual). In this pattern, a helper object is maintaining a `Session` for you, most commonly one that is local to the current thread (and sometimes also local to an application instance). SQLAlchemy 0.4 has worked this pattern out such that it still *looks* like you're creating a new session as you need one...so in that case, it's still a guaranteed win to just say `Session()` whenever you want a session. + +* Is the Session a cache ? + +Yeee...no. It's somewhat used as a cache, in that it implements the identity map pattern, and stores objects keyed to their primary key. However, it doesn't do any kind of query caching. This means, if you say `session.query(Foo).filter_by(name='bar')`, even if `Foo(name='bar')` is right there, in the identity map, the session has no idea about that. It has to issue SQL to the database, get the rows back, and then when it sees the primary key in the row, *then* it can look in the local identity map and see that the object is already there. It's only when you say `query.get({some primary key})` that the `Session` doesn't have to issue a query. + +Also, the `Session` is not designed to be a global object from which everyone consults as a "registry" of objects. That is the job of a **second level cache**. A good library for implementing second level caching is [Memcached](http://www.danga.com/memcached/). It *is* possible to "sort of" use the `Session` in this manner, if you set it to be non-transactional and it never flushes any SQL, but it's not a terrific solution, since if concurrent threads load the same objects at the same time, you may have multiple copies of the same objects present in collections. + +* How can I get the `Session` for a certain object ? + + {python} + from sqlalchemy.orm import object_session + session = object_session(someobject) + +* Is the session threadsafe ? + +Nope. It has no thread synchronization of any kind built in, and particularly when you do a flush operation, it definitely is not open to concurrent threads accessing it, because it holds onto a single database connection at that point. If you use a session which is non-transactional for read operations only, it's still not thread-"safe", but you also wont get any catastrophic failures either, since it opens and closes connections on an as-needed basis; its just that different threads might load the same objects independently of each other, but only one will wind up in the identity map (however, the other one might still live in a collection somewhere). + +But the bigger point here is, you should not *want* to use the session with multiple concurrent threads. That would be like having everyone at a restaurant all eat from the same plate. The session is a local "workspace" that you use for a specific set of tasks; you don't want to, or need to, share that session with other threads who are doing some other task. If, on the other hand, there are other threads participating in the same task you are, such as in a desktop graphical application, then you would be sharing the session with those threads, but you also will have implemented a proper locking scheme (or your graphical framework does) so that those threads do not collide. + +### Session Attributes {@name=attributes} + +The session provides a set of attributes and collection-oriented methods which allow you to view (but not to change) the current state of the session. + +The **identity map** is accessed by the `identity_map` attribute, which provides a dictionary interface. The keys are "identity keys", which are attached to all persistent objects by the attribute `_instance_key`: + + {python} + >>> myobject._instance_key + (<class 'test.tables.User'>, (7,)) + + >>> myobject._instance_key in session.identity_map + True + + >>> session.identity_map.values() + [<__main__.User object at 0x712630>, <__main__.Address object at 0x712a70>] + +The identity map is a `dict` by default. This means that objects stay in the session until they are explicitly removed, or the session is cleared. Some people prefer objects to be automatically garbage collected instead. For this, you can specify the flag `weak_identity_map=True` to the `create_session` or `sessionmaker` functions so that the `Session` will use a `weakref.WeakValueDictionary`. But note that an object that has changes marked on it (i.e. "dirty"), and is then discarded, will not have its changes saved to the database. It's because of this misleading behavior that SQLAlchemy's identity map is by default strong-referencing. + +The Session also supports an iterator interface in order to see all objects in the identity map: + + {python} + for obj in session: + print obj + +As well as `__contains__()`: + + {python} + if obj in session: + print "Object is present" + +The session is also keeping track of all newly created (i.e. pending) objects, all objects which have had changes since they were last loaded or saved (i.e. "dirty"), and everything that's been marked as deleted. + + {python} + # pending objects recently added to the Session + session.new + + # persistent objects which currently have changes detected + # (this collection is now created on the fly each time the property is called) + session.dirty + + # persistent objects that have been marked as deleted via session.delete(obj) + session.deleted + +Note that if a session is created with the `weak_identity_map` flag, an item which is marked as "dirty" will be silently removed from the session if the item falls out of scope in the user application. This is because the unit of work does not look for "dirty" changes except for within a flush operation (or any time the session.dirty collection is accessed). + +As for objects inside of `new` and `deleted`, if you abandon all references to new or modified objects within a session, *they are still present* in either of those two lists, and will be saved on the next flush operation, unless they are removed from the Session explicitly or the session is cleared. + +### Querying + +The `query()` function takes one or more classes and/or mappers, along with an optional `entity_name` parameter, and returns a new `Query` object which will issue mapper queries within the context of this Session. For each mapper is passed, the Query uses that mapper. For each class, the Query will locate the primary mapper for the class using `class_mapper()`. + + {python} + # query from a class + session.query(User).filter_by(name='ed').all() + + # query with multiple classes, returns tuples + session.query(User).add_entity(Address).join('addresses').filter_by(name='ed').all() + + # query from a mapper + query = session.query(usermapper) + x = query.get(1) + + # query from a class mapped with entity name 'alt_users' + q = session.query(User, entity_name='alt_users') + y = q.options(eagerload('orders')).all() + +`entity_name` is an optional keyword argument sent with a class object, in order to further qualify which primary mapper to be used; this only applies if there was a `Mapper` created with that particular class/entity name combination, else an exception is raised. All of the methods on Session which take a class or mapper argument also take the `entity_name` argument, so that a given class can be properly matched to the desired primary mapper. + +All instances retrieved by the returned `Query` object will be stored as persistent instances within the originating `Session`. + +### Saving New Instances + +`save()` is called with a single transient instance as an argument, which is then added to the Session and becomes pending. When the session is next flushed, the instance will be saved to the database. If the given instance is not transient, meaning it is either attached to an existing Session or it has a database identity, an exception is raised. + + {python} + user1 = User(name='user1') + user2 = User(name='user2') + session.save(user1) + session.save(user2) + + session.commit() # write changes to the database + +There's also other ways to have objects saved to the session automatically; one is by using cascade rules, and the other is by using a contextual session. Both of these are described later. + +### Updating/Merging Existing Instances + +The `update()` method is used when you have a detached instance, and you want to put it back into a `Session`. Recall that "detached" means the object has a database identity. + +Since `update()` is a little picky that way, most people use `save_or_update()`, which checks for an `_instance_key` attribute, and based on whether it's there or not, calls either `save()` or `update()`: + + {python} + # load user1 using session 1 + user1 = sess1.query(User).get(5) + + # remove it from session 1 + sess1.expunge(user1) + + # move it into session 2 + sess2.save_or_update(user1) + +`update()` is also an operation that can happen automatically using cascade rules, just like `save()`. + +`merge()` on the other hand is a little like `update()`, except it creates a **copy** of the given instance in the session, and returns to you that instance; the instance you send it never goes into the session. `merge()` is much fancier than `update()`; it will actually look to see if an object with the same primary key is already present in the session, and if not will load it by primary key. Then, it will merge the attributes of the given object into the one which it just located. + +This method is useful for bringing in objects which may have been restored from a serialization, such as those stored in an HTTP session, where the object may be present in the session already: + + {python} + # deserialize an object + myobj = pickle.loads(mystring) + + # "merge" it. if the session already had this object in the + # identity map, then you get back the one from the current session. + myobj = session.merge(myobj) + +### Deleting + +The `delete` method places an instance into the Session's list of objects to be marked as deleted: + + {python} + # mark two objects to be deleted + session.delete(obj1) + session.delete(obj2) + + # commit (or flush) + session.commit() + +The big gotcha with `delete()` is that **nothing is removed from collections**. Such as, if a `User` has a collection of three `Addresses`, deleting an `Address` will not remove it from `user.addresses`: + + {python} + >>> address = user.addresses[1] + >>> session.delete(address) + >>> session.flush() + >>> address in user.addresses + True + +The solution is to use proper cascading: + + {python} + mapper(User, users_table, properties={ + 'addresses':relation(Address, cascade="all, delete") + }) + del user.addresses[1] + session.flush() + +### Flushing + +This is the main gateway to what the `Session` does best, which is save everything ! It should be clear by now what a flush looks like: + + {python} + session.flush() + +It also can be called with a list of objects; in this form, the flush operation will be limited only to the objects specified in the list: + + {python} + # saves only user1 and address2. all other modified + # objects remain present in the session. + session.flush([user1, address2]) + +This second form of flush should be used carefully as it will not necessarily locate other dependent objects within the session, whose database representation may have foreign constraint relationships with the objects being operated upon. + +Theres also a way to have `flush()` called automatically before each query; this is called "autoflush" and is described below. + +Note that flush **does not change** the state of any collections or entity relationships in memory; for example, if you set a foreign key attribute `b_id` on object `A` with the the identifier `B.id`, the change will be flushed to the database, but `A` will not have `B` added to its collection. If you want to manipulate foreign key attributes directly, `refresh()` or `expire()` the objects whose state needs to be refreshed subsequent to flushing. + +### Autoflush + +A session can be configured to issue `flush()` calls before each query. This allows you to immediately have DB access to whatever has been saved to the session. It's recommended to use autoflush with `transactional=True`, that way an unexpected flush call won't permanently save to the database: + + {python} + Session = sessionmaker(autoflush=True, transactional=True) + sess = Session() + u1 = User(name='jack') + sess.save(u1) + + # reload user1 + u2 = sess.query(User).filter_by(name='jack').one() + assert u2 is u1 + + # commit session, flushes whatever is remaining + sess.commit() + +### Expunge / Clear + +Expunge removes an object from the Session, sending persistent instances to the detached state, and pending instances to the transient state: + + {python} + session.expunge(obj1) + +Use `expunge` when youd like to remove an object altogether from memory, such as before calling `del` on it, which will prevent any "ghost" operations occuring when the session is flushed. + +This `clear()` method is equivalent to `expunge()`-ing everything from the Session: + + {python} + session.clear() + +However note that the `clear()` method does not reset any transactional state or connection resources; therefore what you usually want to call instead of `clear()` is `close()`. + +note that objects in collections can be re-saved again + +### Closing + +The `close()` method issues a `clear()`, and releases any transactional/connection resources. When connections are returned to the connection pool, whatever transactional state exists is rolled back. + +When `close()` is called, the `Session` is in the same state as when it was first created, and is safe to be used again. `close()` is especially important when using a contextual session, which remains in memory after usage. By issuing `close()`, the session will be clean for the next request that makes use of it. + +### Refreshing / Expiring + +To assist with the Session's "sticky" behavior of instances which are present, individual objects can have all of their attributes immediately re-loaded from the database, or marked as "expired" which will cause a re-load to occur upon the next access of any of the object's mapped attributes. This includes all relationships, so lazy-loaders will be re-initialized, eager relationships will be repopulated. Any changes marked on the object are discarded: + + {python} + # immediately re-load attributes on obj1, obj2 + session.refresh(obj1) + session.refresh(obj2) + + # expire objects obj1, obj2, attributes will be reloaded + # on the next access: + session.expire(obj1) + session.expire(obj2) + +## Cascades + +Mappers support the concept of configurable *cascade* behavior on `relation()`s. This behavior controls how the Session should treat the instances that have a parent-child relationship with another instance that is operated upon by the Session. Cascade is indicated as a comma-separated list of string keywords, with the possible values `all`, `delete`, `save-update`, `refresh-expire`, `merge`, `expunge`, and `delete-orphan`. + +Cascading is configured by setting the `cascade` keyword argument on a `relation()`: + + {python} + mapper(Order, order_table, properties={ + 'items' : relation(Item, items_table, cascade="all, delete-orphan"), + 'customer' : relation(User, users_table, user_orders_table, cascade="save-update"), + }) + +The above mapper specifies two relations, `items` and `customer`. The `items` relationship specifies "all, delete-orphan" as its `cascade` value, indicating that all `save`, `update`, `merge`, `expunge`, `refresh` `delete` and `expire` operations performed on a parent `Order` instance should also be performed on the child `Item` instances attached to it (`save` and `update` are cascaded using the `save_or_update()` method, so that the database identity of the instance doesn't matter). The `delete-orphan` cascade value additionally indicates that if an `Item` instance is no longer associated with an `Order`, it should also be deleted. The "all, delete-orphan" cascade argument allows a so-called *lifecycle* relationship between an `Order` and an `Item` object. + +The `customer` relationship specifies only the "save-update" cascade value, indicating most operations will not be cascaded from a parent `Order` instance to a child `User` instance, except for if the `Order` is attached with a particular session, either via the `save()`, `update()`, or `save-update()` method. + +Additionally, when a child item is attached to a parent item that specifies the "save-update" cascade value on the relationship, the child is automatically passed to `save_or_update()` (and the operation is further cascaded to the child item). + +Note that cascading doesn't do anything that isn't possible by manually calling Session methods on individual instances within a hierarchy, it merely automates common operations on a group of associated instances. + +The default value for `cascade` on `relation()`s is `save-update, merge`. + +## Managing Transactions + +The Session can manage transactions automatically, including across multiple engines. When the Session is in a transaction, as it receives requests to execute SQL statements, it adds each indivdual Connection/Engine encountered to its transactional state. At commit time, all unflushed data is flushed, and each individual transaction is committed. If the underlying databases support two-phase semantics, this may be used by the Session as well if two-phase transactions are enabled. + +The easiest way to use a Session with transactions is just to declare it as transactional. The session will remain in a transaction at all times: + + {python} + # transactional session + Session = sessionmaker(transactional=True) + sess = Session() + item1 = sess.query(Item).get(1) + item2 = sess.query(Item).get(2) + item1.foo = 'bar' + item2.bar = 'foo' + + # commit- will immediately go into a new transaction afterwards + sess.commit() + +Alternatively, a transaction can be begun explicitly using `begin()`: + + {python} + # non transactional session + Session = sessionmaker(transactional=False) + sess = Session() + sess.begin() + try: + item1 = sess.query(Item).get(1) + item2 = sess.query(Item).get(2) + item1.foo = 'bar' + item2.bar = 'foo' + except: + sess.rollback() + raise + sess.commit() + +Session also supports Python 2.5's with statement so that the example above can be written as: + + {python} + Session = sessionmaker(transactional=False) + sess = Session() + with sess.begin(): + item1 = sess.query(Item).get(1) + item2 = sess.query(Item).get(2) + item1.foo = 'bar' + item2.bar = 'foo' + +For MySQL and Postgres (and soon Oracle), "nested" transactions can be accomplished which use SAVEPOINT behavior, via the `begin_nested()` method: + + {python} + Session = sessionmaker(transactional=False) + sess = Session() + sess.begin() + sess.save(u1) + sess.save(u2) + sess.flush() + + sess.begin_nested() # establish a savepoint + sess.save(u3) + sess.rollback() # rolls back u3, keeps u1 and u2 + + sess.commit() # commits u1 and u2 + +Finally, for MySQL, Postgres, and soon Oracle as well, the session can be instructed to use two-phase commit semantics using the flag `twophase=True`, which coordinates transactions across multiple databases: + + {python} + engine1 = create_engine('postgres://db1') + engine2 = create_engine('postgres://db2') + + Session = sessionmaker(twophase=True, transactional=True) + + # bind User operations to engine 1, Account operations to engine 2 + Session.configure(binds={User:engine1, Account:engine2}) + + sess = Session() + + # .... work with accounts and users + + # commit. session will issue a flush to all DBs, and a prepare step to all DBs, + # before committing both transactions + sess.commit() + +## Using SQL Expressions with Sessions {@name=sql} + +SQL constructs and string statements can be executed via the `Session`. You'd want to do this normally when your `Session` is transactional and youd like your free-standing SQL statements to participate in the same transaction. + +The two ways to do this are to use the connection/execution services of the Session, or to have your Session participate in a regular SQL transaction. + +First, a Session thats associated with an Engine or Connection can execute statements immediately (whether or not its transactional): + + {python} + Session = sessionmaker(bind=engine, transactional=True) + sess = Session() + result = sess.execute("select * from table where id=:id", {'id':7}) + result2 = sess.execute(select([mytable], mytable.c.id==7)) + +To get at the current connection used by the session, which will be part of the current transaction if one is in progress, use `connection()`: + + {python} + connection = sess.connection() + +A second scenario is that of a Session which is not directly bound to a connectable. This session executes statements relative to a particular `Mapper`, since the mappers are bound to tables which are in turn bound to connectables via their `MetaData` (either the session or the mapped tables need to be bound). In this case, the Session can conceivably be associated with multiple databases through different mappers; so it wants you to send along a `mapper` argument, which can be any mapped class or mapper instance: + + {python} + # session is *not* bound to an engine or connection + Session = sessionmaker(transactional=True) + sess = Session() + + # need to specify mapper or class when executing + result = sess.execute("select * from table where id=:id", {'id':7}, mapper=MyMappedClass) + result2 = sess.execute(select([mytable], mytable.c.id==7), mapper=MyMappedClass) + + # need to specify mapper or class when you call connection() + connection = sess.connection(MyMappedClass) + +The third scenario is when you are using `Connection` and `Transaction` yourself, and want the `Session` to participate. This is easy, as you just bind the `Session` to the connection: + + {python} + # non-transactional session + Session = sessionmaker(transactional=False) + + # non-ORM connection + transaction + conn = engine.connect() + trans = conn.begin() + + # bind the Session *instance* to the connection + sess = Session(bind=conn) + + # ... etc + + trans.commit() + +It's safe to use a `Session` which is transactional or autoflushing, as well as to call `begin()`/`commit()` on the session too; the outermost Transaction object, the one we declared explicitly, controls the scope of the transaction. + +When using the `threadlocal` engine context, things are that much easier; the `Session` uses the same connection/transaction as everyone else in the current thread, whether or not you explicitly bind it: + + {python} + engine = create_engine('postgres://mydb', strategy="threadlocal") + engine.begin() + + sess = Session() # session takes place in the transaction like everyone else + + # ... go nuts + + engine.commit() # commit the transaction + +## Contextual/Thread-local Sessions {@name=contextual} + +[this section is TODO] + + {python} + Session = scoped_session(sessionmaker(autoflush=True, transactional=True)) + +[Generated docstrings for scoped_session()](rel:docstrings_sqlalchemy.orm_modfunc_scoped_session) + +## Partitioning Strategies + +this section is TODO + +### Vertical Partitioning + +Vertical partitioning places different kinds of objects, or different tables, across multiple databases. + + {python} + engine1 = create_engine('postgres://db1') + engine2 = create_engine('postgres://db2') + + Session = sessionmaker(twophase=True, transactional=True) + + # bind User operations to engine 1, Account operations to engine 2 + Session.configure(binds={User:engine1, Account:engine2}) + + sess = Session() + +### Horizontal Partitioning + +Horizontal partitioning partitions the rows of a single table (or a set of tables) across multiple databases. + +See the "sharding" example in [attribute_shard.py](rel:http://www.sqlalchemy.org/trac/browser/trunk/examples/sharding/attribute_shard.py) diff --git a/doc/build/content/unitofwork.txt b/doc/build/content/unitofwork.txt deleted file mode 100644 index ef0118901..000000000 --- a/doc/build/content/unitofwork.txt +++ /dev/null @@ -1,493 +0,0 @@ -[alpha_api]: javascript:alphaApi() -[alpha_implementation]: javascript:alphaImplementation() - -Session / Unit of Work {@name=unitofwork} -============ - -### Overview {@name=overview} - -The concept behind Unit of Work is to track modifications to a field of objects, and then be able to flush those changes to the database in a single operation. Theres a lot of advantages to this, including that your application doesn't need to worry about individual save operations on objects, nor about the required order for those operations, nor about excessive repeated calls to save operations that would be more efficiently aggregated into one step. It also simplifies database transactions, providing a neat package with which to insert into the traditional database begin/commit phase. - -SQLAlchemy's unit of work includes these functions: - -* The ability to monitor scalar and list attributes on object instances, as well as object creates. This is handled via the attributes package. -* The ability to maintain and process a list of modified objects, and based on the relationships set up by the mappers for those objects as well as the foreign key relationships of the underlying tables, figure out the proper order of operations so that referential integrity is maintained, and also so that on-the-fly values such as newly created primary keys can be propigated to dependent objects that need them before they are saved. The central algorithm for this is the *topological sort*. -* The ability to define custom functionality that occurs within the unit-of-work flush phase, such as "before insert", "after insert", etc. This is accomplished via MapperExtension. -* an Identity Map, which is a dictionary storing the one and only instance of an object for a particular table/primary key combination. This allows many parts of an application to get a handle to a particular object without any chance of modifications going to two different places. -* The sole interface to the unit of work is provided via the `Session` object. Transactional capability is included. - -### Object States {@name=states} - -When dealing with mapped instances with regards to Sessions, an instance may be *attached* or *unattached* to a particular Session. An instance also may or may not correspond to an actual row in the database. These conditions break up into four distinct states: - -* *Transient* - a transient instance exists within memory only and is not associated with any Session. It also has no database identity and does not have a corresponding record in the database. When a new instance of a class is constructed, and no default session context exists with which to automatically attach the new instance, it is a transient instance. The instance can then be saved to a particular session in which case it becomes a *pending* instance. If a default session context exists, new instances are added to that Session by default and therefore become *pending* instances immediately. - -* *Pending* - a pending instance is a Session-attached object that has not yet been assigned a database identity. When the Session is flushed (i.e. changes are persisted to the database), a pending instance becomes persistent. - -* *Persistent* - a persistent instance has a database identity and a corresponding record in the database, and is also associated with a particular Session. By "database identity" we mean the object is associated with a table or relational concept in the database combined with a particular primary key in that table. Objects that are loaded by SQLAlchemy in the context of a particular session are automatically considered persistent, as are formerly pending instances which have been subject to a session `flush()`. - -* *Detached* - a detached instance is an instance which has a database identity and corresponding row in the database, but is not attached to any Session. This occurs when an instance has been removed from a Session, either because the session itself was cleared or closed, or the instance was explicitly removed from the Session. The object can be re-attached to a session in which case it becomes Persistent again; any un-persisted changes that exist on the instance, whether they occurred during its previous persistent state or during its detached state will be detected and maintained by the new session. Detached instances are useful when an application needs to represent a long-running operation across multiple Sessions, needs to store an object in a serialized state and then restore it later (such as within an HTTP "session" object), or in some cases where code needs to load instances locally which will later be associated with some other Session. - -### Acquiring a Session {@name=getting} - -A new Session object is constructed via the `create_session()` function: - - {python} - session = create_session() - -A common option used with `create_session()` is to specify a specific `Engine` or `Connection` to be used for all operations performed by this Session: - - {python} - # create an engine - e = create_engine('postgres://some/url') - - # create a Session that will use this engine for all operations. - # it will open and close Connections as needed. - session = create_session(bind=e) - - # open a Connection - conn = e.connect() - - # create a Session that will use this specific Connection for all operations - session = create_session(bind=conn) - - -The session to which an object is attached can be acquired via the `object_session()` function, which returns the appropriate `Session` if the object is pending or persistent, or `None` if the object is transient or detached: - - {python} - session = object_session(obj) - -Session Facts: - - * the Session object is **not threadsafe**. For thread-local management of Sessions, the recommended approach is to use the [plugins_sessioncontext](rel:plugins_sessioncontext) extension module. - -We will now cover some of the key concepts used by Sessions and its underlying Unit of Work. - -### Introduction to the Identity Map {@name=identitymap} - -A primary concept of the Session's underlying Unit of Work is that it is keeps track of all persistent instances; recall that a persistent instance has a database identity and is attached to a Session. In particular, the Unit of Work must ensure that only *one* copy of a particular persistent instance exists within the Session at any given time. The UOW accomplishes this task using a dictionary known as an *Identity Map*. - -When a `Query` is used to issue `select` or `get` requests to the database, it will in nearly all cases result in an actual SQL execution to the database, and a corresponding traversal of rows received from that execution. However, when the underlying mapper actually *creates* objects corresponding to the result set rows it receives, it will check the session's identity map first before instantating a new object, and return the same instance already present in the identity map if it already exists, essentially *ignoring* the object state represented by that row. There are several ways to override this behavior and truly refresh an already-loaded instance which are described later, but the main idea is that once your instance is loaded into a particular Session, it will *never change* its state without your explicit approval, regardless of what the database says about it. - -For example; below, two separate calls to load an instance with database identity "15" are issued, and the results assigned to two separate variables. However, since the same `Session` was used, the two instances are the same instance: - - {python} - mymapper = mapper(MyClass, mytable) - - session = create_session() - obj1 = session.query(MyClass).filter(mytable.c.id==15).first() - obj2 = session.query(MyClass).filter(mytable.c.id==15).first() - - >>> obj1 is obj2 - True - -The Identity Map is an instance of `dict` by default. As an option, you can specify the flag `weak_identity_map=True` to the `create_session` function so that it will use a `weakref.WeakValueDictionary`, so that when an in-memory object falls out of scope, it will be removed automatically, thereby providing some automatic management of memory. However, this may not be instant if there are circular references upon the object. To guarantee that an instance is removed from the identity map before removing references to it, use the `expunge()` method, described later, to remove it. Additionally, note that an object that has changes marked on it (i.e. "dirty") can still fall out of scope when using `weak_identity_map`. - -The Session supports an iterator interface in order to see all objects in the identity map: - - {python} - for obj in session: - print obj - -As well as `__contains__()`: - - {python} - if obj in session: - print "Object is present" - -The identity map itself is accessible via the `identity_map` accessor: - - {python} - >>> session.identity_map.values() - [<__main__.User object at 0x712630>, <__main__.Address object at 0x712a70>] - -The identity of each object instance is available via the `_instance_key` property attached to each object instance, and is a tuple consisting of the object's class and an additional tuple of primary key values, in the order that they appear within the table definition: - - {python} - >>> obj._instance_key - (<class 'test.tables.User'>, (7,)) - -At the moment that an object is assigned this key within a `flush()` operation, it is also added to the session's identity map. - -The `get()` method on `Query`, which retrieves an object based on primary key identity, also checks in the Session's identity map first to save a database round-trip if possible. In the case of an object lazy-loading a single child object, the `get()` method is used as well, so scalar-based lazy loads may in some cases not query the database; this is particularly important for backreference relationships as it can save a lot of queries. - -### Whats Changed ? {@name=changed} - -The next concept is that in addition to the `Session` storing a record of all objects loaded or saved, it also stores lists of all *newly created* (i.e. pending) objects and lists of all persistent objects that have been marked as *deleted*. These lists are used when a `flush()` call is issued to save all changes. During a flush operation, it also scans its list of persistent instances for changes which are marked as dirty. - -These records are all tracked by collection functions that are also viewable off the `Session` as properties: - - {python} - # pending objects recently added to the Session - session.new - - # persistent objects which currently have changes detected - # (this collection is now created on the fly each time the property is called) - session.dirty - - # persistent objects that have been marked as deleted via session.delete(obj) - session.deleted - -Note that if a session is created with the `weak_identity_map` flag, an item which is marked as "dirty" will be silently removed from the session if the item falls out of scope in the user application. This is because the unit of work does not look for "dirty" changes except for within a flush operation (or any time the session.dirty collection is accessed). - -As for objects inside of `new` and `deleted`, if you abandon all references to new or modified objects within a session, *they are still present* in either of those two lists, and will be saved on the next flush operation, unless they are removed from the Session explicitly (more on that later). - -### The Session API {@name=api} - -#### query() {@name=query} - -The `query()` function takes one or more classes and/or mappers, along with an optional `entity_name` parameter, and returns a new `Query` object which will issue mapper queries within the context of this Session. For each mapper is passed, the Query uses that mapper. For each class, the Query will locate the primary mapper for the class using `class_mapper()`. - - {python} - # query from a class - session.query(User).filter_by(name='ed').all() - - # query with multiple classes, returns tuples - session.query(User, Address).join('addresses').filter_by(name='ed').all() - - # query from a mapper - query = session.query(usermapper) - x = query.get(1) - - # query from a class mapped with entity name 'alt_users' - q = session.query(User, entity_name='alt_users') - y = q.options(eagerload('orders')).all() - -`entity_name` is an optional keyword argument sent with a class object, in order to further qualify which primary mapper to be used; this only applies if there was a `Mapper` created with that particular class/entity name combination, else an exception is raised. All of the methods on Session which take a class or mapper argument also take the `entity_name` argument, so that a given class can be properly matched to the desired primary mapper. - -All instances retrieved by the returned `Query` object will be stored as persistent instances within the originating `Session`. - -#### get() {@name=get} - -Given a class or mapper, a scalar or tuple-based identity, and an optional `entity_name` keyword argument, creates a `Query` corresponding to the given mapper or class/entity_name combination, and calls the `get()` method with the given identity value. If the object already exists within this Session, it is simply returned, else it is queried from the database. If the instance is not found, the method returns `None`. - - {python} - # get Employer primary key 5 - employer = session.get(Employer, 5) - - # get Report composite primary key 7,12, using mapper 'report_mapper_b' - report = session.get(Report, (7,12), entity_name='report_mapper_b') - - -#### load() {@name=load} - -load() is similar to get() except it will raise an exception if the instance does not exist in the database. It will also load the object's data from the database in all cases, and **overwrite** all changes on the object if it already exists in the session with the latest data from the database. - - {python} - # load Employer primary key 5 - employer = session.load(Employer, 5) - - # load Report composite primary key 7,12, using mapper 'report_mapper_b' - report = session.load(Report, (7,12), entity_name='report_mapper_b') - -#### save() {@name=save} - -save() is called with a single transient (unsaved, unattached) instance as an argument, which is then added to the Session and becomes pending. When the session is next `flush`ed, the instance will be saved to the database uponwhich it becomes persistent (saved, attached). If the given instance is not transient, meaning it is either attached to an existing Session or it has a database identity, an exception is raised. - - {python} - user1 = User(name='user1') - user2 = User(name='user2') - session.save(user1) - session.save(user2) - - session.flush() # write changes to the database - -save() is called automatically for new instances by the classes' associated mapper, if a default Session context is in effect (such as a thread-local session), which means that newly created instances automatically become pending. If there is no default session available, then the instance remains transient (unattached) until it is explicitly added to a Session via the save() method. - -A transient instance also can be automatically `save`ed if it is associated with a parent object which specifies `save-update` within its `cascade` rules, and that parent is already attached or becomes attached to a Session. For more information on `cascade`, see the next section. - -The `save_or_update()` method, covered later, is a convenience method which will call the `save()` or `update()` methods appropriately dependening on whether or not the instance has a database identity (but the instance still must be unattached). - -#### flush() {@name=flush} - -This is the main gateway to what the Unit of Work does best, which is save everything ! It should be clear by now what a flush looks like: - - {python} - session.flush() - -It also can be called with a list of objects; in this form, the flush operation will be limited only to the objects specified in the list, as well as any child objects within `private` relationships for a delete operation: - - {python} - # saves only user1 and address2. all other modified - # objects remain present in the session. - session.flush([user1, address2]) - -This second form of flush should be used carefully as it will not necessarily locate other dependent objects within the session, whose database representation may have foreign constraint relationships with the objects being operated upon. - -Theres also a way to have `flush()` called automatically before each query; this is called "autoflush" and is described below. - -##### Notes on Flush {@name=whatis} - -A common misconception about the `flush()` operation is that once performed, the newly persisted instances will automatically have related objects attached to them, based on the values of primary key identities that have been assigned to the instances before they were persisted. An example would be, you create a new `Address` object, set `address.user_id` to 5, and then `flush()` the session. The erroneous assumption would be that there is now a `User` object of identity "5" attached to the `Address` object, but in fact this is not the case. If you were to `refresh()` the `Address`, invalidating its current state and re-loading, *then* it would have the appropriate `User` object present. - -This misunderstanding is related to the observed behavior of backreferences ([datamapping_relations_backreferences](rel:datamapping_relations_backreferences)), which automatically associates an instance "A" with another instance "B", in response to the manual association of instance "B" to instance "A" by the user. The backreference operation occurs completely externally to the `flush()` operation, and is pretty much the only example of a SQLAlchemy feature that manipulates the relationships of persistent objects. - -The primary guideline for dealing with `flush()` is, the developer is responsible for maintaining in-memory objects and their relationships to each other, the unit of work is responsible for maintaining the database representation of the in-memory objects. The typical pattern is that the manipulation of objects *is* the way that changes get communicated to the unit of work, so that when the flush occurs, the objects are already in their correct in-memory representation and problems dont arise. The manipulation of identifier attributes like integer key values as well as deletes in particular are a frequent source of confusion. - -#### close() {@name=close} - -This method first calls `clear()`, removing all objects from this `Session`, and then ensures that any transactional resources are closed. - -#### delete() {@name=delete} - -The `delete` method places an instance into the Unit of Work's list of objects to be marked as deleted: - - {python} - # mark two objects to be deleted - session.delete(obj1) - session.delete(obj2) - - # flush - session.flush() - -The delete operation will have an effect on instances that are attached to the deleted instance according to the `cascade` style of the relationship; cascade rules are described further in the following section. By default, associated instances may need to be updated in the database to reflect that they no longer are associated with the parent object, before the parent is deleted. If the relationship specifies `cascade="delete"`, then the associated instance will also be deleted upon flush, assuming it is still attached to the parent. If the relationship additionally includes the `delete-orphan` cascade style, the associated instance will be deleted if it is still attached to the parent, or is unattached to any other parent. - -The `delete()` operation has no relationship to the in-memory status of the instance, including usage of the `del` Python statement. An instance marked as deleted and flushed will still exist within memory until references to it are freed; similarly, removing an instance from memory via the `del` statement will have no effect, since the persistent instance will still be referenced by its Session. Obviously, if the instance is removed from the Session and then totally dereferenced, it will no longer exist in memory, but also won't exist in any Session and is therefore not deleted from the database. - -Note that the "in-memory status" of an instance also refers to its presence in any other collection. SQLAlchemy does not track the collections to which an instance is a member, and will not remove an instance from its parent collections that were not directly involved in a deletion operation. The operational and memory overhead implied by this would be too great (such as, if an object belonged to hundreds of collections). This means if an object `A` is attached to both an object `B` and an object `C`, if you `delete()` `A` and flush, `A` still remains attached to both `B` and `C` in a deleted state and must be removed by the application. Similarly, if a delete on `B` cascades to `A`, this **does not** affect `A` still being present on `C` - again it must be manually removed. - - -#### clear() {@name=clear} - -This method detaches all instances from the Session, sending them to the detached or transient state as applicable, and replaces the underlying UnitOfWork with a new one. - - {python} - session.clear() - -The `clear()` method is particularly useful with a "default context" session such as a thread-local session, which can stay attached to the current thread to handle a new field of objects without having to re-attach a new Session. - -#### refresh() / expire() {@name=refreshexpire} - -To assist with the Unit of Work's "sticky" behavior, individual objects can have all of their attributes immediately re-loaded from the database, or marked as "expired" which will cause a re-load to occur upon the next access of any of the object's mapped attributes. This includes all relationships, so lazy-loaders will be re-initialized, eager relationships will be repopulated. Any changes marked on the object are discarded: - - {python} - # immediately re-load attributes on obj1, obj2 - session.refresh(obj1) - session.refresh(obj2) - - # expire objects obj1, obj2, attributes will be reloaded - # on the next access: - session.expire(obj1) - session.expire(obj2) - -#### expunge() {@name=expunge} - -Expunge removes an object from the Session, sending persistent instances to the detached state, and pending instances to the transient state: - - {python} - session.expunge(obj1) - -Use `expunge` when youd like to remove an object altogether from memory, such as before calling `del` on it, which will prevent any "ghost" operations occuring when the session is flushed. - -#### bind\_mapper() / bind\_table() {@name=bind} - -Both of these methods receive two arguments; in the case of `bind_mapper()`, it is a `Mapper` and an `Engine` or `Connection` instance; in the case of `bind_table()`, it is a `Table` instance or other `Selectable` (such as an `Alias`, `Select`, etc.), and an `Engine` or `Connection` instance. - - {python} - engine1 = create_engine('sqlite:///file1.db') - engine2 = create_engine('mysql://localhost') - - sqlite_conneciton = engine1.connect() - - sess = create_session() - - sess.bind_mapper(mymapper, sqlite_connection) # bind mymapper operations to a single SQLite connection - sess.bind_table(email_addresses_table, engine2) # bind operations with the email_addresses_table to mysql - -Normally, when a `Session` is created via `create_session()` with no arguments, the Session has no awareness of individual `Engines`, and when mappers use the `Session` to retrieve connections, the underlying `MetaData` each `Table` is associated with is expected to be "bound" to an `Engine`, else no engine can be located and an exception is raised. A second form of `create_session()` takes the argument `bind=engine_or_connection`, where all SQL operations performed by this `Session` use the single `Engine` or `Connection` (collectively known as a `Connectable`) passed to the constructor. With `bind_mapper()` and `bind_table()`, the operations of individual mapper and/or tables are bound to distinct engines or connections, thereby overriding not only the engine which may be "bound" to the underlying `MetaData`, but also the `Engine` or `Connection` which may have been passed to the `create_session()` function. Configurations which interact with multiple explicit database connections at one time must use either or both of these methods in order to associate `Session` operations with the appropriate connection resource. - -Binding a `Mapper` to a resource takes precedence over a `Table` bind, meaning if mapper A is associated with table B, and the Session binds mapper A to connection X and table B to connection Y, an operation with mapper A will use connection X, not connection Y. - -#### update() {@name=update} - -The update() method is used *only* with detached instances. A detached instance only exists if its `Session` was cleared or closed, or the instance was `expunge()`d from its session. `update()` will re-attach the detached instance with this Session, bringing it back to the persistent state, and allowing any changes on the instance to be saved when the `Session` is next `flush`ed. If the instance is already attached to an existing `Session`, an exception is raised. - -A detached instance also can be automatically `update`ed if it is associated with a parent object which specifies `save-update` within its `cascade` rules, and that parent is already attached or becomes attached to a Session. For more information on `cascade`, see the next section. - -The `save_or_update()` method is a convenience method which will call the `save()` or `update()` methods appropriately dependening on whether or not the instance has a database identity (but the instance still must be unattached). - -#### save\_or\_update() {@name=saveorupdate} - -This method is a combination of the `save()` and `update()` methods, which will examine the given instance for a database identity (i.e. if it is transient or detached), and will call the implementation of `save()` or `update()` as appropriate. Use `save_or_update()` to add unattached instances to a session when you're not sure if they were newly created or not. Like `save()` and `update()`, `save_or_update()` cascades along the `save-update` cascade indicator, described in the `cascade` section below. - -#### merge() {@name=merge} - -`merge()` is used to return the persistent version of an instance that is not attached to this Session. When passed an instance, if an instance with its database identity already exists within this Session, it is returned. If the instance does not exist in this Session, it is loaded from the database and then returned. - -A future version of `merge()` will also update the Session's instance with the state of the given instance (hence the name "merge"). - -This method is useful for bringing in objects which may have been restored from a serialization, such as those stored in an HTTP session: - - {python} - # deserialize an object - myobj = pickle.loads(mystring) - - # "merge" it. if the session already had this object in the - # identity map, then you get back the one from the current session. - myobj = session.merge(myobj) - -Note that `merge()` *does not* associate the given instance with the Session; it remains detached (or attached to whatever Session it was already attached to). - -### Cascade rules {@name=cascade} - -Mappers support the concept of configurable *cascade* behavior on `relation()`s. This behavior controls how the Session should treat the instances that have a parent-child relationship with another instance that is operated upon by the Session. Cascade is indicated as a comma-separated list of string keywords, with the possible values `all`, `delete`, `save-update`, `refresh-expire`, `merge`, `expunge`, and `delete-orphan`. - -Cascading is configured by setting the `cascade` keyword argument on a `relation()`: - - {python} - mapper(Order, order_table, properties={ - 'items' : relation(Item, items_table, cascade="all, delete-orphan"), - 'customer' : relation(User, users_table, user_orders_table, cascade="save-update"), - }) - -The above mapper specifies two relations, `items` and `customer`. The `items` relationship specifies "all, delete-orphan" as its `cascade` value, indicating that all `save`, `update`, `merge`, `expunge`, `refresh` `delete` and `expire` operations performed on a parent `Order` instance should also be performed on the child `Item` instances attached to it (`save` and `update` are cascaded using the `save_or_update()` method, so that the database identity of the instance doesn't matter). The `delete-orphan` cascade value additionally indicates that if an `Item` instance is no longer associated with an `Order`, it should also be deleted. The "all, delete-orphan" cascade argument allows a so-called *lifecycle* relationship between an `Order` and an `Item` object. - -The `customer` relationship specifies only the "save-update" cascade value, indicating most operations will not be cascaded from a parent `Order` instance to a child `User` instance, except for if the `Order` is attached with a particular session, either via the `save()`, `update()`, or `save-update()` method. - -Additionally, when a child item is attached to a parent item that specifies the "save-update" cascade value on the relationship, the child is automatically passed to `save_or_update()` (and the operation is further cascaded to the child item). - -Note that cascading doesn't do anything that isn't possible by manually calling Session methods on individual instances within a hierarchy, it merely automates common operations on a group of associated instances. - -The default value for `cascade` on `relation()`s is `save-update`, and the `private=True` keyword argument is a synonym for `cascade="all, delete-orphan"`. - -### Using Session Transactions {@name=transaction} - -The Session can manage transactions automatically, including across multiple engines. When the Session is in a transaction, as it receives requests to execute SQL statements, it adds each indivdual Connection/Engine encountered to its transactional state. At commit time, all unflushed data is flushed, and each individual transaction is committed. If the underlying databases support two-phase semantics, this may be used by the Session as well if two-phase transactions are enabled. - -The easiest way to use a Session with transactions is just to declare it as transactional. The session will remain in a transaction at all times: - - {python} - sess = create_session(transactional=True) - item1 = sess.query(Item).get(1) - item2 = sess.query(Item).get(2) - item1.foo = 'bar' - item2.bar = 'foo' - - # commit- will immediately go into a new transaction afterwards - sess.commit() - -Alternatively, a transaction can be begun explicitly using `begin()`: - - {python} - sess = create_session() - sess.begin() - try: - item1 = sess.query(Item).get(1) - item2 = sess.query(Item).get(2) - item1.foo = 'bar' - item2.bar = 'foo' - except: - sess.rollback() - raise - sess.commit() - -Session also supports Python 2.5's with statement so that the example above can be written as: - - {python} - sess = create_session() - with sess.begin(): - item1 = sess.query(Item).get(1) - item2 = sess.query(Item).get(2) - item1.foo = 'bar' - item2.bar = 'foo' - -For MySQL and Postgres (and soon Oracle), "nested" transactions can be accomplished which use SAVEPOINT behavior, via the `begin_nested()` method: - - {python} - sess = create_session() - sess.begin() - sess.save(u1) - sess.save(u2) - sess.flush() - - sess.begin_nested() # establish a savepoint - sess.save(u3) - sess.rollback() # rolls back u3, keeps u1 and u2 - - sess.commit() # commits u1 and u2 - -Finally, for MySQL, Postgres, and soon Oracle as well, the session can be instructed to use two-phase commit semantics using the flag `twophase=True`, which coordinates transactions across multiple databases: - - {python} - engine1 = create_engine('postgres://db1') - engine2 = create_engine('postgres://db2') - - sess = create_session(twophase=True, transactional=True) - - # bind User operations to engine 1 - sess.bind_mapper(User, engine1) - - # bind Account operations to engine 2 - sess.bind_mapper(Account, engine2) - - # .... work with accounts and users - - # commit. session will issue a flush to all DBs, and a prepare step to all DBs, - # before committing both transactions - sess.commit() - -#### AutoFlush {@name=autoflush} - -A transactional session can also conveniently issue `flush()` calls before each query. This allows you to immediately have DB access to whatever has been saved to the session. Creating the session with `autoflush=True` implies `transactional=True`: - - {python} - sess = create_session(autoflush=True) - u1 = User(name='jack') - sess.save(u1) - - # reload user1 - u2 = sess.query(User).filter_by(name='jack').one() - assert u2 is u1 - - # commit session, flushes whatever is remaining - sess.commit() - -#### Using SQL with Sessions and Transactions {@name=sql} - -SQL constructs and string statements can be executed via the `Session`. You'd want to do this normally when your `Session` is transactional and youd like your free-standing SQL statements to participate in the same transaction. - -The two ways to do this are to use the connection/execution services of the Session, or to have your Session participate in a regular SQL transaction. - -First, a Session thats associated with an Engine or Connection can execute statements immediately (whether or not its transactional): - - {python} - sess = create_session(bind=engine, transactional=True) - result = sess.execute("select * from table where id=:id", {'id':7}) - result2 = sess.execute(select([mytable], mytable.c.id==7)) - -To get at the current connection used by the session, which will be part of the current transaction if one is in progress, use `connection()`: - - connection = sess.connection() - -A second scenario is that of a Session which is not directly bound to a connectable. This session executes statements relative to a particular `Mapper`, since the mappers are bound to tables which are in turn bound to connectables via their `MetaData` (either the session or the mapped tables need to be bound). In this case, the Session can conceivably be associated with multiple databases through different mappers; so it wants you to send along a `mapper` argument, which can be any mapped class or mapper instance: - - {python} - sess = create_session(transactional=True) - result = sess.execute("select * from table where id=:id", {'id':7}, mapper=MyMappedClass) - result2 = sess.execute(select([mytable], mytable.c.id==7), mapper=MyMappedClass) - - connection = sess.connection(MyMappedClass) - -The third scenario is when you are using `Connection` and `Transaction` yourself, and want the `Session` to participate. This is easy, as you just bind the `Session` to the connection: - - {python} - conn = engine.connect() - trans = conn.begin() - sess = create_session(bind=conn) - # ... etc - trans.commit() - -It's safe to use a `Session` which is transactional or autoflushing, as well as to call `begin()`/`commit()` on the session too; the outermost Transaction object, the one we declared explicitly, controls the scope of the transaction. - -When using the `threadlocal` engine context, things are that much easier; the `Session` uses the same connection/transaction as everyone else in the current thread, whether or not you explicitly bind it: - - {python} - engine = create_engine('postgres://mydb', strategy="threadlocal") - engine.begin() - - sess = create_session() # session takes place in the transaction like everyone else - - # ... go nuts - - engine.commit() # commit - diff --git a/doc/build/genhtml.py b/doc/build/genhtml.py index cb64971e8..731ea07ac 100644 --- a/doc/build/genhtml.py +++ b/doc/build/genhtml.py @@ -18,7 +18,7 @@ files = [ 'ormtutorial', 'sqlexpression', 'mappers', - 'unitofwork', + 'session', 'dbengine', 'metadata', 'types', diff --git a/lib/sqlalchemy/orm/session.py b/lib/sqlalchemy/orm/session.py index 1b0c1a9a2..e61edb8a0 100644 --- a/lib/sqlalchemy/orm/session.py +++ b/lib/sqlalchemy/orm/session.py @@ -172,11 +172,11 @@ class Session(object): changes. * The ability to organize individual SQL queries and population of newly generated - primary and foreign key-holding attributes during a persist operation - such that referential integrity is maintained at all times. + primary and foreign key-holding attributes during a persist operation + such that referential integrity is maintained at all times. * The ability to maintain insert ordering against the order in which - new instances were added to the session. + new instances were added to the session. * an Identity Map, which is a dictionary keying instances to their unique primary key identity. This ensures that only one copy of a particular entity is ever present @@ -188,38 +188,23 @@ class Session(object): particular Session, else it is *unattached* . An instance also may or may not correspond to an actual row in the database. These conditions break up into four distinct states: - * *Transient* - a transient instance exists within memory only and is not associated with - any Session. It also has no database identity and does not have a corresponding record - in the database. When a new instance of a class is constructed, and no default session - context exists with which to automatically attach the new instance, it is a transient - instance. The instance can then be saved to a particular session in which case it - becomes a *pending* instance. If a default session context exists, new instances are - added to that Session by default and therefore become *pending* instances immediately. - - * *Pending* - a pending instance is a Session-attached object that has not yet been - assigned a database identity. When the Session is flushed (i.e. changes are persisted to - the database), a pending instance becomes persistent. - - * *Persistent* - a persistent instance has a database identity and a corresponding record - in the database, and is also associated with a particular Session. By "database - identity" we mean the object is associated with a table or relational concept in the - database combined with a particular primary key in that table. Objects that are loaded - by SQLAlchemy in the context of a particular session are automatically considered - persistent, as are formerly pending instances which have been subject to a session - `flush()`. - - * *Detached* - a detached instance is an instance which has a database identity and - corresponding row in the database, but is not attached to any Session. This occurs when - an instance has been removed from a Session, either because the session itself was - cleared or closed, or the instance was explicitly removed from the Session. The object - can be re-attached to a session in which case it becomes Persistent again; any - un-persisted changes that exist on the instance, whether they occurred during its - previous persistent state or during its detached state will be detected and maintained - by the new session. Detached instances are useful when an application needs to represent - a long-running operation across multiple Sessions, needs to store an object in a - serialized state and then restore it later (such as within an HTTP "session" object), or - in some cases where code needs to load instances locally which will later be associated - with some other Session. + * *Transient* - an instance that's not in a session, and is not saved to the database; + i.e. it has no database identity. The only relationship such an object has to the ORM + is that its class has a `mapper()` associated with it. + + * *Pending* - when you `save()` a transient instance, it becomes pending. It still + wasn't actually flushed to the database yet, but it will be when the next flush + occurs. + + * *Persistent* - An instance which is present in the session and has a record in the + database. You get persistent instances by either flushing so that the pending + instances become persistent, or by querying the database for existing instances (or + moving persistent instances from other sessions into your local session). + + * *Detached* - an instance which has a record in the database, but is not in any + session. Theres nothing wrong with this, and you can use objects normally when + they're detached, **except** they will not be able to issue any SQL in order to load + collections or attributes which are not yet loaded, or were marked as "expired". The session methods which control instance state include ``save()``, ``update()``, ``save_or_update()``, ``delete()``, ``merge()``, and ``expunge()``. @@ -234,7 +219,7 @@ class Session(object): a thread-managed Session adapter, provided by the [sqlalchemy.orm#scoped_session()] function. """ - def __init__(self, bind=None, autoflush=True, transactional=False, twophase=False, echo_uow=False, weak_identity_map=False): + def __init__(self, bind=None, autoflush=True, transactional=False, twophase=False, echo_uow=False, weak_identity_map=False, binds=None): self.uow = unitofwork.UnitOfWork(weak_identity_map=weak_identity_map) self.bind = bind @@ -248,6 +233,13 @@ class Session(object): self.twophase = twophase self._query_cls = query.Query self._mapper_flush_opts = {} + + if binds is not None: + for mapperortable, value in binds: + if isinstance(mapperortable, type): + mapperortable = _class_mapper(mapperortable) + self.__binds[mapperortable] = value + if self.transactional: self.begin() _sessions[self.hash_key] = self |