summaryrefslogtreecommitdiff
path: root/lib/sqlalchemy/sql/traversals.py
Commit message (Collapse)AuthorAgeFilesLines
* ensure anon_map is passed for most annotated traversalsMike Bayer2022-11-111-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | We can cache the annotated cache key for Table, but for selectables it's not safe, as it fails to pass the anon_map along and creates many redudant structures in observed test scenario. It is likely safe for a Column that's mapped to a Table also, however this is not implemented here. Will have to see if that part needs adjusting. Fixed critical memory issue identified in cache key generation, where for very large and complex ORM statements that make use of lots of ORM aliases with subqueries, cache key generation could produce excessively large keys that were orders of magnitude bigger than the statement itself. Much thanks to Rollo Konig Brock for their very patient, long term help in finally identifying this issue. Also within TypeEngine objects, when we generate elements for instance variables, skip the None elements at least. this also saves on tuple complexity. Fixes: #8790 Change-Id: I448ddbfb45ae0a648815be8dad4faad7d1977427 (cherry picked from commit 88c240d907a9ae3b5caf766009edd196a30cece3)
* accommodate for "clone" of ColumnClauseMike Bayer2021-12-211-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | for use with the ClauseElement.params() method, altered ColumnClause._clone() so that while the element stays immutable, if the column is associated with a subquery, it returns a new version of itself as corresponding to a clone of the subquery. this allows processing functions to access the parameters in the subquery and produce a copy of it. The use case here is the expanded use of .params() within loader strategies that use HasCacheKey._apply_params_to_element(). Fixed issue in new "loader criteria" method :meth:`_orm.PropComparator.and_` where usage with a loader strategy like :func:`_orm.selectinload` against a column that was a member of the ``.c.`` collection of a subquery object, where the subquery would be dynamically added to the FROM clause of the statement, would be subject to stale parameter values within the subquery in the SQL statement cache, as the process used by the loader strategy to replace the parameters at execution time would fail to accommodate the subquery when received in this form. Fixes: #7489 Change-Id: Ibb3b6af140b8a62a2c8d05b2ac92e86ca3013c46 (cherry picked from commit 267e9cbf6e3c165a4e953b49d979d7f4ddc533f9)
* Warn when caching is disabled / documentMike Bayer2021-12-061-3/+98
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds new warnings for all elements that don't indicate their caching behavior, including user-defined ClauseElement subclasses and third party dialects. it additionally adds new documentation to discuss an apparent performance degradation in 1.4 when caching is disabled as a result in the significant expense incurred by ORM lazy loaders, which in 1.3 used BakedQuery so were actually cached. As a result of adding the warnings, a fair degree of lesser used SQL expression objects identified that they did not define caching behavior so would have been producing ``[no key]``, including PostgreSQL constructs ``hstore`` and ``array``. These have been amended to use inherit cache where appropriate. "on conflict" constructs in PostgreSQL, MySQL, SQLite still explicitly don't generate a cache key at this time. The change also adds a test for all constructs via assert_compile() to assert they will not generate cache warnings. Fixes: #7394 Change-Id: I85958affbb99bfad0f5efa21bc8f2a95e7e46981 (cherry picked from commit 22deafe15289d2be55682e1632016004b02b62c0)
* accommodate for untracked boundparam lambda in offline_stringMike Bayer2021-08-051-8/+8
| | | | | | | | | | Fixed an issue in the ``CacheKey.to_offline_string()`` method used by the dogpile.caching example where attempting to create a proper cache key from the special "lambda" query generated by the lazy loader would fail to include the parameter values, leading to an incorrect cache key. Fixes: #6858 Change-Id: Ice27087583c6f3ff79cf7d5b879e5dd0a4e58158
* memoize current options and joins w with_entities/with_only_colsMike Bayer2021-06-171-0/+18
| | | | | | | | | | | | | | | | Fixed further regressions in the same area as that of :ticket:`6052` where loader options as well as invocations of methods like :meth:`_orm.Query.join` would fail if the left side of the statement for which the option/join depends upon were replaced by using the :meth:`_orm.Query.with_entities` method, or when using 2.0 style queries when using the :meth:`_sql.Select.with_only_columns` method. A new set of state has been added to the objects which tracks the "left" entities that the options / join were made against which is memoized when the lead entities are changed. Fixes: #6503 Fixes: #6253 Change-Id: I211b2af98b0b20d1263fb15dc513884dcc5de6a4
* Fix adaption in AnnotatedLabel; repair needless expense in coercionMike Bayer2021-05-281-1/+0
| | | | | | | | | | | | | | | Fixed regression involving clause adaption of labeled ORM compound elements, such as single-table inheritance discriminator expressions with conditionals or CASE expressions, which could cause aliased expressions such as those used in ORM join / joinedload operations to not be adapted correctly, such as referring to the wrong table in the ON clause in a join. This change also improves a performance bump that was located within the process of invoking :meth:`_sql.Select.join` given an ORM attribute as a target. Fixes: #6550 Change-Id: I98906476f0cce6f41ea00b77c789baa818e9d167
* Correct cache key for proxy_owner, with_context_optionsMike Bayer2021-05-101-0/+18
| | | | | | | | | | | | | | | | | | Fixed issue in subquery loader strategy which prevented caching from working correctly. This would have been seen in the logs as a "generated" message instead of "cached" for all subqueryload SQL emitted, which by saturating the cache with new keys would degrade overall performance; it also would produce "LRU size alert" warnings. In this issue we also observe that the local LRU cache for lazyloader and selectinloader will get used for all subsequent loads as well, which makes it more likely to hit the limit of 30. However rather than trying to work this out, it would be better if we removed the loader-local LRU caches altogether once we are confident these are working well. Fixes: #6459 Change-Id: Id953e8f75536bb87f7e3315929cebcd8f84a5a50
* don't cache TypeDecorator by defaultMike Bayer2021-05-061-1/+5
| | | | | | | | | | The :class:`.TypeDecorator` class will now emit a warning when used in SQL compilation with caching unless the ``.cache_ok`` flag is set to ``True`` or ``False``. ``.cache_ok`` indicates that all the parameters passed to the object are safe to be used as a cache key, ``False`` means they are not. Fixes: #6436 Change-Id: Ib1bb7dc4b124e38521d615c2e2e691e4915594fb
* Adapt loader_criteria params for current queryMike Bayer2021-03-261-0/+8
| | | | | | | | | | | | | | | | | | Fixed critical issue in the new :meth:`_orm.PropComparator.and_` feature where loader strategies that emit secondary SELECT statements such as :func:`_orm.selectinload` and :func:`_orm.lazyload` would fail to accommodate for bound parameters in the user-defined criteria in terms of the current statement being executed, as opposed to the cached statement, causing stale bound values to be used. This also adds a warning for the case where an object that uses :func:`_orm.lazyload` in conjunction with :meth:`_orm.PropComparator.and_` is attempted to be serialized; the loader criteria cannot reliably be serialized and deserialized and eager loading should be used for this case. Fixes: #6139 Change-Id: I5a638bbecb7b583db2d3c0b76469f5a25c13dd3b
* Use explicit names for mapper _get_clause parametersMike Bayer2021-03-171-2/+9
| | | | | | | | | | | | | | | | | | | | | | | | Fixed a critical regression in the relationship lazy loader where the SQL criteria used to fetch a related many-to-one object could go stale in relation to other memoized structures within the loader if the mapper had configuration changes, such as can occur when mappers are late configured or configured on demand, producing a comparison to None and returning no object. Huge thanks to Alan Hamlett for their help tracking this down late into the night. The primary change is that mapper._get_clause() uses a fixed name for its bound parameters, which is memoized under a lambda statement in the case of many-to-one lazy loading. This has implications for some other logic namely the .compare() used by loader strategies to determine use_get needed to be adjusted. This change also repairs the lambda module's behavior of removing the "required" flag from bound parameters, which caused this issue to also fail silently rather than issuing an error for a required bind parameter. Fixes: #6055 Change-Id: I19e1aba9207a049873e0f13c19bad7541e223cfd
* Implement support for functions as FROM with columns clause supportMike Bayer2021-02-031-1/+4
| | | | | | | | | | | | | | | | Implemented support for "table valued functions" along with additional syntaxes supported by PostgreSQL, one of the most commonly requested features. Table valued functions are SQL functions that return lists of values or rows, and are prevalent in PostgreSQL in the area of JSON functions, where the "table value" is commonly referred towards as the "record" datatype. Table valued functions are also supported by Oracle and SQL Server. Moved from I5b093b72533ef695293e737eb75850b9713e5e03 due to accidental push Fixes: #3566 Change-Id: Iea36d04c80a5ed3509dcdd9ebf0701687143fef5
* Ensure the "orm" plugin is used unconditionally for bundlesMike Bayer2020-11-161-1/+3
| | | | | | | | | | | | | | This also introduces that the "orm" plugin may be used when the plugin_subject is None. Fixed regression where the :paramref:`.Bundle.single_entity` flag would take effect for a :class:`.Bundle` even though it were not set. Additionally, this flag is legacy as it only makes sense for the :class:`_orm.Query` object and not 2.0 style execution. a deprecation warning is emitted when used with new-style execution. Fixes: #5702 Change-Id: If9f43365b9d966cb398890aeba2efa555658a7e5
* Reduce import time overheadMike Bayer2020-11-031-6/+1
| | | | | | | | | | | | | | | | | | * Fix subclass traversals to not run classes multiple times * switch compiler visitor to use an attrgetter, to avoid an eval() at startup time * don't pre-generate traversal functions, there's lots of these which are expensive to generate at once and most applications won't use them all; have it generate them on first use instead * Some ideas about removing asyncio imports, they don't seem to be too signficant, apply some more simplicity to the overall "greenlet fallback" situation Fixes: #5681 Change-Id: Ib564ddaddb374787ce3e11ff48026e99ed570933
* Robustness for lambdas, lambda statementsMike Bayer2020-08-051-3/+65
| | | | | | | | | | | | | | | | | | | | in order to accommodate relationship loaders with lambda caching, a lot more is needed. This is a full refactor of the lambda system such that it now has two levels of caching; the first level caches what can be known from the __code__ element, then the next level of caching is against the lambda itself and the contents of __closure__. This allows for the elements inside the lambdas, like columns and entities, to change and then be part of the cache key. Lazy/selectinloads' use of baked queries had to add distinct cache key elements, which was attempted here but overall things needed to be more robust than that. This commit is broken out from the very long and sprawling commit at Id6b5c03b1ce9ddb7b280f66792212a0ef0a1c541 . Change-Id: I29a513c98917b1d503abfdd61e6b6e8800851aa8
* introduce deferred lambdasMike Bayer2020-07-031-48/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The coercions system allows us to add in lambdas as arguments to Core and ORM elements without changing them at all. By allowing the lambda to produce a deterministic cache key where we can also cheat and yank out literal parameters means we can move towards having 90% of "baked" functionality in a clearer way right in Core / ORM. As a second step, we can have whole statements inside the lambda, and can then add generation with __add__(), so then we have 100% of "baked" functionality with full support of ad-hoc literal values. Adds some more short_selects tests for the moment for comparison. Other tweaks inside cache key generation as we're trying to approach a certain level of performance such that we can remove the use of "baked" from the loader strategies. As we have not yet closed #4639, however the caching feature has been fully integrated as of b0cfa7379cf8513a821a3dbe3028c4965d9f85bd, we will also add complete caching documentation here and close that issue as well. Closes: #4639 Fixes: #5380 Change-Id: If91f61527236fd4d7ae3cad1f24c38be921c90ba
* Fix a wide variety of typos and broken linksaplatkouski2020-06-251-3/+3
| | | | | | | | | | | | Note the PR has a few remaining doc linking issues listed in the comment that must be addressed separately. Signed-off-by: aplatkouski <5857672+aplatkouski@users.noreply.github.com> Closes: #5371 Pull-request: https://github.com/sqlalchemy/sqlalchemy/pull/5371 Pull-request-sha: 7e7d233cf3a0c66980c27db0fcdb3c7d93bc2510 Change-Id: I9c36e8d8804483950db4b42c38ee456e384c59e3
* Turn on caching everywhere, add loggingMike Bayer2020-06-101-17/+112
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A variety of caching issues found by running all tests with statement caching turned on. The cache system now has a more conservative approach where any subclass of a SQL element will by default invalidate the cache key unless it adds the flag inherit_cache=True at the class level, or if it implements its own caching. Add working caching to a few elements that were omitted previously; fix some caching implementations to suit lesser used edge cases such as json casts and array slices. Refine the way BaseCursorResult and CursorMetaData interact with caching; to suit cases like Alembic modifying table structures, don't cache the cursor metadata if it were created against a cursor.description using non-positional matching, e.g. "select *". if a table re-ordered its columns or added/removed, now that data is obsolete. Additionally we have to adapt the cursor metadata _keymap regardless of if we just processed cursor.description, because if we ran against a cached SQLCompiler we won't have the right columns in _keymap. Other refinements to how and when we do this adaption as some weird cases were exposed in the Postgresql dialect, a text() construct that names just one column that is not actually in the statement. Fixed that also as it looks like a cut-and-paste artifact that doesn't actually affect anything. Various issues with re-use of compiled result maps and cursor metadata in conjunction with tables being changed, such as change in order of columns. mappers can be cleared but the class remains, meaning a mapper has to use itself as the cache key not the class. lots of bound parameter / literal issues, due to Alembic creating a straight subclass of bindparam that renders inline directly. While we can update Alembic to not do this, we have to assume other people might be doing this, so bindparam() implements the inherit_cache=True logic as well that was a bit involved. turn on cache stats in logging. Includes a fix to subqueryloader which moves all setup to the create_row_processor() phase and elminates any storage within the compiled context. This includes some changes to create_row_processor() signature and a revising of the technique used to determine if the loader can participate in polymorphic queries, which is also applied to selectinloading. DML update.values() and ordered_values() now coerces the keys as we have tests that pass an arbitrary class here which only includes __clause_element__(), so the key can't be cached unless it is coerced. this in turn changed how composite attributes support bulk update to use the standard approach of ClauseElement with annotations that are parsed in the ORM context. memory profiling successfully caught that the Session from Query was getting passed into _statement_20() so that was a big win for that test suite. Apparently Compiler had .execute() and .scalar() methods stuck on it, these date back to version 0.4 and there was a single test in the PostgreSQL dialect tests that exercised it for no apparent reason. Removed these methods as well as the concept of a Compiler holding onto a "bind". Fixes: #5386 Change-Id: I990b43aab96b42665af1b2187ad6020bee778784
* Convert bulk update/delete to new execution modelMike Bayer2020-06-061-15/+43
| | | | | | | | | | | | | | | This reorganizes the BulkUD model in sqlalchemy.orm.persistence to be based on the CompileState concept and to allow plain update() / delete() to be passed to session.execute() where the ORM synchronize session logic will take place. Also gets "synchronize_session='fetch'" working with horizontal sharding. Adding a few more result.scalar_one() types of methods as scalar_one() seems like what is normally desired. Fixes: #5160 Change-Id: I8001ebdad089da34119eb459709731ba6c0ba975
* Improve rendering of core statements w/ ORM elementsMike Bayer2020-05-311-9/+9
| | | | | | | | | | | | | | | | | | | | This patch contains a variety of ORM and expression layer tweaks to support ORM constructs in select() statements, without the 1.3.x requiremnt in Query that a full _compile_context() + new select() is needed in order to get a working statement object. Includes such tweaks as the ability to implement aliased class of an aliased class, as we are looking to fully support ACs against subqueries, as well as the ability to access anonymously-labeled ColumnProperty expressions within subqueries by naming the ".key" of the label after the property key. Some tuning to query.join() as well as ORMJoin internals to allow things to work more smoothly. Change-Id: Id810f485c5f7ed971529489b84694e02a3356d6d
* callcount reductions and refinement for cached queriesMike Bayer2020-05-281-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit includes that we've removed the "_orm_query" attribute from compile state as well as query context. The attribute created reference cycles and also added method call overhead. As part of this change, the interface for ORMExecuteState changes a bit, as well as the interface for the horizontal sharding extension which now deprecates the "query_chooser" callable in favor of "execute_chooser", which receives the contextual object. This will also work more nicely when we implement the new execution path for bulk updates and deletes. Pre-merge execution options for statement, connection, arguments all up front in Connection. that way they can be passed to the before_execute / after_execute events, and the ExecutionContext doesn't have to merge as second time. Core execute is pretty close to 1.3 now. baked wasn't using the new one()/first()/one_or_none() methods, fixed that. Convert non-buffered cursor strategy to be a stateless singleton. inline all the paths by which the strategy gets chosen, oracle and SQL Server dialects make use of the already-invoked post_exec() hook to establish the alternate strategies, and this is actually much nicer than it was before. Add caching to mapper instance processor for getters. Identified a reference cycle per query that was showing up as a lot of gc cleanup, fixed that. After all that, performance not budging much. Even test_baked_query now runs with significantly fewer function calls than 1.3, still 40% slower. Basically something about the new patterns just makes this slower and while I've walked a whole bunch of them back, it hardly makes a dent. that said, the performance issues are relatively small, in the 20-40% time increase range, and the new caching feature does provide for regular ORM and Core queries that are cached, and they are faster than non-cached. Change-Id: I7b0b0d8ca550c05f79e82f75cd8eff0bbfade053
* Convert execution to move through SessionMike Bayer2020-05-251-6/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces the ORM execution flow with a single pathway through Session.execute() for all queries, including Core and ORM. Currently included is full support for ORM Query, Query.from_statement(), select(), as well as the baked query and horizontal shard systems. Initial changes have also been made to the dogpile caching example, which like baked query makes use of a new ORM-specific execution hook that replaces the use of both QueryEvents.before_compile() as well as Query._execute_and_instances() as the central ORM interception hooks. select() and Query() constructs alike can be passed to Session.execute() where they will return ORM results in a Results object. This API is currently used internally by Query. Full support for Session.execute()->results to behave in a fully 2.0 fashion will be in later changesets. bulk update/delete with ORM support will also be delivered via the update() and delete() constructs, however these have not yet been adapted to the new system and may follow in a subsequent update. Performance is also beginning to lag as of this commit and some previous ones. It is hoped that a few central functions such as the coercions functions can be rewritten in C to re-gain performance. Additionally, query caching is now available and some subsequent patches will attempt to cache more of the per-execution work from the ORM layer, e.g. column getters and adapters. This patch also contains initial "turn on" of the caching system enginewide via the query_cache_size parameter to create_engine(). Still defaulting at zero for "no caching". The caching system still needs adjustments in order to gain adequate performance. Change-Id: I047a7ebb26aa85dc01f6789fac2bff561dcd555d
* Unify Query and select() , move all processing to compile phaseMike Bayer2020-05-241-43/+173
| | | | | | | | | | | | | | | | | | | | | | | | | | Convert Query to do virtually all compile state computation in the _compile_context() phase, and organize it all such that a plain select() construct may also be used as the source of information in order to generate ORM query state. This makes it such that Query is not needed except for its additional methods like from_self() which are all to be deprecated. The construction of ORM state will occur beyond the caching boundary when the new execution model is integrated. future select() gains a working join() and filter_by() method. as we continue to rebase and merge each commit in the steps, callcounts continue to bump around. will have to look at the final result when it's all in. References: #5159 References: #4705 References: #4639 References: #4871 References: #5010 Change-Id: I19e05b3424b07114cce6c439b05198ac47f7ac10
* Streamline visitors.iterateMike Bayer2020-05-181-12/+7
| | | | | | | | | | | | | | | | | | | | This method might be used more significantly in the ORM refactor, so further refine it. * all get_children() methods now work entirely based on iterators. Basically only select() was sensitive to this anymore and it now chains the iterators together * remove all kinds of flags like column_collections, schema_visitor that apparently aren't used anymore. * remove the "depthfirst" visitors as these don't seem to be used either. * make sure select() yields its columns first as these will be used to determine the current mapper. Change-Id: I05273a2d5306a57c2d1b0979050748cf3ac964bf
* Run search and replace of symbolic module namesMike Bayer2020-04-141-1/+1
| | | | | | | | Replaces a wide array of Sphinx-relative doc references with an abbreviated absolute form now supported by zzzeeksphinx. Change-Id: I94bffcc3f37885ffdde6238767224296339698a2
* Repair caching / traversals for valuesMike Bayer2020-04-011-38/+39
| | | | | | | | | | | | | | The test suite wasn't running the copy_internals most fixtures, enable that and try to get all cases working. Set up selectable.values to do tuple conversion within compilation step. at the same time, disable caching for selectable.values for the moment and make it equivalent to dml_multi_values. fix cache / compare / copy cases for dml_values and dml_multi_values which weren't fully tested or covered. Change-Id: I484ca6e9cb2b66c2e6a321698f2abc0838db1460
* Try to measure new style caching in the ORM, take twoMike Bayer2020-04-011-3/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Supercedes: If78fbb557c6f2cae637799c3fec2cbc5ac248aaf Trying to see if by making the cache key memoized, we still can have the older "identity" form of caching which is the cheapest of all, at the same time as the newer "cache key each time" version that is not nearly as cheap; but still much cheaper than no caching at all. Also needed is a per-execution update of _keymap when we invoke from a cached select, so that Column objects that are anonymous or otherwise adapted will match up. this is analogous to the adaption of bound parameters from the cache key. Adds test coverage for the keymap / construct_params() changes related to caching. Also hones performance to a large extent for statement construction and cache key generation. Also includes a new memoized attribute approach that vastly simplifies the previous approach of "group_expirable_memoized_property" and finally integrates cleanly with _clone(), _generate(), etc. no more hardcoding of attributes is needed, as well as that most _reset_memoization() calls are no longer needed as the reset is inherent in a _generate() call; this also has dramatic performance improvements. Change-Id: I95c560ffcbfa30b26644999412fb6a385125f663
* Merge "Rework select(), CompoundSelect() in terms of CompileState"mike bayer2020-03-111-0/+5
|\
| * Rework select(), CompoundSelect() in terms of CompileStateMike Bayer2020-03-101-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Continuation of I408e0b8be91fddd77cf279da97f55020871f75a9 - add an options() method to the base Generative construct. this will be where ORM options can go - Change Null, False_, True_ to be singletons, so that we aren't instantiating them and having to use isinstance. The previous issue with this was that they would produce dupe labels in SELECT statements. Apply the duplicate column logic, newly added in 1.4, to these objects as well as to non-apply-labels SELECT statements in general as a means of improving this. - create a revised system for generating ClauseList compilation constructs that simplfies up front creation to not actually use ClauseList; a simple tuple is rendered by the compiler using the same constrcution rules as what are used for ClauseList but without creating the actual object. Apply to Select, CompoundSelect, revise Update, Delete - Select, CompoundSelect get an initial CompileState implementation. All methods used only within compilation are moved here - refine update/insert/delete compile state to not require an outside boolean - refine and simplify Select._copy_internals - rework bind(), which is going away, to not use some of the internal traversal stuff - remove "autocommit", "for_update" parameters from Select, references #4643 - remove "autocommit" parameter from TextClause , references #4643 - add deprecation warnings for statement.execute(), engine.execute(), statement.scalar(), engine.scalar(). Fixes: #5193 Change-Id: I04ca0152b046fd42c5054ba10f37e43fc6e5a57b
* | Simplified module pre-loading strategy and made it linter friendlyFederico Caselli2020-03-071-3/+3
|/ | | | | | | | | | | | | | | | | Introduced a modules registry to register modules that should be lazily loaded in the package init. This ensures that they are in the system module cache, avoiding potential thread safety issues as when importing them directly in the function that uses them. The module registry is used to obtain these modules directly, ensuring that the all the lazily loaded modules are resolved at the proper time This replaces dependency_for decorator and the dependencies decorator logic, removing the need to pass the resolved modules as arguments of the decodated functions and removes possible errors caused by linters. Fixes: #4689 Fixes: #4656 Change-Id: I2e291eba4297867fc0ddb5d875b9f7af34751d01
* Decouple compiler state from DML objects; make cacheableMike Bayer2020-03-061-5/+213
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Targeting select / insert / update / delete, the goal is to minimize overhead of construction and generative methods so that only the raw arguments passed are handled. An interim stage that converts the raw state into more compiler-ready state is added, which is analogous to the ORM QueryContext which will also be rolled in to be a similar concept, as is currently being prototyped in I19e05b3424b07114cce6c439b05198ac47f7ac10. the ORM update/delete BulkUD concept is also going to be rolled onto this idea. So while the compiler-ready state object, here called DMLState, looks a little thin, it's the base of a bigger pattern that will allow for ORM functionality to embed itself directly into the compiler, execution context, and result set objects. This change targets the DML objects, primarily focused on the values() method which is the most complex process. The work done by values() is minimized as much as possible while still being able to create a cache key. Additional computation is then offloaded to a new object ValuesState that is handled by the compiler. Architecturally, a big change here is that insert.values() and update.values() will generate BindParameter objects for the values now, which are then carefully received by crud.py so that they generate the expected names. This is so that the values() portion of these constructs is cacheable. for the "multi-values" version of Insert, this is all skipped and the plan right now is that a multi-values insert is not worth caching (can always be revisited). Using the coercions system in values() also gets us nicer validation for free, we can remove the NotAClauseElement thing from schema, and we also now require scalar_subquery() is called for an insert/update that uses a SELECT as a column value, 1.x deprecation path is added. The traversal system is then applied to the DML objects including tests so that they have traversal, cloning, and cache key support. cloning is not a use case for DML however having it present allows better validation of the structure within the tests. Special per-dialect DML is explicitly not cacheable at the moment, more as a proof of concept that third party DML constructs can exist as gracefully not-cacheable rather than producing an incomplete cache key. A few selected performance improvements have been added as well, simplifying the immutabledict.union() method and adding a new SQLCompiler function that can generate delimeter-separated clauses like WHERE and ORDER BY without having to build a ClauseList object at all. The use of ClauseList will be removed from Select in an upcoming commit. Overall, ClaustList is unnecessary for internal use and only adds overhead to statement construction and will likely be removed as much as possible except for explcit use of conjunctions like and_() and or_(). Change-Id: I408e0b8be91fddd77cf279da97f55020871f75a9
* Reorganize core event modules to avoid import cyclesMike Bayer2020-01-211-1/+1
| | | | | | | | | sqlalchemy.sql.naming was causing a full import of engine due to the DDLEvents dependency. Break out pool, DDL and engine events into new modules specific to those packages; resolve some other import cycles in Core also. Change-Id: Ife8d217e58a26ab3605dd80ee70837968f957eaf
* Test for short term reference cycles and resolve as many as possibleMike Bayer2019-12-301-0/+2
| | | | | | | | Added test support and repaired a wide variety of unnecessary reference cycles created for short-lived objects, mostly in the area of ORM queries. Fixes: #5056 Change-Id: Ifd93856eba550483f95f9ae63d49f36ab068b85a
* Ensure comparison includes "don't compare values" featureMike Bayer2019-12-201-0/+8
| | | | | | | | | | upcoming changes for "expanding IN in all cases" and "lambda elements" both rely upon comparisons that work across changing bound values, so commit the testing fixture ahead of time. Additionally, repair the feature itself within traversals. Change-Id: Ie65a512dc64745614180da77435f9f745ce78c71
* Merge remote-tracking branch 'origin/pr/5031'Mike Bayer2019-12-181-2/+2
|\ | | | | | | Change-Id: I74d94087495de2e0b98b180ef1b5269a72d0c3bf
| * fix typo strucures -> structuresFederico Caselli2019-12-111-2/+2
| |
* | Traversal and clause generation performance improvementsMike Bayer2019-12-141-110/+119
|/ | | | | | | Added one traversal test, callcounts have been brought from 29754 to 5173 so far. Change-Id: I164e9831600709ee214c1379bb215fdad73b39aa
* Add anonymizing context to cache keys, comparison; convert traversalMike Bayer2019-11-041-0/+768
Created new visitor system called "internal traversal" that applies a data driven approach to the concept of a class that defines its own traversal steps, in contrast to the existing style of traversal now known as "external traversal" where the visitor class defines the traversal, i.e. the SQLCompiler. The internal traversal system now implements get_children(), _copy_internals(), compare() and _cache_key() for most Core elements. Core elements with special needs like Select still implement some of these methods directly however most of these methods are no longer explicitly implemented. The data-driven system is also applied to ORM elements that take part in SQL expressions so that these objects, like mappers, aliasedclass, query options, etc. can all participate in the cache key process. Still not considered is that this approach to defining traversibility will be used to create some kind of generic introspection system that works across Core / ORM. It's also not clear if real statement caching using the _cache_key() method is feasible, if it is shown that running _cache_key() is nearly as expensive as compiling in any case. Because it is data driven, it is more straightforward to optimize using inlined code, as is the case now, as well as potentially using C code to speed it up. In addition, the caching sytem now accommodates for anonymous name labels, which is essential so that constructs which have anonymous labels can be cacheable, that is, their position within a statement in relation to other anonymous names causes them to generate an integer counter relative to that construct which will be the same every time. Gathering of bound parameters from any cache key generation is also now required as there is no use case for a cache key that does not extract bound parameter values. Applies-to: #4639 Change-Id: I0660584def8627cad566719ee98d3be045db4b8d