Turn on caching everywhere, add logging

A variety of caching issues found by running all tests with statement caching turned on. The cache system now has a more conservative approach where any subclass of a SQL element will by default invalidate the cache key unless it adds the flag inherit_cache=True at the class level, or if it implements its own caching. Add working caching to a few elements that were omitted previously; fix some caching implementations to suit lesser used edge cases such as json casts and array slices. Refine the way BaseCursorResult and CursorMetaData interact with caching; to suit cases like Alembic modifying table structures, don't cache the cursor metadata if it were created against a cursor.description using non-positional matching, e.g. "select *". if a table re-ordered its columns or added/removed, now that data is obsolete. Additionally we have to adapt the cursor metadata _keymap regardless of if we just processed cursor.description, because if we ran against a cached SQLCompiler we won't have the right columns in _keymap. Other refinements to how and when we do this adaption as some weird cases were exposed in the Postgresql dialect, a text() construct that names just one column that is not actually in the statement. Fixed that also as it looks like a cut-and-paste artifact that doesn't actually affect anything. Various issues with re-use of compiled result maps and cursor metadata in conjunction with tables being changed, such as change in order of columns. mappers can be cleared but the class remains, meaning a mapper has to use itself as the cache key not the class. lots of bound parameter / literal issues, due to Alembic creating a straight subclass of bindparam that renders inline directly. While we can update Alembic to not do this, we have to assume other people might be doing this, so bindparam() implements the inherit_cache=True logic as well that was a bit involved. turn on cache stats in logging. Includes a fix to subqueryloader which moves all setup to the create_row_processor() phase and elminates any storage within the compiled context. This includes some changes to create_row_processor() signature and a revising of the technique used to determine if the loader can participate in polymorphic queries, which is also applied to selectinloading. DML update.values() and ordered_values() now coerces the keys as we have tests that pass an arbitrary class here which only includes __clause_element__(), so the key can't be cached unless it is coerced. this in turn changed how composite attributes support bulk update to use the standard approach of ClauseElement with annotations that are parsed in the ORM context. memory profiling successfully caught that the Session from Query was getting passed into _statement_20() so that was a big win for that test suite. Apparently Compiler had .execute() and .scalar() methods stuck on it, these date back to version 0.4 and there was a single test in the PostgreSQL dialect tests that exercised it for no apparent reason. Removed these methods as well as the concept of a Compiler holding onto a "bind". Fixes: #5386 Change-Id: I990b43aab96b42665af1b2187ad6020bee778784
author: Mike Bayer <mike_mp@zzzcomputing.com> 2020-06-06 20:40:43 -0400
committer: Mike Bayer <mike_mp@zzzcomputing.com> 2020-06-10 15:29:01 -0400
commit: b0cfa7379cf8513a821a3dbe3028c4965d9f85bd (patch)
tree: 19a79632b4f159092d955765ff9f7e842808bce7 /lib/sqlalchemy/sql/traversals.py
parent: 3ab2364e78641c4f0e4b6456afc2cbed39b0d0e6 (diff)
download: sqlalchemy-b0cfa7379cf8513a821a3dbe3028c4965d9f85bd.tar.gz
1 files changed, 112 insertions, 17 deletions
diff --git a/lib/sqlalchemy/sql/traversals.py b/lib/sqlalchemy/sql/traversals.py
index 68281f33d..ed0bfa27a 100644
--- a/lib/sqlalchemy/sql/traversals.py
+++ b/lib/sqlalchemy/sql/traversals.py
@@ -19,6 +19,7 @@ NO_CACHE = util.symbol("no_cache")
 CACHE_IN_PLACE = util.symbol("cache_in_place")
 CALL_GEN_CACHE_KEY = util.symbol("call_gen_cache_key")
 STATIC_CACHE_KEY = util.symbol("static_cache_key")
+PROPAGATE_ATTRS = util.symbol("propagate_attrs")
 ANON_NAME = util.symbol("anon_name")
 
 
@@ -31,10 +32,74 @@ def compare(obj1, obj2, **kw):
     return strategy.compare(obj1, obj2, **kw)
 
 
+def _preconfigure_traversals(target_hierarchy):
+
+    stack = [target_hierarchy]
+    while stack:
+        cls = stack.pop()
+        stack.extend(cls.__subclasses__())
+
+        if hasattr(cls, "_traverse_internals"):
+            cls._generate_cache_attrs()
+            _copy_internals.generate_dispatch(
+                cls,
+                cls._traverse_internals,
+                "_generated_copy_internals_traversal",
+            )
+            _get_children.generate_dispatch(
+                cls,
+                cls._traverse_internals,
+                "_generated_get_children_traversal",
+            )
+
+
 class HasCacheKey(object):
     _cache_key_traversal = NO_CACHE
     __slots__ = ()
 
+    @classmethod
+    def _generate_cache_attrs(cls):
+        """generate cache key dispatcher for a new class.
+
+        This sets the _generated_cache_key_traversal attribute once called
+        so should only be called once per class.
+
+        """
+        inherit = cls.__dict__.get("inherit_cache", False)
+
+        if inherit:
+            _cache_key_traversal = getattr(cls, "_cache_key_traversal", None)
+            if _cache_key_traversal is None:
+                try:
+                    _cache_key_traversal = cls._traverse_internals
+                except AttributeError:
+                    cls._generated_cache_key_traversal = NO_CACHE
+                    return NO_CACHE
+
+            # TODO: wouldn't we instead get this from our superclass?
+            # also, our superclass may not have this yet, but in any case,
+            # we'd generate for the superclass that has it.   this is a little
+            # more complicated, so for the moment this is a little less
+            # efficient on startup but simpler.
+            return _cache_key_traversal_visitor.generate_dispatch(
+                cls, _cache_key_traversal, "_generated_cache_key_traversal"
+            )
+        else:
+            _cache_key_traversal = cls.__dict__.get(
+                "_cache_key_traversal", None
+            )
+            if _cache_key_traversal is None:
+                _cache_key_traversal = cls.__dict__.get(
+                    "_traverse_internals", None
+                )
+                if _cache_key_traversal is None:
+                    cls._generated_cache_key_traversal = NO_CACHE
+                    return NO_CACHE
+
+            return _cache_key_traversal_visitor.generate_dispatch(
+                cls, _cache_key_traversal, "_generated_cache_key_traversal"
+            )
+
     @util.preload_module("sqlalchemy.sql.elements")
     def _gen_cache_key(self, anon_map, bindparams):
         """return an optional cache key.
@@ -72,14 +137,18 @@ class HasCacheKey(object):
         else:
             id_ = None
 
-        _cache_key_traversal = self._cache_key_traversal
-        if _cache_key_traversal is None:
-            try:
-                _cache_key_traversal = self._traverse_internals
-            except AttributeError:
-                _cache_key_traversal = NO_CACHE
+        try:
+            dispatcher = self.__class__.__dict__[
+                "_generated_cache_key_traversal"
+            ]
+        except KeyError:
+            # most of the dispatchers are generated up front
+            # in sqlalchemy/sql/__init__.py ->
+            # traversals.py-> _preconfigure_traversals().
+            # this block will generate any remaining dispatchers.
+            dispatcher = self.__class__._generate_cache_attrs()
 
-        if _cache_key_traversal is NO_CACHE:
+        if dispatcher is NO_CACHE:
             if anon_map is not None:
                 anon_map[NO_CACHE] = True
             return None
@@ -87,19 +156,13 @@ class HasCacheKey(object):
         result = (id_, self.__class__)
 
         # inline of _cache_key_traversal_visitor.run_generated_dispatch()
-        try:
-            dispatcher = self.__class__.__dict__[
-                "_generated_cache_key_traversal"
-            ]
-        except KeyError:
-            dispatcher = _cache_key_traversal_visitor.generate_dispatch(
-                self, _cache_key_traversal, "_generated_cache_key_traversal"
-            )
 
         for attrname, obj, meth in dispatcher(
             self, _cache_key_traversal_visitor
         ):
             if obj is not None:
+                # TODO: see if C code can help here as Python lacks an
+                # efficient switch construct
                 if meth is CACHE_IN_PLACE:
                     # cache in place is always going to be a Python
                     # tuple, dict, list, etc. so we can do a boolean check
@@ -116,6 +179,15 @@ class HasCacheKey(object):
                         attrname,
                         obj._gen_cache_key(anon_map, bindparams),
                     )
+                elif meth is PROPAGATE_ATTRS:
+                    if obj:
+                        result += (
+                            attrname,
+                            obj["compile_state_plugin"],
+                            obj["plugin_subject"]._gen_cache_key(
+                                anon_map, bindparams
+                            ),
+                        )
                 elif meth is InternalTraversal.dp_annotations_key:
                     # obj is here is the _annotations dict.   however,
                     # we want to use the memoized cache key version of it.
@@ -332,6 +404,8 @@ class _CacheKey(ExtendedInternalTraversal):
     visit_type = STATIC_CACHE_KEY
     visit_anon_name = ANON_NAME
 
+    visit_propagate_attrs = PROPAGATE_ATTRS
+
     def visit_inspectable(self, attrname, obj, parent, anon_map, bindparams):
         return (attrname, inspect(obj)._gen_cache_key(anon_map, bindparams))
 
@@ -445,10 +519,16 @@ class _CacheKey(ExtendedInternalTraversal):
     def visit_setup_join_tuple(
         self, attrname, obj, parent, anon_map, bindparams
     ):
+        is_legacy = "legacy" in attrname
+
         return tuple(
             (
-                target._gen_cache_key(anon_map, bindparams),
-                onclause._gen_cache_key(anon_map, bindparams)
+                target
+                if is_legacy and isinstance(target, str)
+                else target._gen_cache_key(anon_map, bindparams),
+                onclause
+                if is_legacy and isinstance(onclause, str)
+                else onclause._gen_cache_key(anon_map, bindparams)
                 if onclause is not None
                 else None,
                 from_._gen_cache_key(anon_map, bindparams)
@@ -711,6 +791,11 @@ class _CopyInternals(InternalTraversal):
             for sequence in element
         ]
 
+    def visit_propagate_attrs(
+        self, attrname, parent, element, clone=_clone, **kw
+    ):
+        return element
+
 
 _copy_internals = _CopyInternals()
 
@@ -782,6 +867,9 @@ class _GetChildren(InternalTraversal):
     def visit_dml_multi_values(self, element, **kw):
         return ()
 
+    def visit_propagate_attrs(self, element, **kw):
+        return ()
+
 
 _get_children = _GetChildren()
 
@@ -916,6 +1004,13 @@ class TraversalComparatorStrategy(InternalTraversal, util.MemoizedSlots):
         ):
             return COMPARE_FAILED
 
+    def visit_propagate_attrs(
+        self, attrname, left_parent, left, right_parent, right, **kw
+    ):
+        return self.compare_inner(
+            left.get("plugin_subject", None), right.get("plugin_subject", None)
+        )
+
     def visit_has_cache_key_list(
         self, attrname, left_parent, left, right_parent, right, **kw
     ):
author	Mike Bayer <mike_mp@zzzcomputing.com>	2020-06-06 20:40:43 -0400
committer	Mike Bayer <mike_mp@zzzcomputing.com>	2020-06-10 15:29:01 -0400
commit	b0cfa7379cf8513a821a3dbe3028c4965d9f85bd (patch)
tree	19a79632b4f159092d955765ff9f7e842808bce7 /lib/sqlalchemy/sql/traversals.py
parent	3ab2364e78641c4f0e4b6456afc2cbed39b0d0e6 (diff)
download	sqlalchemy-b0cfa7379cf8513a821a3dbe3028c4965d9f85bd.tar.gz