Extend CTE search optimisation to projectsextend-cte-optimisations-to-projects

When we use the `search` param on an `IssuableFinder`, we can run into issues. We have trigram indexes to support these searches. On GitLab.com, we often see Postgres's optimiser prioritise the (global) trigram indexes over the index on `project_id`. For group and project searches, we know that it will be quicker to filter by `project_id` first, as it returns fewer rows in most cases. For group issues search, we ran into this issue previously, and went through the following iterations: 1. Use a CTE on the project IDs as an optimisation fence. This prevents the planner from disregarding the index on `project_id`. Unfortunately it breaks some types of sorting, like priority and popularity, as they sort on a joined table. 2. Use a subquery for listing issues, and a CTE for counts. The subquery - in the case of group lists - didn't help as much as the CTE, but was faster than not including it. We can safely use a CTE for counts as they don't have sorting. Now, however, we're seeing the same issue in a project context. The subquery doesn't help at all there (it would only return one row, after all). In an attempt to keep total code complexity under control, this commit removes the subquery optimisation and applies the CTE optimisation only for sorts we know that are safe. This means that for more complicated sorts (like priority and popularity), the search will continue to be very slow. If this is a high-priority issue, we can consider introducing further optimisations, but this finder is already very complicated and additional complexity has a cost. The group CTE optimisation is controlled by the same feature flag as before, `attempt_group_search_optimizations`, which is enabled by default. The new project CTE optimisation is controlled by a new feature flag, `attempt_project_search_optimizations`, which is disabled by default.
author: Sean McGivern <sean@gitlab.com> 2019-04-03 10:46:13 +0100
committer: Sean McGivern <sean@gitlab.com> 2019-04-04 12:36:22 +0100
commit: 10ceb33ba271f603fa09d4a4b5fdca03fd7ea333 (patch)
tree: f96b0e786a4e2ef96287ac2099cb7fd89d8e3ebb /app
parent: f87b7fe3b386962c45e83486634352da544857fb (diff)
download: gitlab-ce-10ceb33ba271f603fa09d4a4b5fdca03fd7ea333.tar.gz
4 files changed, 40 insertions, 39 deletions
diff --git a/app/controllers/concerns/issuable_collections.rb b/app/controllers/concerns/issuable_collections.rb
index c529aabf797..6d6e0cc6c7f 100644
--- a/app/controllers/concerns/issuable_collections.rb
+++ b/app/controllers/concerns/issuable_collections.rb
@@ -100,6 +100,7 @@ module IssuableCollections
 
     if @project
       options[:project_id] = @project.id
+      options[:attempt_project_search_optimizations] = true
     elsif @group
       options[:group_id] = @group.id
       options[:include_subgroups] = true
diff --git a/app/finders/issuable_finder.rb b/app/finders/issuable_finder.rb
index 6eab8c5ee51..fa9dda2ab31 100644
--- a/app/finders/issuable_finder.rb
+++ b/app/finders/issuable_finder.rb
@@ -84,7 +84,7 @@ class IssuableFinder
     # https://www.postgresql.org/docs/current/static/queries-with.html
     items = by_search(items)
 
-    items = sort(items) unless use_cte_for_count?
+    items = sort(items)
 
     items
   end
@@ -92,7 +92,6 @@ class IssuableFinder
   def filter_items(items)
     items = by_project(items)
     items = by_group(items)
-    items = by_subquery(items)
     items = by_scope(items)
     items = by_created_at(items)
     items = by_updated_at(items)
@@ -132,10 +131,12 @@ class IssuableFinder
     # This does not apply when we are using a CTE for the search, as the labels
     # GROUP BY is inside the subquery in that case, so we set labels_count to 1.
     #
-    # We always use CTE when searching in Groups if the feature flag is enabled,
-    # but never when searching in Projects.
+    # Groups and projects have separate feature flags to suggest the use
+    # of a CTE. The CTE will not be used if the sort doesn't support it,
+    # but will always be used for the counts here as we ignore sorting
+    # anyway.
     labels_count = label_names.any? ? label_names.count : 1
-    labels_count = 1 if use_cte_for_count?
+    labels_count = 1 if use_cte_for_search?
 
     finder.execute.reorder(nil).group(:state).count.each do |key, value|
       counts[count_key(key)] += value / labels_count
@@ -309,15 +310,14 @@ class IssuableFinder
   end
   # rubocop: enable CodeReuse/ActiveRecord
 
-  def use_subquery_for_search?
-    strong_memoize(:use_subquery_for_search) do
-      !force_cte? && attempt_group_search_optimizations?
-    end
-  end
+  def use_cte_for_search?
+    strong_memoize(:use_cte_for_search) do
+      next false unless search
+      next false unless Gitlab::Database.postgresql?
+      # Only simple unsorted & simple sorts can use CTE
+      next false if params[:sort].present? && !params[:sort].in?(klass.simple_sorts.keys)
 
-  def use_cte_for_count?
-    strong_memoize(:use_cte_for_count) do
-      force_cte? && attempt_group_search_optimizations?
+      attempt_group_search_optimizations? || attempt_project_search_optimizations?
     end
   end
 
@@ -332,12 +332,15 @@ class IssuableFinder
   end
 
   def attempt_group_search_optimizations?
-    search &&
-      Gitlab::Database.postgresql? &&
-      params[:attempt_group_search_optimizations] &&
+    params[:attempt_group_search_optimizations] &&
       Feature.enabled?(:attempt_group_search_optimizations, default_enabled: true)
   end
 
+  def attempt_project_search_optimizations?
+    params[:attempt_project_search_optimizations] &&
+      Feature.enabled?(:attempt_project_search_optimizations)
+  end
+
   def count_key(value)
     Array(value).last.to_sym
   end
@@ -408,20 +411,11 @@ class IssuableFinder
   end
   # rubocop: enable CodeReuse/ActiveRecord
 
-  # Wrap projects and groups in a subquery if the conditions are met.
-  def by_subquery(items)
-    if use_subquery_for_search?
-      klass.where(id: items.select(:id)) # rubocop: disable CodeReuse/ActiveRecord
-    else
-      items
-    end
-  end
-
   # rubocop: disable CodeReuse/ActiveRecord
   def by_search(items)
     return items unless search
 
-    if use_cte_for_count?
+    if use_cte_for_search?
       cte = Gitlab::SQL::RecursiveCTE.new(klass.table_name)
       cte << items
 
diff --git a/app/models/concerns/issuable.rb b/app/models/concerns/issuable.rb
index 51a8395c013..17f94b4bd9b 100644
--- a/app/models/concerns/issuable.rb
+++ b/app/models/concerns/issuable.rb
@@ -172,6 +172,10 @@ module Issuable
       fuzzy_search(query, matched_columns)
     end
 
+    def simple_sorts
+      super.except('name_asc', 'name_desc')
+    end
+
     def sort_by_attribute(method, excluded_labels: [])
       sorted =
         case method.to_s
diff --git a/app/models/concerns/sortable.rb b/app/models/concerns/sortable.rb
index 29e48f0c5f7..df1a9e3fe6e 100644
--- a/app/models/concerns/sortable.rb
+++ b/app/models/concerns/sortable.rb
@@ -21,19 +21,21 @@ module Sortable
 
   class_methods do
     def order_by(method)
-      case method.to_s
-      when 'created_asc'  then order_created_asc
-      when 'created_date' then order_created_desc
-      when 'created_desc' then order_created_desc
-      when 'id_asc'       then order_id_asc
-      when 'id_desc'      then order_id_desc
-      when 'name_asc'     then order_name_asc
-      when 'name_desc'    then order_name_desc
-      when 'updated_asc'  then order_updated_asc
-      when 'updated_desc' then order_updated_desc
-      else
-        all
-      end
+      simple_sorts.fetch(method.to_s, -> { all }).call
+    end
+
+    def simple_sorts
+      {
+        'created_asc' => -> { order_created_asc },
+        'created_date' => -> { order_created_desc },
+        'created_desc' => -> { order_created_desc },
+        'id_asc' => -> { order_id_asc },
+        'id_desc' => -> { order_id_desc },
+        'name_asc' => -> { order_name_asc },
+        'name_desc' => -> { order_name_desc },
+        'updated_asc' => -> { order_updated_asc },
+        'updated_desc' => -> { order_updated_desc }
+      }
     end
 
     private
author	Sean McGivern <sean@gitlab.com>	2019-04-03 10:46:13 +0100
committer	Sean McGivern <sean@gitlab.com>	2019-04-04 12:36:22 +0100
commit	10ceb33ba271f603fa09d4a4b5fdca03fd7ea333 (patch)
tree	f96b0e786a4e2ef96287ac2099cb7fd89d8e3ebb /app
parent	f87b7fe3b386962c45e83486634352da544857fb (diff)
download	gitlab-ce-10ceb33ba271f603fa09d4a4b5fdca03fd7ea333.tar.gz