From ba9b4c4de86aa816e5ddc7a9cde9193c43835223 Mon Sep 17 00:00:00 2001
From: Sean McGivern <sean@gitlab.com>
Date: Fri, 27 Oct 2017 14:20:55 +0100
Subject: Avoid hitting statement timeout finding MR pipelines

For MRs with many thousands of commits, `SELECT DISTINCT(sha)` will be very
slow.

What we can't do to fix this:

1. Add an index. Postgres won't use it for DISTINCT without a lot of ceremony.
2. Do the `uniq` in Ruby. That can still be very slow with hundreds of
   thousands of commits.
3. Use a subquery. We haven't removed the `st_commits` column yet, but we will
   soon.

Until 3 is available to us, we can just do 2, but also add a limit clause. There
is no ordering, so this may return different results, but our goal with these
MRs is just to get them to load, so it's not a huge deal.
---
 app/models/merge_request.rb | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

(limited to 'app/models/merge_request.rb')

diff --git a/app/models/merge_request.rb b/app/models/merge_request.rb
index c3fae16d109..b0a0c753c09 100644
--- a/app/models/merge_request.rb
+++ b/app/models/merge_request.rb
@@ -874,7 +874,7 @@ class MergeRequest < ActiveRecord::Base
   #
   def all_commit_shas
     if persisted?
-      column_shas = MergeRequestDiffCommit.where(merge_request_diff: merge_request_diffs).pluck('DISTINCT(sha)')
+      column_shas = MergeRequestDiffCommit.where(merge_request_diff: merge_request_diffs).limit(10_000).pluck('sha')
       serialised_shas = merge_request_diffs.where.not(st_commits: nil).flat_map(&:commit_shas)
 
       (column_shas + serialised_shas).uniq
-- 
cgit v1.2.1