summaryrefslogtreecommitdiff
path: root/src/mongo/db/query/get_executor.h
diff options
context:
space:
mode:
authorJustin Seyster <justin.seyster@mongodb.com>2019-03-27 19:01:18 -0400
committerJustin Seyster <justin.seyster@mongodb.com>2019-03-28 17:17:27 -0400
commite73da48e26048cb5ca2120acadac2d9c2c8ee403 (patch)
tree1a562334c2cf2bbfc7629f9723439d564c20e847 /src/mongo/db/query/get_executor.h
parentb885fa6feb7da00dc367e917c53ba16a41b75af4 (diff)
downloadmongo-e73da48e26048cb5ca2120acadac2d9c2c8ee403.tar.gz
SERVER-40089 $group optimized with DISTINCT_SCAN cannot use $$ROOT
The getExecutorDistinct() function is responsible for both creating an executor for the distinct command and creating an executor for a $group that has been optimized with a DISTINCT_SCAN (see commit da63195). These two scenarios have different requirements for their projection, and getExecutorDistinct() distinguished the two by assuming any caller with an empty ({}) projection wanted the distinct command projection. However, a $first accumulator with $$ROOT requires the entire document, so the logic that builds an optimized $group executor generates an empty projection for this case as well. When that happens, getExecutorDistinct() mistakenly chooses the projection that the distinct command wants, and when the pipeline evaluates $$ROOT, it only gets to see a small subset of fields in the document. This patch modifies getExecutorDistinct() so that the caller must explicitly state what projection it wants. That means that the distinct command no longer passes an empty projection to indicate that it wants to project on just the distinct field. Instead, the distinct command computes the projection for the distinct field on its own and includes that projection in the ParsedDistinct object that it passes to getExecutorDistinct().
Diffstat (limited to 'src/mongo/db/query/get_executor.h')
-rw-r--r--src/mongo/db/query/get_executor.h12
1 files changed, 8 insertions, 4 deletions
diff --git a/src/mongo/db/query/get_executor.h b/src/mongo/db/query/get_executor.h
index 4eecc309d99..0e97efa0fe0 100644
--- a/src/mongo/db/query/get_executor.h
+++ b/src/mongo/db/query/get_executor.h
@@ -150,12 +150,12 @@ bool turnIxscanIntoDistinctIxscan(QuerySolution* soln,
* or an aggregation pipeline that uses a $group stage with distinct-like semantics.
*
* Distinct is unique in that it doesn't care about getting all the results; it just wants all
- * possible values of a certain field. As such, we can skip lots of data in certain cases (see
- * body of method for detail).
+ * possible values of a certain field. As such, we can skip lots of data in certain cases (see body
+ * of method for detail).
*
* A $group stage on a single field behaves similarly to a distinct command. If it has no
- * accumulators or only $first accumulators, the $group command only needs to visit one document
- * for each distinct value of the grouped-by (_id) field to compute its result. When there is a sort
+ * accumulators or only $first accumulators, the $group command only needs to visit one document for
+ * each distinct value of the grouped-by (_id) field to compute its result. When there is a sort
* order specified in parsedDistinct->getQuery()->getQueryRequest.getSort(), the DISTINCT_SCAN will
* follow that sort order, ensuring that it chooses the correct document from each group to compute
* any $first accumulators.
@@ -166,6 +166,10 @@ bool turnIxscanIntoDistinctIxscan(QuerySolution* soln,
* DISTINCT_SCAN to filter some but not all duplicates (so that de-duplication is still necessary
* after query execution), or it may fall back to a regular IXSCAN.
*
+ * Note that this function uses the projection in 'parsedDistinct' to produce a covered query when
+ * possible, but when a covered query is not possible, the resulting plan may elide the projection
+ * stage (instead returning entire fetched documents).
+ *
* For example, a distinct query on field 'b' could use a DISTINCT_SCAN over index {a: 1, b: 1}.
* This plan will reduce the output set by filtering out documents that are equal on both the 'a'
* and 'b' fields, but it could still output documents with equal 'b' values if their 'a' fields are