summaryrefslogtreecommitdiff
path: root/src/backend/optimizer/path
diff options
context:
space:
mode:
authorTom Lane <tgl@sss.pgh.pa.us>2022-10-27 14:42:18 -0400
committerTom Lane <tgl@sss.pgh.pa.us>2022-10-27 14:42:18 -0400
commita5fc46414deb7cbcd4cec1275efac69b9ac10500 (patch)
treed17ba92470e28a3646c7316f178f7dd54188891b /src/backend/optimizer/path
parent4ab8c81bd90ae442dbd092df04a12dbb7e68f562 (diff)
downloadpostgresql-a5fc46414deb7cbcd4cec1275efac69b9ac10500.tar.gz
Avoid making commutatively-duplicate clauses in EquivalenceClasses.
When we decide we need to make a derived clause equating a.x and b.y, we already will re-use a previously-made clause "a.x = b.y". But we might instead have "b.y = a.x", which is perfectly usable because equivclass.c has never promised anything about the operand order in clauses it builds. Saving construction of a new RestrictInfo doesn't matter all that much in itself --- but because we cache selectivity estimates and so on per-RestrictInfo, there's a possibility of saving a fair amount of duplicative effort downstream. Hence, check for commutative matches as well as direct ones when seeing if we have a pre-existing clause. This changes the visible clause order in several regression test cases, but they're all clearly-insignificant changes. Checking for the reverse operand order is simple enough, but if we wanted to check for operator OID match we'd need to call get_commutator here, which is not so cheap. I concluded that we don't really need the operator check anyway, so I just removed it. It's unlikely that an opfamily contains more than one applicable operator for a given pair of operand datatypes; and if it does they had better give the same answers, so there seems little need to insist that we use exactly the one select_equality_operator chose. Using the current core regression suite as a test case, I see this change reducing the number of new join clauses built by create_join_clause from 9673 to 5142 (out of 26652 calls). So not quite 50% savings, but pretty close to it. Discussion: https://postgr.es/m/78062.1666735746@sss.pgh.pa.us
Diffstat (limited to 'src/backend/optimizer/path')
-rw-r--r--src/backend/optimizer/path/equivclass.c28
1 files changed, 20 insertions, 8 deletions
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index f962ff82ad..e65b967b1f 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -1382,7 +1382,9 @@ generate_base_implied_equalities_broken(PlannerInfo *root,
* whenever we select a particular pair of EquivalenceMembers to join,
* we check to see if the pair matches any original clause (in ec_sources)
* or previously-built clause (in ec_derives). This saves memory and allows
- * re-use of information cached in RestrictInfos.
+ * re-use of information cached in RestrictInfos. We also avoid generating
+ * commutative duplicates, i.e. if the algorithm selects "a.x = b.y" but
+ * we already have "b.y = a.x", we return the existing clause.
*
* join_relids should always equal bms_union(outer_relids, inner_rel->relids).
* We could simplify this function's API by computing it internally, but in
@@ -1790,7 +1792,8 @@ select_equality_operator(EquivalenceClass *ec, Oid lefttype, Oid righttype)
/*
* create_join_clause
* Find or make a RestrictInfo comparing the two given EC members
- * with the given operator.
+ * with the given operator (or, possibly, its commutator, because
+ * the ordering of the operands in the result is not guaranteed).
*
* parent_ec is either equal to ec (if the clause is a potentially-redundant
* join clause) or NULL (if not). We have to treat this as part of the
@@ -1811,16 +1814,22 @@ create_join_clause(PlannerInfo *root,
/*
* Search to see if we already built a RestrictInfo for this pair of
* EquivalenceMembers. We can use either original source clauses or
- * previously-derived clauses. The check on opno is probably redundant,
- * but be safe ...
+ * previously-derived clauses, and a commutator clause is acceptable.
+ *
+ * We used to verify that opno matches, but that seems redundant: even if
+ * it's not identical, it'd better have the same effects, or the operator
+ * families we're using are broken.
*/
foreach(lc, ec->ec_sources)
{
rinfo = (RestrictInfo *) lfirst(lc);
if (rinfo->left_em == leftem &&
rinfo->right_em == rightem &&
- rinfo->parent_ec == parent_ec &&
- opno == ((OpExpr *) rinfo->clause)->opno)
+ rinfo->parent_ec == parent_ec)
+ return rinfo;
+ if (rinfo->left_em == rightem &&
+ rinfo->right_em == leftem &&
+ rinfo->parent_ec == parent_ec)
return rinfo;
}
@@ -1829,8 +1838,11 @@ create_join_clause(PlannerInfo *root,
rinfo = (RestrictInfo *) lfirst(lc);
if (rinfo->left_em == leftem &&
rinfo->right_em == rightem &&
- rinfo->parent_ec == parent_ec &&
- opno == ((OpExpr *) rinfo->clause)->opno)
+ rinfo->parent_ec == parent_ec)
+ return rinfo;
+ if (rinfo->left_em == rightem &&
+ rinfo->right_em == leftem &&
+ rinfo->parent_ec == parent_ec)
return rinfo;
}