From ede2e0d2549c476195766158aaf91c3ab2fcd31c Mon Sep 17 00:00:00 2001 From: "igor@rurik.mysql.com" <> Date: Sun, 11 Dec 2005 22:21:19 -0800 Subject: item_sum.h, item.cc: Post review corrections in comments. --- sql/item_sum.h | 45 +++++++++++++++++++++++---------------------- 1 file changed, 23 insertions(+), 22 deletions(-) (limited to 'sql/item_sum.h') diff --git a/sql/item_sum.h b/sql/item_sum.h index eddc3ed8446..a8242d76287 100644 --- a/sql/item_sum.h +++ b/sql/item_sum.h @@ -30,7 +30,7 @@ GENERAL NOTES - A set function can be used not in any position where an expression is + A set function cannot be used in certain positions where expressions are accepted. There are some quite explicable restrictions for the usage of set functions. @@ -60,36 +60,37 @@ for each group defined in the main query, not for groups of the subquery. The problem of finding the query where to aggregate a particular - set function is not so simple is it seems to be. + set function is not so simple as it seems to be. In the query: SELECT t1.a FROM t1 GROUP BY t1.a HAVING t1.a > ALL(SELECT t2.c FROM t2 GROUP BY t2.c HAVING SUM(t1.a) < t2.c) the set function can be evaluated for both outer and inner selects. - If we evaluate SUM(t.a) for the outer query then we get the value of t.a + If we evaluate SUM(t1.a) for the outer query then we get the value of t1.a multiplied by the cardinality of a group in table t1. In this case in each correlated subquery SUM(t1.a) is used as a constant. But we also - can evaluate SUM(t.a) for the inner query. In this case t.a will be a + can evaluate SUM(t1.a) for the inner query. In this case t1.a will be a constant for each correlated subquery and summation is performed for each group of table t2. (Here it makes sense to remind that the query SELECT c FROM t GROUP BY a HAVING SUM(1) < a is quite legal in our SQL). - So depending on to what query we assign the set function we + So depending on what query we assign the set function to we can get different result sets. The general rule to detect the query where a set function is to be evaluated can be formulated as follows. Consider a set function S(E) where E is an expression with occurrences - of column references C1, ..., CN. Resolve these references against - subqueries whose subexpression the set function S(E) is. Let Q be the - most inner subquery of those subqueries. (It should be noted here that S(E) - in no way can be evaluated in the subquery embedding the subquery Q.) - If S(E) is used in a construct of Q where set function are allowed then + of column references C1, ..., CN. Resolve these column references against + subqueries that contain the set function S(E). Let Q be the innermost + subquery of those subqueries. (It should be noted here that S(E) + in no way can be evaluated in the subquery embedding the subquery Q, + otherwise S(E) would refer to at least one unbound column reference) + If S(E) is used in a construct of Q where set functions are allowed then we evaluate S(E) in Q. - Otherwise we look for a most inner subquery containing S(E) of those where + Otherwise we look for a innermost subquery containing S(E) of those where usage of S(E) is allowed. Let's demonstrate how this rule is applied to the following queries. @@ -99,7 +100,7 @@ HAVING t2.b > ALL(SELECT t3.c FROM t3 GROUP BY t3.c HAVING SUM(t1.a+t2.b) < t3.c)) For this query the set function SUM(t1.a+t2.b) depends on t1.a and t2.b - with t1.a defined in the most outer query, and t2.b defined for its + with t1.a defined in the outermost query, and t2.b defined for its subquery. The set function is in the HAVING clause of the subquery and can be evaluated in this subquery. @@ -110,8 +111,8 @@ Here the set function SUM(t1.a+t2.b)is in the WHERE clause of the second subquery - the most upper subquery where t1.a and t2.b are defined. If we evaluate the function in this subquery we violate the context rules. - So we evaluate the function in the third subquery where it is used under - the HAVING clause. + So we evaluate the function in the third subquery (over table t3) where it + is used under the HAVING clause. 3. SELECT t1.a FROM t1 GROUP BY t1.a HAVING t1.a > ALL(SELECT t2.b FROM t2 @@ -123,7 +124,7 @@ Mostly set functions cannot be nested. In the query SELECT t1.a from t1 GROUP BY t1.a HAVING AVG(SUM(t1.b)) > 20 the expression SUM(b) is not acceptable, though it is under a HAVING clause. - It is acceptable in the query: + Yet it is acceptable in the query: SELECT t.1 FROM t1 GROUP BY t1.a HAVING SUM(t1.b) > 20. An argument of a set function does not have to be a reference to a table @@ -138,7 +139,7 @@ SELECT t1.a FROM t1 GROUP BY t1.a HAVING t1.a IN (SELECT t2.c FROM t2 GROUP BY t2.c HAVING AVG(t2.c+SUM(t1.b)) > 20) - is still clear too. For a group of the rows with the same t1.a values we + is still clear. For a group of the rows with the same t1.a values we calculate the value of SUM(t1.b). This value 's' is substituted in the the subquery: SELECT t2.c FROM t2 GROUP BY t2.c HAVING AVG(t2.c+s) @@ -153,11 +154,11 @@ IMPLEMENTATION NOTES Three methods were added to the class to check the constraints specified - in the previous section. This methods utilize several new members. + in the previous section. These methods utilize several new members. The field 'nest_level' contains the number of the level for the subquery containing the set function. The main SELECT is of level 0, its subqueries - are of levels 1, the subqueries of the latters are of level 2 and so on. + are of levels 1, the subqueries of the latter are of level 2 and so on. The field 'aggr_level' is to contain the nest level of the subquery where the set function is aggregated. @@ -174,7 +175,7 @@ HAVING t2.b > ALL(SELECT t3.c FROM t3 GROUP BY t3.c HAVING SUM(t1.a+t2.b) < t3.c)) the value of max_arg_level is equal to 1 since t1.a is bound in the main - query, and t2.b is bound by the first subquery whose nest level 1. + query, and t2.b is bound by the first subquery whose nest level is 1. Obviously a set function cannot be aggregated in the subquery whose nest level is less than max_arg_level. (Yet it can be aggregated in the subqueries whose nest level is greater than max_arg_level.) @@ -192,11 +193,11 @@ for set function s0 if s1 is not calculated in any subquery within s0. - A set function that as a subexpression in an argument of another set - function refers to the latter via the field 'in_sum_func'. + A set function that is used as a subexpression in an argument of another + set function refers to the latter via the field 'in_sum_func'. The condition imposed on the usage of set functions are checked when - we traverse query subexpressions with the help of recursive method + we traverse query subexpressions with the help of the recursive method fix_fields. When we apply this method to an object of the class Item_sum, first, on the descent, we call the method init_sum_func_check that initialize members used at checking. Then, on the ascent, we -- cgit v1.2.1