summaryrefslogtreecommitdiff
path: root/src/backend/nodes
Commit message (Collapse)AuthorAgeFilesLines
...
* Support Parallel Append plan nodes.Robert Haas2017-12-054-0/+41
| | | | | | | | | | | | | | | | | | | | When we create an Append node, we can spread out the workers over the subplans instead of piling on to each subplan one at a time, which should typically be a bit more efficient, both because the startup cost of any plan executed entirely by one worker is paid only once and also because of reduced contention. We can also construct Append plans using a mix of partial and non-partial subplans, which may allow for parallelism in places that otherwise couldn't support it. Unfortunately, this patch doesn't handle the important case of parallelizing UNION ALL by running each branch in a separate worker; the executor infrastructure is added here, but more planner work is needed. Amit Khandekar, Robert Haas, Amul Sul, reviewed and tested by Ashutosh Bapat, Amit Langote, Rafia Sabih, Amit Kapila, and Rajkumar Raghuwanshi. Discussion: http://postgr.es/m/CAJ3gD9dy0K_E8r727heqXoBmWZ83HwLFwdcaSSmBQ1+S+vRuUQ@mail.gmail.com
* Remove extra word from comment.Robert Haas2017-11-301-2/+2
| | | | | | | | | David Rowley, who also was the primary author of the patch that added this function; the attribution in my previous commit, 84940644de931f331433b35e3a391822671f8c9c, was incorrect due to sloppiness on my part. Discussion: http://postgr.es/m/CAKJS1f_0iSiLQsf_c06AzOWAc3eS6ePjjVQFpcFv3W-O5aktnQ@mail.gmail.com
* SQL proceduresPeter Eisentraut2017-11-302-0/+28
| | | | | | | | | | | | | | | | | | | | | This adds a new object type "procedure" that is similar to a function but does not have a return type and is invoked by the new CALL statement instead of SELECT or similar. This implementation is aligned with the SQL standard and compatible with or similar to other SQL implementations. This commit adds new commands CALL, CREATE/ALTER/DROP PROCEDURE, as well as ALTER/DROP ROUTINE that can refer to either a function or a procedure (or an aggregate function, as an extension to SQL). There is also support for procedures in various utility commands such as COMMENT and GRANT, as well as support in pg_dump and psql. Support for defining procedures is available in all the languages supplied by the core distribution. While this commit is mainly syntax sugar around existing functionality, future features will rely on having procedures as a separate object type. Reviewed-by: Andrew Dunstan <andrew.dunstan@2ndquadrant.com>
* New C function: bms_add_rangeRobert Haas2017-11-291-0/+72
| | | | | | | | | | | This will be used by pending patches to improve partition pruning. Amit Langote and Kyotaro Horiguchi, per a suggestion from David Rowley. Review and testing of the larger patch set of which this is a part by Ashutosh Bapat, David Rowley, Dilip Kumar, Jesper Pedersen, Rajkumar Raghuwanshi, Beena Emerson, Amul Sul, and Kyotaro Horiguchi. Discussion: http://postgr.es/m/098b9c71-1915-1a2a-8d52-1a7a50ce79e8@lab.ntt.co.jp
* Pass InitPlan values to workers via Gather (Merge).Robert Haas2017-11-163-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | If a PARAM_EXEC parameter is used below a Gather (Merge) but the InitPlan that computes it is attached to or above the Gather (Merge), force the value to be computed before starting parallelism and pass it down to all workers. This allows us to use parallelism in cases where it previously would have had to be rejected as unsafe. We do - in this case - lose the optimization that the value is only computed if it's actually used. An alternative strategy would be to have the first worker that needs the value compute it, but one downside of that approach is that we'd then need to select a parallel-safe path to compute the parameter value; it couldn't for example contain a Gather (Merge) node. At some point in the future, we might want to consider both approaches. Independent of that consideration, there is a great deal more work that could be done to make more kinds of PARAM_EXEC parameters parallel-safe. This infrastructure could be used to allow a Gather (Merge) on the inner side of a nested loop (although that's not a very appealing plan) and cases where the InitPlan is attached below the Gather (Merge) could be addressed as well using various techniques. But this is a good start. Amit Kapila, reviewed and revised by me. Reviewing and testing from Kuntal Ghosh, Haribabu Kommi, and Tushar Ahuja. Discussion: http://postgr.es/m/CAA4eK1LV0Y1AUV4cUCdC+sYOx0Z0-8NAJ2Pd9=UKsbQ5Sr7+JQ@mail.gmail.com
* Track in the plan the types associated with PARAM_EXEC parameters.Robert Haas2017-11-133-4/+4
| | | | | | | | | | | | Up until now, we only tracked the number of parameters, which was sufficient to allocate an array of Datums of the appropriate size, but not sufficient to, for example, know how to serialize a Datum stored in one of those slots. An upcoming patch wants to do that, so add this tracking to make it possible. Patch by me, reviewed by Tom Lane and Amit Kapila. Discussion: http://postgr.es/m/CA+TgmoYqpxDKn8koHdW8BEKk8FMUL0=e8m2Qe=M+r0UBjr3tuQ@mail.gmail.com
* Account for the effect of lossy pages when costing bitmap scans.Robert Haas2017-11-101-12/+25
| | | | | | | Dilip Kumar, reviewed by Alexander Kumenkov, Amul Sul, and me. Some final adjustments by me. Discussion: http://postgr.es/m/CAFiTN-sYtqUOXQ4SpuhTv0Z9gD0si3YxZGv_PQAAMX8qbOotcg@mail.gmail.com
* Add hash partitioning.Robert Haas2017-11-094-0/+8
| | | | | | | | | | | | | | | | | | | Hash partitioning is useful when you want to partition a growing data set evenly. This can be useful to keep table sizes reasonable, which makes maintenance operations such as VACUUM faster, or to enable partition-wise join. At present, we still depend on constraint exclusion for partitioning pruning, and the shape of the partition constraints for hash partitioning is such that that doesn't work. Work is underway to fix that, which should both improve performance and make partitioning pruning work with hash partitioning. Amul Sul, reviewed and tested by Dilip Kumar, Ashutosh Bapat, Yugo Nagata, Rajkumar Raghuwanshi, Jesper Pedersen, and by me. A few final tweaks also by me. Discussion: http://postgr.es/m/CAAJ_b96fhpJAP=ALbETmeLk1Uni_GFZD938zgenhF49qgDTjaQ@mail.gmail.com
* Change TRUE/FALSE to true/falsePeter Eisentraut2017-11-083-7/+7
| | | | | | | | | | | | | | The lower case spellings are C and C++ standard and are used in most parts of the PostgreSQL sources. The upper case spellings are only used in some files/modules. So standardize on the standard spellings. The APIs for ICU, Perl, and Windows define their own TRUE and FALSE, so those are left as is when using those APIs. In code comments, we use the lower-case spelling for the C concepts and keep the upper-case spelling for the SQL concepts. Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
* Support domains over composite types.Tom Lane2017-10-261-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | This is the last major omission in our domains feature: you can now make a domain over anything that's not a pseudotype. The major complication from an implementation standpoint is that places that might be creating tuples of a domain type now need to be prepared to apply domain_check(). It seems better that unprepared code fail with an error like "<type> is not composite" than that it silently fail to apply domain constraints. Therefore, relevant infrastructure like get_func_result_type() and lookup_rowtype_tupdesc() has been adjusted to treat domain-over-composite as a distinct case that unprepared code won't recognize, rather than just transparently treating it the same as plain composite. This isn't a 100% solution to the possibility of overlooked domain checks, but it catches most places. In passing, improve typcache.c's support for domains (it can now cache the identity of a domain's base type), and rewrite the argument handling logic in jsonfuncs.c's populate_record[set]_worker to reduce duplicative per-call lookups. I believe this is code-complete so far as the core and contrib code go. The PLs need varying amounts of work, which will be tackled in followup patches. Discussion: https://postgr.es/m/4206.1499798337@sss.pgh.pa.us
* pg_stat_statements: Widen query IDs from 32 bits to 64 bits.Robert Haas2017-10-112-3/+15
| | | | | | | | | | | | | | | | | | This takes advantage of the infrastructure introduced by commit 81c5e46c490e2426db243eada186995da5bb0ba7 to greatly reduce the likelihood that two different queries will end up with the same query ID. It's still possible, of course, but whereas before it the chances of a collision reached 25% around 50,000 queries, it will now take more than 3 billion queries. Backward incompatibility: Because the type exposed at the SQL level is int8, users may now see negative query IDs in the pg_stat_statements view (and also, query IDs more than 4 billion, which was the old limit). Patch by me, reviewed by Michael Paquier and Peter Geoghegan. Discussion: http://postgr.es/m/CA+TgmobG_Kp4cBKFmsznUAaM1GWW6hhRNiZC0KjRMOOeYnz5Yw@mail.gmail.com
* Allow multiple tables to be specified in one VACUUM or ANALYZE command.Tom Lane2017-10-033-0/+41
| | | | | | | | | | | | | | Not much to say about this; does what it says on the tin. However, formerly, if there was a column list then the ANALYZE action was implied; now it must be specified, or you get an error. This is because it would otherwise be a bit unclear what the user meant if some tables have column lists and some don't. Nathan Bossart, reviewed by Michael Paquier and Masahiko Sawada, with some editorialization by me Discussion: https://postgr.es/m/E061A8E3-5E3D-494D-94F0-E8A9B312BBFC@amazon.com
* Support arrays over domains.Tom Lane2017-09-305-18/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allowing arrays with a domain type as their element type was left un-done in the original domain patch, but not for any very good reason. This omission leads to such surprising results as array_agg() not working on a domain column, because the parser can't identify a suitable output type for the polymorphic aggregate. In order to fix this, first clean up the APIs of coerce_to_domain() and some internal functions in parse_coerce.c so that we consistently pass around a CoercionContext along with CoercionForm. Previously, we sometimes passed an "isExplicit" boolean flag instead, which is strictly less information; and coerce_to_domain() didn't even get that, but instead had to reverse-engineer isExplicit from CoercionForm. That's contrary to the documentation in primnodes.h that says that CoercionForm only affects display and not semantics. I don't think this change fixes any live bugs, but it makes things more consistent. The main reason for doing it though is that now build_coercion_expression() receives ccontext, which it needs in order to be able to recursively invoke coerce_to_target_type(). Next, reimplement ArrayCoerceExpr so that the node does not directly know any details of what has to be done to the individual array elements while performing the array coercion. Instead, the per-element processing is represented by a sub-expression whose input is a source array element and whose output is a target array element. This simplifies life in parse_coerce.c, because it can build that sub-expression by a recursive invocation of coerce_to_target_type(). The executor now handles the per-element processing as a compiled expression instead of hard-wired code. The main advantage of this is that we can use a single ArrayCoerceExpr to handle as many as three successive steps per element: base type conversion, typmod coercion, and domain constraint checking. The old code used two stacked ArrayCoerceExprs to handle type + typmod coercion, which was pretty inefficient, and adding yet another array deconstruction to do domain constraint checking seemed very unappetizing. In the case where we just need a single, very simple coercion function, doing this straightforwardly leads to a noticeable increase in the per-array-element runtime cost. Hence, add an additional shortcut evalfunc in execExprInterp.c that skips unnecessary overhead for that specific form of expression. The runtime speed of simple cases is within 1% or so of where it was before, while cases that previously required two levels of array processing are significantly faster. Finally, create an implicit array type for every domain type, as we do for base types, enums, etc. Everything except the array-coercion case seems to just work without further effort. Tom Lane, reviewed by Andrew Dunstan Discussion: https://postgr.es/m/9852.1499791473@sss.pgh.pa.us
* Add inline murmurhash32(uint32) function.Andres Freund2017-09-221-18/+2
| | | | | | | | The function already existed in tidbitmap.c but more users requiring fast hashing of 32bit ints are coming up. Author: Andres Freund Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
* Allow a partitioned table to have a default partition.Robert Haas2017-09-084-0/+4
| | | | | | | | | | | | | Any tuples that don't route to any other partition will route to the default partition. Jeevan Ladhe, Beena Emerson, Ashutosh Bapat, Rahila Syed, and Robert Haas, with review and testing at various stages by (at least) Rushabh Lathia, Keith Fiske, Amit Langote, Amul Sul, Rajkumar Raghuanshi, Sven Kunze, Kyotaro Horiguchi, Thom Brown, Rafia Sabih, and Dilip Kumar. Discussion: http://postgr.es/m/CAH2L28tbN4SYyhS7YV1YBWcitkqbhSWfQCy0G=apRcC_PEO-bg@mail.gmail.com Discussion: http://postgr.es/m/CAOG9ApEYj34fWMcvBMBQ-YtqR9fTdXhdN82QEKG0SVZ6zeL1xg@mail.gmail.com
* Reduce excessive dereferencing of function pointersPeter Eisentraut2017-09-071-3/+3
| | | | | | | | | | | | It is equivalent in ANSI C to write (*funcptr) () and funcptr(). These two styles have been applied inconsistently. After discussion, we'll use the more verbose style for plain function pointer variables, to make it clear that it's a variable, and the shorter style when the function pointer is in a struct (s.func() or s->func()), because then it's clear that it's not a plain function name, and otherwise the excessive punctuation makes some of those invocations hard to read. Discussion: https://www.postgresql.org/message-id/f52c16db-14ed-757d-4b48-7ef360b1631d@2ndquadrant.com
* Allow SET STATISTICS on expression indexesSimon Riggs2017-09-062-0/+2
| | | | | | | | | | | | | Index columns are referenced by ordinal number rather than name, e.g. CREATE INDEX coord_idx ON measured (x, y, (z + t)); ALTER INDEX coord_idx ALTER COLUMN 3 SET STATISTICS 1000; Incompatibility note for release notes: \d+ for indexes now also displays Stats Target Authors: Alexander Korotkov, with contribution by Adrien NAYRAT Review: Adrien NAYRAT, Simon Riggs Wordsmith: Simon Riggs
* Force rescanning of parallel-aware scan nodes below a Gather[Merge].Tom Lane2017-08-303-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The ExecReScan machinery contains various optimizations for postponing or skipping rescans of plan subtrees; for example a HashAgg node may conclude that it can re-use the table it built before, instead of re-reading its input subtree. But that is wrong if the input contains a parallel-aware table scan node, since the portion of the table scanned by the leader process is likely to vary from one rescan to the next. This explains the timing-dependent buildfarm failures we saw after commit a2b70c89c. The established mechanism for showing that a plan node's output is potentially variable is to mark it as depending on some runtime Param. Hence, to fix this, invent a dummy Param (one that has a PARAM_EXEC parameter number, but carries no actual value) associated with each Gather or GatherMerge node, mark parallel-aware nodes below that node as dependent on that Param, and arrange for ExecReScanGather[Merge] to flag that Param as changed whenever the Gather[Merge] node is rescanned. This solution breaks an undocumented assumption made by the parallel executor logic, namely that all rescans of nodes below a Gather[Merge] will happen synchronously during the ReScan of the top node itself. But that's fundamentally contrary to the design of the ExecReScan code, and so was doomed to fail someday anyway (even if you want to argue that the bug being fixed here wasn't a failure of that assumption). A follow-on patch will address that issue. In the meantime, the worst that's expected to happen is that given very bad timing luck, the leader might have to do all the work during a rescan, because workers think they have nothing to do, if they are able to start up before the eventual ReScan of the leader's parallel-aware table scan node has reset the shared scan state. Although this problem exists in 9.6, there does not seem to be any way for it to manifest there. Without GatherMerge, it seems that a plan tree that has a rescan-short-circuiting node below Gather will always also have one above it that will short-circuit in the same cases, preventing the Gather from being rescanned. Hence we won't take the risk of back-patching this change into 9.6. But v10 needs it. Discussion: https://postgr.es/m/CAA4eK1JkByysFJNh9M349u_nNjqETuEnY_y1VUc_kJiU0bxtaQ@mail.gmail.com
* Fix up some misusage of appendStringInfo() and friendsPeter Eisentraut2017-08-151-3/+3
| | | | | | | | Change to appendStringInfoChar() or appendStringInfoString() where those can be used. Author: David Rowley <david.rowley@2ndquadrant.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>
* Avoid out-of-memory in a hash join with many duplicate inner keys.Tom Lane2017-08-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The executor is capable of splitting buckets during a hash join if too much memory is being used by a small number of buckets. However, this only helps if a bucket's population is actually divisible; if all the hash keys are alike, the tuples still end up in the same new bucket. This can result in an OOM failure if there are enough inner keys with identical hash values. The planner's cost estimates will bias it against choosing a hash join in such situations, but not by so much that it will never do so. To mitigate the OOM hazard, explicitly estimate the hash bucket space needed by just the inner side's most common value, and if that would exceed work_mem then add disable_cost to the hash cost estimate. This approach doesn't account for the possibility that two or more common values would share the same hash value. On the other hand, work_mem is normally a fairly conservative bound, so that eating two or more times that much space is probably not going to kill us. If we have no stats about the inner side, ignore this consideration. There was some discussion of making a conservative assumption, but that would effectively result in disabling hash join whenever we lack stats, which seems like an overreaction given how seldom the problem manifests in the field. Per a complaint from David Hinkle. Although this could be viewed as a bug fix, the lack of similar complaints weighs against back- patching; indeed we waited for v11 because it seemed already rather late in the v10 cycle to be making plan choice changes like this one. Discussion: https://postgr.es/m/32013.1487271761@sss.pgh.pa.us
* Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition bounds.Dean Rasheed2017-07-214-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, UNBOUNDED meant no lower bound when used in the FROM list, and no upper bound when used in the TO list, which was OK for single-column range partitioning, but problematic with multiple columns. For example, an upper bound of (10.0, UNBOUNDED) would not be collocated with a lower bound of (10.0, UNBOUNDED), thus making it difficult or impossible to define contiguous multi-column range partitions in some cases. Fix this by using MINVALUE and MAXVALUE instead of UNBOUNDED to represent a partition column that is unbounded below or above respectively. This syntax removes any ambiguity, and ensures that if one partition's lower bound equals another partition's upper bound, then the partitions are contiguous. Also drop the constraint prohibiting finite values after an unbounded column, and just document the fact that any values after MINVALUE or MAXVALUE are ignored. Previously it was necessary to repeat UNBOUNDED multiple times, which was needlessly verbose. Note: Forces a post-PG 10 beta2 initdb. Report by Amul Sul, original patch by Amit Langote with some additional hacking by me. Discussion: https://postgr.es/m/CAAJ_b947mowpLdxL3jo3YLKngRjrq9+Ej4ymduQTfYR+8=YAYQ@mail.gmail.com
* Code review for NextValueExpr expression node type.Tom Lane2017-07-143-8/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add missing infrastructure for this node type, notably in ruleutils.c where its lack could demonstrably cause EXPLAIN to fail. Add outfuncs/readfuncs support. (outfuncs support is useful today for debugging purposes. The readfuncs support may never be needed, since at present it would only matter for parallel query and NextValueExpr should never appear in a parallelizable query; but it seems like a bad idea to have a primnode type that isn't fully supported here.) Teach planner infrastructure that NextValueExpr is a volatile, parallel-unsafe, non-leaky expression node with cost cpu_operator_cost. Given its limited scope of usage, there *might* be no live bug today from the lack of that knowledge, but it's certainly going to bite us on the rear someday. Teach pg_stat_statements about the new node type, too. While at it, also teach cost_qual_eval() that MinMaxExpr, SQLValueFunction, XmlExpr, and CoerceToDomain should be charged as cpu_operator_cost. Failing to do this for SQLValueFunction was an oversight in my commit 0bb51aa96. The others are longer-standing oversights, but no time like the present to fix them. (In principle, CoerceToDomain could have cost much higher than this, but it doesn't presently seem worth trying to examine the domain's constraints here.) Modify execExprInterp.c to execute NextValueExpr as an out-of-line function; it seems quite unlikely to me that it's worth insisting that it be inlined in all expression eval methods. Besides, providing the out-of-line function doesn't stop anyone from inlining if they want to. Adjust some places where NextValueExpr support had been inserted with the aid of a dartboard rather than keeping it in the same order as elsewhere. Discussion: https://postgr.es/m/23862.1499981661@sss.pgh.pa.us
* Fix output of char node fieldsPeter Eisentraut2017-07-052-2/+21
| | | | | | | WRITE_CHAR_FIELD() didn't do any escaping, so that for example a zero byte would cause the whole output string to be truncated. To fix, pass the char through outToken(), so it is escaped like a string. Adjust the reading side to handle this.
* Phase 3 of pgindent updates.Tom Lane2017-06-213-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Phase 2 of pgindent updates.Tom Lane2017-06-217-24/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change pg_bsd_indent to follow upstream rules for placement of comments to the right of code, and remove pgindent hack that caused comments following #endif to not obey the general rule. Commit e3860ffa4dd0dad0dd9eea4be9cc1412373a8c89 wasn't actually using the published version of pg_bsd_indent, but a hacked-up version that tried to minimize the amount of movement of comments to the right of code. The situation of interest is where such a comment has to be moved to the right of its default placement at column 33 because there's code there. BSD indent has always moved right in units of tab stops in such cases --- but in the previous incarnation, indent was working in 8-space tab stops, while now it knows we use 4-space tabs. So the net result is that in about half the cases, such comments are placed one tab stop left of before. This is better all around: it leaves more room on the line for comment text, and it means that in such cases the comment uniformly starts at the next 4-space tab stop after the code, rather than sometimes one and sometimes two tabs after. Also, ensure that comments following #endif are indented the same as comments following other preprocessor commands such as #else. That inconsistency turns out to have been self-inflicted damage from a poorly-thought-through post-indent "fixup" in pgindent. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Initial pgindent run with pg_bsd_indent version 2.0.Tom Lane2017-06-211-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new indent version includes numerous fixes thanks to Piotr Stefaniak. The main changes visible in this commit are: * Nicer formatting of function-pointer declarations. * No longer unexpectedly removes spaces in expressions using casts, sizeof, or offsetof. * No longer wants to add a space in "struct structname *varname", as well as some similar cases for const- or volatile-qualified pointers. * Declarations using PG_USED_FOR_ASSERTS_ONLY are formatted more nicely. * Fixes bug where comments following declarations were sometimes placed with no space separating them from the code. * Fixes some odd decisions for comments following case labels. * Fixes some cases where comments following code were indented to less than the expected column 33. On the less good side, it now tends to put more whitespace around typedef names that are not listed in typedefs.list. This might encourage us to put more effort into typedef name collection; it's not really a bug in indent itself. There are more changes coming after this round, having to do with comment indentation and alignment of lines appearing within parentheses. I wanted to limit the size of the diffs to something that could be reviewed without one's eyes completely glazing over, so it seemed better to split up the changes as much as practical. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us
* Reconcile nodes/*funcs.c with PostgreSQL 10 work.Noah Misch2017-06-164-13/+11
| | | | | | | The _equalTableFunc() omission of coltypmods has semantic significance, but I did not track down resulting user-visible bugs, if any. The other changes are cosmetic only, affecting order. catversion bump due to readfuncs.c field order change.
* Fix problems related to RangeTblEntry members enrname and enrtuples.Robert Haas2017-06-144-1/+6
| | | | | | | | | | Commit 18ce3a4ab22d2984f8540ab480979c851dae5338 failed to update the comments in parsenodes.h for the new members, and made only incomplete updates to src/backend/nodes Thomas Munro, per a report from Noah Misch. Discussion: http://postgr.es/m/20170611062525.GA1628882@rfd.leadboat.com
* Fix no-longer-valid shortcuts in expression_returns_set().Tom Lane2017-06-141-29/+1
| | | | | | | | | | | | | | | expression_returns_set() used to short-circuit its recursion upon seeing certain node types, such as DistinctExpr, that it knew the executor did not support set-valued arguments for. That was never inherent, though, just a reflection of laziness in execQual.c. With the new implementation of SRFs there is no reason to think that any scalar-valued expression node could not have a set-valued subexpression, except for AggRefs and WindowFuncs where we know there is a parser check rejecting it. And indeed, the shortcut causes unexpected failures for cases such as a SRF underneath DistinctExpr, because the planner stops looking for SRFs too soon. Discussion: https://postgr.es/m/5259.1497044025@sss.pgh.pa.us
* Fix omission of locations in outfuncs/readfuncs partitioning node support.Tom Lane2017-05-302-11/+9
| | | | | | | | | | | | | We could have limped along without this for v10, which was my intention when I annotated the bug in commit 76a3df6e5. But consensus is that it's better to fix it now and take the cost of a post-beta1 initdb (which is needed because these node types are stored in pg_class.relpartbound). Since we're forcing initdb anyway, take the opportunity to make the node type identification strings match the node struct names, instead of being randomly different from them. Discussion: https://postgr.es/m/E1dFBEX-0004wt-8t@gemulon.postgresql.org
* Code review focused on new node types added by partitioning support.Tom Lane2017-05-285-39/+51
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix failure to check that we got a plain Const from const-simplification of a coercion request. This is the cause of bug #14666 from Tian Bing: there is an int4 to money cast, but it's only stable not immutable (because of dependence on lc_monetary), resulting in a FuncExpr that the code was miserably unequipped to deal with, or indeed even to notice that it was failing to deal with. Add test cases around this coercion behavior. In view of the above, sprinkle the code liberally with castNode() macros, in hope of catching the next such bug a bit sooner. Also, change some functions that were randomly declared to take Node* to take more specific pointer types. And change some struct fields that were declared Node* but could be given more specific types, allowing removal of assorted explicit casts. Place PARTITION_MAX_KEYS check a bit closer to the code it's protecting. Likewise check only-one-key-for-list-partitioning restriction in a less random place. Avoid not-per-project-style usages like !strcmp(...). Fix assorted failures to avoid scribbling on the input of parse transformation. I'm not sure how necessary this is, but it's entirely silly for these functions to be expending cycles to avoid that and not getting it right. Add guards against partitioning on system columns. Put backend/nodes/ support code into an order that matches handling of these node types elsewhere. Annotate the fact that somebody added location fields to PartitionBoundSpec and PartitionRangeDatum but forgot to handle them in outfuncs.c/readfuncs.c. This is fairly harmless for production purposes (since readfuncs.c would just substitute -1 anyway) but it's still bogus. It's not worth forcing a post-beta1 initdb just to fix this, but if we have another reason to force initdb before 10.0, we should go back and clean this up. Contrariwise, somebody added location fields to PartitionElem and PartitionSpec but forgot to teach exprLocation() about them. Consolidate duplicative code in transformPartitionBound(). Improve a couple of error messages. Improve assorted commentary. Re-pgindent the files touched by this patch; this affects a few comment blocks that must have been added quite recently. Report: https://postgr.es/m/20170524024550.29935.14396@wrigleys.postgresql.org
* Post-PG 10 beta1 pgindent runBruce Momjian2017-05-174-8/+9
| | | | perltidy run not included.
* Remove no-longer-needed fields of Hash plan nodes.Tom Lane2017-05-143-6/+0
| | | | | | | skewColType/skewColTypmod are no longer used in the wake of commit 9aab83fc5, and seem unlikely to be wanted in future, so let's drop 'em. Discussion: https://postgr.es/m/16364.1494520862@sss.pgh.pa.us
* Change CREATE STATISTICS syntaxAlvaro Herrera2017-05-123-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, we had the WITH clause in the middle of the command, where you'd specify both generic options as well as statistic types. Few people liked this, so this commit changes it to remove the WITH keyword from that clause and makes it accept statistic types only. (We currently don't have any generic options, but if we invent in the future, we will gain a new WITH clause, probably at the end of the command). Also, the column list is now specified without parens, which makes the whole command look more similar to a SELECT command. This change will let us expand the command to supporting expressions (not just columns names) as well as multiple tables and their join conditions. Tom added lots of code comments and fixed some parts of the CREATE STATISTICS reference page, too; more changes in this area are forthcoming. He also fixed a potential problem in the alter_generic regression test, reducing verbosity on a cascaded drop to avoid dependency on message ordering, as we do in other tests. Tom also closed a security bug: we documented that table ownership was required in order to create a statistics object on it, but didn't actually implement it. Implement tab-completion for statistics objects. This can stand some more improvement. Authors: Alvaro Herrera, with lots of cleanup by Tom Lane Discussion: https://postgr.es/m/20170420212426.ltvgyhnefvhixm6i@alvherre.pgsql
* Remove the NODROP SLOT option from DROP SUBSCRIPTIONPeter Eisentraut2017-05-092-2/+2
| | | | | | | | | | | It turned out this approach had problems, because a DROP command should not have any options other than CASCADE and RESTRICT. Instead, always attempt to drop the slot if there is one configured, but also add an ALTER SUBSCRIPTION action to set the slot to NONE. Author: Petr Jelinek <petr.jelinek@2ndquadrant.com> Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://www.postgresql.org/message-id/29431.1493730652@sss.pgh.pa.us
* Fire per-statement triggers on partitioned tables.Robert Haas2017-05-013-0/+7
| | | | | | | | | | | Even though no actual tuples are ever inserted into a partitioned table (the actual tuples are in the partitions, not the partitioned table itself), we still need to have a ResultRelInfo for the partitioned table, or per-statement triggers won't get fired. Amit Langote, per a report from Rajkumar Raghuwanshi. Reviewed by me. Discussion: http://postgr.es/m/CAKcux6%3DwYospCRY2J4XEFuVy0L41S%3Dfic7rmkbsU-GXhhSbmBg%40mail.gmail.com
* Fix crash when partitioned column specified twice.Robert Haas2017-04-284-0/+4
| | | | | | Amit Langote, reviewed by Beena Emerson Discussion: http://postgr.es/m/6ed23d3d-c09d-4cbc-3628-0a8a32f750f4@lab.ntt.co.jp
* Mark finished Plan nodes with parallel_safe flags.Tom Lane2017-04-123-0/+3
| | | | | | | | | | | | | | | | | | | | | | | We'd managed to avoid doing this so far, but it seems pretty obvious that it would be forced on us some day, and this is much the cleanest way of approaching the open problem that parallel-unsafe subplans are being transmitted to parallel workers. Anyway there's no space cost due to alignment considerations, and the time cost is pretty minimal since we're just copying the flag from the corresponding Path node. (At least in most cases ... some of the klugier spots in createplan.c have to work a bit harder.) In principle we could perhaps get rid of SubPlan.parallel_safe, but I thought it better to keep that in case there are reasons to consider a SubPlan unsafe even when its child plan is parallel-safe. This patch doesn't actually do anything with the new flags, but I thought I'd commit it separately anyway. Note: although this touches outfuncs/readfuncs, there's no need for a catversion bump because Plan trees aren't stored on disk. Discussion: https://postgr.es/m/87tw5x4vcu.fsf@credativ.de
* Fix failure when a shared tidbitmap has only one page.Robert Haas2017-04-111-2/+4
| | | | | | | | | | | | Commit 98e6e89040a0534ca26914c66cae9dd49ef62ad9 made inadequate provision for the case of a single-page shared tidbitmap. It allocate space for a shared PagetableEntry, but failed to initialize it. Report by Thomas Munro. Patch by Dilip Kumar, with some comment changes by me. Discussion: http://postgr.es/m/CAEepm=19Cmnfbi-j2Bw-a6yGPeHE1OVhKvvKz9bRBTJGKfGHMA@mail.gmail.com
* Improve castNode notation by introducing list-extraction-specific variants.Tom Lane2017-04-101-8/+8
| | | | | | | | | | | | | | | | | This extends the castNode() notation introduced by commit 5bcab1114 to provide, in one step, extraction of a list cell's pointer and coercion to a concrete node type. For example, "lfirst_node(Foo, lc)" is the same as "castNode(Foo, lfirst(lc))". Almost half of the uses of castNode that have appeared so far include a list extraction call, so this is pretty widely useful, and it saves a few more keystrokes compared to the old way. As with the previous patch, back-patch the addition of these macros to pg_list.h, so that the notation will be available when back-patching. Patch by me, after an idea of Andrew Gierth's. Discussion: https://postgr.es/m/14197.1491841216@sss.pgh.pa.us
* Optimize joins when the inner relation can be proven unique.Tom Lane2017-04-073-0/+9
| | | | | | | | | | | | | | | | | | | | | | | If there can certainly be no more than one matching inner row for a given outer row, then the executor can move on to the next outer row as soon as it's found one match; there's no need to continue scanning the inner relation for this outer row. This saves useless scanning in nestloop and hash joins. In merge joins, it offers the opportunity to skip mark/restore processing, because we know we have not advanced past the first possible match for the next outer row. Of course, the devil is in the details: the proof of uniqueness must depend only on joinquals (not otherquals), and if we want to skip mergejoin mark/restore then it must depend only on merge clauses. To avoid adding more planning overhead than absolutely necessary, the present patch errs in the conservative direction: there are cases where inner_unique or skip_mark_restore processing could be used, but it will not do so because it's not sure that the uniqueness proof depended only on "safe" clauses. This could be improved later. David Rowley, reviewed and rather heavily editorialized on by me Discussion: https://postgr.es/m/CAApHDvqF6Sw-TK98bW48TdtFJ+3a7D2mFyZ7++=D-RyPsL76gw@mail.gmail.com
* Identity columnsPeter Eisentraut2017-04-065-0/+62
| | | | | | | | | | | | | This is the SQL standard-conforming variant of PostgreSQL's serial columns. It fixes a few usability issues that serial columns have: - CREATE TABLE / LIKE copies default but refers to same sequence - cannot add/drop serialness with ALTER TABLE - dropping default does not drop sequence - need to grant separate privileges to sequence - other slight weirdnesses because serial is some kind of special macro Reviewed-by: Vitaly Burovoy <vitaly.burovoy@gmail.com>
* Print new RelOptInfo field top_parent_relids in outfuncs.cRobert Haas2017-04-031-0/+1
| | | | | I intended to include this adjustment in the previous commit (7a39b5e4d11229ece930a51fd7cb29e535db44940) but messed up.
* Add infrastructure to support EphemeralNamedRelation references.Kevin Grittner2017-03-315-0/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A QueryEnvironment concept is added, which allows new types of objects to be passed into queries from parsing on through execution. At this point, the only thing implemented is a collection of EphemeralNamedRelation objects -- relations which can be referenced by name in queries, but do not exist in the catalogs. The only type of ENR implemented is NamedTuplestore, but provision is made to add more types fairly easily. An ENR can carry its own TupleDesc or reference a relation in the catalogs by relid. Although these features can be used without SPI, convenience functions are added to SPI so that ENRs can easily be used by code run through SPI. The initial use of all this is going to be transition tables in AFTER triggers, but that will be added to each PL as a separate commit. An incidental effect of this patch is to produce a more informative error message if an attempt is made to modify the contents of a CTE from a referencing DML statement. No tests previously covered that possibility, so one is added. Kevin Grittner and Thomas Munro Reviewed by Heikki Linnakangas, David Fetter, and Thomas Munro with valuable comments and suggestions from many others
* Fix parallel query so it doesn't spoil row estimates above Gather.Robert Haas2017-03-311-0/+1
| | | | | | | | | | | | | | | | | Commit 45be99f8cd5d606086e0a458c9c72910ba8a613d removed GatherPath's num_workers field, but this is entirely bogus. Normally, a path's parallel_workers flag is supposed to indicate the number of workers that it wants, and should be 0 for a non-partial path. In that commit, I mistakenly thought that GatherPath could also use that field to indicate the number of workers that it would try to start, but that's disastrous, because then it can propagate up to higher nodes in the plan tree, which will then get incorrect rowcounts because the parallel_workers flag is involved in computing those values. Repair by putting the separate field back. Report by Tomas Vondra. Patch by me, reviewed by Amit Kapila. Discussion: http://postgr.es/m/f91b4a44-f739-04bd-c4b6-f135bd643669@2ndquadrant.com
* Cast result of copyObject() to correct typePeter Eisentraut2017-03-281-4/+4
| | | | | | | | | | | | | | copyObject() is declared to return void *, which allows easily assigning the result independent of the input, but it loses all type checking. If the compiler supports typeof or something similar, cast the result to the input type. This creates a greater amount of type safety. In some cases, where the result is assigned to a generic type such as Node * or Expr *, new casts are now necessary, but in general casts are now unnecessary in the normal case and indicate that something unusual is happening. Reviewed-by: Mark Dilger <hornschnorter@gmail.com>
* Pass DSA_ALLOC_HUGE when allocating a shared TIDBitmap.Robert Haas2017-03-271-2/+4
| | | | | | | | | | | When creating an unshared TIDBitmap, we pass MCXT_ALLOC_HUGE to allow allocations >1GB, so by analogy we pass DSA_ALLOC_HUGE for a shared TIDBitmap. Bug introduced by commit 98e6e89040a0534ca26914c66cae9dd49ef62ad9. Report by Rafia Sabih, fix by Dilip Kumar, adjusted by me. Discussion: http://postgr.es/m/CAOGQiiPpSnkuKq+oUK_bvQFg2EPGFPN8RwgxTgBa6HU_kQa3EA@mail.gmail.com
* Support hashed aggregation with grouping sets.Andrew Gierth2017-03-272-4/+60
| | | | | | | | | | | | | | | | | | This extends the Aggregate node with two new features: HashAggregate can now run multiple hashtables concurrently, and a new strategy MixedAggregate populates hashtables while doing sorted grouping. The planner will now attempt to save as many sorts as possible when planning grouping sets queries, while not exceeding work_mem for the estimated combined sizes of all hashtables used. No SQL-level changes are required. There should be no user-visible impact other than the new EXPLAIN output and possible changes to result ordering when ORDER BY was not used (which affected a few regression tests). The enable_hashagg option is respected. Author: Andrew Gierth Reviewers: Mark Dilger, Andres Freund Discussion: https://postgr.es/m/87vatszyhj.fsf@news-spur.riddles.org.uk
* Implement multivariate n-distinct coefficientsAlvaro Herrera2017-03-243-0/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support for explicitly declared statistic objects (CREATE STATISTICS), allowing collection of statistics on more complex combinations that individual table columns. Companion commands DROP STATISTICS and ALTER STATISTICS ... OWNER TO / SET SCHEMA / RENAME are added too. All this DDL has been designed so that more statistic types can be added later on, such as multivariate most-common-values and multivariate histograms between columns of a single table, leaving room for permitting columns on multiple tables, too, as well as expressions. This commit only adds support for collection of n-distinct coefficient on user-specified sets of columns in a single table. This is useful to estimate number of distinct groups in GROUP BY and DISTINCT clauses; estimation errors there can cause over-allocation of memory in hashed aggregates, for instance, so it's a worthwhile problem to solve. A new special pseudo-type pg_ndistinct is used. (num-distinct estimation was deemed sufficiently useful by itself that this is worthwhile even if no further statistic types are added immediately; so much so that another version of essentially the same functionality was submitted by Kyotaro Horiguchi: https://postgr.es/m/20150828.173334.114731693.horiguchi.kyotaro@lab.ntt.co.jp though this commit does not use that code.) Author: Tomas Vondra. Some code rework by Álvaro. Reviewed-by: Dean Rasheed, David Rowley, Kyotaro Horiguchi, Jeff Janes, Ideriha Takeshi Discussion: https://postgr.es/m/543AFA15.4080608@fuzzy.cz https://postgr.es/m/20170320190220.ixlaueanxegqd5gr@alvherre.pgsql
* ICU supportPeter Eisentraut2017-03-232-0/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a column collprovider to pg_collation that determines which library provides the collation data. The existing choices are default and libc, and this adds an icu choice, which uses the ICU4C library. The pg_locale_t type is changed to a union that contains the provider-specific locale handles. Users of locale information are changed to look into that struct for the appropriate handle to use. Also add a collversion column that records the version of the collation when it is created, and check at run time whether it is still the same. This detects potentially incompatible library upgrades that can corrupt indexes and other structures. This is currently only supported by ICU-provided collations. initdb initializes the default collation set as before from the `locale -a` output but also adds all available ICU locales with a "-x-icu" appended. Currently, ICU-provided collations can only be explicitly named collations. The global database locales are still always libc-provided. ICU support is enabled by configure --with-icu. Reviewed-by: Thomas Munro <thomas.munro@enterprisedb.com> Reviewed-by: Andreas Karlsson <andreas@proxel.se>