diff options
Diffstat (limited to 'doc/development/database/setting_multiple_values.md')
-rw-r--r-- | doc/development/database/setting_multiple_values.md | 103 |
1 files changed, 103 insertions, 0 deletions
diff --git a/doc/development/database/setting_multiple_values.md b/doc/development/database/setting_multiple_values.md new file mode 100644 index 00000000000..5569a0e10b7 --- /dev/null +++ b/doc/development/database/setting_multiple_values.md @@ -0,0 +1,103 @@ +# Setting Multiple Values + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/32921) in GitLab 13.5. + +Frequently, we will want to update multiple objects with new values for one +or more columns. The obvious way to do this is using `Relation#update_all`: + +```ruby +user.issues.open.update_all(due_date: 7.days.from_now) # (1) +user.issues.update_all('relative_position = relative_position + 1') # (2) +``` + +But what do you do if you cannot express the update as either a static value (1) +or as a calculation (2)? + +Thankfully we can use `UPDATE FROM` to express the need to update multiple rows +with distinct values in a single query. One can either use a temporary table, or +a Common Table Expression (CTE), and then use that as the source of the updates: + +```sql +with updates(obj_id, new_title, new_weight) as ( + values (1 :: integer, 'Very difficult issue' :: text, 8 :: integer), + (2, 'Very easy issue', 1) +) +update issues + set title = new_title, weight = new_weight + from updates + where id = obj_id +``` + +The bad news: There is no way to express this in ActiveRecord or even dropping +down to ARel - the `UpdateManager` just does not support `update from`, so this +is not expressible. + +The good news: We supply an abstraction to help you generate these kinds of +updates, called `Gitlab::Database::BulkUpdate`. This constructs queries such as the +above, and uses binding parameters to avoid SQL injection. + +## Usage + +To use this, we need: + +- the list of columns to update +- a mapping from object/ID to the new values to set for that object +- a way to determine the table for each object + +So for example, we can express the query above as: + +```ruby +issue_a = Issue.find(..) +issue_b = Issue.find(..) + +# Issues a single query: +::Gitlab::Database::BulkUpdate.execute(%i[title weight], { + issue_a => { title: 'Very difficult issue', weight: 8 }, + issue_b => { title: 'Very easy issue', weight: 1 } +}) +``` + +Here the table can be determined automatically, from calling +`object.class.table_name`, so we don't need to provide anything. + +We can even pass heterogeneous sets of objects, if the updates all make sense +for them: + +```ruby +issue_a = Issue.find(..) +issue_b = Issue.find(..) +merge_request = MergeRequest.find(..) + +# Issues two queries +::Gitlab::Database::BulkUpdate.execute(%i[title], { + issue_a => { title: 'A' }, + issue_b => { title: 'B' }, + merge_request => { title: 'B' } +}) +``` + +If your objects do not return the correct model class (perhaps because they are +part of a union), then we need to specify this explicitly in a block: + +```ruby +bazzes = params +objects = Foo.from_union([ + Foo.select("id, 'foo' as object_type").where(quux: true), + Bar.select("id, 'bar' as object_type").where(wibble: true) + ]) +# At this point, all the objects are instances of Foo, even the ones from the +# Bar table +mapping = objects.to_h { |obj| [obj, bazzes[obj.id] } + +# Issues at most 2 queries +::Gitlab::Database::BulkUpdate.execute(%i[baz], mapping) do |obj| + obj.object_type.constantize +end +``` + +## Caveats + +Note that this is a **very low level** tool, and operates on the raw column +values. Enumerations and state fields must be translated into their underlying +representations, for example, and nested associations are not supported. No +validations or hooks will be called. |