diff options
author | Yorick Peterse <yorickpeterse@gmail.com> | 2017-07-07 14:49:05 +0200 |
---|---|---|
committer | Yorick Peterse <yorickpeterse@gmail.com> | 2017-07-07 14:49:05 +0200 |
commit | 5f9c84584efd5a7cbe19ada49fdccefbd5f54aea (patch) | |
tree | 0b5d591e04ea498e1c0dbaa9c18d9200709e930c /doc | |
parent | d40445e4c9964ae0ab793bfdd7ba530de4259716 (diff) | |
download | gitlab-ce-5f9c84584efd5a7cbe19ada49fdccefbd5f54aea.tar.gz |
Added EachBatch for iterating tables in batches
This module provides a class method called `each_batch` that can be used
to iterate tables in batches in a more efficient way compared to Rails'
`in_batches` method. This commit also includes a RuboCop cop to
blacklist the use of `in_batches` in favour of this new method.
Diffstat (limited to 'doc')
-rw-r--r-- | doc/development/README.md | 1 | ||||
-rw-r--r-- | doc/development/iterating_tables_in_batches.md | 37 |
2 files changed, 38 insertions, 0 deletions
diff --git a/doc/development/README.md b/doc/development/README.md index a2a07c37ced..58993c52dcd 100644 --- a/doc/development/README.md +++ b/doc/development/README.md @@ -55,6 +55,7 @@ - [Single Table Inheritance](single_table_inheritance.md) - [Background Migrations](background_migrations.md) - [Storing SHA1 Hashes As Binary](sha1_as_binary.md) +- [Iterating Tables In Batches](iterating_tables_in_batches.md) ## i18n diff --git a/doc/development/iterating_tables_in_batches.md b/doc/development/iterating_tables_in_batches.md new file mode 100644 index 00000000000..590c8cbba2d --- /dev/null +++ b/doc/development/iterating_tables_in_batches.md @@ -0,0 +1,37 @@ +# Iterating Tables In Batches + +Rails provides a method called `in_batches` that can be used to iterate over +rows in batches. For example: + +```ruby +User.in_batches(of: 10) do |relation| + relation.update_all(updated_at: Time.now) +end +``` + +Unfortunately this method is implemented in a way that is not very efficient, +both query and memory usage wise. + +To work around this you can include the `EachBatch` module into your models, +then use the `each_batch` class method. For example: + +```ruby +class User < ActiveRecord::Base + include EachBatch +end + +User.each_batch(of: 10) do |relation| + relation.update_all(updated_at: Time.now) +end +``` + +This will end up producing queries such as: + +``` +User Load (0.7ms) SELECT "users"."id" FROM "users" WHERE ("users"."id" >= 41654) ORDER BY "users"."id" ASC LIMIT 1 OFFSET 1000 + (0.7ms) SELECT COUNT(*) FROM "users" WHERE ("users"."id" >= 41654) AND ("users"."id" < 42687) +``` + +The API of this method is similar to `in_batches`, though it doesn't support +all of the arguments that `in_batches` supports. You should always use +`each_batch` _unless_ you have a specific need for `in_batches`. |