diff options
Diffstat (limited to 'doc/development/iterating_tables_in_batches.md')
-rw-r--r-- | doc/development/iterating_tables_in_batches.md | 37 |
1 files changed, 37 insertions, 0 deletions
diff --git a/doc/development/iterating_tables_in_batches.md b/doc/development/iterating_tables_in_batches.md new file mode 100644 index 00000000000..590c8cbba2d --- /dev/null +++ b/doc/development/iterating_tables_in_batches.md @@ -0,0 +1,37 @@ +# Iterating Tables In Batches + +Rails provides a method called `in_batches` that can be used to iterate over +rows in batches. For example: + +```ruby +User.in_batches(of: 10) do |relation| + relation.update_all(updated_at: Time.now) +end +``` + +Unfortunately this method is implemented in a way that is not very efficient, +both query and memory usage wise. + +To work around this you can include the `EachBatch` module into your models, +then use the `each_batch` class method. For example: + +```ruby +class User < ActiveRecord::Base + include EachBatch +end + +User.each_batch(of: 10) do |relation| + relation.update_all(updated_at: Time.now) +end +``` + +This will end up producing queries such as: + +``` +User Load (0.7ms) SELECT "users"."id" FROM "users" WHERE ("users"."id" >= 41654) ORDER BY "users"."id" ASC LIMIT 1 OFFSET 1000 + (0.7ms) SELECT COUNT(*) FROM "users" WHERE ("users"."id" >= 41654) AND ("users"."id" < 42687) +``` + +The API of this method is similar to `in_batches`, though it doesn't support +all of the arguments that `in_batches` supports. You should always use +`each_batch` _unless_ you have a specific need for `in_batches`. |