summaryrefslogtreecommitdiff
path: root/doc/development/iterating_tables_in_batches.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/iterating_tables_in_batches.md')
-rw-r--r--doc/development/iterating_tables_in_batches.md37
1 files changed, 37 insertions, 0 deletions
diff --git a/doc/development/iterating_tables_in_batches.md b/doc/development/iterating_tables_in_batches.md
new file mode 100644
index 00000000000..590c8cbba2d
--- /dev/null
+++ b/doc/development/iterating_tables_in_batches.md
@@ -0,0 +1,37 @@
+# Iterating Tables In Batches
+
+Rails provides a method called `in_batches` that can be used to iterate over
+rows in batches. For example:
+
+```ruby
+User.in_batches(of: 10) do |relation|
+ relation.update_all(updated_at: Time.now)
+end
+```
+
+Unfortunately this method is implemented in a way that is not very efficient,
+both query and memory usage wise.
+
+To work around this you can include the `EachBatch` module into your models,
+then use the `each_batch` class method. For example:
+
+```ruby
+class User < ActiveRecord::Base
+ include EachBatch
+end
+
+User.each_batch(of: 10) do |relation|
+ relation.update_all(updated_at: Time.now)
+end
+```
+
+This will end up producing queries such as:
+
+```
+User Load (0.7ms) SELECT "users"."id" FROM "users" WHERE ("users"."id" >= 41654) ORDER BY "users"."id" ASC LIMIT 1 OFFSET 1000
+ (0.7ms) SELECT COUNT(*) FROM "users" WHERE ("users"."id" >= 41654) AND ("users"."id" < 42687)
+```
+
+The API of this method is similar to `in_batches`, though it doesn't support
+all of the arguments that `in_batches` supports. You should always use
+`each_batch` _unless_ you have a specific need for `in_batches`.