gitlab-org--gitlab-foss/doc/development/iterating_tables_in_batches.md
Yorick Peterse ff78af152c
Added EachBatch for iterating tables in batches
This module provides a class method called `each_batch` that can be used
to iterate tables in batches in a more efficient way compared to Rails'
`in_batches` method. This commit also includes a RuboCop cop to
blacklist the use of `in_batches` in favour of this new method.
2017-07-07 16:39:36 +02:00

37 lines
1.1 KiB
Markdown

# Iterating Tables In Batches
Rails provides a method called `in_batches` that can be used to iterate over
rows in batches. For example:
```ruby
User.in_batches(of: 10) do |relation|
relation.update_all(updated_at: Time.now)
end
```
Unfortunately this method is implemented in a way that is not very efficient,
both query and memory usage wise.
To work around this you can include the `EachBatch` module into your models,
then use the `each_batch` class method. For example:
```ruby
class User < ActiveRecord::Base
include EachBatch
end
User.each_batch(of: 10) do |relation|
relation.update_all(updated_at: Time.now)
end
```
This will end up producing queries such as:
```
User Load (0.7ms) SELECT "users"."id" FROM "users" WHERE ("users"."id" >= 41654) ORDER BY "users"."id" ASC LIMIT 1 OFFSET 1000
(0.7ms) SELECT COUNT(*) FROM "users" WHERE ("users"."id" >= 41654) AND ("users"."id" < 42687)
```
The API of this method is similar to `in_batches`, though it doesn't support
all of the arguments that `in_batches` supports. You should always use
`each_batch` _unless_ you have a specific need for `in_batches`.