gitlab-org--gitlab-foss/doc/development/iterating_tables_in_batches.md

1.3 KiB

stage group info
none unassigned To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers

Iterating Tables In Batches

Rails provides a method called in_batches that can be used to iterate over rows in batches. For example:

User.in_batches(of: 10) do |relation|
  relation.update_all(updated_at: Time.now)
end

Unfortunately this method is implemented in a way that is not very efficient, both query and memory usage wise.

To work around this you can include the EachBatch module into your models, then use the each_batch class method. For example:

class User < ActiveRecord::Base
  include EachBatch
end

User.each_batch(of: 10) do |relation|
  relation.update_all(updated_at: Time.now)
end

This will end up producing queries such as:

User Load (0.7ms)  SELECT  "users"."id" FROM "users" WHERE ("users"."id" >= 41654)  ORDER BY "users"."id" ASC LIMIT 1 OFFSET 1000
  (0.7ms)  SELECT COUNT(*) FROM "users" WHERE ("users"."id" >= 41654) AND ("users"."id" < 42687)

The API of this method is similar to in_batches, though it doesn't support all of the arguments that in_batches supports. You should always use each_batch unless you have a specific need for in_batches.