gitlab-org--gitlab-foss/db/migrate/20171115164540_populate_mer...

31 lines
845 B
Ruby
Raw Normal View History

Use latest_merge_request_diff association Compared to the merge_request_diff association: 1. It's simpler to query. The query uses a foreign key to the merge_request_diffs table, so no ordering is necessary. 2. It's faster for preloading. The merge_request_diff association has to load every diff for the MRs in the set, then discard all but the most recent for each. This association means that Rails can just query for N diffs from N MRs. 3. It's more complicated to update. This is a bidirectional foreign key, so we need to update two tables when adding a diff record. This also means we need to handle this as a special case when importing a GitLab project. There is some juggling with this association in the merge request model: * `MergeRequest#latest_merge_request_diff` is _always_ the latest diff. * `MergeRequest#merge_request_diff` reuses `MergeRequest#latest_merge_request_diff` unless: * Arguments are passed. These are typically to force-reload the association. * It doesn't exist. That means we might be trying to implicitly create a diff. This only seems to happen in specs. * The association is already loaded. This is important for the reasons explained in the comment, which I'll reiterate here: if we a) load a non-latest diff, then b) get its `merge_request`, then c) get that MR's `merge_request_diff`, we should get the diff we loaded in c), even though that's not the latest diff. Basically, `MergeRequest#merge_request_diff` is the latest diff in most cases, but not quite all.
2017-11-15 17:22:18 +00:00
# This is identical to the stolen background migration, which already has specs.
class PopulateMergeRequestsLatestMergeRequestDiffIdTakeTwo < ActiveRecord::Migration[4.2]
Use latest_merge_request_diff association Compared to the merge_request_diff association: 1. It's simpler to query. The query uses a foreign key to the merge_request_diffs table, so no ordering is necessary. 2. It's faster for preloading. The merge_request_diff association has to load every diff for the MRs in the set, then discard all but the most recent for each. This association means that Rails can just query for N diffs from N MRs. 3. It's more complicated to update. This is a bidirectional foreign key, so we need to update two tables when adding a diff record. This also means we need to handle this as a special case when importing a GitLab project. There is some juggling with this association in the merge request model: * `MergeRequest#latest_merge_request_diff` is _always_ the latest diff. * `MergeRequest#merge_request_diff` reuses `MergeRequest#latest_merge_request_diff` unless: * Arguments are passed. These are typically to force-reload the association. * It doesn't exist. That means we might be trying to implicitly create a diff. This only seems to happen in specs. * The association is already loaded. This is important for the reasons explained in the comment, which I'll reiterate here: if we a) load a non-latest diff, then b) get its `merge_request`, then c) get that MR's `merge_request_diff`, we should get the diff we loaded in c), even though that's not the latest diff. Basically, `MergeRequest#merge_request_diff` is the latest diff in most cases, but not quite all.
2017-11-15 17:22:18 +00:00
include Gitlab::Database::MigrationHelpers
DOWNTIME = false
BATCH_SIZE = 1_000
class MergeRequest < ActiveRecord::Base
self.table_name = 'merge_requests'
include ::EachBatch
end
disable_ddl_transaction!
def up
Gitlab::BackgroundMigration.steal('PopulateMergeRequestsLatestMergeRequestDiffId')
update = '
latest_merge_request_diff_id = (
SELECT MAX(id)
FROM merge_request_diffs
WHERE merge_requests.id = merge_request_diffs.merge_request_id
)'.squish
MergeRequest.where(latest_merge_request_diff_id: nil).each_batch(of: BATCH_SIZE) do |relation|
relation.update_all(update)
end
end
end