gitlab-org--gitlab-foss/lib/gitlab/import_export/attributes_finder.rb
Kamil Trzciński 0e56c1e7cb Improve performance and memory usage of project export
ActiveModel::Serialization is simple in that it recursively calls
`as_json` on each object to serialize everything. However, for a model
like a Project, this can generate a query for every single association,
which can add up to tens of thousands of queries and lead to memory
bloat.

To improve this, we can do several things:

1. We use `tree:` and `preload:` to automatically generate
   a list of all preloads that could be used to serialize
   objects in bulk.

2. We observe that a single project has many issues, merge requests,
   etc. Instead of serializing everything at once, which could lead to
   database timeouts and high memory usage, we take each top-level
   association and serialize the data in batches.

For example, we serialize the first 100 issues and preload all of
their associated events, notes, etc. before moving onto the next
batch. When we're done, we serialize merge requests in the same way.
We repeat this pattern for the remaining associations specified in
import_export.yml.
2019-09-09 15:40:49 +00:00

65 lines
1.7 KiB
Ruby

# frozen_string_literal: true
module Gitlab
module ImportExport
class AttributesFinder
def initialize(config:)
@tree = config[:tree] || {}
@included_attributes = config[:included_attributes] || {}
@excluded_attributes = config[:excluded_attributes] || {}
@methods = config[:methods] || {}
@preloads = config[:preloads] || {}
end
def find_root(model_key)
find(model_key, @tree[model_key])
end
def find_relations_tree(model_key)
@tree[model_key]
end
def find_excluded_keys(klass_name)
@excluded_attributes[klass_name.to_sym]&.map(&:to_s) || []
end
private
def find(model_key, model_tree)
{
only: @included_attributes[model_key],
except: @excluded_attributes[model_key],
methods: @methods[model_key],
include: resolve_model_tree(model_tree),
preload: resolve_preloads(model_key, model_tree)
}.compact
end
def resolve_preloads(model_key, model_tree)
model_tree
.map { |submodel_key, submodel_tree| resolve_preload(model_key, submodel_key, submodel_tree) }
.compact
.to_h
.deep_merge(@preloads[model_key].to_h)
.presence
end
def resolve_preload(parent_model_key, model_key, model_tree)
return if @methods[parent_model_key]&.include?(model_key)
[model_key, resolve_preloads(model_key, model_tree)]
end
def resolve_model_tree(model_tree)
return unless model_tree
model_tree
.map(&method(:resolve_model))
end
def resolve_model(model_key, model_tree)
{ model_key => find(model_key, model_tree) }
end
end
end
end