2018-11-16 19:37:17 -05:00
|
|
|
# frozen_string_literal: true
|
|
|
|
|
2016-05-06 11:55:06 -04:00
|
|
|
module Gitlab
|
|
|
|
module ImportExport
|
|
|
|
class AttributesFinder
|
2021-05-04 08:10:04 -04:00
|
|
|
attr_reader :tree, :included_attributes, :excluded_attributes, :methods, :preloads, :export_reorders
|
2020-05-25 11:07:58 -04:00
|
|
|
|
2019-09-05 09:51:27 -04:00
|
|
|
def initialize(config:)
|
|
|
|
@tree = config[:tree] || {}
|
|
|
|
@included_attributes = config[:included_attributes] || {}
|
|
|
|
@excluded_attributes = config[:excluded_attributes] || {}
|
|
|
|
@methods = config[:methods] || {}
|
Improve performance and memory usage of project export
ActiveModel::Serialization is simple in that it recursively calls
`as_json` on each object to serialize everything. However, for a model
like a Project, this can generate a query for every single association,
which can add up to tens of thousands of queries and lead to memory
bloat.
To improve this, we can do several things:
1. We use `tree:` and `preload:` to automatically generate
a list of all preloads that could be used to serialize
objects in bulk.
2. We observe that a single project has many issues, merge requests,
etc. Instead of serializing everything at once, which could lead to
database timeouts and high memory usage, we take each top-level
association and serialize the data in batches.
For example, we serialize the first 100 issues and preload all of
their associated events, notes, etc. before moving onto the next
batch. When we're done, we serialize merge requests in the same way.
We repeat this pattern for the remaining associations specified in
import_export.yml.
2019-09-09 11:40:49 -04:00
|
|
|
@preloads = config[:preloads] || {}
|
2021-05-04 08:10:04 -04:00
|
|
|
@export_reorders = config[:export_reorders] || {}
|
2016-05-06 11:55:06 -04:00
|
|
|
end
|
|
|
|
|
2019-09-05 09:51:27 -04:00
|
|
|
def find_root(model_key)
|
|
|
|
find(model_key, @tree[model_key])
|
2016-05-06 11:55:06 -04:00
|
|
|
end
|
|
|
|
|
2019-09-05 09:51:27 -04:00
|
|
|
def find_relations_tree(model_key)
|
|
|
|
@tree[model_key]
|
2016-05-18 11:48:15 -04:00
|
|
|
end
|
|
|
|
|
2018-05-03 05:02:26 -04:00
|
|
|
def find_excluded_keys(klass_name)
|
|
|
|
@excluded_attributes[klass_name.to_sym]&.map(&:to_s) || []
|
|
|
|
end
|
|
|
|
|
2016-05-06 11:55:06 -04:00
|
|
|
private
|
|
|
|
|
2019-09-05 09:51:27 -04:00
|
|
|
def find(model_key, model_tree)
|
|
|
|
{
|
|
|
|
only: @included_attributes[model_key],
|
|
|
|
except: @excluded_attributes[model_key],
|
|
|
|
methods: @methods[model_key],
|
Improve performance and memory usage of project export
ActiveModel::Serialization is simple in that it recursively calls
`as_json` on each object to serialize everything. However, for a model
like a Project, this can generate a query for every single association,
which can add up to tens of thousands of queries and lead to memory
bloat.
To improve this, we can do several things:
1. We use `tree:` and `preload:` to automatically generate
a list of all preloads that could be used to serialize
objects in bulk.
2. We observe that a single project has many issues, merge requests,
etc. Instead of serializing everything at once, which could lead to
database timeouts and high memory usage, we take each top-level
association and serialize the data in batches.
For example, we serialize the first 100 issues and preload all of
their associated events, notes, etc. before moving onto the next
batch. When we're done, we serialize merge requests in the same way.
We repeat this pattern for the remaining associations specified in
import_export.yml.
2019-09-09 11:40:49 -04:00
|
|
|
include: resolve_model_tree(model_tree),
|
2021-05-04 08:10:04 -04:00
|
|
|
preload: resolve_preloads(model_key, model_tree),
|
|
|
|
export_reorder: @export_reorders[model_key]
|
2019-09-05 09:51:27 -04:00
|
|
|
}.compact
|
|
|
|
end
|
|
|
|
|
Improve performance and memory usage of project export
ActiveModel::Serialization is simple in that it recursively calls
`as_json` on each object to serialize everything. However, for a model
like a Project, this can generate a query for every single association,
which can add up to tens of thousands of queries and lead to memory
bloat.
To improve this, we can do several things:
1. We use `tree:` and `preload:` to automatically generate
a list of all preloads that could be used to serialize
objects in bulk.
2. We observe that a single project has many issues, merge requests,
etc. Instead of serializing everything at once, which could lead to
database timeouts and high memory usage, we take each top-level
association and serialize the data in batches.
For example, we serialize the first 100 issues and preload all of
their associated events, notes, etc. before moving onto the next
batch. When we're done, we serialize merge requests in the same way.
We repeat this pattern for the remaining associations specified in
import_export.yml.
2019-09-09 11:40:49 -04:00
|
|
|
def resolve_preloads(model_key, model_tree)
|
|
|
|
model_tree
|
|
|
|
.map { |submodel_key, submodel_tree| resolve_preload(model_key, submodel_key, submodel_tree) }
|
2020-04-03 05:09:31 -04:00
|
|
|
.tap { |entries| entries.compact! }
|
Improve performance and memory usage of project export
ActiveModel::Serialization is simple in that it recursively calls
`as_json` on each object to serialize everything. However, for a model
like a Project, this can generate a query for every single association,
which can add up to tens of thousands of queries and lead to memory
bloat.
To improve this, we can do several things:
1. We use `tree:` and `preload:` to automatically generate
a list of all preloads that could be used to serialize
objects in bulk.
2. We observe that a single project has many issues, merge requests,
etc. Instead of serializing everything at once, which could lead to
database timeouts and high memory usage, we take each top-level
association and serialize the data in batches.
For example, we serialize the first 100 issues and preload all of
their associated events, notes, etc. before moving onto the next
batch. When we're done, we serialize merge requests in the same way.
We repeat this pattern for the remaining associations specified in
import_export.yml.
2019-09-09 11:40:49 -04:00
|
|
|
.to_h
|
|
|
|
.deep_merge(@preloads[model_key].to_h)
|
|
|
|
.presence
|
|
|
|
end
|
|
|
|
|
|
|
|
def resolve_preload(parent_model_key, model_key, model_tree)
|
|
|
|
return if @methods[parent_model_key]&.include?(model_key)
|
|
|
|
|
|
|
|
[model_key, resolve_preloads(model_key, model_tree)]
|
|
|
|
end
|
|
|
|
|
2019-09-05 09:51:27 -04:00
|
|
|
def resolve_model_tree(model_tree)
|
|
|
|
return unless model_tree
|
|
|
|
|
|
|
|
model_tree
|
|
|
|
.map(&method(:resolve_model))
|
2016-05-09 11:58:43 -04:00
|
|
|
end
|
|
|
|
|
2019-09-05 09:51:27 -04:00
|
|
|
def resolve_model(model_key, model_tree)
|
|
|
|
{ model_key => find(model_key, model_tree) }
|
2016-05-06 11:55:06 -04:00
|
|
|
end
|
|
|
|
end
|
|
|
|
end
|
2016-05-09 06:15:50 -04:00
|
|
|
end
|