diff options
author | Kamil TrzciĆski <ayufan@ayufan.eu> | 2019-09-09 15:40:49 +0000 |
---|---|---|
committer | Stan Hu <stanhu@gmail.com> | 2019-09-09 15:40:49 +0000 |
commit | 0e56c1e7cb3e1bbf3e81ab9907a26d385e28022c (patch) | |
tree | 4022cd2fe891d64eb34ceb5537467737a4054538 /lib/gitlab/import_export/import_export.yml | |
parent | 383f363589ac405cce07d3b54e796f9c949d2ffb (diff) | |
download | gitlab-ce-0e56c1e7cb3e1bbf3e81ab9907a26d385e28022c.tar.gz |
Improve performance and memory usage of project export
ActiveModel::Serialization is simple in that it recursively calls
`as_json` on each object to serialize everything. However, for a model
like a Project, this can generate a query for every single association,
which can add up to tens of thousands of queries and lead to memory
bloat.
To improve this, we can do several things:
1. We use `tree:` and `preload:` to automatically generate
a list of all preloads that could be used to serialize
objects in bulk.
2. We observe that a single project has many issues, merge requests,
etc. Instead of serializing everything at once, which could lead to
database timeouts and high memory usage, we take each top-level
association and serialize the data in batches.
For example, we serialize the first 100 issues and preload all of
their associated events, notes, etc. before moving onto the next
batch. When we're done, we serialize merge requests in the same way.
We repeat this pattern for the remaining associations specified in
import_export.yml.
Diffstat (limited to 'lib/gitlab/import_export/import_export.yml')
-rw-r--r-- | lib/gitlab/import_export/import_export.yml | 10 |
1 files changed, 10 insertions, 0 deletions
diff --git a/lib/gitlab/import_export/import_export.yml b/lib/gitlab/import_export/import_export.yml index 06c94beead8..511b702553e 100644 --- a/lib/gitlab/import_export/import_export.yml +++ b/lib/gitlab/import_export/import_export.yml @@ -231,6 +231,16 @@ methods: ci_pipelines: - :notes +preloads: + statuses: + # TODO: We cannot preload tags, as they are not part of `GenericCommitStatus` + # tags: # needed by tag_list + project: # deprecated: needed by coverage_regex of Ci::Build + merge_requests: + source_project: # needed by source_branch_sha and diff_head_sha + target_project: # needed by target_branch_sha + assignees: # needed by assigne_id that is implemented by DeprecatedAssignee + # EE specific relationships and settings to include. All of this will be merged # into the previous structures if EE is used. ee: |