diff options
Diffstat (limited to 'doc')
-rw-r--r-- | doc/api/releases/index.md | 42 | ||||
-rw-r--r-- | doc/development/README.md | 1 | ||||
-rw-r--r-- | doc/development/insert_into_tables_in_batches.md | 156 | ||||
-rw-r--r-- | doc/gitlab-basics/create-your-ssh-keys.md | 7 | ||||
-rw-r--r-- | doc/user/project/settings/import_export.md | 7 |
5 files changed, 199 insertions, 14 deletions
diff --git a/doc/api/releases/index.md b/doc/api/releases/index.md index 88fdedfa1b8..ffd4efdca5d 100644 --- a/doc/api/releases/index.md +++ b/doc/api/releases/index.md @@ -70,7 +70,11 @@ Example response: "updated_at":"2019-07-12T19:45:44.256Z", "due_date":"2019-08-16T11:00:00.256Z", "start_date":"2019-07-30T12:00:00.256Z", - "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/1" + "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/1", + "issue_stats": { + "total": 98, + "closed": 76 + } }, { "id":52, @@ -83,7 +87,11 @@ Example response: "updated_at":"2019-07-16T14:00:12.256Z", "due_date":"2019-08-16T11:00:00.256Z", "start_date":"2019-07-30T12:00:00.256Z", - "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/2" + "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/2", + "issue_stats": { + "total": 24, + "closed": 21 + } } ], "commit_path":"/root/awesome-app/commit/588440f66559714280628a4f9799f0c4eb880a4a", @@ -252,7 +260,11 @@ Example response: "updated_at":"2019-07-12T19:45:44.256Z", "due_date":"2019-08-16T11:00:00.256Z", "start_date":"2019-07-30T12:00:00.256Z", - "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/1" + "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/1", + "issue_stats": { + "total": 98, + "closed": 76 + } }, { "id":52, @@ -265,7 +277,11 @@ Example response: "updated_at":"2019-07-16T14:00:12.256Z", "due_date":"2019-08-16T11:00:00.256Z", "start_date":"2019-07-30T12:00:00.256Z", - "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/2" + "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/2", + "issue_stats": { + "total": 24, + "closed": 21 + } } ], "commit_path":"/root/awesome-app/commit/588440f66559714280628a4f9799f0c4eb880a4a", @@ -374,7 +390,11 @@ Example response: "updated_at":"2019-07-12T19:45:44.256Z", "due_date":"2019-08-16T11:00:00.256Z", "start_date":"2019-07-30T12:00:00.256Z", - "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/1" + "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/1", + "issue_stats": { + "total": 99, + "closed": 76 + } }, { "id":52, @@ -387,7 +407,11 @@ Example response: "updated_at":"2019-07-16T14:00:12.256Z", "due_date":"2019-08-16T11:00:00.256Z", "start_date":"2019-07-30T12:00:00.256Z", - "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/2" + "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/2", + "issue_stats": { + "total": 24, + "closed": 21 + } } ], "commit_path":"/root/awesome-app/commit/588440f66559714280628a4f9799f0c4eb880a4a", @@ -495,7 +519,11 @@ Example response: "updated_at":"2019-09-01T13:00:00.256Z", "due_date":"2019-09-20T13:00:00.256Z", "start_date":"2019-09-05T12:00:00.256Z", - "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/3" + "web_url":"https://gitlab.example.com/root/awesome-app/-/milestones/3", + "issue_stats": { + "opened": 11, + "closed": 78 + } } ], "commit_path":"/root/awesome-app/commit/588440f66559714280628a4f9799f0c4eb880a4a", diff --git a/doc/development/README.md b/doc/development/README.md index d73b83e53fc..b207f208e3d 100644 --- a/doc/development/README.md +++ b/doc/development/README.md @@ -145,6 +145,7 @@ Complementary reads: - [Hash indexes](hash_indexes.md) - [Storing SHA1 hashes as binary](sha1_as_binary.md) - [Iterating tables in batches](iterating_tables_in_batches.md) +- [Insert into tables in batches](insert_into_tables_in_batches.md) - [Ordering table columns](ordering_table_columns.md) - [Verifying database capabilities](verifying_database_capabilities.md) - [Database Debugging and Troubleshooting](database_debugging.md) diff --git a/doc/development/insert_into_tables_in_batches.md b/doc/development/insert_into_tables_in_batches.md new file mode 100644 index 00000000000..763185013c9 --- /dev/null +++ b/doc/development/insert_into_tables_in_batches.md @@ -0,0 +1,156 @@ +--- +description: "Sometimes it is necessary to store large amounts of records at once, which can be inefficient +when iterating collections and performing individual `save`s. With the arrival of `insert_all` +in Rails 6, which operates at the row level (that is, using `Hash`es), GitLab has added a set +of APIs that make it safe and simple to insert ActiveRecord objects in bulk." +--- + +# Insert into tables in batches + +Sometimes it is necessary to store large amounts of records at once, which can be inefficient +when iterating collections and saving each record individually. With the arrival of +[`insert_all`](https://apidock.com/rails/ActiveRecord/Persistence/ClassMethods/insert_all) +in Rails 6, which operates at the row level (that is, using `Hash` objects), GitLab has added a set +of APIs that make it safe and simple to insert `ActiveRecord` objects in bulk. + +## Prepare `ApplicationRecord`s for bulk insertion + +In order for a model class to take advantage of the bulk insertion API, it has to include the +`BulkInsertSafe` concern first: + +```ruby +class MyModel < ApplicationRecord + # other includes here + # ... + include BulkInsertSafe # include this last + + # ... +end +``` + +The `BulkInsertSafe` concern has two functions: + +- It performs checks against your model class to ensure that it does not use ActiveRecord + APIs that are not safe to use with respect to bulk insertions (more on that below). +- It adds a new class method `bulk_insert!`, which you can use to insert many records at once. + +## Insert records via `bulk_insert!` + +If the target class passes the checks performed by `BulkInsertSafe`, you can proceed to use +the `bulk_insert!` class method as follows: + +```ruby +records = [MyModel.new, ...] + +MyModel.bulk_insert!(records) +``` + +### Record validation + +The `bulk_insert!` method guarantees that `records` will be inserted transactionally, and +will run validations on each record prior to insertion. If any record fails to validate, +an error is raised and the transaction is rolled back. You can turn off validations via +the `:validate` option: + +```ruby +MyModel.bulk_insert!(records, validate: false) +``` + +### Batch size configuration + +In those cases where the number of `records` is above a given threshold, insertions will +occur in multiple batches. The default batch size is defined in +[`BulkInsertSafe::DEFAULT_BATCH_SIZE`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/concerns/bulk_insert_safe.rb). +Assuming a default threshold of 500, inserting 950 records +would result in two batches being written sequentially (of size 500 and 450 respectively.) +You can override the default batch size via the `:batch_size` option: + +```ruby +MyModel.bulk_insert!(records, batch_size: 100) +``` + +Assuming the same number of 950 records, this would result in 10 batches being written instead. +Since this will also affect the number of `INSERT`s that occur, make sure you measure the +performance impact this might have on your code. There is a trade-off between the number of +`INSERT` statements the database has to process and the size and cost of each `INSERT`. + +### Requirements for safe bulk insertions + +Large parts of ActiveRecord's persistence API are built around the notion of callbacks. Many +of these callbacks fire in response to model life cycle events such as `save` or `create`. +These callbacks cannot be used with bulk insertions, since they are meant to be called for +every instance that is saved or created. Since these events do not fire when +records are inserted in bulk, we currently disallow their use. + +The specifics around which callbacks are disallowed are defined in +[`BulkInsertSafe`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/concerns/bulk_insert_safe.rb). +Consult the module source code for details. If your class uses any of the blacklisted +functionality, and you `include BulkInsertSafe`, the application will fail with an error. + +### `BulkInsertSafe` versus `InsertAll` + +Internally, `BulkInsertSafe` is based on `InsertAll`, and you may wonder when to choose +the former over the latter. To help you make the decision, +the key differences between these classes are listed in the table below. + +| | Input type | Validates input | Specify batch size | Can bypass callbacks | Transactional | +|--------------- | -------------------- | --------------- | ------------------ | --------------------------------- | ------------- | +| `bulk_insert!` | ActiveRecord objects | Yes (optional) | Yes (optional) | No (prevents unsafe callback use) | Yes | +| `insert_all!` | Attribute hashes | No | No | Yes | Yes | + +To summarize, `BulkInsertSafe` moves bulk inserts closer to how ActiveRecord objects +and inserts would normally behave. However, if all you need is to insert raw data in bulk, then +`insert_all` is more efficient. + +## Insert `has_many` associations in bulk + +A common use case is to save collections of associated relations through the owner side of the relation, +where the owned relation is associated to the owner through the `has_many` class method: + +```ruby +owner = OwnerModel.new(owned_relations: array_of_owned_relations) +# saves all `owned_relations` one-by-one +owner.save! +``` + +This will issue a single `INSERT`, and transaction, for every record in `owned_relations`, which is inefficient if +`array_of_owned_relations` is large. To remedy this, the `BulkInsertableAssociations` concern can be +used to declare that the owner defines associations that are safe for bulk insertion: + +```ruby +class OwnerModel < ApplicationRecord + # other includes here + # ... + include BulkInsertableAssociations # include this last + + has_many :my_models +end +``` + +Here `my_models` must be declared `BulkInsertSafe` (as described previously) for bulk insertions +to happen. You can now insert any yet unsaved records as follows: + +```ruby +BulkInsertableAssociations.with_bulk_insert do + owner = OwnerModel.new(my_models: array_of_my_model_instances) + # saves `my_models` using a single bulk insert (possibly via multiple batches) + owner.save! +end +``` + +Note that you can still save relations that are not `BulkInsertSafe` in this block; they will +simply be treated as if you had invoked `save` from outside the block. + +## Known limitations + +There are a few restrictions to how these APIs can be used: + +- Bulk inserts only work for new records; `UPDATE`s or "upserts" are not supported yet. +- `ON CONFLICT` behavior cannot currently be configured; an error will be raised on primary key conflicts. +- `BulkInsertableAssociations` furthermore has the following restrictions: + - only compatible with `has_many` relations. + - does not support `has_many through: ...` relations. + +Moreover, input data should either be limited to around 1000 records at most, +or already batched prior to calling bulk insert. The `INSERT` statement will run in a single +transaction, so for large amounts of records it may negatively affect database stability. diff --git a/doc/gitlab-basics/create-your-ssh-keys.md b/doc/gitlab-basics/create-your-ssh-keys.md index 98f2679c9d6..9b3431a5a42 100644 --- a/doc/gitlab-basics/create-your-ssh-keys.md +++ b/doc/gitlab-basics/create-your-ssh-keys.md @@ -1,14 +1,13 @@ --- type: howto --- - -# Create and add your SSH public key +# Create and add your SSH key pair It is best practice to use [Git over SSH instead of Git over HTTP](https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols). In order to use SSH, you will need to: -1. [Create an SSH key pair](#creating-your-ssh-key-pair) on your local computer. -1. [Add the key to GitLab](#adding-your-ssh-public-key-to-gitlab). +1. Create an SSH key pair +1. Add your SSH public key to GitLab ## Creating your SSH key pair diff --git a/doc/user/project/settings/import_export.md b/doc/user/project/settings/import_export.md index c69a4740ab3..9ff9f76dadb 100644 --- a/doc/user/project/settings/import_export.md +++ b/doc/user/project/settings/import_export.md @@ -29,10 +29,11 @@ Note the following: - Exports are stored in a temporary [shared directory](../../../development/shared_files.md) and are deleted every 24 hours by a specific worker. - Group members are exported as project members, as long as the user has - maintainer or admin access to the group where the exported project lives. Import admins should map users by email address. + maintainer or admin access to the group where the exported project lives. +- Project members with owner access will be imported as maintainers. +- Using an admin account to import will map users by email address (self-managed only). Otherwise, a supplementary comment is left to mention that the original author and the MRs, notes, or issues will be owned by the importer. -- Project members with owner access will be imported as maintainers. - If an imported project contains merge requests originating from forks, then new branches associated with such merge requests will be created within a project during the import/export. Thus, the number of branches @@ -142,4 +143,4 @@ To help avoid abuse, users are rate limited to: | ---------------- | --------------------------- | | Export | 1 project per 5 minutes | | Download export | 10 projects per 10 minutes | -| Import | 30 projects per 10 minutes | +| Import | 30 projects per 5 minutes | |