summaryrefslogtreecommitdiff
path: root/doc/administration
diff options
context:
space:
mode:
authorShinya Maeda <shinya@gitlab.com>2018-05-07 16:21:09 +0900
committerShinya Maeda <shinya@gitlab.com>2018-05-07 16:21:09 +0900
commitb71320a29658ecb423b1f557656adba9f9b1c562 (patch)
tree80d754cf64d1a444b61f7d36b2ecc8f9c76993f2 /doc/administration
parent1f39fcd1123c1a65798a0a0b3e5f3b2fa43651ac (diff)
downloadgitlab-ce-b71320a29658ecb423b1f557656adba9f9b1c562.tar.gz
Add doc about this architecture, impact, roadmap, etc
Diffstat (limited to 'doc/administration')
-rw-r--r--doc/administration/job_traces.md96
1 files changed, 96 insertions, 0 deletions
diff --git a/doc/administration/job_traces.md b/doc/administration/job_traces.md
index 84a1ffeec98..3470274e5ea 100644
--- a/doc/administration/job_traces.md
+++ b/doc/administration/job_traces.md
@@ -40,3 +40,99 @@ To change the location where the job logs will be stored, follow the steps below
[reconfigure gitlab]: restart_gitlab.md#omnibus-gitlab-reconfigure "How to reconfigure Omnibus GitLab"
[restart gitlab]: restart_gitlab.md#installations-from-source "How to restart GitLab"
+
+## New live trace architecture
+
+> [Introduced][ce-18169] in GitLab 10.4.
+
+> **Notes**:
+- This feature is still Beta, which could impact GitLab.com/on-premises instances, and in the worst case scenario, traces will be lost.
+- This feature is still being discussed in [an issue](https://gitlab.com/gitlab-org/gitlab-ce/issues/46097) for the performance improvements.
+- This feature is off by default. Please check below how to enable/disable this featrue.
+
+**What is "live trace"?**
+
+It's job traces exists while job is being processed by Gitlab-Runner. You can see the progress in job pages(GUI).
+In contrast, all traces will be archived after job is finished, that's called "archived trace".
+
+**What is new architecture?**
+
+So far, when GitLab-Runner sends a job trace to GitLab-Rails, traces have been saved to File Storage as text files.
+This was a problem on [Cloud Native-compatible GitLab application](https://gitlab.com/gitlab-com/migration/issues/23) that
+GitLab-Rails had to rely on File Storage.
+
+This new live trace architecture stores traces to Redis and Database instead of File Storage.
+Redis is used as first-class trace storage, it stores each trace upto 128KB. Once the data is fulfileld, it's flushed to Database. Afterwhile, the data in Redis and Database will be archived to ObjectStorage.
+
+Here is the detailed data flow.
+
+1. GitLab-Runner picks a job from GitLab-Rails
+1. GitLab-Runner sends a piece of trace to GitLab-Rails
+1. GitLab-Rails appends the data to Redis
+1. If the data in Redis is fulfilled 128KB, the data is flushed to Database.
+1. 2.~4. is continued until the job is finished
+1. Once the job is finished, GitLab-Rails schedules a sidekiq worker to archive the trace
+1. The sidekiq worker archives the trace to Object Storage, and cleanup the trace in Redis and Database
+
+**How to check if it's on or off**
+
+```ruby
+Feature.enabled?('ci_enable_live_trace')
+```
+
+**How to enable**
+
+```ruby
+Feature.enable('ci_enable_live_trace')
+```
+
+>**Note:**
+The transition period will be handled gracefully. Upcoming traces will be generated with the new architecture, and on-going live traces will stay with the legacy architecture (i.e. on-going live traces won't be re-generated forcibly with the new architecture).
+
+**How to disable**
+
+```ruby
+Feature.disable('ci_enable_live_trace')
+```
+
+>**Note:**
+The transition period will be handled gracefully. Upcoming traces will be generated with the legacy architecture, and on-going live traces will stay with the new architecture (i.e. on-going live traces won't be re-generated forcibly with the legacy architecture).
+
+**Redis namespace**
+
+`Gitlab::Redis::SharedState`
+
+**Potential impact**
+
+- This feature could incur data loss
+ - Case 1: When all data in Redis are accidentally flushed.
+ - On-going live traces could be recovered by re-sending traces (This is supported by all versions of GitLab-Runner)
+ - Finished jobs which has not archived live traces will lose the last part(~128KB) of trace data.
+ - Case 2: When sidekiq workers failed to archive (e.g. There was a bug that prevents archiving process, Sidekiq inconsistancy, etc)
+ - Currently all trace data in Redis will be deleted after one week. If the sidekiq workers can't finish by the expiry date, the part of trace data will be lost.
+- This feature could consume all memeory on Redis instance. If the number of jobs is 1000, 128KB * 1000 = 128MB is consumed.
+- This feature could pressure Database instance. `INSERT` is queried per 128KB per a job. `UPDATE` is queried with the same condition, but only if the total size of the trace exceeds 128KB.
+- and so on
+
+**How to test**
+
+We're currently evaluating this feature on dev.gitalb.org or staging.gitlab.com to verify this features. Here is the list of tests/measurements.
+
+- Features
+ - Live traces should be visible on job pages
+ - Archived traces should be visible on job pages
+ - Live traces should be archived to Object storage
+ - Live traces should be cleaned up after archived
+ - etc
+- Performance
+ - Schedule 1000~10000 jobs and let GitLab-runners process concurrently. Measure memoery presssure, IO load, etc.
+ - etc
+- Failover
+ - Simulate Redis outage
+ - etc
+
+**How to verify the correctnesss**
+
+ - TBD
+
+[ce-44935]: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/18169