diff options
Diffstat (limited to 'doc/administration/monitoring/prometheus/gitlab_metrics.md')
-rw-r--r-- | doc/administration/monitoring/prometheus/gitlab_metrics.md | 54 |
1 files changed, 50 insertions, 4 deletions
diff --git a/doc/administration/monitoring/prometheus/gitlab_metrics.md b/doc/administration/monitoring/prometheus/gitlab_metrics.md index 3bfcc9a289e..48bd709a2b7 100644 --- a/doc/administration/monitoring/prometheus/gitlab_metrics.md +++ b/doc/administration/monitoring/prometheus/gitlab_metrics.md @@ -43,10 +43,52 @@ The following metrics are available: | redis_ping_latency_seconds | Gauge | 9.4 | Round trip time of the redis ping | | user_session_logins_total | Counter | 9.4 | Counter of how many users have logged in | | upload_file_does_not_exist | Counter | 10.7 in EE, 11.5 in CE | Number of times an upload record could not find its file | -| failed_login_captcha_total | Gauge | 11.0 | Counter of failed CAPTCHA attempts during login | -| successful_login_captcha_total | Gauge | 11.0 | Counter of successful CAPTCHA attempts during login | -| unicorn_active_connections | Gauge | 11.0 | The number of active Unicorn connections (workers) | -| unicorn_queued_connections | Gauge | 11.0 | The number of queued Unicorn connections | +| failed_login_captcha_total | Gauge | 11.0 | Counter of failed CAPTCHA attempts during login | +| successful_login_captcha_total | Gauge | 11.0 | Counter of successful CAPTCHA attempts during login | +| unicorn_active_connections | Gauge | 11.0 | The number of active Unicorn connections (workers) | +| unicorn_queued_connections | Gauge | 11.0 | The number of queued Unicorn connections | +| unicorn_workers | Gauge | 11.11 | The number of Unicorn workers | + +## Sidekiq Metrics available for Geo **[PREMIUM]** + +Sidekiq jobs may also gather metrics, and these metrics can be accessed if the Sidekiq exporter is enabled (e.g. via +the `monitoring.sidekiq_exporter` configuration option in `gitlab.yml`. + +| Metric | Type | Since | Description | Labels | +|:-------------------------------------------- |:------- |:----- |:----------- |:------ | +| geo_db_replication_lag_seconds | Gauge | 10.2 | Database replication lag (seconds) | url +| geo_repositories | Gauge | 10.2 | Total number of repositories available on primary | url +| geo_repositories_synced | Gauge | 10.2 | Number of repositories synced on secondary | url +| geo_repositories_failed | Gauge | 10.2 | Number of repositories failed to sync on secondary | url +| geo_lfs_objects | Gauge | 10.2 | Total number of LFS objects available on primary | url +| geo_lfs_objects_synced | Gauge | 10.2 | Number of LFS objects synced on secondary | url +| geo_lfs_objects_failed | Gauge | 10.2 | Number of LFS objects failed to sync on secondary | url +| geo_attachments | Gauge | 10.2 | Total number of file attachments available on primary | url +| geo_attachments_synced | Gauge | 10.2 | Number of attachments synced on secondary | url +| geo_attachments_failed | Gauge | 10.2 | Number of attachments failed to sync on secondary | url +| geo_last_event_id | Gauge | 10.2 | Database ID of the latest event log entry on the primary | url +| geo_last_event_timestamp | Gauge | 10.2 | UNIX timestamp of the latest event log entry on the primary | url +| geo_cursor_last_event_id | Gauge | 10.2 | Last database ID of the event log processed by the secondary | url +| geo_cursor_last_event_timestamp | Gauge | 10.2 | Last UNIX timestamp of the event log processed by the secondary | url +| geo_status_failed_total | Counter | 10.2 | Number of times retrieving the status from the Geo Node failed | url +| geo_last_successful_status_check_timestamp | Gauge | 10.2 | Last timestamp when the status was successfully updated | url +| geo_lfs_objects_synced_missing_on_primary | Gauge | 10.7 | Number of LFS objects marked as synced due to the file missing on the primary | url +| geo_job_artifacts_synced_missing_on_primary | Gauge | 10.7 | Number of job artifacts marked as synced due to the file missing on the primary | url +| geo_attachments_synced_missing_on_primary | Gauge | 10.7 | Number of attachments marked as synced due to the file missing on the primary | url +| geo_repositories_checksummed_count | Gauge | 10.7 | Number of repositories checksummed on primary | url +| geo_repositories_checksum_failed_count | Gauge | 10.7 | Number of repositories failed to calculate the checksum on primary | url +| geo_wikis_checksummed_count | Gauge | 10.7 | Number of wikis checksummed on primary | url +| geo_wikis_checksum_failed_count | Gauge | 10.7 | Number of wikis failed to calculate the checksum on primary | url +| geo_repositories_verified_count | Gauge | 10.7 | Number of repositories verified on secondary | url +| geo_repositories_verification_failed_count | Gauge | 10.7 | Number of repositories failed to verify on secondary | url +| geo_repositories_checksum_mismatch_count | Gauge | 10.7 | Number of repositories that checksum mismatch on secondary | url +| geo_wikis_verified_count | Gauge | 10.7 | Number of wikis verified on secondary | url +| geo_wikis_verification_failed_count | Gauge | 10.7 | Number of wikis failed to verify on secondary | url +| geo_wikis_checksum_mismatch_count | Gauge | 10.7 | Number of wikis that checksum mismatch on secondary | url +| geo_repositories_checked_count | Gauge | 11.1 | Number of repositories that have been checked via `git fsck` | url +| geo_repositories_checked_failed_count | Gauge | 11.1 | Number of repositories that have a failure from `git fsck` | url +| geo_repositories_retrying_verification_count | Gauge | 11.2 | Number of repositories verification failures that Geo is actively trying to correct on secondary | url +| geo_wikis_retrying_verification_count | Gauge | 11.2 | Number of wikis verification failures that Geo is actively trying to correct on secondary | url ### Ruby metrics @@ -59,6 +101,10 @@ Some basic Ruby runtime metrics are available: | ruby_file_descriptors | Gauge | 11.1 | File descriptors per process | | ruby_memory_bytes | Gauge | 11.1 | Memory usage by process | | ruby_sampler_duration_seconds_total | Counter | 11.1 | Time spent collecting stats | +| ruby_process_cpu_seconds_total | Gauge | 11.11 | Total amount of CPU time per process | +| ruby_process_max_fds | Gauge | 11.11 | Maximum number of open file descriptors per process | +| ruby_process_resident_memory_bytes | Gauge | 11.11 | Memory usage by process, measured in bytes | +| ruby_process_start_time_seconds | Gauge | 11.11 | The elapsed time between system boot and the process started, measured in seconds | [GC.stat]: https://ruby-doc.org/core-2.3.0/GC.html#method-c-stat |