diff options
author | GitLab Bot <gitlab-bot@gitlab.com> | 2020-08-20 18:42:06 +0000 |
---|---|---|
committer | GitLab Bot <gitlab-bot@gitlab.com> | 2020-08-20 18:42:06 +0000 |
commit | 6e4e1050d9dba2b7b2523fdd1768823ab85feef4 (patch) | |
tree | 78be5963ec075d80116a932011d695dd33910b4e /doc/operations/incident_management | |
parent | 1ce776de4ae122aba3f349c02c17cebeaa8ecf07 (diff) | |
download | gitlab-ce-6e4e1050d9dba2b7b2523fdd1768823ab85feef4.tar.gz |
Add latest changes from gitlab-org/gitlab@13-3-stable-ee
Diffstat (limited to 'doc/operations/incident_management')
26 files changed, 662 insertions, 0 deletions
diff --git a/doc/operations/incident_management/alertdetails.md b/doc/operations/incident_management/alertdetails.md new file mode 100644 index 00000000000..774eaee286f --- /dev/null +++ b/doc/operations/incident_management/alertdetails.md @@ -0,0 +1,194 @@ +--- +stage: Monitor +group: Health +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers +--- + +# Alert details page + +Navigate to the Alert details view by visiting the +[Alert list](alerts.md) and selecting an alert from the +list. You need least Developer [permissions](../../user/permissions.md) to access +alerts. + +Alerts provide **Overview** and **Alert details** tabs to give you the right +amount of information you need. + +## Alert overview tab + +The **Overview** tab provides basic information about the alert: + +![Alert Detail Overview](img/alert_detail_overview_v13_1.png) + +## Alert details tab + +![Alert Full Details](img/alert_detail_full_v13_1.png) + +### Update an Alert's status + +The Alert detail view enables you to update the Alert Status. +See [Create and manage alerts in GitLab](alerts.md) for more details. + +### Create an Issue from an Alert + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/217745) in GitLab 13.1. + +The Alert detail view enables you to create an issue with a +description automatically populated from an alert. To create the issue, +click the **Create Issue** button. You can then view the issue from the +alert by clicking the **View Issue** button. + +Closing a GitLab issue associated with an alert changes the alert's status to Resolved. +See [Create and manage alerts in GitLab](alerts.md) for more details about alert statuses. + +### Update an Alert's assignee + +> [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/3066) in GitLab 13.1. + +The Alert detail view allows users to update the Alert assignee. + +In large teams, where there is shared ownership of an alert, it can be difficult +to track who is investigating and working on it. The Alert detail view +enables you to update the Alert assignee: + +NOTE: **Note:** +GitLab currently only supports a single assignee per alert. + +1. To display the list of current alerts, click + **{cloud-gear}** **Operations > Alerts**: + + ![Alert List View Assignee(s)](img/alert_list_assignees_v13_1.png) + +1. Select your desired alert to display its **Alert Details View**: + + ![Alert Details View Assignee(s)](img/alert_details_assignees_v13_1.png) + +1. If the right sidebar is not expanded, click + **{angle-double-right}** **Expand sidebar** to expand it. +1. In the right sidebar, locate the **Assignee** and click **Edit**. From the + dropdown menu, select each user you want to assign to the alert. GitLab creates + a [To-Do list item](../../user/todos.md) for each user. + + ![Alert Details View Assignee(s)](img/alert_todo_assignees_v13_1.png) + +To remove an assignee, click **Edit** next to the **Assignee** dropdown menu and +deselect the user from the list of assignees, or click **Unassigned**. + +### Alert system notes + +> [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/3066) in GitLab 13.1. + +When you take action on an alert, this is logged as a system note, +which is visible in the Alert Details view. This gives you a linear +timeline of the alert's investigation and assignment history. + +The following actions will result in a system note: + +- [Updating the status of an alert](#update-an-alerts-status) +- [Creating an issue based on an alert](#create-an-issue-from-an-alert) +- [Assignment of an alert to a user](#update-an-alerts-assignee) + +![Alert Details View System Notes](img/alert_detail_system_notes_v13_1.png) + +### Create a To-Do from an Alert + +> [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/3066) in GitLab 13.1. + +You can manually create [To-Do list items](../../user/todos.md) for yourself from the +Alert details screen, and view them later on your **To-Do List**. To add a To-Do: + +1. To display the list of current alerts, click + **{cloud-gear}** **Operations > Alerts**. +1. Select your desired alert to display its **Alert Management Details View**. +1. Click the **Add a To-Do** button in the right sidebar: + + ![Alert Details Add A To Do](img/alert_detail_add_todo_v13_1.png) + +Click the **To-Do** **{todo-done}** in the navigation bar to view your current To-Do list. + +![Alert Details Added to Do](img/alert_detail_added_todo_v13_1.png) + +### View an Alert's metrics data + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/217768) in GitLab 13.2. + +To view the metrics for an alert: + + 1. Sign in as a user with Developer or higher [permissions](../../user/permissions.md). + 1. Navigate to **{cloud-gear}** **Operations > Alerts**. + 1. Click the alert you want to view. + 1. Below the title of the alert, click the **Metrics** tab. + +![Alert Metrics View](img/alert_detail_metrics_v13_2.png) + +For GitLab-managed Prometheus instances, metrics data is automatically available +for the alert, making it easy to see surrounding behavior. See +[Managed Prometheus instances](../metrics/alerts.md#managed-prometheus-instances) +for information on setting up alerts. + +For externally-managed Prometheus instances, you can configure your alerting rules to +display a chart in the alert. See +[Embedding metrics based on alerts in incident issues](../metrics/embed.md#embedding-metrics-based-on-alerts-in-incident-issues) +for information on how to appropriately configure your alerting rules. See +[External Prometheus instances](../metrics/alerts.md#external-prometheus-instances) +for information on setting up alerts for your self-managed Prometheus instance. + +## Use cases for assigning alerts + +Consider a team formed by different sections of monitoring, collaborating on a +single application. After an alert surfaces, it's extremely important to +route the alert to the team members who can address and resolve the alert. + +Assigning Alerts eases collaboration and delegation. All +assignees are shown in your team's work-flows, and all assignees receive +notifications, simplifying communication and ownership of the alert. + +After completing their portion of investigating or fixing the alert, users can +unassign their account from the alert when their role is complete. +The alert status can be updated on the [Alert list](alerts.md) to +reflect if the alert has been resolved. + +## View an Alert's logs + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/217768) in GitLab 13.3. + +To view the logs for an alert: + + 1. Sign in as a user with Developer or higher [permissions](../../user/permissions.md). + 1. Navigate to **{cloud-gear}** **Operations > Alerts**. + 1. Click the alert you want to view. + 1. Below the title of the alert, click the **Metrics** tab. + 1. Click the [menu](../metrics/dashboards/index.md#chart-context-menu) of the metric chart to view options. + 1. Click **View logs**. + +Read [View logs from metrics panel](#view-logs-from-metrics-panel) for additional information. + +## Embed metrics in incidents and issues + +You can embed metrics anywhere [GitLab Markdown](../../user/markdown.md) is used, such as descriptions, +comments on issues, and merge requests. Embedding metrics helps you share them +when discussing incidents or performance issues. You can output the dashboard directly +into any issue, merge request, epic, or any other Markdown text field in GitLab +by [copying and pasting the link to the metrics dashboard](../metrics/embed.md#embedding-gitlab-managed-kubernetes-metrics). + +You can embed both +[GitLab-hosted metrics](../metrics/embed.md) and +[Grafana metrics](../metrics/embed_grafana.md) +in incidents and issue templates. + +### Context menu + +You can view more details about an embedded metrics panel from the context menu. +To access the context menu, click the **{ellipsis_v}** **More actions** dropdown box +above the upper right corner of the panel. For a list of options, see +[Chart context menu](../metrics/dashboards/index.md#chart-context-menu). + +#### View logs from metrics panel + +> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/201846) in GitLab Ultimate 12.8. +> - [Moved](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/25455) to [GitLab Core](https://about.gitlab.com/pricing/) 12.9. + +Viewing logs from a metrics panel can be useful if you're triaging an application +incident and need to [explore logs](../metrics/dashboards/index.md#chart-context-menu) +from across your application. These logs help you understand what is affecting +your application's performance and resolve any problems. diff --git a/doc/operations/incident_management/alerts.md b/doc/operations/incident_management/alerts.md new file mode 100644 index 00000000000..5a5fc59d5e3 --- /dev/null +++ b/doc/operations/incident_management/alerts.md @@ -0,0 +1,118 @@ +--- +stage: Monitor +group: Health +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers +--- + +# Create and manage alerts in GitLab + +Users with at least Developer [permissions](../../user/permissions.md) can access +the Alert Management list at **{cloud-gear}** **Operations > Alerts** in your +project's sidebar. The Alert Management list displays alerts sorted by start time, +but you can change the sort order by clicking the headers in the Alert Management list. +([Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/217745) in GitLab 13.1.) + +The alert list displays the following information: + +![Alert List](../../user/project/operations/img/alert_list_v13_1.png) + +- **Search** - The alert list supports a simple free text search on the title, + description, monitoring tool, and service fields. + ([Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/213884) in GitLab 13.1.) +- **Severity** - The current importance of a alert and how much attention it should + receive. For a listing of all statuses, read [Alert Management severity](#alert-severity). +- **Start time** - How long ago the alert fired. This field uses the standard + GitLab pattern of `X time ago`, but is supported by a granular date/time tooltip + depending on the user's locale. +- **Alert description** - The description of the alert, which attempts to capture the most meaningful data. +- **Event count** - The number of times that an alert has fired. +- **Issue** - A link to the incident issue that has been created for the alert. +- **Status** - The current status of the alert: + - **Triggered**: No one has begun investigation. + - **Acknowledged**: Someone is actively investigating the problem. + - **Resolved**: No further work is required. + +## Enable Alerts + +NOTE: **Note:** +You need at least Maintainer [permissions](../../user/permissions.md) to enable +the Alerts feature. + +There are several ways to accept alerts into your GitLab project. +Enabling any of these methods enables the Alert list. After configuring +alerts, visit **{cloud-gear}** **Operations > Alerts** in your project's sidebar +to view the list of alerts. + +### Enable GitLab-managed Prometheus alerts + +You can install the GitLab-managed Prometheus application on your Kubernetes +cluster. For more information, read +[Managed Prometheus on Kubernetes](../../user/project/integrations/prometheus.md#managed-prometheus-on-kubernetes). +When GitLab-managed Prometheus is installed, the [Alerts list](alerts.md) +is also enabled. + +To populate the alerts with data, read +[GitLab-Managed Prometheus instances](../metrics/alerts.md#managed-prometheus-instances). + +### Enable external Prometheus alerts + +You can configure an externally-managed Prometheus instance to send alerts +to GitLab. To set up this configuration, read the [configuring Prometheus](../metrics/alerts.md#external-prometheus-instances) documentation. Activating the external Prometheus +configuration also enables the [Alerts list](alerts.md). + +To populate the alerts with data, read +[External Prometheus instances](../metrics/alerts.md#external-prometheus-instances). + +### Enable a Generic Alerts endpoint + +GitLab provides the Generic Alerts endpoint so you can accept alerts from a third-party +alerts service. Read the +[instructions for toggling generic alerts](../../user/project/integrations/generic_alerts.md#setting-up-generic-alerts) +to add this option. After configuring the endpoint, the +[Alerts list](alerts.md) is enabled. + +To populate the alerts with data, read [Customizing the payload](../../user/project/integrations/generic_alerts.md#customizing-the-payload) for requests to the alerts endpoint. + +### Opsgenie integration **(PREMIUM)** + +> [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/3066) in [GitLab Premium](https://about.gitlab.com/pricing/) 13.2. + +A new way of monitoring Alerts via a GitLab integration is with +[Opsgenie](https://www.atlassian.com/software/opsgenie). + +NOTE: **Note:** +If you enable the Opsgenie integration, you can't have other GitLab alert services, +such as [Generic Alerts](../../user/project/integrations/generic_alerts.md) or +Prometheus alerts, active at the same time. + +To enable Opsgenie integration: + +1. Sign in as a user with Maintainer or Owner [permissions](../../user/permissions.md). +1. Navigate to **{cloud-gear}** **Operations > Alerts**. +1. In the **Integrations** select box, select Opsgenie. +1. Click the **Active** toggle. +1. In the **API URL**, enter the base URL for your Opsgenie integration, such + as `https://app.opsgenie.com/alert/list`. +1. Click **Save changes**. + +After enabling the integration, navigate to the Alerts list page at +**{cloud-gear}** **Operations > Alerts**, and click **View alerts in Opsgenie**. + +## Alert severity + +Each level of alert contains a uniquely shaped and color-coded icon to help +you identify the severity of a particular alert. These severity icons help you +immediately identify which alerts you should prioritize investigating: + +![Alert Management Severity System](img/alert_management_severity_v13_0.png) + +Alerts contain one of the following icons: + +| Severity | Icon | Color (hexadecimal) | +|---|---|---| +| Critical | **{severity-critical}** | `#8b2615` | +| High | **{severity-high}** | `#c0341d` | +| Medium | **{severity-medium}** | `#fca429` | +| Low | **{severity-low}** | `#fdbc60` | +| Info | **{severity-info}** | `#418cd8` | +| Unknown | **{severity-unknown}** | `#bababa` | diff --git a/doc/operations/incident_management/img/alert_detail_add_todo_v13_1.png b/doc/operations/incident_management/img/alert_detail_add_todo_v13_1.png Binary files differnew file mode 100644 index 00000000000..39aa9e33728 --- /dev/null +++ b/doc/operations/incident_management/img/alert_detail_add_todo_v13_1.png diff --git a/doc/operations/incident_management/img/alert_detail_added_todo_v13_1.png b/doc/operations/incident_management/img/alert_detail_added_todo_v13_1.png Binary files differnew file mode 100644 index 00000000000..ae874706895 --- /dev/null +++ b/doc/operations/incident_management/img/alert_detail_added_todo_v13_1.png diff --git a/doc/operations/incident_management/img/alert_detail_full_v13_1.png b/doc/operations/incident_management/img/alert_detail_full_v13_1.png Binary files differnew file mode 100644 index 00000000000..18a6f4fb67b --- /dev/null +++ b/doc/operations/incident_management/img/alert_detail_full_v13_1.png diff --git a/doc/operations/incident_management/img/alert_detail_metrics_v13_2.png b/doc/operations/incident_management/img/alert_detail_metrics_v13_2.png Binary files differnew file mode 100644 index 00000000000..84d83365ea8 --- /dev/null +++ b/doc/operations/incident_management/img/alert_detail_metrics_v13_2.png diff --git a/doc/operations/incident_management/img/alert_detail_overview_v13_1.png b/doc/operations/incident_management/img/alert_detail_overview_v13_1.png Binary files differnew file mode 100644 index 00000000000..10c945d3810 --- /dev/null +++ b/doc/operations/incident_management/img/alert_detail_overview_v13_1.png diff --git a/doc/operations/incident_management/img/alert_detail_system_notes_v13_1.png b/doc/operations/incident_management/img/alert_detail_system_notes_v13_1.png Binary files differnew file mode 100644 index 00000000000..2a6d0320a54 --- /dev/null +++ b/doc/operations/incident_management/img/alert_detail_system_notes_v13_1.png diff --git a/doc/operations/incident_management/img/alert_details_assignees_v13_1.png b/doc/operations/incident_management/img/alert_details_assignees_v13_1.png Binary files differnew file mode 100644 index 00000000000..dab4eac384a --- /dev/null +++ b/doc/operations/incident_management/img/alert_details_assignees_v13_1.png diff --git a/doc/operations/incident_management/img/alert_list_assignees_v13_1.png b/doc/operations/incident_management/img/alert_list_assignees_v13_1.png Binary files differnew file mode 100644 index 00000000000..db1e0d8dcb7 --- /dev/null +++ b/doc/operations/incident_management/img/alert_list_assignees_v13_1.png diff --git a/doc/operations/incident_management/img/alert_list_search_v13_1.png b/doc/operations/incident_management/img/alert_list_search_v13_1.png Binary files differnew file mode 100644 index 00000000000..ba993fe530b --- /dev/null +++ b/doc/operations/incident_management/img/alert_list_search_v13_1.png diff --git a/doc/operations/incident_management/img/alert_list_sort_v13_1.png b/doc/operations/incident_management/img/alert_list_sort_v13_1.png Binary files differnew file mode 100644 index 00000000000..8e06c3478f7 --- /dev/null +++ b/doc/operations/incident_management/img/alert_list_sort_v13_1.png diff --git a/doc/operations/incident_management/img/alert_management_severity_v13_0.png b/doc/operations/incident_management/img/alert_management_severity_v13_0.png Binary files differnew file mode 100644 index 00000000000..f996d6e88f4 --- /dev/null +++ b/doc/operations/incident_management/img/alert_management_severity_v13_0.png diff --git a/doc/operations/incident_management/img/alert_todo_assignees_v13_1.png b/doc/operations/incident_management/img/alert_todo_assignees_v13_1.png Binary files differnew file mode 100644 index 00000000000..637f8be5d25 --- /dev/null +++ b/doc/operations/incident_management/img/alert_todo_assignees_v13_1.png diff --git a/doc/operations/incident_management/img/incident_list.png b/doc/operations/incident_management/img/incident_list.png Binary files differnew file mode 100644 index 00000000000..0498fec6c9c --- /dev/null +++ b/doc/operations/incident_management/img/incident_list.png diff --git a/doc/operations/incident_management/img/incident_list_create_v13_3.png b/doc/operations/incident_management/img/incident_list_create_v13_3.png Binary files differnew file mode 100644 index 00000000000..a000c849099 --- /dev/null +++ b/doc/operations/incident_management/img/incident_list_create_v13_3.png diff --git a/doc/operations/incident_management/img/incident_list_search_v13_3.png b/doc/operations/incident_management/img/incident_list_search_v13_3.png Binary files differnew file mode 100644 index 00000000000..293268986cc --- /dev/null +++ b/doc/operations/incident_management/img/incident_list_search_v13_3.png diff --git a/doc/operations/incident_management/img/incident_list_sort_v13_3.png b/doc/operations/incident_management/img/incident_list_sort_v13_3.png Binary files differnew file mode 100644 index 00000000000..4a263aa188e --- /dev/null +++ b/doc/operations/incident_management/img/incident_list_sort_v13_3.png diff --git a/doc/operations/incident_management/img/incident_management_settings_v13_3.png b/doc/operations/incident_management/img/incident_management_settings_v13_3.png Binary files differnew file mode 100644 index 00000000000..c9520860414 --- /dev/null +++ b/doc/operations/incident_management/img/incident_management_settings_v13_3.png diff --git a/doc/operations/incident_management/img/pagerduty_incidents_integration_v13_3.png b/doc/operations/incident_management/img/pagerduty_incidents_integration_v13_3.png Binary files differnew file mode 100644 index 00000000000..0991e963e02 --- /dev/null +++ b/doc/operations/incident_management/img/pagerduty_incidents_integration_v13_3.png diff --git a/doc/operations/incident_management/img/status_page_detail_link_v13_1.png b/doc/operations/incident_management/img/status_page_detail_link_v13_1.png Binary files differnew file mode 100644 index 00000000000..f3d1005447c --- /dev/null +++ b/doc/operations/incident_management/img/status_page_detail_link_v13_1.png diff --git a/doc/operations/incident_management/img/status_page_detail_v12_10.png b/doc/operations/incident_management/img/status_page_detail_v12_10.png Binary files differnew file mode 100644 index 00000000000..d8dbbb539e6 --- /dev/null +++ b/doc/operations/incident_management/img/status_page_detail_v12_10.png diff --git a/doc/operations/incident_management/img/status_page_incidents_v12_10.png b/doc/operations/incident_management/img/status_page_incidents_v12_10.png Binary files differnew file mode 100644 index 00000000000..3540fbffcf8 --- /dev/null +++ b/doc/operations/incident_management/img/status_page_incidents_v12_10.png diff --git a/doc/operations/incident_management/incidents.md b/doc/operations/incident_management/incidents.md new file mode 100644 index 00000000000..0668dc72c22 --- /dev/null +++ b/doc/operations/incident_management/incidents.md @@ -0,0 +1,103 @@ +--- +stage: Monitor +group: Health +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers +--- + +# Create and manage incidents in GitLab + +While no configuration is required to use the [manual features](#create-an-incident-manually) +of incident management, some simple [configuration](#configure-incidents) is needed to automate incident creation. + +For users with at least Developer [permissions](../../user/permissions.md), the +Incident Management list is available at **Operations > Incidents** +in your project's sidebar. The list contains the following metrics: + +![Incident List](img/incident_list_sort_v13_3.png) + +- **Status** - To filter incidents by their status, click **Open**, **Closed**, + or **All** above the incident list. +- **Search** - The Incident list supports a simple free text search, which filters + on the **Title** and **Incident** fields. +- **Incident** - The description of the incident, which attempts to capture the + most meaningful data. +- **Date created** - How long ago the incident was created. This field uses the + standard GitLab pattern of `X time ago`, but is supported by a granular date/time + tooltip depending on the user's locale. +- **Assignees** - The user assigned to the incident. +- **Published** - Displays a green check mark (**{check-circle}**) if the incident is published + to a [Status Page](status_page.md).. **(ULTIMATE)** + +The Incident list displays incidents sorted by incident created date. +([Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/229534) to GitLab core in 13.3).) +To see if a column is sortable, point your mouse at the header. Sortable columns +display an arrow next to the column name. + +NOTE: **Note:** +Incidents share the [Issues API](../../user/project/issues/index.md). + +## Configure incidents + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/4925) in GitLab Ultimate 11.11. + +With Maintainer or higher [permissions](../../user/permissions.md), you can enable +or disable Incident Management features in the GitLab user interface +to create issues when alerts are triggered: + +1. Navigate to **Settings > Operations > Incidents** and expand + **Incidents**: + + ![Incident Management Settings](img/incident_management_settings_v13_3.png) + +1. For GitLab versions 11.11 and greater, you can select the **Create an issue** + checkbox to create an issue based on your own + [issue templates](../../user/project/description_templates.md#creating-issue-templates). + For more information, see + [Trigger actions from alerts](../metrics/alerts.md#trigger-actions-from-alerts-ultimate) **(ULTIMATE)**. +1. To create issues from alerts, select the template in the **Issue Template** + select box. +1. To send [separate email notifications](index.md#notify-developers-of-alerts) to users + with [Developer permissions](../../user/permissions.md), select + **Send a separate email notification to Developers**. +1. Click **Save changes**. + +Appropriately configured alerts include an +[embedded chart](../metrics/embed.md#embedding-metrics-based-on-alerts-in-incident-issues) +for the query corresponding to the alert. You can also configure GitLab to +[close issues](../metrics/alerts.md#trigger-actions-from-alerts-ultimate) +when you receive notification that the alert is resolved. + +## Create an incident manually + +> [Moved](https://gitlab.com/gitlab-org/monitor/health/-/issues/24) to GitLab core in 13.3. + +For users with at least Developer [permissions](../../user/permissions.md), to create a Incident you can take any of the following actions: + +- Navigate to **Operations > Incidents** and click **Create Incident**. +- Create a new issue using the `incident` template available when creating it. +- Create a new issue and assign the `incident` label to it. + +![Incident List Create](img/incident_list_create_v13_3.png) + +## Configure PagerDuty integration + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/119018) in GitLab 13.3. + +You can set up a webhook with PagerDuty to automatically create a GitLab issue +for each PagerDuty incident. This configuration requires you to make changes +in both PagerDuty and GitLab: + +1. Sign in as a user with Maintainer [permissions](../../user/permissions.md). +1. Navigate to **Settings > Operations > Incidents** and expand **Incidents**. +1. Select the **PagerDuty integration** tab: + + ![PagerDuty incidents integration](img/pagerduty_incidents_integration_v13_3.png) + +1. Activate the integration, and save the changes in GitLab. +1. Copy the value of **Webhook URL** for use in a later step. +1. Follow the steps described in the + [PagerDuty documentation](https://support.pagerduty.com/docs/webhooks) + to add the webhook URL to a PagerDuty webhook integration. + +To confirm the integration is successful, trigger a test incident from PagerDuty to +confirm that a GitLab issue is created from the incident. diff --git a/doc/operations/incident_management/index.md b/doc/operations/incident_management/index.md new file mode 100644 index 00000000000..53a6b47ec4b --- /dev/null +++ b/doc/operations/incident_management/index.md @@ -0,0 +1,68 @@ +--- +stage: Monitor +group: Health +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers +--- + +# Incident management + +> [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/2877) in GitLab 13.0. + +Alert Management enables developers to easily discover and view the alerts +generated by their application. By surfacing alert information where the code is +being developed, efficiency and awareness can be increased. + +GitLab offers solutions for handling incidents in your applications and services, +such as [setting up Prometheus alerts](#configure-prometheus-alerts), +[displaying metrics](alertdetails.md#embed-metrics-in-incidents-and-issues), and sending notifications. + +## Alert notifications + +### Slack Notifications + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/216326) in GitLab 13.1. + +You can be alerted via a Slack message when a new alert has been received. + +See the [Slack Notifications Service docs](../../user/project/integrations/slack.md) for information on how to set this up. + +### Notify developers of alerts + +GitLab can react to the alerts triggered from your applications and services +by creating issues and alerting developers through email. By default, GitLab +sends these emails to [owners and maintainers](../../user/permissions.md) of the project. +These emails contain details of the alert, and a link for more information. + +To send separate email notifications to users with +[Developer permissions](../../user/permissions.md), see +[Configure incidents](incidents.md#configure-incidents). + +## Configure Prometheus alerts + +You can set up Prometheus alerts in: + +- [GitLab-managed Prometheus](../metrics/alerts.md) installations. +- [Self-managed Prometheus](../metrics/alerts.md#external-prometheus-instances) installations. + +Prometheus alerts are created by the special Alert Bot user. You can't remove this +user, but it does not count toward your license limit. + +## Configure external generic alerts + +GitLab can accept alerts from any source through a generic webhook receiver. When +[configuring the generic alerts integration](../../user/project/integrations/generic_alerts.md), +GitLab creates a unique endpoint which receives a JSON-formatted, customizable payload. + +## Integrate incidents with Slack + +Slack slash commands allow you to control GitLab and view GitLab content without leaving Slack. + +Learn how to [set up Slack slash commands](../../user/project/integrations/slack_slash_commands.md) +and how to [use the available slash commands](../../user/project/slash_commands.md). + +## Integrate issues with Zoom + +GitLab enables you to [associate a Zoom meeting with an issue](../../user/project/issues/associate_zoom_meeting.md) +for synchronous communication during incident management. After starting a Zoom +call for an incident, you can associate the conference call with an issue. Your +team members can join the Zoom call without requesting a link. diff --git a/doc/operations/incident_management/status_page.md b/doc/operations/incident_management/status_page.md new file mode 100644 index 00000000000..e376607d86f --- /dev/null +++ b/doc/operations/incident_management/status_page.md @@ -0,0 +1,179 @@ +--- +stage: Monitor +group: Health +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers +--- + +# GitLab Status Page **(ULTIMATE)** + +> [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/2479) in [GitLab Ultimate](https://about.gitlab.com/pricing/) 12.10. + +With a GitLab Status Page, you can create and deploy a static website to communicate +efficiently to users during an incident. The Status Page landing page displays an +overview of recent incidents: + +![Status Page landing page](img/status_page_incidents_v12_10.png) + +Clicking an incident displays a detail page with more information about a particular incident: + +![Status Page detail](img/status_page_detail_v12_10.png) + +- Status on the incident, including when the incident was last updated. +- The incident title, including any emojis. +- The description of the incident, including emojis. +- Any file attachments provided in the incident description, or comments with a + valid image extension. [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/205166) in GitLab 13.1. +- A chronological ordered list of updates to the incident. + +## Set up a GitLab Status Page + +To configure a GitLab Status Page you must: + +1. [Configure GitLab](#configure-gitlab-with-cloud-provider-information) with your + cloud provider information. +1. [Configure your AWS account](#configure-your-aws-account). +1. [Create a Status Page project](#create-a-status-page-project) on GitLab. +1. [Sync incidents to the Status Page](#sync-incidents-to-the-status-page). + +### Configure GitLab with cloud provider information + +To provide GitLab with the AWS account information needed to push content to your Status Page: + +NOTE: **Note:** +Only AWS S3 is supported as a deploy target. + +1. Sign into GitLab as a user with Maintainer or greater [permissions](../../user/permissions.md). +1. Navigate to **{settings}** **Settings > Operations**. Next to **Status Page**, + click **Expand**. +1. Click **Active** to enable the Status Page feature. +1. In **Status Page URL**, provide the URL to your external status page. +1. Provide the **S3 Bucket name**. For more information, see + [Bucket configuration documentation](https://docs.aws.amazon.com/AmazonS3/latest/dev/HostingWebsiteOnS3Setup.html). +1. Provide the **AWS region** for your bucket. For more information, see the + [AWS documentation](https://github.com/aws/aws-sdk-ruby#configuration). +1. Provide your **AWS access key ID** and **AWS Secret access key**. +1. Click **Save changes**. + +### Configure your AWS account + +1. Within your AWS account, create two new IAM policies, using the following files + as examples: + - [Create bucket](https://gitlab.com/gitlab-org/status-page/-/blob/master/deploy/etc/s3_create_policy.json). + - [Update bucket contents](https://gitlab.com/gitlab-org/status-page/-/blob/master/deploy/etc/s3_update_bucket_policy.json) (Remember replace `S3_BUCKET_NAME` with your bucket name). +1. Create a new AWS access key with the permissions policies created in the first step. + +### Create a status page project + +After configuring your AWS account, you must add the Status Page project and configure +the necessary CI/CD variables to deploy the Status Page to AWS S3: + +1. Fork the [Status Page](https://gitlab.com/gitlab-org/status-page) project. + You can do this through [Repository Mirroring](https://gitlab.com/gitlab-org/status-page#repository-mirroring), + which ensures you get the up-to-date Status Page features. +1. Navigate to **{settings}** **Settings > CI/CD**. +1. Scroll to **Variables**, and click **Expand**. +1. Add the following variables from your Amazon Console: + - `S3_BUCKET_NAME` - The name of the Amazon S3 bucket. + + NOTE: **Note:** + If no bucket with the provided name exists, the first pipeline run creates + one and configures it for + [static website hosting](https://docs.aws.amazon.com/AmazonS3/latest/dev/HostingWebsiteOnS3Setup.html). + + - `AWS_DEFAULT_REGION` - The AWS region. + - `AWS_ACCESS_KEY_ID` - The AWS access key ID. + - `AWS_SECRET_ACCESS_KEY` - The AWS secret. +1. Navigate to **CI / CD > Pipelines > Run Pipeline**, and run the pipeline to + deploy the Status Page to S3. + +CAUTION: **Caution:** +Consider limiting who can access issues in this project, as any user who can view +the issue can potentially [publish comments to your GitLab Status Page](#publish-comments-on-incidents). + +### Sync incidents to the Status Page + +After creating the CI/CD variables, configure the Project you want to use for +Incident issues: + +1. To view the [Operations Settings](../../user/project/settings/#operations-settings) + page, navigate to **{settings}** **Settings > Operations > Status Page**. +1. Fill in your cloud provider's credentials and make sure the **Active** checkbox is checked. +1. Click **Save changes**. + +## How to use your GitLab Status Page + +After configuring your GitLab instance, relevant updates trigger a background job +that pushes JSON-formatted data about the incident to your external cloud provider. +Your status page website periodically fetches this JSON-formatted data. It formats +and displays it to users, providing information about ongoing incidents without +extra effort from your team: + +```mermaid +graph TB + subgraph GitLab Instance + issues(issue updates) -- trigger --> middleware(Background job: JSON generation) + end + subgraph Cloud Provider + middleware --saves data --> c1(Cloud Bucket stores JSON file) + end + subgraph Status Page + d(Static Site on CDN) -- fetches data --> c1 + end +``` + +### Publish an incident + +To publish an incident: + +1. Create an issue in the project you enabled the GitLab Status Page settings in. +1. A [project or group owner](../../user/permissions.md) must use the + `/publish` [quick action](../../user/project/quick_actions.md) to publish the + issue to the GitLab Status Page. + + NOTE: **Note:** + Confidential issues can't be published. + +A background worker publishes the issue onto the Status Page using the credentials +you provided during setup. As part of publication, GitLab will: + +- Anonymize user and group mentions with `Incident Responder`. +- Remove titles of non-public [GitLab references](../../user/markdown.md#special-gitlab-references). +- Publish any files attached to incident issue descriptions, up to 5000 per issue. + ([Introduced in GitLab 13.1](https://gitlab.com/gitlab-org/gitlab/-/issues/205166).) + +After publication, you can access the incident's details page by clicking the +**Published on status page** button displayed under the Incident's title. + +![Status Page detail link](img/status_page_detail_link_v13_1.png) + +### Update an incident + +To publish an update to the Incident, update the incident issue's description. + +CAUTION: **Caution:** +When referenced issues are changed (such as title or confidentiality) the incident +they were referenced in is not updated. + +### Publish comments on incidents + +To publish comments to the Status Page Incident: + +- Create a comment on the incident issue. +- When you're ready to publish the comment, mark the comment for publication by + adding a microphone [award emoji](../../user/award_emojis.md) + reaction (`:microphone:` 🎤) to the comment. +- Any files attached to the comment (up to 5000 per issue) are also published. + ([Introduced in GitLab 13.1](https://gitlab.com/gitlab-org/gitlab/-/issues/205166).) + +CAUTION: **Caution:** +Anyone with access to view the Issue can add an emoji award to a comment, so +consider limiting access to issues to team members only. + +### Update the incident status + +To change the incident status from `open` to `closed`, close the incident issue +within GitLab. Closing the issue triggers a background worker to update the +GitLab Status Page website. + +If you make a published issue confidential, GitLab unpublishes it from your +GitLab Status Page website. |