summaryrefslogtreecommitdiff
path: root/doc/user/project/ml/experiment_tracking/index.md
blob: e274bd7f38e3d90e2c361c24cc5e66746e17ab90 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
---
stage: Create
group: Incubation
info: Machine Learning Experiment Tracking is a GitLab Incubation Engineering program. No technical writer assigned to this group.
---

# Machine Learning Experiment Tracking **(FREE)**

DISCLAIMER:
Machine Learning Experiment Tracking is an experimental feature being developed by the Incubation Engineering Department,
and will receive significant changes over time. This feature is being release with the aim of getting user feedback, but
is not stable and can lead to performance degradation. See below on how to disable this feature.

When creating machine learning models, data scientists often experiment with different parameters, configurations, feature
engineering, and so on, to improve the performance of the model. Keeping track of all this metadata and the associated
artifacts so that the data scientist can later replicate the experiment is not trivial. Machine learning experiment
tracking enables them to log parameters, metrics, and artifacts directly into GitLab, giving easy access later on.

![List of Experiments](img/experiments.png)

![Experiment Candidates](img/candidates.png)

## What is an experiment?

An experiment is a collection of comparable model candidates. Experiments can be long lived (for example, when they represent
a use case), or short lived (results from hyperparameter tuning triggered by a merge request), but usually hold model candidates
that have a similar set of parameters and metrics.

## Model candidate

A model candidate is a variation of the training of a machine learning model, that can be eventually promoted to a version
of the model. The goal of a data scientist is to find the model candidate whose parameter values lead to the best model
performance, as indicated by the given metrics.

Example parameters:

- Algorithm (linear regression, decision tree, and so on).
- Hyperparameters for the algorithm (learning rate, tree depth, number of epochs).
- Features included.

## Usage

### User access management

An experiment is always associated to a project. Only users with access to the project an experiment is associated with
can view that experiment data.

### Tracking new experiments and trials

Experiment and trials can only be tracked through the [MLFlow](https://www.mlflow.org/docs/latest/tracking.html) client
integration. More information on how to use GitLab as a backend for MLFlow Client can be found [at the documentation page](../../integrations/mlflow_client.md).

### Exploring model candidates

To list the current active experiments, navigate to `https/-/ml/experiments`. To display all trials
that have been logged, along with their metrics and parameters, selecting an experiment.

### Logging artifacts

Trial artifacts are saved as [generic packages](../../../packages/generic_packages/index.md), and follow all their
conventions. After an artifact is logged for a candidate, all artifacts logged for the candidate are listed in the
package registry. The package name for a candidate is `ml_candidate_<candidate_id>`, with version `-`.

### Limitations and future

- Searching experiments, searching trials, visual comparison of trials, and creating, deleting and updating experiments and trials through GitLab UI is under development.
- No support for experiment and trial metadata that do not classify as parameters or metrics.

## Disabling or enabling the Feature

On self-managed GitLab, ML Experiment Tracking is disabled by default. To enable the feature, ask an administrator to [disable the feature flag](../../../../administration/feature_flags.md) named `ml_experiment_tracking`.
On GitLab.com, this feature is currently on private testing.

## Feedback, roadmap and reports

For updates on the development, feedback and bug reports, refer to the [development epic](https://gitlab.com/groups/gitlab-org/-/epics/8560).