doc/development/reusing_abstractions.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342

---
stage: none
group: unassigned
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
---

# Guidelines for reusing abstractions

As GitLab has grown, different patterns emerged across the codebase. Service
classes, serializers, and presenters are just a few. These patterns made it easy
to reuse code, but at the same time make it easy to accidentally reuse the wrong
abstraction in a particular place.

## Why these guidelines are necessary

Code reuse is good, but sometimes this can lead to shoehorning the wrong
abstraction into a particular use case. This in turn can have a negative impact
on maintainability, the ability to easily debug problems, or even performance.

An example would be to use `ProjectsFinder` in `IssuesFinder` to limit issues to
those belonging to a set of projects. While initially this may seem like a good
idea, both classes provide a very high level interface with very little control.
This means that `IssuesFinder` may not be able to produce a better optimized
database query, as a large portion of the query is controlled by the internals
of `ProjectsFinder`.

To work around this problem, you would use the same code used by
`ProjectsFinder`, instead of using `ProjectsFinder` itself directly. This allows
you to compose your behavior better, giving you more control over the behavior
of the code.

To illustrate, consider the following code from `IssuableFinder#projects`:

```ruby
return @projects = project if project?

projects =
  if current_user && params[:authorized_only].presence && !current_user_related?
    current_user.authorized_projects
  elsif group
    finder_options = { include_subgroups: params[:include_subgroups], only_owned: true }
    GroupProjectsFinder.new(group: group, current_user: current_user, options: finder_options).execute
  else
    ProjectsFinder.new(current_user: current_user).execute
  end

@projects = projects.with_feature_available_for_user(klass, current_user).reorder(nil)
```

Here we determine what projects to scope our data to, using three different
approaches. When a group is specified, we use `GroupProjectsFinder` to retrieve
all the projects of that group. On the surface this seems harmless: it is easy
to use, and we only need two lines of code.

In reality, things can get hairy very quickly. For example, the query produced
by `GroupProjectsFinder` may start out simple. Over time more and more
functionality is added to this (high level) interface. Instead of _only_
affecting the cases where this is necessary, it may also start affecting
`IssuableFinder` in a negative way. For example, the query produced by
`GroupProjectsFinder` may include unnecessary conditions. Since we're using a
finder here, we can't easily opt-out of that behavior. We could add options to
do so, but then we'd need as many options as we have features. Every option adds
two code paths, which means that for four features we have to cover 8 different
code paths.

A much more reliable (and pleasant) way of dealing with this, is to use
the underlying bits that make up `GroupProjectsFinder` directly. This means we
may need a little bit more code in `IssuableFinder`, but it also gives us much
more control and certainty. This means we might end up with something like this:

```ruby
return @projects = project if project?

projects =
  if current_user && params[:authorized_only].presence && !current_user_related?
    current_user.authorized_projects
  elsif group
    current_user
      .owned_groups(subgroups: params[:include_subgroups])
      .projects
      .any_additional_method_calls
      .that_might_be_necessary
  else
    current_user
      .projects_visible_to_user
      .any_additional_method_calls
      .that_might_be_necessary
  end

@projects = projects.with_feature_available_for_user(klass, current_user).reorder(nil)
```

This is just a sketch, but it shows the general idea: we would use whatever the
`GroupProjectsFinder` and `ProjectsFinder` finders use under the hoods.

## End goal

The guidelines in this document are meant to foster _better_ code reuse, by
clearly defining what can be reused where, and what to do when you cannot reuse
something. Clearly separating abstractions makes it harder to use the wrong one,
makes it easier to debug the code, and (hopefully) results in fewer performance
problems.

## Abstractions

Now let's take a look at the various abstraction levels available, and what they
can (or cannot) reuse. For this we can use the following table, which defines
the various abstractions and what they can (not) reuse:

| Abstraction            | Service classes  | Finders  | Presenters  | Serializers   | Model instance method   | Model class methods   | Active Record   | Worker
|:-----------------------|:-----------------|:---------|:------------|:--------------|:------------------------|:----------------------|:----------------|:--------
| Controller/API endpoint| Yes              | Yes      | Yes         | Yes           | Yes                     | No                    | No              | No
| Service class          | Yes              | Yes      | No          | No            | Yes                     | No                    | No              | Yes
| Finder                 | No               | No       | No          | No            | Yes                     | Yes                   | No              | No
| Presenter              | No               | Yes      | No          | No            | Yes                     | Yes                   | No              | No
| Serializer             | No               | Yes      | No          | No            | Yes                     | Yes                   | No              | No
| Model class method     | No               | No       | No          | No            | Yes                     | Yes                   | Yes             | No
| Model instance method  | No               | Yes      | No          | No            | Yes                     | Yes                   | Yes             | Yes
| Worker                 | Yes              | Yes      | No          | No            | Yes                     | No                    | No              | Yes

### Controllers

Everything in `app/controllers`.

Controllers should not do much work on their own, instead they pass input
to other classes and present the results.

### API endpoints

Everything in `lib/api` (the REST API) and `app/graphql` (the GraphQL API).

API endpoints have the same abstraction level as controllers.

### Service classes

Everything that resides in `app/services`.

Service classes represent operations that coordinates changes between models
(such as entities and value objects). Changes impact the state of the application.

1. When an object makes no changes to the state of the application, then it's not a service.
   It may be a [finder](#finders) or a value object.
1. When there is no operation, there is no need to execute a service. The class would
   probably be better designed as an entity, a value object, or a policy.

When implementing a service class, consider:

1. A service class initializer should contain in its arguments:
   1. A [model](#models) instance that is being acted upon. Should be first positional
      argument of the initializer. The argument name of the argument is left to the
      developer's discretion, such as: `issue`, `project`, `merge_request`.
   1. When service represents an action initiated by a user or executed in the
      context of a user, the initializer must have the `current_user:` keyword argument.
      Services with the `current_user:` argument run high-level business logic
      and must validate user authorization to perform their operations.
   1. When service does not have a user context and it's not directly initiated
      by a user (like background service or side-effects), the `current_user:`
      argument is not needed. This describes low-level domain logic or instance-wide logic.
   1. For all additional data required by a service, the explicit keyword arguments are recommended.
      When a service requires too long of a list of arguments, consider splitting them into:
      - `params`: A hash with model properties that will be assigned directly.
      - `options`: A hash with extra parameters (which need to be processed,
        and are not model properties). The `options` hash should be stored in an instance variable.

      ```ruby
      # merge_request: A model instance that is being acted upon.
      # assignee: new MR assignee that will be assigned to the MR
      #   after the service is executed.
      def initialize(merge_request, assignee:)
        @merge_request = merge_request
        @assignee = assignee
      end
      ```

      ```ruby
      # issue: A model instance that is being acted upon.
      # current_user: Current user.
      # params: Model properties.
      # options: Configuration for this service. Can be any of the following:
      #   - notify: Whether to send a notification to the current user.
      #   - cc: Email address to copy when sending a notification.
      def initialize(issue:, current_user:, params: {}, options: {})
        @issue = issue
        @current_user = current_user
        @params = params
        @options = options
      end
      ```

1. It implements a single public instance method `#execute`, which invokes service class behavior:
   - The `#execute` method takes no arguments. All required data is passed into initializer.
   - Optional. If needed, the `#execute` method returns its result via [`ServiceResponse`](#serviceresponse).

Several base classes implement the service classes convention. You may consider inheriting from:

- `BaseContainerService` for services scoped by container (project or group).
- `BaseProjectService` for services scoped to projects.
- `BaseGroupService` for services scoped to groups.

Classes that are not service objects should be
[created elsewhere](software_design.md#use-namespaces-to-define-bounded-contexts),
such as in `lib`.

#### ServiceResponse

Service classes usually have an `execute` method, which can return a
`ServiceResponse`. You can use `ServiceResponse.success` and
`ServiceResponse.error` to return a response in `execute` method.

In a successful case:

```ruby
response = ServiceResponse.success(message: 'Branch was deleted')

response.success? # => true
response.error? # => false
response.status # => :success
response.message # => 'Branch was deleted'
```

In a failed case:

```ruby
response = ServiceResponse.error(message: 'Unsupported operation')

response.success? # => false
response.error? # => true
response.status # => :error
response.message # => 'Unsupported operation'
```

An additional payload can also be attached:

```ruby
response = ServiceResponse.success(payload: { issue: issue })

response.payload[:issue] # => issue
```

Error responses can also specify the failure `reason` which can be used by the caller
to understand the nature of the failure.
The caller, if an HTTP endpoint, could translate the reason symbol into an HTTP status code:

```ruby
response = ServiceResponse.error(
  message: 'Job is in a state that cannot be retried',
  reason: :job_not_retrieable)

if response.success?
  head :ok
elsif response.reason == :job_not_retriable
  head :unprocessable_entity
else
  head :bad_request
end
```

For common failures such as resource `:not_found` or operation `:forbidden`, we could
leverage the Rails [HTTP status symbols](http://www.railsstatuscodes.com/) as long as
they are sufficiently specific for the domain logic involved.
For other failures use domain-specific reasons whenever possible.

For example: `:job_not_retriable`, `:duplicate_package`, `:merge_request_not_mergeable`.

### Finders

Everything in `app/finders`, typically used for retrieving data from a database.

Finders cannot reuse other finders in an attempt to better control the SQL
queries they produce.

Finders' `execute` method should return `ActiveRecord::Relation`. Exceptions
can be added to `spec/support/finder_collection_allowlist.yml`.
See [`#298771`](https://gitlab.com/gitlab-org/gitlab/-/issues/298771) for more details.

### Presenters

Everything in `app/presenters`, used for exposing complex data to a Rails view,
without having to create many instance variables.

See [the documentation](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/presenters/README.md) for more information.

### Serializers

Everything in `app/serializers`, used for presenting the response to a request,
typically in JSON.

### Models

Classes and modules in `app/models` represent domain concepts that encapsulate both
[data and behavior](https://en.wikipedia.org/wiki/Domain_model).

These classes can interact directly with a data store (like ActiveRecord models) or
can be a thin wrapper (Plain Old Ruby Objects) on top of ActiveRecord models to express a
richer domain concept.

[Entities and Value Objects](https://martinfowler.com/bliki/EvansClassification.html)
that represent domain concepts are considered domain models.

Some examples:

- [`DesignManagement::DesignAtVersion`](https://gitlab.com/gitlab-org/gitlab/-/blob/b62ce98cff8e0530210670f9cb0314221181b77f/app/models/design_management/design_at_version.rb)
  is a model that leverages validations to combine designs and versions.
- [`Ci::Minutes::Usage`](https://gitlab.com/gitlab-org/gitlab/-/blob/ec52f19f7325410177c00fef06379f55ab7cab67/ee/app/models/ci/minutes/usage.rb)
  is a Value Object that provides [CI/CD minutes usage](../ci/pipelines/cicd_minutes.md)
  for a given namespace.

#### Model class methods

These are class methods defined by _GitLab itself_, including the following
methods provided by Active Record:

- `find`
- `find_by_id`
- `delete_all`
- `destroy`
- `destroy_all`

Any other methods such as `find_by(some_column: X)` are not included, and
instead fall under the "Active Record" abstraction.

#### Model instance methods

Instance methods defined on Active Record models by _GitLab itself_. Methods
provided by Active Record are not included, except for the following methods:

- `save`
- `update`
- `destroy`
- `delete`

#### Active Record

The API provided by Active Record itself, such as the `where` method, `save`,
`delete_all`, and so on.

### Worker

Everything in `app/workers`.

Use `SomeWorker.perform_async` or `SomeWorker.perform_in` to schedule Sidekiq
jobs. Never directly invoke a worker using `SomeWorker.new.perform`.