summaryrefslogtreecommitdiff
path: root/docs/build_system_reference.md
blob: 45ae052ad287ebdb08645d13d3a95ef27037d290 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
# MongoDB Build System Reference

## MongoDB Build System Requirements
### Recommended minimum requirements
### Python modules
### External libraries
### Enterprise module requirements
### Testing requirements

## MongoDB customizations
### SCons modules
### Development tools
#### Compilation database generator
### Build tools
#### IDL Compiler
### Auxiliary tools
#### Ninja generator
#### Icecream tool
#### ccache tool
### LIBDEPS
Libdeps is a subsystem within the build, which is centered around the LIBrary DEPendency graph. It tracks and maintains the dependency graph as well as lints, analyzes and provides useful metrics about the graph.
#### Design
The libdeps subsystem is divided into several stages, described in order of use as follows.

##### SConscript `LIBDEPS` definitions and built time linting

During the build, the SConscripts are read and all the library relationships are setup via the `LIBDEPS` variables. Some issues can be identified early during processing of the SConscripts via the build-time linter. Most of these will be style and usage issues which can be realized without needing the full graph. This component lives within the build and is executed through the SCons emitters added via the libdeps subsystem.

##### Libdeps graph generation for post-build analysis

For more advanced analysis and linting, a full graph is necessary. The build target `generate-libdeps-graph` builds all libdeps and things which use libdeps, and generates the graph to a file in graphml format.

##### The libdeps analyzer python module

The libdeps analyzer module is a python library which provides and Application Programming Interface (API) to analyze and lint the graph. The library internally leverages the networkx python module for the generic graph interfaces.

##### The CLI and Visualizer tools

The libdeps analyzer module is used in the libdeps Graph Analysis Command Line Interface (gacli) tool and the libdeps Graph Visualizer web service. Both tools read in the graph file generated from the build and provide the Human Machine Interface (HMI) for analysis and linting.
#### The `LIBDEPS` variables
The variables include several types of lists to be added to libraries per a SCons builder instance:

| Variable        | Use           |
| ------------- |-------------|
| `LIBDEPS`      | transitive dependencies  |
| `LIBDEPS_PRIVATE`    |   local dependencies    |
| `LIBDEPS_INTERFACE` | transitive dependencies excluding self      |
| `LIBDEPS_DEPENDENTS` | reverse dependencies |
| `PROGDEPS_DEPENDENTS` | reverse dependencies for Programs |


_`LIBDEPS`_ is the 'public' type, such that libraries that are added to this list become a dependency of the current library, and also become dependencies of libraries which may depend on the current library. This propagation also includes not just the libraries in the `LIBDEPS` list, but all `LIBDEPS` of those `LIBDEPS` recursively, meaning that all dependencies of the `LIBDEPS` libraries, also become dependencies of the current library and libraries which depend on it.

_`LIBDEPS_PRIVATE`_ should be a list of libraries which creates dependencies only between the current library and the libraries in the list. However, in static linking builds, this will behave the same as `LIBDEPS` due to the nature of static linking.

_`LIBDEPS_INTERFACE`_ is very similar to `LIBDEPS`, however it does not create propagating dependencies for the libraries themselves in the `LIBDEPS_INTERFACE` list. Only the dependencies of those `LIBDEPS_INTERFACE` libraries are propagated forward.

_`LIBDEPS_DEPENDENTS`_ are added to libraries which will force themselves as dependencies of the libraries in the supplied list. This is conceptually a reverse dependency, where the library which is a dependency is the one declaring itself as the dependency of some other library. By default this creates a `LIBDEPS_PRIVATE` like relationship, but a tuple can be used to force it to a `LIBDEPS` like or other relationship.

_`PROGDEPS_DEPENDENTS`_ are the same as `LIBDEPS_DEPENDENTS`, but intended for use only with Program builders.

#### `LIBDEPS_TAGS`
The `LIBDEPS_TAGS` variable is used to mark certain libdeps for various reasons. Some `LIBDEPS_TAGS` are used to mark certain libraries for `LIBDEPS_TAG_EXPANSIONS` variable which is used to create a function which can expand to a string on the command line. Below is a table of available `LIBDEPS` tags:

| Tag | Description |
|---|---|
| `illegal_cyclic_or_unresolved_dependencies_allowlisted` | SCons subst expansion tag to handle dependency cycles |
| `init-no-global-side-effects` | SCons subst expansion tag for causing linkers to avoid pulling in all symbols |
| `lint-public-dep-allowed` | Linting exemption tag exempting the `lint-no-public-deps` tag |
| `lint-no-public-deps` | Linting inclusion tag ensuring a libdep has no `LIBDEPS` declared |
| `lint-allow-non-alphabetic` | Linting exemption tag allowing `LIBDEPS` variable lists to be non-alphabetic |
| `lint-leaf-node-allowed-dep` | Linting exemption tag exempting the `lint-leaf-node-no-deps` tag |
| `lint-leaf-node-no-deps` | Linting inclusion tag ensuring a libdep has no libdeps and is a leaf node |
| `lint-allow-nonlist-libdeps` | Linting exemption tag allowing a `LIBDEPS` variable to not be a list | `lint-allow-bidirectional-edges` | Linting exemption tag allowing reverse dependencies to also be a forward dependencies |
| `lint-allow-nonprivate-on-deps-dependents` | Linting exemption tag allowing reverse dependencies to be transitive |
| `lint-allow-dup-libdeps` | Linting exemption tag allowing `LIBDEPS` variables to contain duplicate libdeps on a given library |
| `lint-allow-program-links-private` | Linting exemption tag allowing `Program`s to have `PRIVATE_LIBDEPS` |

##### The `illegal_cyclic_or_unresolved_dependencies_allowlisted` tag
This tag should not be used anymore because the library dependency graph has been successfully converted to a Directed Acyclic Graph (DAG). Prior to this accomplishment, it was necessary to handle
cycles specifically with platform specific options on the command line.

##### The `init-no-global-side-effects` tag
Adding this flag to a library turns on platform specific compiler flags which will cause the linker to pull in just the symbols it needs. Note that by default, the build is configured to pull in all symbols from libraries because of the use of static initializers, however if a library is known to not have any of these initializers, then this flag can be added for some performance improvement.

#### Linting and linter tags

The libdeps linter features automatically detect certain classes of LIBDEPS usage errors. The libdeps linters are implemented as build-time linting and post-build linting procedures to maintain order in usage of the libdeps tool and the build’s library dependency graph. You will need to comply with the rules enforced by the libdeps linter, and fix issues that it raises when modifying the build scripts. There are exemption tags to prevent the linter from blocking things, however these exemption tags should only be used in extraordinary cases, and with good reason. A goal of the libdeps linter is to drive and maintain the number of exemption tags in use to zero.
##### Exemption Tags
There are a number of existing issues that need to be addressed, but they will be addressed in future tickets. In the meantime, the use of specific strings in the LIBDEPS_TAGS variable can allow the libdeps linter to skip certain issues on given libraries. For example, to have the linter skip enforcement of the lint rule against bidirectional edges for "some_library":
```
env.Library(
    target=’some_library’
    ...
    LIBDEPS_TAGS=[‘lint-allow-bidirectional-edges’]
)
```

#### build-time Libdeps Linter
If there is a build-time issue, the build will fail until it is addressed. This linting feature will be on by default and takes about half a second to complete in a full enterprise build (at the time of writing this), but can be turned off by using the --libdeps-linting=off option on your SCons invocation.

The current rules and there exemptions are listed below:

1. **A 'Program' can not link a non-public dependency, it can only have LIBDEPS links.**
    ###### Example

    ```
    env.Program(
        target=’some_program’,
        ...
        LIBDEPS=[‘lib1’], # OK
        LIBDEPS_PRIVATE=[‘lib2’], # This is a Program, BAD
    )
    ```
    ###### Rationale

    A Program can not be linked into anything else, and there for the transitiveness does not apply. A default value of LIBDEPS was selected for consistency since most Program's were already doing this at the time the rule was created.

    ###### Exemption
    'lint-allow-program-links-private' on the target node
    ######

2. **A 'Node' can only directly link a given library once.**

    ###### Example

    ```
    env.Library(
        target=’some_library’,
        ...
        LIBDEPS=[‘lib1’], # Linked once, OK
        LIBDEPS_PRIVATE=[‘lib1’], # Also linked in LIBDEPS, BAD
        LIBDEPS_INTERFACE=[‘lib2’, 'lib2'], # Linked twice, BAD
    )
    ```
    ###### Rationale

    Libdeps will ignore duplicate links, so this rule is mostly for consistency and neatness in the build scripts.

    ###### Exemption
    'lint-allow-dup-libdeps' on the target node

    ######
3. **A 'Node' which uses LIBDEPS_DEPENDENTS or PROGDEPS_DEPENDENTS can only have LIBDEPS_PRIVATE links.**
    ###### Example

    ```
    env.Library(
        target=’some_library’,
        ...
        LIBDEPS_DEPENDENTS=['lib3'],
        LIBDEPS=[‘lib1’], # LIBDEPS_DEPENDENTS is in use, BAD
        LIBDEPS_PRIVATE=[‘lib2’], # OK
    )
    ```
    ###### Rationale

    The node that the library is using LIBDEPS_DEPENDENTS or PROGDEPS_DEPENDENT to inject its dependency onto should be conditional, therefore there should not be transitiveness for that dependency since it cannot be the source of any resolved symbols.
    ###### Exemption
    'lint-allow-nonprivate-on-deps-dependents' on the target node
    ######
4. **A 'Node' can not link directly to a library that uses LIBDEPS_DEPENDENTS or PROGDEPS_DEPENDENTS.**
    ###### Example
    ```
    env.Library(
        target='other_library',
        ...
        LIBDEPS=['lib1'], # BAD, 'lib1' has LIBDEPS_DEPENDENTS

    env.Library(
        target=’lib1’,
        ...
        LIBDEPS_DEPENDENTS=['lib3'],
    )
    ```
    ###### Rationale

    A library that is using LIBDEPS_DEPENDENTS or PROGDEPS_DEPENDENT should only be used for reverse dependency edges. If a node does need to link directly to a library that does have reverse dependency edges, that indicates the library should be split into two separate libraries, containing its direct dependency content and its conditional reverse dependency content.

    ###### Exemption
    'lint-allow-bidirectional-edges' on the target node
    ######

5. **All libdeps environment vars must be assigned as lists.**
    ###### Example
    ```
    env.Library(
        target='some_library',
        ...
        LIBDEPS='lib1', # not a list, BAD
        LIBDEPS_PRIVATE=['lib2'], # OK
    )
    ```
    ###### Rationale

    Libdeps will handle non-list environment variables, so this is more for consistency and neatness in the build scripts.
    ###### Exemption
    'lint-allow-nonlist-libdeps' on the target node
    ######
6. **Libdeps with the tag 'lint-leaf-node-no-deps' shall not link any libdeps.**
    ###### Example

    ```
    env.Library(
        target='lib2',
        ...
        LIBDEPS_TAGS=[
            'lint-leaf-node-allowed-dep'
        ]
    )

    env.Library(
        target='some_library',
        ...
        LIBDEPS=['lib1'], # BAD, should have no LIBDEPS
        LIBDEPS_PRIVATE=['lib2'], # OK, has exemption tag
        LIBDEPS_TAGS=[
            'lint-leaf-node-no-deps'
        ]
    )
    ```

    ###### Rationale

    The special tag allows certain nodes to be marked and programmatically checked that they remain lead nodes. An example use-case is when we want to make sure certain nodes never link mongodb code.

    ###### Exemption
    'lint-leaf-node-allowed-dep' on the exempted libdep

    ###### Inclusion
    'lint-leaf-node-no-deps' on the target node
    ######
7. **Libdeps with the tag 'lint-no-public-deps' shall not link any libdeps.**
    ###### Example
    ```
    env.Library(
        target='lib2',
        ...
        LIBDEPS_TAGS=[
            'lint-public-dep-allowed'
        ]
    )

    env.Library(
        target='some_library',
        ...
        LIBDEPS=[
            'lib1' # BAD
            'lib2' # OK, has exemption tag
        ],
        LIBDEPS_TAGS=[
            'lint-no-public-deps'
        ]
    )
    ```
    ###### Rationale

    The special tag allows certain nodes to be marked and programmatically checked that they do not link publicly. Some nodes such as mongod_main have special requirements that this programmatically checks.

    ###### Exemption
    'lint-public-dep-allowed' on the exempted libdep
    ###### Inclusion
    'lint-no-public-deps' on the target node
    ######
8. **Libdeps shall be sorted alphabetically in LIBDEPS lists in the SCons files.**
    ###### Example
    ```
    env.Library(
        target='lib2',
        ...
        LIBDEPS=[
            '$BUILD/mongo/db/d', # OK, $ comes before c
            'c', # OK, c comes before s
            'src/a', # BAD, s should be after b
            'b', # BAD, b should be before c
        ]
    )
    ```
    ###### Rationale

    Keeping the SCons files neat and ordered allows for easier Code Review diffs and generally better maintainability.
    ###### Exemption
    'lint-allow-non-alphabetic' on the exempted libdep
    ######




##### The build-time print Option
The libdeps linter also has the `--libdeps-linting=print` option which will perform linting, and instead of failing the build on an issue, just print and continue on. It will also ignore exemption tags, and still print the issue because it will not fail the build. This is a good way to see the entirety of existing issues that are exempted by tags, as well as printing other metrics such as time spent linting.

#### post-build linting and analysis

The dependency graph can be analyzed post-build by leveraging the completeness of the graph to perform more extensive analysis. You will need to install the libdeps requirements file to python when attempting to use the post-build analysis tools:

```
python3 -m pip install -r etc/pip/libdeps-requirements.txt
```

The command line interface tool (gacli) has a comprehensive help text which will describe the available analysis options and interface. The visualizer tool includes a GUI which displays the available analysis options graphically. These tools will be briefly covered in the following sections.

##### Generating the graph file

To generate the full graph, build the target `generate-libdeps-graph`. This will build all things involving libdeps and construct a graphml file representing the library dependency graph. The graph can be used in the command line interface tool or the visualizer web service tool. The minimal set of required SCons arguments to build the graph file is shown below:

```
python3 buildscripts/scons.py --link-model=dynamic --build-tools=next generate-libdeps-graph
```

The graph file by default will be generate to `build/opt/libdeps/libdeps.graphml` (where `build/opt` is the `$BUILD_DIR`).

##### General libdeps analyzer API usage

Below is a basic example of usage of the libdeps analyzer API:

```
import libdeps

libdeps_graph = libdeps.graph.load_libdeps_graph('path/to/libdeps.graphml')

list_of_analysis_to_run = [
    libdeps.analyzer.NodeCounter(libdeps_graph),
    libdeps.analyzer.DirectDependencies(libdeps_graph, node='path/to/library'),
]

analysis_results = libdeps.graph.LibdepsGraphAnalysis(list_of_analysis_to_run)
libdeps.analyzer.GaPrettyPrinter(analysis_after_run).print()
```

Walking through this example, first the graph is loaded from file. Then a list of desired Analyzer instances is created. Some example analyzer classes are instantiated in the example above, but there are many others to choose from. Specific Analyzers have different interfaces and should be supplied an argument list corresponding to that analyzer.

_Note:_ The graph file will contain the build dir that the graph data was created with and it expects all node arguments to be relative to the build dir. If you are using the libdeps module generically in some app, you can extract the build dir from the libdeps graph and append it to any generic library path.

Once the list of analyzers is created, they can be used to create a LibdepsGraphAnalysis instance, which will upon instantiation, run the analysis list provided. Once the instance is created, it contains the results, and optionally can be fed into different printer classes. In this case, a human readable format printer called GaPrettyPrinter is used to print to the console.

##### Using the gacli tool

The command line interface tool can be used from the command line to run analysis on a given graph. The only required argument is the graph file. The default with no args will run all the counters and linters on the graph. Here is an example output:

```
(venv) Apr.20 02:46 ubuntu[mongo]: python buildscripts/libdeps/gacli.py --graph-file build/cached/libdeps/libdeps.graphml
Loading graph data...Loaded!


Graph built from git hash:
1358cdc6ff0e53e4f4c01ea0e6fcf544fa7e1672

Graph Schema version:
2

Build invocation:
"/home/ubuntu/venv/bin/python" "buildscripts/scons.py" "--variables-files=etc/scons/mongodbtoolchain_v3_gcc.vars" "--cache=all" "--cache-dir=/home/ubuntu/scons-cache" "--link-model=dynamic" "--build-tools=next" "ICECC=icecc" "CCACHE=ccache" "-j200" "--cache-signature-mode=validate" "--cache-debug=-" "generate-libdeps-graph"

Nodes in Graph: 867
Edges in Graph: 90706
Direct Edges in Graph: 5948
Transitive Edges in Graph: 84758
Direct Public Edges in Graph: 3483
Public Edges in Graph: 88241
Private Edges in Graph: 2440
Interface Edges in Graph: 25
Shim Nodes in Graph: 20
Program Nodes in Graph: 136
Library Nodes in Graph: 731

LibdepsLinter: PUBLIC libdeps that could be PRIVATE: 0
```

Use the `--help` option to see detailing information about all the available options.

##### Using the graph visualizer Tool

The graph visualizer tools starts up a web service to provide a frontend GUI for navigating and examining the graph files. The Visualizer uses a Python Flask backend and React/Redux Javascript frontend.

For installing the dependencies for the frontend, you will need node >= 12.0.0 and npm installed and in the PATH. To install the dependencies navigate to directory where package.json lives, and run:

```
cd buildscripts/libdeps/graph_visualizer_web_stack && npm install
```

Alternatively if you are on linux, you can use the setup_node_env.sh script to automatically download node 12 and npm, setup the local environment and install the dependencies. Run the command:

```
source buildscripts/libdeps/graph_visualizer_web_stack/setup_node_end.sh install
```

Assuming you are on a remote workstation and using defaults, you will need to make ssh tunnels to the web service to access the service in your local browser. The frontend and backend both use a port (this case 3000 is the frontend and 5000 is the backend), and the default host is localhost, so you will need to open two tunnels so the frontend running in your local web browser can communicate with the backend. If you are using the default host and port the tunnel command will look like this:

```
ssh -L 3000:localhost:3000 -L 5000:localhost:5000 ubuntu@workstation.hostname
```

Next we need to start the web service. It will require you to pass a directory where it will search for `.graphml` files which contain the graph data for various commits:

```
python3 buildscripts/libdeps/graph_visualizer.py --graphml-dir build/opt/libdeps
```

The script will launch the backend and then build the optimized production frontend and launch it. You can supply the `--debug` argument to work in development load which starts up much faster and allows real time updates as files are modified, with a small cost to performance on the frontend. Other options allow more configuration and can be viewed in the `--help` text.

After the server has started up, it should notify you via the terminal that you can access it at http://localhost:3000 locally in your browser.



## Build system configuration
### SCons configuration
#### Frequently used flags and variables
### MongoDB build configuration
#### Frequently used flags and variables

##### `MONGO_GIT_HASH`

The `MONGO_GIT_HASH` SCons variable controls the value of the git hash
which will be interpolated into the build to identify the commit
currently being built. If not overridden, this defaults to the git
hash of the current commit.

##### `MONGO_VERSION`

The `MONGO_VERSION` SCons variable controls the value which will be
interpolated into the build to identify the version of the software
currently being built. If not overridden, this defaults to the result
of `git describe`, which will use the local tags to derive a version.

### Targets and Aliases

## Build artifacts and installation
### Hygienic builds
### AutoInstall
### AutoArchive

## MongoDB SCons style guide
### Sconscript Formatting Guidelines
#### Vertical list style
#### Alphabetize everything
### `Environment` Isolation
### Declaring Targets (`Program`, `Library`, and `CppUnitTest`)
### Invoking external tools correctly with `Command`s
### Customizing an `Environment` for a target
### Invoking subordinate `SConscript`s
#### `Import`s and `Export`s
### A Model `SConscript` with Comments