summaryrefslogtreecommitdiff
path: root/typing/TODO.md
blob: c1151161176966c57245d24a0413ec9ee243aeaf (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
TODO for the OCaml typechecker implementation
=============================================

There is a consensus that the current implementation of the OCaml
typechecker is overly complex and fragile.  A big rewriting "from
scratch" might be possible or desirable at some point, or not, but
incremental cleanup steps are certainly accessible and could bring the
current implementation in a better shape at a relatively small cost
and in a reasonably distant future.

Goals of the cleanup:

 - Make the implementation more maintainable and less fragile.

 - Allow new contributors, or people involved in bigger rewriting
   projects, to get familiar with the code base more easily.

 - Pave the way for future extensions or bigger structural changes to
   the implementation.

This file collects specific cleanup ideas which have been discussed
amongst maintainers.  Having the list committed in the repo allows for
everyone to get an idea of planned tasks, refine them through Pull
Requests, suggest more cleanups, or even start working on specific
tasks (ideally after discussing it first with maintainers).

# Code smells

- global mutable state
- poor data representation
- avoid constructing a parsetree locally
  (methods build a piece of AST with a self argument
   with a *-using name to avoid conflicts; #row, etc.)
- avoid magic string literals

# TODO List

Not all ideas have been thoroughly discussed, and there might not be a
consensus for all of them.

- Make the level generator be part of `Env.t` instead of being global.

- Introduce an abstraction boundary between "the type algebra" and
  "the type checker" (at first between Ctype and Typecore) so that the
  type checker is forced to go through a proper API to access/mutate
  type nodes.  This would make it impossible to "forget" a call
  to `repr` and will allow further changes on the internal representation.

- Tidy up Typeclass (use records instead of 14-tuples, avoid
  "#"-encoding, etc).

- Collect all global state of the type checker in a single place,
  possibly a single reference to a persistent data structure
  (e.g. maps instead of hashtables).

- Get rid of Tsubst.  With the unique ids on each type node, copying
  can be implemented rather efficiently with a map.

- Document row_desc, get rid of row_bound.

- Implement union-find with a more abstract/persistent datastructure
  (be careful about memory leaks with the naive approach of representing
  links with a persistent heap).

  Modest version of the proposal: have an explicit indirection layer
    (type_expr Unode.t)
  for nodes in the union-find structure. Efficiency cost?

- Make the logic for record/constructor disambiguation more readable.

  (Jacques should write a specification, and then we could try
  to make the implementation easier for others to understand.)

- Tidy up destructive substitution.

- Get rid of syntactic encodings (generating Parsetree fragments
  during type-checking, cf optional arguments or classes).

- Track "string literals" in the type-checker, which often act as
  magic "internal" names which should be avoided.

- Consider storing warning settings (+other context) as part of `Env.t`?

- Parse attributes understood (e.g. the deprecated attribute) by the
  compiler into a structured representation during type-checking.

- Introduce a notion of syntactic "path-like location" to point to
  allow pointing to AST fragments, and use that to implement "unused"
  warnings in a less invasive and less imperative way.
  (See Thomas' PR)

- Deprecate -nolabels, or even get rid of it?
  (We could even stop supporting unlabeled full applications.
   First turn on the warning by default.)

- Using e.g. bisect_ppx, monitor coverage of the typechecker
  implementation while running the testsuite, and expand the testsuite
  and/or kill dead code in the typechecker to increase coverage ratio.
  (Partially done by Oxana's Outreachy internship.
   See PR#8874.
   Ask Florian Angeletti and Sebastien Hinderer about the current state.)