mlir/docs/CAPI.md


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218

# MLIR C API

**Current status: Under development, API unstable, built by default.**

[TOC]

## Design

Many languages can interoperate with C but have a harder time with C++ due to
name mangling and memory model differences. Although the C API for MLIR can be
used directly from C, it is primarily intended to be wrapped in higher-level
language- or library-specific constructs. Therefore the API tends towards
simplicity and feature minimalism.

**Note:** while the C API is expected to be more stable than C++ API, it
currently offers no stability guarantees.

### Scope

The API is provided for core IR components (attributes, blocks, operations,
regions, types, values), Passes and some fundamental type and attribute kinds.
The core IR API is intentionally low-level, e.g. exposes a plain list of
operation's operands and attributes without attempting to assign "semantic"
names to them. Users of specific dialects are expected to wrap the core API in a
dialect-specific way, for example, by implementing an ODS backend.

### Object Model

Core IR components are exposed as opaque _handles_ to an IR object existing in
C++. They are not intended to be inspected by the API users (and, in many cases,
cannot be meaningfully inspected). Instead the users are expected to pass
handles to the appropriate manipulation functions.

The handle _may or may not_ own the underlying object.

### Naming Convention and Ownership Model

All objects are prefixed with `Mlir`. They are typedefs and should be used
without `struct`.

All functions are prefixed with `mlir`.

Functions primarily operating on an instance of `MlirX` are prefixed with
`mlirX`. They take the instance being acted upon as their first argument (except
for creation functions). For example, `mlirOperationGetNumOperands` inspects an
`MlirOperation`, which it takes as its first operand.

The *ownership* model is encoded in the naming convention as follows.

-   By default, the ownership is not transferred.
-   Functions that transfer the ownership of the result to the caller can be in
    one of two forms:
    *   functions that create a new object have the name `mlirXCreate<...>`, for
        example, `mlirOperationCreate`;
    *   functions that detach an object from a parent object have the name
        `mlirYTake<...>`, for example `mlirOperationStateTakeRegion`.
-   Functions that take ownership of some of their arguments have the form
    `mlirY<...>OwnedX<...>` where `X` can refer to the type or any other
    sufficiently unique description of the argument, the ownership of which will
    be taken by the callee, for example `mlirRegionAppendOwnedBlock`.
-   Functions that create an object by default do not transfer its ownership to
    the caller, i.e. one of other objects passed in as an argument retains the
    ownership, they have the form `mlirX<...>Get`. For example,
    `mlirTypeParseGet`.
-   Functions that destroy an object owned by the caller are of the form
    `mlirXDestroy`.

If the code owns an object, it is responsible for destroying the object when it
is no longer necessary. If an object that owns other objects is destroyed, any
handles to those objects become invalid. Note that types and attributes are
owned by the `MlirContext` in which they were created.

### Nullity

A handle may refer to a _null_ object. It is the responsibility of the caller to
check if an object is null by using `mlirXIsNull(MlirX)`. API functions do _not_
expect null objects as arguments unless explicitly stated otherwise. API
functions _may_ return null objects.

### Type Hierarchies

MLIR objects can form type hierarchies in C++. For example, all IR classes
representing types are derived from `mlir::Type`, some of them may also be also
derived from common base classes such as `mlir::ShapedType` or dialect-specific
base classes. Type hierarchies are exposed to C API through naming conventions
as follows.

-   Only the top-level class of each hierarchy is exposed, e.g. `MlirType` is
    defined as a type but `MlirShapedType` is not. This avoids the need for
    explicit upcasting when passing an object of a derived type to a function
    that expects a base type (this happens more often in core/standard APIs,
    while downcasting usually involves further checks anyway).
-   A type `Y` that derives from `X` provides a function `int mlirXIsAY(MlirX)`
    that returns a non-zero value if the given dynamic instance of `X` is also
    an instance of `Y`. For example, `int MlirTypeIsAInteger(MlirType)`.
-   A function that expects a derived type as its first argument takes the base
    type instead and documents the expectation by using `Y` in its name
    `MlirY<...>(MlirX, ...)`. This function asserts that the dynamic instance of
    its first argument is `Y`, and it is the responsibility of the caller to
    ensure it is indeed the case.

### Auxiliary Types

#### `StringRef`

Numerous MLIR functions return instances of `StringRef` to refer to a non-owning
segment of a string. This segment may or may not be null-terminated. In C API,
these are represented as instances of `MlirStringRef` structure that contains a
pointer to the first character of the string fragment (`str`) and the fragment
length (`length`). Note that the fragment is _not necessarily_ null-terminated,
the `length` field must be used to identify the last character. `MlirStringRef`
is a non-owning pointer, the caller is in charge of performing the copy or
ensuring that the pointee outlives all uses of `MlirStringRef`.

### Printing

IR objects can be printed using `mlirXPrint(MlirX, MlirStringCallback, void *)`
functions. These functions accept take arguments a callback with signature `void
(*)(const char *, intptr_t, void *)` and a pointer to user-defined data. They
call the callback and supply it with chunks of the string representation,
provided as a pointer to the first character and a length, and forward the
user-defined data unmodified. It is up to the caller to allocate memory if the
string representation must be stored and perform the copy. There is no guarantee
that the pointer supplied to the callback points to a null-terminated string,
the size argument should be used to find the end of the string. The callback may
be called multiple times with consecutive chunks of the string representation
(the printing itself is buffered).

*Rationale*: this approach allows the caller to have full control of the
allocation and avoid unnecessary allocation and copying inside the printer.

For convenience, `mlirXDump(MlirX)` functions are provided to print the given
object to the standard error stream.

## Common Patterns

The API adopts the following patterns for recurrent functionality in MLIR.

### Indexed Components

An object has an _indexed component_ if it has fields accessible using a
zero-based contiguous integer index, typically arrays. For example, an
`MlirBlock` has its arguments as an indexed component. An object may have
several such components. For example, an `MlirOperation` has attributes,
operands, regions, results and successors.

For indexed components, the following pair of functions is provided.

-   `intptr_t mlirXGetNum<Y>s(MlirX)` returns the upper bound on the index.
-   `MlirY mlirXGet<Y>(MlirX, intptr_t pos)` returns 'pos'-th subobject.

The sizes are accepted and returned as signed pointer-sized integers, i.e.
`intptr_t`. This typedef is available in C99.

Note that the name of subobject in the function does not necessarily match the
type of the subobject. For example, `mlirOperationGetOperand` returns an
`MlirValue`.

### Iterable Components

An object has an _iterable component_ if it has iterators accessing its fields
in some order other than integer indexing, typically linked lists. For example,
an `MlirBlock` has an iterable list of operations it contains. An object may
have several iterable components.

For iterable components, the following triple of functions is provided.

-   `MlirY mlirXGetFirst<Y>(MlirX)` returns the first subobject in the list.
-   `MlirY mlirYGetNextIn<X>(MlirY)` returns the next subobject in the list that
    contains the given object, or a null object if the given object is the last
    in this list.
-   `int mlirYIsNull(MlirY)` returns 1 if the given object is null.

Note that the name of subobject in the function may or may not match its type.

This approach enables one to iterate as follows.

```c++
MlirY iter;
for (iter = mlirXGetFirst<Y>(x); !mlirYIsNull(iter);
     iter = mlirYGetNextIn<X>(iter)) {
  /* User 'iter'. */
}
```

## Extending the API

### Extensions for Dialect Attributes and Types

Dialect attributes and types can follow the example of builtin attributes and
types, provided that implementations live in separate directories, i.e.
`include/mlir-c/<...>Dialect/` and `lib/CAPI/<...>Dialect/`. The core APIs
provide implementation-private headers in `include/mlir/CAPI/IR` that allow one
to convert between opaque C structures for core IR components and their C++
counterparts. `wrap` converts a C++ class into a C structure and `unwrap` does
the inverse conversion. Once the C++ object is available, the API implementation
should rely on `isa` to implement `mlirXIsAY` and is expected to use `cast`
inside other API calls.

### Extensions for Interfaces

Interfaces can follow the example of IR interfaces and should be placed in the
appropriate library (e.g., common interfaces in `mlir-c/Interfaces` and
dialect-specific interfaces in their dialect library). Similarly to other type
hierarchies, interfaces are not expected to have objects of their own type and
instead operate on top-level objects: `MlirAttribute`, `MlirOperation` and
`MlirType`. Static interface methods are expected to take as leading argument a
canonical identifier of the class, `MlirStringRef` with the name for operations
and `MlirTypeID` for attributes and types, followed by `MlirContext` in which
the interfaces are registered.

Individual interfaces are expected provide a `mlir<InterfaceName>TypeID()`
function that can be used to check whether an object or a class implements this
interface using `mlir<Attribute/Operation/Type>ImplementsInterface` or
`mlir<Attribute/Operation?Type>ImplementsInterfaceStatic` functions,
respectively. Rationale: C++ `isa` only works when an object exists, static
methods are usually dispatched to using templates; lookup by `TypeID` in
`MLIRContext` works even without an object.