summaryrefslogtreecommitdiff
path: root/gcc/doc/generic.texi
blob: 14284cc397eac0306cb7e5fa59db8d239a668c2a (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
@c Copyright (c) 2004, 2005, 2007, 2008 Free Software Foundation, Inc.
@c Free Software Foundation, Inc.
@c This is part of the GCC manual.
@c For copying conditions, see the file gcc.texi.

@c ---------------------------------------------------------------------
@c GENERIC
@c ---------------------------------------------------------------------

@node GENERIC
@chapter GENERIC
@cindex GENERIC

The purpose of GENERIC is simply to provide a
language-independent way of representing an entire function in
trees.  To this end, it was necessary to add a few new tree codes
to the back end, but most everything was already there.  If you
can express it with the codes in @code{gcc/tree.def}, it's
GENERIC@.

Early on, there was a great deal of debate about how to think
about statements in a tree IL@.  In GENERIC, a statement is
defined as any expression whose value, if any, is ignored.  A
statement will always have @code{TREE_SIDE_EFFECTS} set (or it
will be discarded), but a non-statement expression may also have
side effects.  A @code{CALL_EXPR}, for instance.

It would be possible for some local optimizations to work on the
GENERIC form of a function; indeed, the adapted tree inliner
works fine on GENERIC, but the current compiler performs inlining
after lowering to GIMPLE (a restricted form described in the next
section). Indeed, currently the frontends perform this lowering
before handing off to @code{tree_rest_of_compilation}, but this
seems inelegant.

If necessary, a front end can use some language-dependent tree
codes in its GENERIC representation, so long as it provides a
hook for converting them to GIMPLE and doesn't expect them to
work with any (hypothetical) optimizers that run before the
conversion to GIMPLE@. The intermediate representation used while
parsing C and C++ looks very little like GENERIC, but the C and
C++ gimplifier hooks are perfectly happy to take it as input and
spit out GIMPLE@.

@menu
* Statements::
@end menu

@node Statements
@section Statements
@cindex Statements

Most statements in GIMPLE are assignment statements, represented by
@code{GIMPLE_ASSIGN}.  No other C expressions can appear at statement level;
a reference to a volatile object is converted into a
@code{GIMPLE_ASSIGN}.

There are also several varieties of complex statements.

@menu
* Blocks::
* Statement Sequences::
* Empty Statements::
* Jumps::
* Cleanups::
@end menu

@node Blocks
@subsection Blocks
@cindex Blocks

Block scopes and the variables they declare in GENERIC are
expressed using the @code{BIND_EXPR} code, which in previous
versions of GCC was primarily used for the C statement-expression
extension.

Variables in a block are collected into @code{BIND_EXPR_VARS} in
declaration order.  Any runtime initialization is moved out of
@code{DECL_INITIAL} and into a statement in the controlled block.
When gimplifying from C or C++, this initialization replaces the
@code{DECL_STMT}.

Variable-length arrays (VLAs) complicate this process, as their
size often refers to variables initialized earlier in the block.
To handle this, we currently split the block at that point, and
move the VLA into a new, inner @code{BIND_EXPR}.  This strategy
may change in the future.

A C++ program will usually contain more @code{BIND_EXPR}s than
there are syntactic blocks in the source code, since several C++
constructs have implicit scopes associated with them.  On the
other hand, although the C++ front end uses pseudo-scopes to
handle cleanups for objects with destructors, these don't
translate into the GIMPLE form; multiple declarations at the same
level use the same @code{BIND_EXPR}.

@node Statement Sequences
@subsection Statement Sequences
@cindex Statement Sequences

Multiple statements at the same nesting level are collected into
a @code{STATEMENT_LIST}.  Statement lists are modified and
traversed using the interface in @samp{tree-iterator.h}.

@node Empty Statements
@subsection Empty Statements
@cindex Empty Statements

Whenever possible, statements with no effect are discarded.  But
if they are nested within another construct which cannot be
discarded for some reason, they are instead replaced with an
empty statement, generated by @code{build_empty_stmt}.
Initially, all empty statements were shared, after the pattern of
the Java front end, but this caused a lot of trouble in practice.

An empty statement is represented as @code{(void)0}.

@node Jumps
@subsection Jumps
@cindex Jumps

Other jumps are expressed by either @code{GOTO_EXPR} or
@code{RETURN_EXPR}.

The operand of a @code{GOTO_EXPR} must be either a label or a
variable containing the address to jump to.

The operand of a @code{RETURN_EXPR} is either @code{NULL_TREE},
@code{RESULT_DECL}, or a @code{MODIFY_EXPR} which sets the return
value.  It would be nice to move the @code{MODIFY_EXPR} into a
separate statement, but the special return semantics in
@code{expand_return} make that difficult.  It may still happen in
the future, perhaps by moving most of that logic into
@code{expand_assignment}.

@node Cleanups
@subsection Cleanups
@cindex Cleanups

Destructors for local C++ objects and similar dynamic cleanups are
represented in GIMPLE by a @code{TRY_FINALLY_EXPR}.
@code{TRY_FINALLY_EXPR} has two operands, both of which are a sequence
of statements to execute.  The first sequence is executed.  When it
completes the second sequence is executed.

The first sequence may complete in the following ways:

@enumerate

@item Execute the last statement in the sequence and fall off the
end.

@item Execute a goto statement (@code{GOTO_EXPR}) to an ordinary
label outside the sequence.

@item Execute a return statement (@code{RETURN_EXPR}).

@item Throw an exception.  This is currently not explicitly represented in
GIMPLE.

@end enumerate

The second sequence is not executed if the first sequence completes by
calling @code{setjmp} or @code{exit} or any other function that does
not return.  The second sequence is also not executed if the first
sequence completes via a non-local goto or a computed goto (in general
the compiler does not know whether such a goto statement exits the
first sequence or not, so we assume that it doesn't).

After the second sequence is executed, if it completes normally by
falling off the end, execution continues wherever the first sequence
would have continued, by falling off the end, or doing a goto, etc.

@code{TRY_FINALLY_EXPR} complicates the flow graph, since the cleanup
needs to appear on every edge out of the controlled block; this
reduces the freedom to move code across these edges.  Therefore, the
EH lowering pass which runs before most of the optimization passes
eliminates these expressions by explicitly adding the cleanup to each
edge.  Rethrowing the exception is represented using @code{RESX_EXPR}.