devel-docs/render_tree.rst


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306

Render tree
===========

For historical reasons, librsvg's code flow during rendering is as
follows.  The rendering code traverses the SVG tree of elements, and
for each one, its ``::draw()`` method is called; its signature looks
like this (some arguments omitted):

.. code-block:: rust

    pub fn draw(
        &self,
        ...
        draw_ctx: &mut DrawingCtx,
    ) -> Result<BoundingBox, RenderingError> { ... }

The draw() methods perform the actual rendering as side effects on the
``draw_ctx``, and return a ``BoundingBox``.  That is, the bounding box of
an element is computed at the same time that it is rendered.  This is
suboptimal for several reasons:

- Many things that happen during rendering depend on knowing the
  bounding box.  For example, gradients, patterns, and filters with
  units set to ``objectBoundingBox`` need to know the bounds.  The
  rendering code in drawing_ctx.rs is cluttered because it must
  resolve bounding boxes very late.

- This is especially problematic for filters, since a Cairo surface
  needs to be created *before* rendering, and that surface should have
  a size relative to the bounding box of the element being filtered!
  `Bug #1 <https://gitlab.gnome.org/GNOME/librsvg/-/issues/1>`_ is
  precisely about this: librsvg instead creates a temporary surface as
  big as the document's toplevel viewport and filters it, but this
  doesn't work well for filters like Gaussian blur that should
  actually reference pixels outside of the document's area (think of a
  shape that extends past the document's area, which then gets
  blurred).

- The way for an element to signal that it is not drawable
  (e.g. ``<defs>`` is by returning an empty bounding box and not
  rendering anything.  This is awkward.

- When rendering to a temporary surface for filtering or masking,
  there is a set of affine transformations that needs to be maintained
  carefully: an affine for the clipping path outside the temporary
  surface, an affine for drawing inside the surface, an affine to
  composite the surface into the final result.  This is hard to
  understand and hard to test.

These problems can be solved by having a **render tree**.

What is a render tree?
----------------------

As of 2022/Oct/06, librsvg does not compute a render tree data
structure prior to rendering.  Instead, in a very 2000s fashion, it
walks the tree of elements and calls a ``.draw()`` method for each
one.  Each element then calls whatever methods it needs from
``DrawingCtx`` to draw itself.  Elements which don't produce graphical
output (e.g. ``<defs>`` or ``<marker>``) simply have an empty
``draw()`` method.

Over time we have been refactoring that in the direction of actually
being able to produce a render tree.  What would that look like?
Consider an SVG document like this:

.. code-block:: xml
   
   <svg xmlns="http://www.w3.org/2000/svg" width="100" height="100">
     <defs>
       <rect id="TheRect" x="10" y="10" width="20" height="20" fill="blue"/>
     </defs>
   
     <g>
       <use href="#TheRect" stroke="red" stroke-width="2"/>
   
       <circle cx="50" cy="50" r="20" fill="yellow"/>
     </g>
   </svg>

A render tree would be a list of nested instructions like this:

::

   group {                            # refers to the toplevel SVG
     width: 100
     height: 100
     establishes_viewport: true       # because it is an <svg> element

     children {
       group {                        # refers to the <g>
         establishes_viewport: false  # because it is a simple <g>

         children {
           shape {
             path="the <rect> above but resolved to path commands"
    
             # note how the following is the cascaded style and the <use> semantics
             fill: blue
             stroke: red
             stroke-width: 2
           }
    
           shape {
             path="the <circle> above but resolved to path commands"
    
             fill: yellow
           }
         }
       }
     }
   }

That is, we take the high-level SVG instructions and "lower" them to a
few possible drawing primitives like path-based shapes that can be
grouped.  All the primitives have everything that is needed to draw
them, like their set of computed values for styles, and their
coordinates resolved to their user-space coordinate system.

Browser engines produce render trees more or less similar to the above
(they don't always call them that), and get various benefits:

- The various recursively-nested subtrees can be rendered concurrently.

- Having low-level primitives makes it easier to switch to another
  rendering engine in the future.

- The tree can be re-rendered without recomputation, or subtrees can
  be recomputed efficiently if e.g. an animated element changes a few
  of its properties.

Why did librsvg not do that since the beginning?
------------------------------------------------

Librsvg was originally written in the early 2000s, when several things
were happening at the same time:

- libxml2 (one of the early widely-available parsers for XML) had
  recently gotten a SAX API for parsing XML.  This lets an application
  stream in the parsed XML elements and process them one by one,
  without having to build a tree of elements+attributes first.  In
  those days, memory was at a premium and "not producing a tree" was
  seen as beneficial.

- The SVG spec itself was being written, and it did not have all of
  the features we know now.  In particular, maybe at some point it
  didn't have elements that worked by referencing others, like
  ``<use>`` or ``<filter>``.  The CSS cascade could be done on the fly
  for the XML elements being streamed in, and one could emit rendering
  commands for each element to produce the final result.

That is, at that time, it was indeed feasible to do this: stream in
parsed XML elements one by one as produced by libxml2, and for each
element, compute its CSS cascade and render it.

This scheme probably stopped working at some point when SVG got
features that allowed referencing elements that have not been declared
yet (think of ``<use href="#foo"/>`` but with the ``<defs> <path
id="foo" .../> </defs>`` declared until later in the document).  Or
elements that referenced others, like ``<rect filter="url(#blah)">``.
In both cases, one needs to actually build an in-memory tree of parsed
elements, and *then* resolve the references between them.

That is where much of the complexity of librsvg's code flow comes from:

- ``AcquiredNodes`` is the thing that resolves references when needed.
  It also detects reference cycles, which are an error.

- ``ComputedValues`` often get resolved until pretty late, by passing
  the ``CascadedValues`` state down to children as they are drawn.

- ``DrawingCtx`` was originally a giant ball of mutable state, but we
  have been whittling it down and moving part of that state elsewhere.


Summary of the SVG rendering model
----------------------------------

In the SVG2 spec, this has been offloaded to the "`Order of graphical
operations
<https://www.w3.org/TR/compositing/#compositingandblendingorder>`_"
section of the Compositing and Blending Level 1 spec.  Once the render
tree is resolved, each node is painted like this, conceptually to a
transparent, temporary surface:

- Paint the shape/text/etc.
- Filters.
- Clip paths.
- Masks.
- Blend/composite the temporary surface onto the result.

The most critical function in librsvg is probably
`DrawingCtx::with_discrete_layer
<https://gnome.pages.gitlab.gnome.org/librsvg/internals/librsvg/drawing_ctx/struct.DrawingCtx.html#method.with_discrete_layer>`_;
it implements this drawing model.

Current state
-------------

``layout.rs`` has the beginnings of the render tree.  It's probably mis-named?  It contains this:

- A primitive for path-based shapes.

- A primitive for text.

- A `stacking context
  <https://www.w3.org/TR/SVG2/render.html#EstablishingStackingContex>`_,
  which indicates each layer's opacity/clip/mask/filters.

- Various ancillary structures that try to have only user-space
  coordinates (e.g. a number of CSS pixels instead of ``5cm``) and no
  references to other things.

The last point is not yet fully realized.  For example,
``StackingContext.clip_in_user_space`` has a reference to an element,
which will be used as the clip path — that one needs to be normalized
to user-space coordinates in the end.  Also,
``StackingContext.filter`` is a filter list as parsed from the SVG,
not a ``FilterSpec`` that has been resolved to user space.

It would be good to resolve everything as early as possible to allow
lowering concepts to their final renderable form.  Whenever we have
done this via refactoring, it has simplified the code closer to the
actual rendering via Cairo.

Major subprojects
-----------------

Path based shapes (``layout::Shape``) and text primitives
(``layout::Text``) are almost done.  The only missing thing for shapes
would be to "explode" their markers into the actual primitives that
would be rendered for them.  However...

There is no primitive for groups yet.  Every SVG element that allows
renderable children must produce a group primitive of some sort:
``svg``, ``g``, ``use``, ``marker``, etc.  Among those, ``use`` and
``marker`` are especially interesting since they must explode their
referenced subtree into a shadow DOM, which librsvg doesn't support
yet for CSS cascading purposes (the reference subtree gets rendered
properly, but the full semantics of shadow DOM are not implemented
yet).

Elements that establish a viewport (``svg``, ``symbol``, ``image``,
``marker``, ``pattern``) need to carry information about this
viewport, which is a ``viewBox`` plus ``preserveAspectRatio``.  See
`#298 <https://gitlab.gnome.org/GNOME/librsvg/-/issues/298>`_ for a
somewhat obsolete description of the refactoring work needed to unify
this logic.

The ``layout::StackingContext`` struct should contain another field,
probably called ``layer``, with something like this:

.. code-block:: rust

   struct StackingContext {
       // ... all its current fields

       layer: Layer
   }
                
   enum Layer {
       Shape(Box<Shape>),
       Text(Box<Text>),
       StackingContext(Box<StackingContext>)
   }

That is, every stacking context should contain the thing that it will
draw, and that thing may be a shape/text or another stacking context!

Bounding boxes
--------------

SVG depends on the ``objectBoundingBox`` of an element in many places:
to resolve a gradient's or pattern's units, to determine the size of
masks and clips, to determine the size of the filter region.

The current big bug to solve is `#778
<https://gitlab.gnome.org/GNOME/librsvg/-/issues/>`_, which requires
knowing the ``objectBoundingBox`` of an element **before** rendering
it, so that a temporary surface of the appropriate size can be created
for rendering the element if it has isolated opacity or masks/filters.
Currently librsvg creates a temporary surface with the size and
position of the toplevel viewport, and this is wrong for shapes that
fall outside the viewport.

The problem is that librsvg computes bounding boxes at the time of
rendering, not before that.  However, now ``layout::Shape`` and
``layout::Text`` already know their bounding box beforehand.  Work
needs to be done to do the same for a ``layout::Group`` or whatever
that primitive ends up being called (by taking the union of its
children's bounding boxes, so e.g. that a group with a filter can
create a temporary surface to be able to render all of its children
and then filter the surface).

Being able to compute the ``objectBoundingBox`` of an element before
rendering it would open the door to fixing bug `#1
<https://gitlab.gnome.org/GNOME/librsvg/-/issues/1>`_ (yeah, really):
currently, the temporary surface used for filtering has the size of
the toplevel viewport, but this doesn't work well when one tries to
Gaussian-blur an element that lies partially outside that viewport.
The filter should apply to the element's extents plus the filter
region, which takes into account the extra space needed for a Gaussian
blur to work around a shape.  Since librsvg cannot render the full
shape if it lies partially outside of the toplevel viewport, the
blurred result shows up with a halo near the image's edge, since
transparent pixels get "blurred in" with the shape's pixels.