summaryrefslogtreecommitdiff
path: root/PORTING.rst
blob: bae8cd9cd6c6b87977030a7d6a92ae02c92f2400 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
..
      Licensed under the Apache License, Version 2.0 (the "License"); you may
      not use this file except in compliance with the License. You may obtain
      a copy of the License at

          http://www.apache.org/licenses/LICENSE-2.0

      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
      License for the specific language governing permissions and limitations
      under the License.

      Convention for heading levels in Open vSwitch documentation:

      =======  Heading 0 (reserved for the title in a document)
      -------  Heading 1
      ~~~~~~~  Heading 2
      +++++++  Heading 3
      '''''''  Heading 4

      Avoid deeper levels because they do not render well.

================================================
Porting Open vSwitch to New Software or Hardware
================================================

Open vSwitch (OVS) is intended to be easily ported to new software and hardware
platforms.  This document describes the types of changes that are most likely
to be necessary in porting OVS to Unix-like platforms.  (Porting OVS to other
kinds of platforms is likely to be more difficult.)

Vocabulary
----------

For historical reasons, different words are used for essentially the same
concept in different areas of the Open vSwitch source tree.  Here is a
concordance, indexed by the area of the source tree:

::

    datapath/       vport           ---
    vswitchd/       iface           port
    ofproto/        port            bundle
    ofproto/bond.c  slave           bond
    lib/lacp.c      slave           lacp
    lib/netdev.c    netdev          ---
    database        Interface       Port

Open vSwitch Architectural Overview
-----------------------------------

The following diagram shows the very high-level architecture of Open vSwitch
from a porter's perspective.

::

    +-------------------+
    |    ovs-vswitchd   |<-->ovsdb-server
    +-------------------+
    |      ofproto      |<-->OpenFlow controllers
    +--------+-+--------+
    | netdev | | ofproto|
    +--------+ |provider|
    | netdev | +--------+
    |provider|
    +--------+

Some of the components are generic.  Modulo bugs or inadequacies, these
components should not need to be modified as part of a port:

ovs-vswitchd
  The main Open vSwitch userspace program, in vswitchd/.  It reads the desired
  Open vSwitch configuration from the ovsdb-server program over an IPC channel
  and passes this configuration down to the "ofproto" library.  It also passes
  certain status and statistical information from ofproto back into the
  database.

ofproto
  The Open vSwitch library, in ofproto/, that implements an OpenFlow switch.
  It talks to OpenFlow controllers over the network and to switch hardware or
  software through an "ofproto provider", explained further below.

netdev
  The Open vSwitch library, in lib/netdev.c, that abstracts interacting with
  network devices, that is, Ethernet interfaces.  The netdev library is a thin
  layer over "netdev provider" code, explained further below.

The other components may need attention during a port.  You will almost
certainly have to implement a "netdev provider".  Depending on the type of port
you are doing and the desired performance, you may also have to implement an
"ofproto provider" or a lower-level component called a "dpif" provider.

The following sections talk about these components in more detail.

Writing a netdev Provider
-------------------------

A "netdev provider" implements an operating system and hardware specific
interface to "network devices", e.g. eth0 on Linux.  Open vSwitch must be able
to open each port on a switch as a netdev, so you will need to implement a
"netdev provider" that works with your switch hardware and software.

``struct netdev_class``, in ``lib/netdev-provider.h``, defines the interfaces
required to implement a netdev.  That structure contains many function
pointers, each of which has a comment that is meant to describe its behavior in
detail.  If the requirements are unclear, report this as a bug.

The netdev interface can be divided into a few rough categories:

- Functions required to properly implement OpenFlow features.  For example,
  OpenFlow requires the ability to report the Ethernet hardware address of a
  port.  These functions must be implemented for minimally correct operation.

- Functions required to implement optional Open vSwitch features.  For example,
  the Open vSwitch support for in-band control requires netdev support for
  inspecting the TCP/IP stack's ARP table.  These functions must be implemented
  if the corresponding OVS features are to work, but may be omitted initially.

- Functions needed in some implementations but not in others.  For example,
  most kinds of ports (see below) do not need functionality to receive packets
  from a network device.

The existing netdev implementations may serve as useful examples during a port:

- lib/netdev-linux.c implements netdev functionality for Linux network devices,
  using Linux kernel calls.  It may be a good place to start for full-featured
  netdev implementations.

- lib/netdev-vport.c provides support for "virtual ports" implemented by the
  Open vSwitch datapath module for the Linux kernel.  This may serve as a model
  for minimal netdev implementations.

- lib/netdev-dummy.c is a fake netdev implementation useful only for testing.

.. _porting strategies:

Porting Strategies
------------------

After a netdev provider has been implemented for a system's network devices,
you may choose among three basic porting strategies.

.. TODO(stephenfin): Update the link to the installation guide when this is
   moved

The lowest-effort strategy is to use the "userspace switch" implementation
built into Open vSwitch.  This ought to work, without writing any more code, as
long as the netdev provider that you implemented supports receiving packets.
It yields poor performance, however, because every packet passes through the
ovs-vswitchd process.  See the `userspace installation guide` for instructions
on how to configure a userspace switch.

If the userspace switch is not the right choice for your port, then you will
have to write more code.  You may implement either an "ofproto provider" or a
"dpif provider".  Which you should choose depends on a few different factors:

* Only an ofproto provider can take full advantage of hardware with built-in
  support for wildcards (e.g. an ACL table or a TCAM).

* A dpif provider can take advantage of the Open vSwitch built-in
  implementations of bonding, LACP, 802.1ag, 802.1Q VLANs, and other features.
  An ofproto provider has to provide its own implementations, if the hardware
  can support them at all.

* A dpif provider is usually easier to implement, but most appropriate for
  software switching.  It "explodes" wildcard rules into exact-match entries
  (with an optional wildcard mask).  This allows fast hash lookups in software,
  but makes inefficient use of TCAMs in hardware that support wildcarding.

The following sections describe how to implement each kind of port.

ofproto Providers
-----------------

An "ofproto provider" is what ofproto uses to directly monitor and control an
OpenFlow-capable switch.  ``struct ofproto_class``, in
``ofproto/ofproto-provider.h``, defines the interfaces to implement an ofproto
provider for new hardware or software.  That structure contains many function
pointers, each of which has a comment that is meant to describe its behavior in
detail.  If the requirements are unclear, report this as a bug.

The ofproto provider interface is preliminary.  Let us know if it seems
unsuitable for your purpose.  We will try to improve it.

Writing a dpif Provider
-----------------------

Open vSwitch has a built-in ofproto provider named "ofproto-dpif", which is
built on top of a library for manipulating datapaths, called "dpif".  A
"datapath" is a simple flow table, one that is only required to support
exact-match flows, that is, flows without wildcards.  When a packet arrives on
a network device, the datapath looks for it in this table.  If there is a
match, then it performs the associated actions.  If there is no match, the
datapath passes the packet up to ofproto-dpif, which maintains the full
OpenFlow flow table.  If the packet matches in this flow table, then
ofproto-dpif executes its actions and inserts a new entry into the dpif flow
table.  (Otherwise, ofproto-dpif passes the packet up to ofproto to send the
packet to the OpenFlow controller, if one is configured.)

When calculating the dpif flow, ofproto-dpif generates an exact-match flow that
describes the missed packet.  It makes an effort to figure out what fields can
be wildcarded based on the switch's configuration and OpenFlow flow table.  The
dpif is free to ignore the suggested wildcards and only support the exact-match
entry.  However, if the dpif supports wildcarding, then it can use the masks to
match multiple flows with fewer entries and potentially significantly reduce
the number of flow misses handled by ofproto-dpif.

The "dpif" library in turn delegates much of its functionality to a "dpif
provider".  The following diagram shows how dpif providers fit into the Open
vSwitch architecture:

::


    Architecure

               _
              |   +-------------------+
              |   |    ovs-vswitchd   |<-->ovsdb-server
              |   +-------------------+
              |   |      ofproto      |<-->OpenFlow controllers
              |   +--------+-+--------+  _
              |   | netdev | |ofproto-|   |
    userspace |   +--------+ |  dpif  |   |
              |   | netdev | +--------+   |
              |   |provider| |  dpif  |   |
              |   +---||---+ +--------+   |
              |       ||     |  dpif  |   | implementation of
              |       ||     |provider|   | ofproto provider
              |_      ||     +---||---+   |
                      ||         ||       |
               _  +---||-----+---||---+   |
              |   |          |datapath|   |
       kernel |   |          +--------+  _|
              |   |                   |
              |_  +--------||---------+
                           ||
                        physical
                           NIC

struct ``dpif_class``, in ``lib/dpif-provider.h``, defines the interfaces
required to implement a dpif provider for new hardware or software.  That
structure contains many function pointers, each of which has a comment that is
meant to describe its behavior in detail.  If the requirements are unclear,
report this as a bug.

There are two existing dpif implementations that may serve as useful examples
during a port:

* lib/dpif-netlink.c is a Linux-specific dpif implementation that talks to an
  Open vSwitch-specific kernel module (whose sources are in the "datapath"
  directory).  The kernel module performs all of the switching work, passing
  packets that do not match any flow table entry up to userspace.  This dpif
  implementation is essentially a wrapper around calls into the kernel module.

* lib/dpif-netdev.c is a generic dpif implementation that performs all
  switching internally.  This is how the Open vSwitch userspace switch is
  implemented.

Miscellaneous Notes
-------------------

Open vSwitch source code uses ``uint16_t``, ``uint32_t``, and ``uint64_t`` as
fixed-width types in host byte order, and ``ovs_be16``, ``ovs_be32``, and
``ovs_be64`` as fixed-width types in network byte order.  Each of the latter is
equivalent to the one of the former, but the difference in name makes the
intended use obvious.

The default "fail-mode" for Open vSwitch bridges is "standalone", meaning that,
when the OpenFlow controllers cannot be contacted, Open vSwitch acts as a
regular MAC-learning switch.  This works well in virtualization environments
where there is normally just one uplink (either a single physical interface or
a bond).  In a more general environment, it can create loops.  So, if you are
porting to a general-purpose switch platform, you should consider changing the
default "fail-mode" to "secure", which does not behave this way.  See
documentation for the "fail-mode" column in the Bridge table in
ovs-vswitchd.conf.db(5) for more information.

``lib/entropy.c`` assumes that it can obtain high-quality random number seeds
at startup by reading from /dev/urandom.  You will need to modify it if this is
not true on your platform.

``vswitchd/system-stats.c`` only knows how to obtain some statistics on Linux.
Optionally you may implement them for your platform as well.

Why OVS Does Not Support Hybrid Providers
-----------------------------------------

The `porting strategies`_ section above describes the "ofproto provider" and
"dpif provider" porting strategies.  Only an ofproto provider can take
advantage of hardware TCAM support, and only a dpif provider can take advantage
of the OVS built-in implementations of various features.  It is therefore
tempting to suggest a hybrid approach that shares the advantages of both
strategies.

However, Open vSwitch does not support a hybrid approach.  Doing so may be
possible, with a significant amount of extra development work, but it does not
yet seem worthwhile, for the reasons explained below.

First, user surprise is likely when a switch supports a feature only with a
high performance penalty.  For example, one user questioned why adding a
particular OpenFlow action to a flow caused a 1,058x slowdown on a hardware
OpenFlow implementation [1]_.  The action required the flow to be implemented in
software.

Given that implementing a flow in software on the slow management CPU of a
hardware switch causes a major slowdown, software-implemented flows would only
make sense for very low-volume traffic.  But many of the features built into
the OVS software switch implementation would need to apply to every flow to be
useful.  There is no value, for example, in applying bonding or 802.1Q VLAN
support only to low-volume traffic.

Besides supporting features of OpenFlow actions, a hybrid approach could also
support forms of matching not supported by particular switching hardware, by
sending all packets that might match a rule to software.  But again this can
cause an unacceptable slowdown by forcing bulk traffic through software in the
hardware switch's slow management CPU.  Consider, for example, a hardware
switch that can match on the IPv6 Ethernet type but not on fields in IPv6
headers.  An OpenFlow table that matched on the IPv6 Ethernet type would
perform well, but adding a rule that matched only UDPv6 would force every IPv6
packet to software, slowing down not just UDPv6 but all IPv6 processing.

.. [1] Aaron Rosen, "Modify packet fields extremely slow",
    openflow-discuss mailing list, June 26, 2011, archived at
    https://mailman.stanford.edu/pipermail/openflow-discuss/2011-June/002386.html.

Questions
---------

Direct porting questions to dev@openvswitch.org.  We will try to use questions
to improve this porting guide.