summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--datapath/Modules.mk2
-rw-r--r--datapath/README.md265
-rw-r--r--datapath/README.rst265
-rw-r--r--lib/dpif.h2
-rw-r--r--vtep/README.ovs-vtep.rst2
5 files changed, 268 insertions, 268 deletions
diff --git a/datapath/Modules.mk b/datapath/Modules.mk
index 8e9a1697a..2ffab2b2f 100644
--- a/datapath/Modules.mk
+++ b/datapath/Modules.mk
@@ -46,7 +46,7 @@ openvswitch_headers = \
vport-netdev.h
openvswitch_extras = \
- README.md
+ README.rst
dist_sources = $(foreach module,$(dist_modules),$($(module)_sources))
dist_headers = $(foreach module,$(dist_modules),$($(module)_headers))
diff --git a/datapath/README.md b/datapath/README.md
deleted file mode 100644
index 8faecc063..000000000
--- a/datapath/README.md
+++ /dev/null
@@ -1,265 +0,0 @@
-Open vSwitch datapath developer documentation
-=============================================
-
-The Open vSwitch kernel module allows flexible userspace control over
-flow-level packet processing on selected network devices. It can be
-used to implement a plain Ethernet switch, network device bonding,
-VLAN processing, network access control, flow-based network control,
-and so on.
-
-The kernel module implements multiple "datapaths" (analogous to
-bridges), each of which can have multiple "vports" (analogous to ports
-within a bridge). Each datapath also has associated with it a "flow
-table" that userspace populates with "flows" that map from keys based
-on packet headers and metadata to sets of actions. The most common
-action forwards the packet to another vport; other actions are also
-implemented.
-
-When a packet arrives on a vport, the kernel module processes it by
-extracting its flow key and looking it up in the flow table. If there
-is a matching flow, it executes the associated actions. If there is
-no match, it queues the packet to userspace for processing (as part of
-its processing, userspace will likely set up a flow to handle further
-packets of the same type entirely in-kernel).
-
-
-Flow key compatibility
-----------------------
-
-Network protocols evolve over time. New protocols become important
-and existing protocols lose their prominence. For the Open vSwitch
-kernel module to remain relevant, it must be possible for newer
-versions to parse additional protocols as part of the flow key. It
-might even be desirable, someday, to drop support for parsing
-protocols that have become obsolete. Therefore, the Netlink interface
-to Open vSwitch is designed to allow carefully written userspace
-applications to work with any version of the flow key, past or future.
-
-To support this forward and backward compatibility, whenever the
-kernel module passes a packet to userspace, it also passes along the
-flow key that it parsed from the packet. Userspace then extracts its
-own notion of a flow key from the packet and compares it against the
-kernel-provided version:
-
- - If userspace's notion of the flow key for the packet matches the
- kernel's, then nothing special is necessary.
-
- - If the kernel's flow key includes more fields than the userspace
- version of the flow key, for example if the kernel decoded IPv6
- headers but userspace stopped at the Ethernet type (because it
- does not understand IPv6), then again nothing special is
- necessary. Userspace can still set up a flow in the usual way,
- as long as it uses the kernel-provided flow key to do it.
-
- - If the userspace flow key includes more fields than the
- kernel's, for example if userspace decoded an IPv6 header but
- the kernel stopped at the Ethernet type, then userspace can
- forward the packet manually, without setting up a flow in the
- kernel. This case is bad for performance because every packet
- that the kernel considers part of the flow must go to userspace,
- but the forwarding behavior is correct. (If userspace can
- determine that the values of the extra fields would not affect
- forwarding behavior, then it could set up a flow anyway.)
-
-How flow keys evolve over time is important to making this work, so
-the following sections go into detail.
-
-
-Flow key format
----------------
-
-A flow key is passed over a Netlink socket as a sequence of Netlink
-attributes. Some attributes represent packet metadata, defined as any
-information about a packet that cannot be extracted from the packet
-itself, e.g. the vport on which the packet was received. Most
-attributes, however, are extracted from headers within the packet,
-e.g. source and destination addresses from Ethernet, IP, or TCP
-headers.
-
-The <linux/openvswitch.h> header file defines the exact format of the
-flow key attributes. For informal explanatory purposes here, we write
-them as comma-separated strings, with parentheses indicating arguments
-and nesting. For example, the following could represent a flow key
-corresponding to a TCP packet that arrived on vport 1:
-
- in_port(1), eth(src=e0:91:f5:21:d0:b2, dst=00:02:e3:0f:80:a4),
- eth_type(0x0800), ipv4(src=172.16.0.20, dst=172.18.0.52, proto=17, tos=0,
- frag=no), tcp(src=49163, dst=80)
-
-Often we ellipsize arguments not important to the discussion, e.g.:
-
- in_port(1), eth(...), eth_type(0x0800), ipv4(...), tcp(...)
-
-
-Wildcarded flow key format
---------------------------
-
-A wildcarded flow is described with two sequences of Netlink attributes
-passed over the Netlink socket. A flow key, exactly as described above, and an
-optional corresponding flow mask.
-
-A wildcarded flow can represent a group of exact match flows. Each '1' bit
-in the mask specifies an exact match with the corresponding bit in the flow key.
-A '0' bit specifies a don't care bit, which will match either a '1' or '0' bit
-of an incoming packet. Using a wildcarded flow can improve the flow set up rate
-by reducing the number of new flows that need to be processed by the user space
-program.
-
-Support for the mask Netlink attribute is optional for both the kernel and user
-space program. The kernel can ignore the mask attribute, installing an exact
-match flow, or reduce the number of don't care bits in the kernel to less than
-what was specified by the user space program. In this case, variations in bits
-that the kernel does not implement will simply result in additional flow setups.
-The kernel module will also work with user space programs that neither support
-nor supply flow mask attributes.
-
-Since the kernel may ignore or modify wildcard bits, it can be difficult for
-the userspace program to know exactly what matches are installed. There are
-two possible approaches: reactively install flows as they miss the kernel
-flow table (and therefore not attempt to determine wildcard changes at all)
-or use the kernel's response messages to determine the installed wildcards.
-
-When interacting with userspace, the kernel should maintain the match portion
-of the key exactly as originally installed. This will provides a handle to
-identify the flow for all future operations. However, when reporting the
-mask of an installed flow, the mask should include any restrictions imposed
-by the kernel.
-
-The behavior when using overlapping wildcarded flows is undefined. It is the
-responsibility of the user space program to ensure that any incoming packet
-can match at most one flow, wildcarded or not. The current implementation
-performs best-effort detection of overlapping wildcarded flows and may reject
-some but not all of them. However, this behavior may change in future versions.
-
-
-Unique flow identifiers
------------------------
-
-An alternative to using the original match portion of a key as the handle for
-flow identification is a unique flow identifier, or "UFID". UFIDs are optional
-for both the kernel and user space program.
-
-User space programs that support UFID are expected to provide it during flow
-setup in addition to the flow, then refer to the flow using the UFID for all
-future operations. The kernel is not required to index flows by the original
-flow key if a UFID is specified.
-
-
-Basic rule for evolving flow keys
----------------------------------
-
-Some care is needed to really maintain forward and backward
-compatibility for applications that follow the rules listed under
-"Flow key compatibility" above.
-
-The basic rule is obvious:
-
- ------------------------------------------------------------------
- New network protocol support must only supplement existing flow
- key attributes. It must not change the meaning of already defined
- flow key attributes.
- ------------------------------------------------------------------
-
-This rule does have less-obvious consequences so it is worth working
-through a few examples. Suppose, for example, that the kernel module
-did not already implement VLAN parsing. Instead, it just interpreted
-the 802.1Q TPID (0x8100) as the Ethertype then stopped parsing the
-packet. The flow key for any packet with an 802.1Q header would look
-essentially like this, ignoring metadata:
-
- eth(...), eth_type(0x8100)
-
-Naively, to add VLAN support, it makes sense to add a new "vlan" flow
-key attribute to contain the VLAN tag, then continue to decode the
-encapsulated headers beyond the VLAN tag using the existing field
-definitions. With this change, a TCP packet in VLAN 10 would have a
-flow key much like this:
-
- eth(...), vlan(vid=10, pcp=0), eth_type(0x0800), ip(proto=6, ...), tcp(...)
-
-But this change would negatively affect a userspace application that
-has not been updated to understand the new "vlan" flow key attribute.
-The application could, following the flow compatibility rules above,
-ignore the "vlan" attribute that it does not understand and therefore
-assume that the flow contained IP packets. This is a bad assumption
-(the flow only contains IP packets if one parses and skips over the
-802.1Q header) and it could cause the application's behavior to change
-across kernel versions even though it follows the compatibility rules.
-
-The solution is to use a set of nested attributes. This is, for
-example, why 802.1Q support uses nested attributes. A TCP packet in
-VLAN 10 is actually expressed as:
-
- eth(...), eth_type(0x8100), vlan(vid=10, pcp=0), encap(eth_type(0x0800),
- ip(proto=6, ...), tcp(...)))
-
-Notice how the "eth_type", "ip", and "tcp" flow key attributes are
-nested inside the "encap" attribute. Thus, an application that does
-not understand the "vlan" key will not see either of those attributes
-and therefore will not misinterpret them. (Also, the outer eth_type
-is still 0x8100, not changed to 0x0800.)
-
-Handling malformed packets
---------------------------
-
-Don't drop packets in the kernel for malformed protocol headers, bad
-checksums, etc. This would prevent userspace from implementing a
-simple Ethernet switch that forwards every packet.
-
-Instead, in such a case, include an attribute with "empty" content.
-It doesn't matter if the empty content could be valid protocol values,
-as long as those values are rarely seen in practice, because userspace
-can always forward all packets with those values to userspace and
-handle them individually.
-
-For example, consider a packet that contains an IP header that
-indicates protocol 6 for TCP, but which is truncated just after the IP
-header, so that the TCP header is missing. The flow key for this
-packet would include a tcp attribute with all-zero src and dst, like
-this:
-
- eth(...), eth_type(0x0800), ip(proto=6, ...), tcp(src=0, dst=0)
-
-As another example, consider a packet with an Ethernet type of 0x8100,
-indicating that a VLAN TCI should follow, but which is truncated just
-after the Ethernet type. The flow key for this packet would include
-an all-zero-bits vlan and an empty encap attribute, like this:
-
- eth(...), eth_type(0x8100), vlan(0), encap()
-
-Unlike a TCP packet with source and destination ports 0, an
-all-zero-bits VLAN TCI is not that rare, so the CFI bit (aka
-VLAN_TAG_PRESENT inside the kernel) is ordinarily set in a vlan
-attribute expressly to allow this situation to be distinguished.
-Thus, the flow key in this second example unambiguously indicates a
-missing or malformed VLAN TCI.
-
-Other rules
------------
-
-The other rules for flow keys are much less subtle:
-
- - Duplicate attributes are not allowed at a given nesting level.
-
- - Ordering of attributes is not significant.
-
- - When the kernel sends a given flow key to userspace, it always
- composes it the same way. This allows userspace to hash and
- compare entire flow keys that it may not be able to fully
- interpret.
-
-
-Coding rules
-============
-
-Compatibility
--------------
-
-Please implement the headers and codes for compatibility with older kernel
-in linux/compat/ directory. All public functions should be exported using
-EXPORT_SYMBOL macro. Public function replacing the same-named kernel
-function should be prefixed with 'rpl_'. Otherwise, the function should be
-prefixed with 'ovs_'. For special case when it is not possible to follow
-this rule (e.g., the pskb_expand_head() function), the function name must
-be added to linux/compat/build-aux/export-check-whitelist, otherwise, the
-compilation check 'check-export-symbol' will fail.
diff --git a/datapath/README.rst b/datapath/README.rst
new file mode 100644
index 000000000..47e0e23e9
--- /dev/null
+++ b/datapath/README.rst
@@ -0,0 +1,265 @@
+..
+ Licensed under the Apache License, Version 2.0 (the "License"); you may
+ not use this file except in compliance with the License. You may obtain
+ a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ License for the specific language governing permissions and limitations
+ under the License.
+
+ Convention for heading levels in Open vSwitch documentation:
+
+ ======= Heading 0 (reserved for the title in a document)
+ ------- Heading 1
+ ~~~~~~~ Heading 2
+ +++++++ Heading 3
+ ''''''' Heading 4
+
+ Avoid deeper levels because they do not render well.
+
+=======================================
+Open vSwitch Datapath Development Guide
+=======================================
+
+The Open vSwitch kernel module allows flexible userspace control over
+flow-level packet processing on selected network devices. It can be used to
+implement a plain Ethernet switch, network device bonding, VLAN processing,
+network access control, flow-based network control, and so on.
+
+The kernel module implements multiple "datapaths" (analogous to bridges), each
+of which can have multiple "vports" (analogous to ports within a bridge). Each
+datapath also has associated with it a "flow table" that userspace populates
+with "flows" that map from keys based on packet headers and metadata to sets of
+actions. The most common action forwards the packet to another vport; other
+actions are also implemented.
+
+When a packet arrives on a vport, the kernel module processes it by extracting
+its flow key and looking it up in the flow table. If there is a matching flow,
+it executes the associated actions. If there is no match, it queues the packet
+to userspace for processing (as part of its processing, userspace will likely
+set up a flow to handle further packets of the same type entirely in-kernel).
+
+Flow Key Compatibility
+----------------------
+
+Network protocols evolve over time. New protocols become important and
+existing protocols lose their prominence. For the Open vSwitch kernel module
+to remain relevant, it must be possible for newer versions to parse additional
+protocols as part of the flow key. It might even be desirable, someday, to
+drop support for parsing protocols that have become obsolete. Therefore, the
+Netlink interface to Open vSwitch is designed to allow carefully written
+userspace applications to work with any version of the flow key, past or
+future.
+
+To support this forward and backward compatibility, whenever the kernel module
+passes a packet to userspace, it also passes along the flow key that it parsed
+from the packet. Userspace then extracts its own notion of a flow key from the
+packet and compares it against the kernel-provided version:
+
+- If userspace's notion of the flow key for the packet matches the kernel's,
+ then nothing special is necessary.
+
+- If the kernel's flow key includes more fields than the userspace version of
+ the flow key, for example if the kernel decoded IPv6 headers but userspace
+ stopped at the Ethernet type (because it does not understand IPv6), then
+ again nothing special is necessary. Userspace can still set up a flow in the
+ usual way, as long as it uses the kernel-provided flow key to do it.
+
+- If the userspace flow key includes more fields than the kernel's, for example
+ if userspace decoded an IPv6 header but the kernel stopped at the Ethernet
+ type, then userspace can forward the packet manually, without setting up a
+ flow in the kernel. This case is bad for performance because every packet
+ that the kernel considers part of the flow must go to userspace, but the
+ forwarding behavior is correct. (If userspace can determine that the values
+ of the extra fields would not affect forwarding behavior, then it could set
+ up a flow anyway.)
+
+How flow keys evolve over time is important to making this work, so
+the following sections go into detail.
+
+Flow Key Format
+---------------
+
+A flow key is passed over a Netlink socket as a sequence of Netlink attributes.
+Some attributes represent packet metadata, defined as any information about a
+packet that cannot be extracted from the packet itself, e.g. the vport on which
+the packet was received. Most attributes, however, are extracted from headers
+within the packet, e.g. source and destination addresses from Ethernet, IP, or
+TCP headers.
+
+The ``<linux/openvswitch.h>`` header file defines the exact format of the flow
+key attributes. For informal explanatory purposes here, we write them as
+comma-separated strings, with parentheses indicating arguments and nesting.
+For example, the following could represent a flow key corresponding to a TCP
+packet that arrived on vport 1::
+
+ in_port(1), eth(src=e0:91:f5:21:d0:b2, dst=00:02:e3:0f:80:a4),
+ eth_type(0x0800), ipv4(src=172.16.0.20, dst=172.18.0.52, proto=17, tos=0,
+ frag=no), tcp(src=49163, dst=80)
+
+Often we ellipsize arguments not important to the discussion, e.g.::
+
+ in_port(1), eth(...), eth_type(0x0800), ipv4(...), tcp(...)
+
+Wildcarded Flow Key Format
+--------------------------
+
+A wildcarded flow is described with two sequences of Netlink attributes passed
+over the Netlink socket. A flow key, exactly as described above, and an
+optional corresponding flow mask.
+
+A wildcarded flow can represent a group of exact match flows. Each ``1`` bit
+in the mask specifies an exact match with the corresponding bit in the flow key.
+A ``0`` bit specifies a don't care bit, which will match either a ``1`` or
+``0`` bit of an incoming packet. Using a wildcarded flow can improve the flow
+set up rate by reducing the number of new flows that need to be processed by
+the user space program.
+
+Support for the mask Netlink attribute is optional for both the kernel and user
+space program. The kernel can ignore the mask attribute, installing an exact
+match flow, or reduce the number of don't care bits in the kernel to less than
+what was specified by the user space program. In this case, variations in bits
+that the kernel does not implement will simply result in additional flow
+setups. The kernel module will also work with user space programs that neither
+support nor supply flow mask attributes.
+
+Since the kernel may ignore or modify wildcard bits, it can be difficult for
+the userspace program to know exactly what matches are installed. There are two
+possible approaches: reactively install flows as they miss the kernel flow
+table (and therefore not attempt to determine wildcard changes at all) or use
+the kernel's response messages to determine the installed wildcards.
+
+When interacting with userspace, the kernel should maintain the match portion
+of the key exactly as originally installed. This will provides a handle to
+identify the flow for all future operations. However, when reporting the mask
+of an installed flow, the mask should include any restrictions imposed by the
+kernel.
+
+The behavior when using overlapping wildcarded flows is undefined. It is the
+responsibility of the user space program to ensure that any incoming packet can
+match at most one flow, wildcarded or not. The current implementation performs
+best-effort detection of overlapping wildcarded flows and may reject some but
+not all of them. However, this behavior may change in future versions.
+
+Unique Flow Identifiers
+-----------------------
+
+An alternative to using the original match portion of a key as the handle for
+flow identification is a unique flow identifier, or "UFID". UFIDs are optional
+for both the kernel and user space program.
+
+User space programs that support UFID are expected to provide it during flow
+setup in addition to the flow, then refer to the flow using the UFID for all
+future operations. The kernel is not required to index flows by the original
+flow key if a UFID is specified.
+
+Basic Rule for Evolving Flow Keys
+---------------------------------
+
+Some care is needed to really maintain forward and backward compatibility for
+applications that follow the rules listed under "Flow key compatibility" above.
+
+The basic rule is obvious:
+
+ New network protocol support must only supplement existing flow key
+ attributes. It must not change the meaning of already defined flow key
+ attributes.
+
+This rule does have less-obvious consequences so it is worth working through a
+few examples. Suppose, for example, that the kernel module did not already
+implement VLAN parsing. Instead, it just interpreted the 802.1Q TPID
+(``0x8100``) as the Ethertype then stopped parsing the packet. The flow key
+for any packet with an 802.1Q header would look essentially like this, ignoring
+metadata::
+
+ eth(...), eth_type(0x8100)
+
+Naively, to add VLAN support, it makes sense to add a new "vlan" flow key
+attribute to contain the VLAN tag, then continue to decode the encapsulated
+headers beyond the VLAN tag using the existing field definitions. With this
+change, a TCP packet in VLAN 10 would have a flow key much like this::
+
+ eth(...), vlan(vid=10, pcp=0), eth_type(0x0800), ip(proto=6, ...), tcp(...)
+
+But this change would negatively affect a userspace application that has not
+been updated to understand the new "vlan" flow key attribute. The application
+could, following the flow compatibility rules above, ignore the "vlan"
+attribute that it does not understand and therefore assume that the flow
+contained IP packets. This is a bad assumption (the flow only contains IP
+packets if one parses and skips over the 802.1Q header) and it could cause the
+application's behavior to change across kernel versions even though it follows
+the compatibility rules.
+
+The solution is to use a set of nested attributes. This is, for example, why
+802.1Q support uses nested attributes. A TCP packet in VLAN 10 is actually
+expressed as::
+
+ eth(...), eth_type(0x8100), vlan(vid=10, pcp=0), encap(eth_type(0x0800),
+ ip(proto=6, ...), tcp(...)))
+
+Notice how the ``eth_type``, ``ip``, and ``tcp`` flow key attributes are nested
+inside the ``encap`` attribute. Thus, an application that does not understand
+the ``vlan`` key will not see either of those attributes and therefore will not
+misinterpret them. (Also, the outer ``eth_type`` is still ``0x8100``, not
+changed to ``0x0800``)
+
+Handling Malformed Packets
+--------------------------
+
+Don't drop packets in the kernel for malformed protocol headers, bad checksums,
+etc. This would prevent userspace from implementing a simple Ethernet switch
+that forwards every packet.
+
+Instead, in such a case, include an attribute with "empty" content. It doesn't
+matter if the empty content could be valid protocol values, as long as those
+values are rarely seen in practice, because userspace can always forward all
+packets with those values to userspace and handle them individually.
+
+For example, consider a packet that contains an IP header that indicates
+protocol 6 for TCP, but which is truncated just after the IP header, so that
+the TCP header is missing. The flow key for this packet would include a tcp
+attribute with all-zero ``src`` and ``dst``, like this::
+
+ eth(...), eth_type(0x0800), ip(proto=6, ...), tcp(src=0, dst=0)
+
+As another example, consider a packet with an Ethernet type of 0x8100,
+indicating that a VLAN TCI should follow, but which is truncated just after the
+Ethernet type. The flow key for this packet would include an all-zero-bits
+vlan and an empty encap attribute, like this::
+
+ eth(...), eth_type(0x8100), vlan(0), encap()
+
+Unlike a TCP packet with source and destination ports 0, an all-zero-bits VLAN
+TCI is not that rare, so the CFI bit (aka VLAN_TAG_PRESENT inside the kernel)
+is ordinarily set in a vlan attribute expressly to allow this situation to be
+distinguished. Thus, the flow key in this second example unambiguously
+indicates a missing or malformed VLAN TCI.
+
+Other Rules
+-----------
+
+The other rules for flow keys are much less subtle:
+
+- Duplicate attributes are not allowed at a given nesting level.
+
+- Ordering of attributes is not significant.
+
+- When the kernel sends a given flow key to userspace, it always composes it
+ the same way. This allows userspace to hash and compare entire flow keys
+ that it may not be able to fully interpret.
+
+Coding Rules
+------------
+
+Implement the headers and codes for compatibility with older kernel in
+``linux/compat/`` directory. All public functions should be exported using
+``EXPORT_SYMBOL`` macro. Public function replacing the same-named kernel
+function should be prefixed with ``rpl_``. Otherwise, the function should be
+prefixed with ``ovs_``. For special case when it is not possible to follow
+this rule (e.g., the ``pskb_expand_head()`` function), the function name must
+be added to ``linux/compat/build-aux/export-check-whitelist``, otherwise, the
+compilation check ``check-export-symbol`` will fail.
diff --git a/lib/dpif.h b/lib/dpif.h
index cade0464d..e69087dee 100644
--- a/lib/dpif.h
+++ b/lib/dpif.h
@@ -113,7 +113,7 @@
*
* In Open vSwitch userspace, "struct flow" is the typical way to describe
* a flow, but the datapath interface uses a different data format to
- * allow ABI forward- and backward-compatibility. datapath/README.md
+ * allow ABI forward- and backward-compatibility. datapath/README.rst
* describes the rationale and design. Refer to OVS_KEY_ATTR_* and
* "struct ovs_key_*" in include/odp-netlink.h for details.
* lib/odp-util.h defines several functions for working with these flows.
diff --git a/vtep/README.ovs-vtep.rst b/vtep/README.ovs-vtep.rst
index 9e9883b65..75f03d0b7 100644
--- a/vtep/README.ovs-vtep.rst
+++ b/vtep/README.ovs-vtep.rst
@@ -154,7 +154,7 @@ using the debian packages as mentioned in step 2 of the "Requirements" section.
6. Start the VTEP emulator. If you installed the components following the
`installation guide <../INSTALL.rst>`__ file, run the following from the
- same directory as this README.md:
+ same directory as this README:
::