summaryrefslogtreecommitdiff
path: root/datapath
diff options
context:
space:
mode:
authorMark Gray <mark.d.gray@redhat.com>2021-07-16 06:17:36 -0400
committerIlya Maximets <i.maximets@ovn.org>2021-07-16 20:05:03 +0200
commitb1e517bd2f818fc7c0cd43ee0b67db4274e6b972 (patch)
tree97b08b9fd7cfce9b81d69c2eb9965fcb6e8c6979 /datapath
parent485e3a13a6ead01df5ae68931b36d9845b8ea61c (diff)
downloadopenvswitch-b1e517bd2f818fc7c0cd43ee0b67db4274e6b972.tar.gz
dpif-netlink: Introduce per-cpu upcall dispatch.
The Open vSwitch kernel module uses the upcall mechanism to send packets from kernel space to user space when it misses in the kernel space flow table. The upcall sends packets via a Netlink socket. Currently, a Netlink socket is created for every vport. In this way, there is a 1:1 mapping between a vport and a Netlink socket. When a packet is received by a vport, if it needs to be sent to user space, it is sent via the corresponding Netlink socket. This mechanism, with various iterations of the corresponding user space code, has seen some limitations and issues: * On systems with a large number of vports, there is correspondingly a large number of Netlink sockets which can limit scaling. (https://bugzilla.redhat.com/show_bug.cgi?id=1526306) * Packet reordering on upcalls. (https://bugzilla.redhat.com/show_bug.cgi?id=1844576) * A thundering herd issue. (https://bugzilla.redhat.com/show_bug.cgi?id=1834444) This patch introduces an alternative, feature-negotiated, upcall mode using a per-cpu dispatch rather than a per-vport dispatch. In this mode, the Netlink socket to be used for the upcall is selected based on the CPU of the thread that is executing the upcall. In this way, it resolves the issues above as: a) The number of Netlink sockets scales with the number of CPUs rather than the number of vports. b) Ordering per-flow is maintained as packets are distributed to CPUs based on mechanisms such as RSS and flows are distributed to a single user space thread. c) Packets from a flow can only wake up one user space thread. Reported-at: https://bugzilla.redhat.com/1844576 Signed-off-by: Mark Gray <mark.d.gray@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Diffstat (limited to 'datapath')
-rw-r--r--datapath/linux/compat/include/linux/openvswitch.h7
1 files changed, 7 insertions, 0 deletions
diff --git a/datapath/linux/compat/include/linux/openvswitch.h b/datapath/linux/compat/include/linux/openvswitch.h
index 875de2025..f0595eeba 100644
--- a/datapath/linux/compat/include/linux/openvswitch.h
+++ b/datapath/linux/compat/include/linux/openvswitch.h
@@ -89,6 +89,8 @@ enum ovs_datapath_cmd {
* set on the datapath port (for OVS_ACTION_ATTR_MISS). Only valid on
* %OVS_DP_CMD_NEW requests. A value of zero indicates that upcalls should
* not be sent.
+ * @OVS_DP_ATTR_PER_CPU_PIDS: Per-cpu array of PIDs for upcalls when
+ * OVS_DP_F_DISPATCH_UPCALL_PER_CPU feature is set.
* @OVS_DP_ATTR_STATS: Statistics about packets that have passed through the
* datapath. Always present in notifications.
* @OVS_DP_ATTR_MEGAFLOW_STATS: Statistics about mega flow masks usage for the
@@ -105,6 +107,8 @@ enum ovs_datapath_attr {
OVS_DP_ATTR_MEGAFLOW_STATS, /* struct ovs_dp_megaflow_stats */
OVS_DP_ATTR_USER_FEATURES, /* OVS_DP_F_* */
OVS_DP_ATTR_PAD,
+ OVS_DP_ATTR_PAD2,
+ OVS_DP_ATTR_PER_CPU_PIDS, /* Netlink PIDS to receive upcalls */
__OVS_DP_ATTR_MAX
};
@@ -146,6 +150,9 @@ struct ovs_vport_stats {
/* Allow tc offload recirc sharing */
#define OVS_DP_F_TC_RECIRC_SHARING (1 << 2)
+/* Allow per-cpu dispatch of upcalls */
+#define OVS_DP_F_DISPATCH_UPCALL_PER_CPU (1 << 3)
+
/* Fixed logical ports. */
#define OVSP_LOCAL ((__u32)0)