| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2f748bf8016c ("datapath: Derive IP protocol...")
This commit is causing some ipv6 fragmentation errors in some older
kernels. Revert for now and then we can determine how to implement
this patch with appropriate compatability layer changes to prevent
errors on older kernels.
CC: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit fa642f08839bf2ff35b2f6c6a6c062aee8121ba8
Author: Yi-Hung Wei <yihung.wei@gmail.com>
Date: Tue Sep 4 15:33:41 2018 -0700
openvswitch: Derive IP protocol number for IPv6 later frags
Currently, OVS only parses the IP protocol number for the first
IPv6 fragment, but sets the IP protocol number for the later fragments
to be NEXTHDF_FRAGMENT. This patch tries to derive the IP protocol
number for the IPV6 later frags so that we can match that.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
CC: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit c48e74736fccf25fb32bb015426359e1c2016e3b
Author: Eric Garver <e@erig.me>
Date: Wed Dec 20 15:09:22 2017 -0500
openvswitch: Fix pop_vlan action for double tagged frames
skb_vlan_pop() expects skb->protocol to be a valid TPID for double
tagged frames. So set skb->protocol to the TPID and let skb_vlan_pop()
shift the true ethertype into position for us.
Fixes: 5108bbaddc37 ("openvswitch: add processing of L3 packets")
Signed-off-by: Eric Garver <e@erig.me>
Reviewed-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Cc: Eric Garver <e@erig.me>
Fixes: a27c454ee0 ("datapath: add processing of L3 packets")
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit 311af51dcb5629f04976a8e451673f77e3301041
Author: Arnd Bergmann <arnd@arndb.de>
Date: Mon Nov 27 12:41:38 2017 +0100
openvswitch: use ktime_get_ts64() instead of ktime_get_ts()
timespec is deprecated because of the y2038 overflow, so let's convert
this one to ktime_get_ts64(). The code is already safe even on 32-bit
architectures, since it uses monotonic times. On 64-bit architectures,
nothing changes, while on 32-bit architectures this avoids one
type conversion.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Additional compatability check for ktime_get_ts64() exists or not.
If not, then just continue using ktime_get_ts(). I added a new
compatability header file "timekeeping.h".
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit b2d0f5d5dc53532e6f07bc546a476a55ebdfe0f3
Author: Yi Yang <yi.y.yang@intel.com>
Date: Tue Nov 7 21:07:02 2017 +0800
openvswitch: enable NSH support
OVS master and 2.8 branch has merged NSH userspace
patch series, this patch is to enable NSH support
in kernel data path in order that OVS can support
NSH in compat mode by porting this.
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Eric Garver <e@erig.me>
Acked-by: Pravin Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit c4b2bf6b4a35348fe6d1eb06928eb68d7b9d99a9
Author: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Date: Mon Jul 17 23:28:06 2017 -0700
openvswitch: Optimize operations for OvS flow_stats.
When calling the flow_free() to free the flow, we call many times
(cpu_possible_mask, eg. 128 as default) cpumask_next(). That will
take up our CPU usage if we call the flow_free() frequently.
When we put all packets to userspace via upcall, and OvS will send
them back via netlink to ovs_packet_cmd_execute(will call flow_free).
The test topo is shown as below. VM01 sends TCP packets to VM02,
and OvS forward packtets. When testing, we use perf to report the
system performance.
VM01 --- OvS-VM --- VM02
Without this patch, perf-top show as below: The flow_free() is
3.02% CPU usage.
4.23% [kernel] [k] _raw_spin_unlock_irqrestore
3.62% [kernel] [k] __do_softirq
3.16% [kernel] [k] __memcpy
3.02% [kernel] [k] flow_free
2.42% libc-2.17.so [.] __memcpy_ssse3_back
2.18% [kernel] [k] copy_user_generic_unrolled
2.17% [kernel] [k] find_next_bit
When applied this patch, perf-top show as below: Not shown on
the list anymore.
4.11% [kernel] [k] _raw_spin_unlock_irqrestore
3.79% [kernel] [k] __do_softirq
3.46% [kernel] [k] __memcpy
2.73% libc-2.17.so [.] __memcpy_ssse3_back
2.25% [kernel] [k] copy_user_generic_unrolled
1.89% libc-2.17.so [.] _int_malloc
1.53% ovs-vswitchd [.] xlate_actions
With this patch, the TCP throughput(we dont use Megaflow Cache
+ Microflow Cache) between VMs is 1.18Gbs/sec up to 1.30Gbs/sec
(maybe ~10% performance improve).
This patch adds cpumask struct, the cpu_used_mask stores the cpu_id
that the flow used. And we only check the flow_stats on the cpu we
used, and it is unncessary to check all possible cpu when getting,
cleaning, and updating the flow_stats. Adding the cpu_used_mask to
sw_flow struct doesât increase the cacheline number.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Andy Zhou <azhou@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit c57c054eb5b1ccf230c49f736f7a018fcbc3e952
Author: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Date: Mon Jul 17 23:28:05 2017 -0700
openvswitch: Optimize updating for OvS flow_stats.
In the ovs_flow_stats_update(), we only use the node
var to alloc flow_stats struct. But this is not a
common case, it is unnecessary to call the numa_node_id()
everytime. This patch is not a bugfix, but there maybe
a small increase.
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: Andy Zhou <azhou@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit 880388aa3c07fdea4f9b85e35641753017b1852f
Author: David S. Miller <davem@davemloft.net>
Date: Mon Jul 3 07:29:12 2017 -0700
net: Remove all references to SKB_GSO_UDP.
Such packets are no longer possible.
Signed-off-by: David S. Miller <davem@davemloft.net>
SKB_GSO_UDP is removed in the upstream kernel. Use HAVE_SKB_GSO_UDP
define from acinclude to detect if SKB_GSO_UDP exists and if so apply
openvswitch section of this upstream patch.
Signed-off-by: Greg Rose <gvrose8192@gmail.com>
Signed-off-by: Andy Zhou <azhou@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit 6f56f6186c18e3fd54122b73da68e870687b8c59
Author: Yi-Hung Wei <yihung.wei@gmail.com>
Date: Thu Mar 30 12:36:03 2017 -0700
ovs_flow_key_update() is called when the flow key is invalid, and it is
used to update and revalidate the flow key. Commit 329f45bc4f19
("openvswitch: add mac_proto field to the flow key") introduces mac_proto
field to flow key and use it to determine whether the flow key is valid.
However, the commit does not update the code path in ovs_flow_key_update()
to revalidate the flow key which may cause BUG_ON() on execute_recirc().
This patch addresses the aforementioned issue.
Fixes: 329f45bc4f19 ("openvswitch: add mac_proto field to the flow key")
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Acked-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit f7d49bce8e741e1e6aa14ce4db1b6cea7e4be4e8
Author: Jiri Benc <jbenc@redhat.com>
Date: Fri Sep 30 19:08:05 2016 +0200
openvswitch: mpls: set network header correctly on key extract
After the 48d2ab609b6b ("net: mpls: Fixups for GSO"), MPLS handling in
openvswitch was changed to have network header pointing to the start of the
MPLS headers and inner_network_header pointing after the MPLS headers.
However, key_extract was missed by the mentioned commit, causing incorrect
headers to be set when a MPLS packet just enters the bridge or after it is
recirculated.
Fixes: 48d2ab609b6b ("net: mpls: Fixups for GSO")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit 9dd7f8907c3705dc7a7a375d1c6e30b06e6daffc
Author: Jarno Rajahalme <jarno@ovn.org>
Date: Thu Feb 9 11:21:59 2017 -0800
openvswitch: Add original direction conntrack tuple to sw_flow_key.
Add the fields of the conntrack original direction 5-tuple to struct
sw_flow_key. The new fields are initially marked as non-existent, and
are populated whenever a conntrack action is executed and either finds
or generates a conntrack entry. This means that these fields exist
for all packets that were not rejected by conntrack as untrackable.
The original tuple fields in the sw_flow_key are filled from the
original direction tuple of the conntrack entry relating to the
current packet, or from the original direction tuple of the master
conntrack entry, if the current conntrack entry has a master.
Generally, expected connections of connections having an assigned
helper (e.g., FTP), have a master conntrack entry.
The main purpose of the new conntrack original tuple fields is to
allow matching on them for policy decision purposes, with the premise
that the admissibility of tracked connections reply packets (as well
as original direction packets), and both direction packets of any
related connections may be based on ACL rules applying to the master
connection's original direction 5-tuple. This also makes it easier to
make policy decisions when the actual packet headers might have been
transformed by NAT, as the original direction 5-tuple represents the
packet headers before any such transformation.
When using the original direction 5-tuple the admissibility of return
and/or related packets need not be based on the mere existence of a
conntrack entry, allowing separation of admission policy from the
established conntrack state. While existence of a conntrack entry is
required for admission of the return or related packets, policy
changes can render connections that were initially admitted to be
rejected or dropped afterwards. If the admission of the return and
related packets was based on mere conntrack state (e.g., connection
being in an established state), a policy change that would make the
connection rejected or dropped would need to find and delete all
conntrack entries affected by such a change. When using the original
direction 5-tuple matching the affected conntrack entries can be
allowed to time out instead, as the established state of the
connection would not need to be the basis for packet admission any
more.
It should be noted that the directionality of related connections may
be the same or different than that of the master connection, and
neither the original direction 5-tuple nor the conntrack state bits
carry this information. If needed, the directionality of the master
connection can be stored in master's conntrack mark or labels, which
are automatically inherited by the expected related connections.
The fact that neither ARP nor ND packets are trackable by conntrack
allows mutual exclusion between ARP/ND and the new conntrack original
tuple fields. Hence, the IP addresses are overlaid in union with ARP
and ND fields. This allows the sw_flow_key to not grow much due to
this patch, but it also means that we must be careful to never use the
new key fields with ARP or ND packets. ARP is easy to distinguish and
keep mutually exclusive based on the ethernet type, but ND being an
ICMPv6 protocol requires a bit more attention.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch squashes in minimal amount of OVS userspace code to not
break the build. Later patches contain the full userspace support.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Joe Stringer <joe@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit 0a6410fbde597ebcf82dda4a0b0e889e82242678
Author: Jiri Benc <jbenc@redhat.com>
Date: Thu Nov 10 16:28:22 2016 +0100
openvswitch: netlink: support L3 packets
Extend the ovs flow netlink protocol to support L3 packets. Packets without
OVS_KEY_ATTR_ETHERNET attribute specify L3 packets; for those, the
OVS_KEY_ATTR_ETHERTYPE attribute is mandatory.
Push/pop vlan actions are only supported for Ethernet packets.
Based on previous versions by Lorand Jakab and Simon Horman.
Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream commit:
commit 87e159c59d9f325d571689d4027115617adb32e6
Author: Jarno Rajahalme <jarno@ovn.org>
Date: Mon Dec 19 17:06:33 2016 -0800
openvswitch: Add a missing break statement.
Add a break statement to prevent fall-through from
OVS_KEY_ATTR_ETHERNET to OVS_KEY_ATTR_TUNNEL. Without the break
actions setting ethernet addresses fail to validate with log messages
complaining about invalid tunnel attributes.
Fixes: 0a6410fbde ("openvswitch: netlink: support L3 packets")
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jiri Benc <jbenc@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream commit:
commit df30f7408b187929dbde72661c7f7c615268f1d0
Author: pravin shelar <pshelar@ovn.org>
Date: Mon Dec 26 08:31:27 2016 -0800
openvswitch: upcall: Fix vlan handling.
Networking stack accelerate vlan tag handling by
keeping topmost vlan header in skb. This works as
long as packet remains in OVS datapath. But during
OVS upcall vlan header is pushed on to the packet.
When such packet is sent back to OVS datapath, core
networking stack might not handle it correctly. Following
patch avoids this issue by accelerating the vlan tag
during flow key extract. This simplifies datapath by
bringing uniform packet processing for packets from
all code paths.
Fixes: 5108bbaddc ("openvswitch: add processing of L3 packets").
CC: Jarno Rajahalme <jarno@ovn.org>
CC: Jiri Benc <jbenc@redhat.com>
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
[Committer Notes]
Squashed in the following upstream commits to retain bisectability:
87e159c59d9f ("openvswitch: Add a missing break statement.")
df30f7408b18 ("openvswitch: upcall: Fix vlan handling.")
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit 5108bbaddc37c1c8583f0cf2562d7d3463cd12cb
Author: Jiri Benc <jbenc@redhat.com>
Date: Thu Nov 10 16:28:21 2016 +0100
openvswitch: add processing of L3 packets
Support receiving, extracting flow key and sending of L3 packets (packets
without an Ethernet header).
Note that even after this patch, non-Ethernet interfaces are still not
allowed to be added to bridges. Similarly, netlink interface for sending and
receiving L3 packets to/from user space is not in place yet.
Based on previous versions by Lorand Jakab and Simon Horman.
Signed-off-by: Lorand Jakab <lojakab@cisco.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit 329f45bc4f191c663dc156c510816411a4310578
Author: Jiri Benc <jbenc@redhat.com>
Date: Thu Nov 10 16:28:18 2016 +0100
openvswitch: add mac_proto field to the flow key
Use a hole in the structure. We support only Ethernet so far and will add
a support for L2-less packets shortly. We could use a bool to indicate
whether the Ethernet header is present or not but the approach with the
mac_proto field is more generic and occupies the same number of bytes in the
struct, while allowing later extensibility. It also makes the code in the
next patches more self explaining.
It would be nice to use ARPHRD_ constants but those are u16 which would be
waste. Thus define our own constants.
Another upside of this is that we can overload this new field to also denote
whether the flow key is valid. This has the advantage that on
refragmentation, we don't have to reparse the packet but can rely on the
stored eth.type. This is especially important for the next patches in this
series - instead of adding another branch for L2-less packets before calling
ovs_fragment, we can just remove all those branches completely.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Signed-off-by: Joe Stringer <joe@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit 2279994d07ab67ff7a1d09bfbd65588332dfb6d8
Author: pravin shelar <pshelar@ovn.org>
Date: Mon Sep 19 13:51:00 2016 -0700
openvswitch: avoid resetting flow key while installing new flow.
since commit commit db74a3335e0f6 ("openvswitch: use percpu
flow stats") flow alloc resets flow-key. So there is no need
to reset the flow-key again if OVS is using newly allocated
flow-key.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Joe Stringer <joe@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit db74a3335e0f645e3139c80bcfc90feb01d8e304
Author: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Date: Thu Sep 15 19:11:53 2016 -0300
openvswitch: use percpu flow stats
Instead of using flow stats per NUMA node, use it per CPU. When using
megaflows, the stats lock can be a bottleneck in scalability.
On a E5-2690 12-core system, usual throughput went from ~4Mpps to
~15Mpps when forwarding between two 40GbE ports with a single flow
configured on the datapath.
This has been tested on a system with possible CPUs 0-7,16-23. After
module removal, there were no corruption on the slab cache.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Cc: pravin shelar <pshelar@ovn.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Joe Stringer <joe@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit 40773966ccf1985a1b2bb570a03cbeaf1cbd4e00
Author: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Date: Thu Sep 15 19:11:52 2016 -0300
openvswitch: fix flow stats accounting when node 0 is not possible
On a system with only node 1 as possible, all statistics is going to be
accounted on node 0 as it will have a single writer.
However, when getting and clearing the statistics, node 0 is not going
to be considered, as it's not a possible node.
Tested that statistics are not zero on a system with only node 1
possible. Also compile-tested with CONFIG_NUMA off.
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch contained a memory leak that is fixed in this backport.
The next patch silently fixed that in upstream, too.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Signed-off-by: Joe Stringer <joe@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
commit 018c1dda5ff1e7bd1fe2d9fd1d0f5b82dc6fc0cd
Author: Eric Garver <e@erig.me>
Date: Wed Sep 7 12:56:59 2016 -0400
openvswitch: 802.1AD Flow handling, actions, vlan parsing, netlink attributes
Add support for 802.1ad including the ability to push and pop double
tagged vlans. Add support for 802.1ad to netlink parsing and flow
conversion. Uses double nested encap attributes to represent double
tagged vlan. Inner TPID encoded along with ctci in nested attributes.
This is based on Thomas F Herbert's original v20 patch. I made some
small clean ups and bug fixes.
Signed-off-by: Thomas F Herbert <thomasfherbert@gmail.com>
Signed-off-by: Eric Garver <e@erig.me>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream commit:
commit 20ecf1e4e30005ad50f561a92c888b6477f99341
Author: Jiri Benc <jbenc@redhat.com>
Date: Mon Oct 10 17:02:42 2016 +0200
openvswitch: vlan: remove wrong likely statement
This code is called whenever flow key is being extracted from the packet.
The packet may be as likely vlan tagged as not.
Fixes: 018c1dda5ff1 ("openvswitch: 802.1AD Flow handling, actions, vlan parsing, netlink attributes")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Eric Garver <e@erig.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream commit:
commit 72ec108d701506fa6cd2f66ec5b15ea71df3c464
Author: Jiri Benc <jbenc@redhat.com>
Date: Mon Oct 10 17:02:43 2016 +0200
openvswitch: fix vlan subtraction from packet length
When the packet has its vlan tag in skb->vlan_tci, the length of the VLAN
header is not counted in skb->len. It doesn't make sense to subtract it.
Fixes: 018c1dda5ff1 ("openvswitch: 802.1AD Flow handling, actions, vlan parsing, netlink attributes")
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Eric Garver <e@erig.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
[Committer notes]
The following commits upstream fix bugs in this patch, so to retain
bisectability of the OVS tree they were rolled into this commit:
20ecf1e4e300 openvswitch: vlan: remove wrong likely statement
72ec108d7015 openvswitch: fix vlan subtraction from packet length
Signed-off-by: Yi Yang <yi.y.yang@intel.com>
Acked-by: Eric Garver <e@erig.me>
Signed-off-by: Joe Stringer <joe@ovn.org>
|
|
|
|
|
|
|
| |
Synchronize code with upstream ovs_nla_get_flow_metadata().
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
skipping extension headers
Upstream commit:
commit c30da497893718abc6cec4f1d34d35875200edee
Author: Simon Horman <simon.horman@netronome.com>
openvswitch: retain parsed IPv6 header fields in flow on error skipping extension headers
When an error occurs skipping IPv6 extension headers retain the already
parsed IP protocol and IPv6 addresses in the flow. Also assume that the
packet is not a fragment in the absence of information to the contrary;
that is always use the frag_off value set by ipv6_skip_exthdr().
This allows matching on the IP protocol and IPv6 addresses of packets
with malformed extension headers.
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mostly backports upstream commit along with other pieces
to make IPv6 tunneling work.
commit 6b26ba3a7d952e611dcde1f3f77ce63bcc70540a
Author: Jiri Benc <jbenc@redhat.com>
openvswitch: netlink attributes for IPv6 tunneling
Add netlink attributes for IPv6 tunnel addresses. This enables IPv6 support
for tunnels.
Signed-off-by: Jiri Benc <jbenc@redhat.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently OVS out of tree datapath supports a large number of kernel
versions. From 2.6.32 to 4.3 and various distribution-specific
kernels. But at this point major features are only available on more
recent kernels. For example, stateful services are only available
starting in kernel 3.10 and STT is available on starting with 3.5.
Since these features are becoming essential to many OVS deployments,
and the effort of maintaining the backports is high. We have decided
to drop support for older kernel. Following patch drops supports
for kernel older than 3.10.
Signed-off-by: Pravin B Shelar <pshelar@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow matching and setting the ct_label field. As with ct_mark, this is
populated by executing the CT action. The label field may be modified by
specifying a label and mask nested under the CT action. It is stored as
metadata attached to the connection. Label modification occurs after
lookup, and will only persist when the conntrack entry is committed by
providing the COMMIT flag to the CT action. Labels are currently fixed
to 128 bits in size.
Upstream: c2ac667 "openvswitch: Allow matching on conntrack label"
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Expose the kernel connection tracker via OVS. Userspace components can
make use of the CT action to populate the connection state (ct_state)
field for a flow. This state can be subsequently matched.
Exposed connection states are OVS_CS_F_*:
- NEW (0x01) - Beginning of a new connection.
- ESTABLISHED (0x02) - Part of an existing connection.
- RELATED (0x04) - Related to an established connection.
- INVALID (0x20) - Could not track the connection for this packet.
- REPLY_DIR (0x40) - This packet is in the reply direction for the flow.
- TRACKED (0x80) - This packet has been sent through conntrack.
When the CT action is executed by itself, it will send the packet
through the connection tracker and populate the ct_state field with one
or more of the connection state flags above. The CT action will always
set the TRACKED bit.
When the COMMIT flag is passed to the conntrack action, this specifies
that information about the connection should be stored. This allows
subsequent packets for the same (or related) connections to be
correlated with this connection. Sending subsequent packets for the
connection through conntrack allows the connection tracker to consider
the packets as ESTABLISHED, RELATED, and/or REPLY_DIR.
The CT action may optionally take a zone to track the flow within. This
allows connections with the same 5-tuple to be kept logically separate
from connections in other zones. If the zone is specified, then the
"ct_zone" match field will be subsequently populated with the zone id.
IP fragments are handled by transparently assembling them as part of the
CT action. The maximum received unit (MRU) size is tracked so that
refragmentation can occur during output.
IP frag handling contributed by Andy Zhou.
Based on original design by Justin Pettit.
Upstream: 7f8a436 "openvswitch: Add conntrack action"
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Justin Pettit <jpettit@nicira.com>
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Following patch adds support for lwtunnel to OVS datapath.
With this change OVS datapath detect lwtunnel support and
make use of new APIs if available. On older kernel where the
support is not there the backported tunnel modules are used.
These backported tunnel devices acts as lwtunnel devices.
I tried to keep backported module same as upstream for easier
bug-fix backport. Since STT and LISP are not upstream OVS
always needs to use respective modules from tunnel compat layer.
To make it work on kernel 4.3 I have converted STT and LISP
modules to lwtunnel API model.
lwtunnel make use of skb-dst to pass tunnel information to the
tunnel module. On older kernel this is not possible. So the in
case of old kernel metadata ref is stored in OVS_CB and direct
call to tunnel transmit function is made by respective tunnel
vport modules. Similarly on receive side tunnel recv directly
call netdev-vport-receive to pass the skb to OVS.
Major backported components include:
Geneve, GRE, VXLAN, ip_tunnel, udp-tunnels GRO.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Joe Stringer <joe@ovn.org>
Acked-by: Jesse Gross <jesse@kernel.org>
|
|
|
|
|
|
| |
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Replace "ntohs(proto) >= ETH_P_802_3_MIN" w/ eth_proto_is_802_3(proto).
Backport of upstream commit 6713fc9b8fa33444aa000f0f31076f6a859ccb34:
"openvswitch: Use eth_proto_is_802_3"
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Joe Stringer <joestringer@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Backport of upstream commit:
openvswitch: Rename GENEVE_TUN_OPTS() to TUN_METADATA_OPTS()
Also factors out Geneve validation code into a new separate function
validate_and_copy_geneve_opts().
A subsequent patch will introduce VXLAN options. Rename the existing
GENEVE_TUN_OPTS() to reflect its extended purpose of carrying generic
tunnel metadata options.
Signed-off-by: Thomas Graf <tgraf@suug.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: d91641d ("openvswitch: Rename GENEVE_TUN_OPTS() to TUN_METADATA_OPTS()")
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream commit:
net: rename vlan_tx_* helpers since "tx" is misleading there
The same macros are used for rx as well. So rename it.
Signed-off-by: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: David S. Miller <davem@davemloft.net>
Upstream: df8a39d ("net: rename vlan_tx_* helpers since "tx" is misleading there")
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now, when VLAN acceleration was in use, the bytes of the VLAN header
were not included in port or flow byte counters. They were however
included when VLAN acceleration was not used. This commit corrects the
inconsistency, by always including the VLAN header in byte counters.
Previous discussion at
http://openvswitch.org/pipermail/dev/2014-December/049521.html
Already committed to upstream Linux netdev tree as
24cc59d1ebaac54d933dc0b30abcd8bd86193eef.
Reported-by: Motonori Shindo <mshindo@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Reviewed-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
|
|
| |
Kernel datapath code has diverged from upstream code. This
makes porting patches between these two code bases harder
than it needs to be. Following patch fixes this by fixing
coding style issues on this branch.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
| |
Found during MPLS upstreaming. Also sync-up MPLS header files
with upstream code.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
| |
Use netdev comment style.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pskb_may_pull() called by arphdr_ok can change skb->data, so put the arp
setting after arphdr_ok to avoid the use the freed memory
Fixes: 0714812134d7d ("openvswitch: Eliminate memset() from flow_extract.")
Cc: Jesse Gross <jesse@nicira.com>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Li RongQing <roy.qing.li@gmail.com>
Acked-by: Jesse Gross <jesse@nicira.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
|
|
| |
This new flag is useful for suppressing error logging while probing
for datapath features using flow commands. For backwards
compatibility reasons the commands are executed normally, but error
logging is suppressed.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
| |
Help produce better optimized code.
Signed-off-by: Thomas Graf <tgraf@noironetworks.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
|
|
| |
OVS keeps pointer to packet key in skb->cb, but the packet key is
store on stack. This could make code bit tricky. So it is better to
get rid of the pointer.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "rcu_dereference()" call is used directly in a condition.
Since its return value is never dereferenced it is recommended to use
"rcu_access_pointer()" instead of "rcu_dereference()".
Therefore, this patch makes the replacement.
The following Coccinelle semantic patch was used:
@@
@@
(
if(
(<+...
- rcu_dereference
+ rcu_access_pointer
(...)
...+>)) {...}
|
while(
(<+...
- rcu_dereference
+ rcu_access_pointer
(...)
...+>)) {...}
)
Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When flow key becomes invalid due to push or pop actions, current
implementation leaves it as invalid, only rebuild the flow key used
for recirculation.
This works, but is less efficient in case of multiple recirc
actions. Each recirc action will have to re-extract
its own flow keys.
This patch update the original flow key as soon as the first recirc
action is encountered, avoiding expensive flow extract call for any
future recirc actions as long as the flow key remains valid.
Signed-off-by: Andy Zhou <azhou@nicira.com>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
|
|
| |
Currently tun_info is used for passing tunnel information
on ingress and egress path, this cause confusion. Following
patch removes its use on ingress path make it egress only parameter.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recirc action needs to extract flow key from packet, it uses tun_info
from OVS_CB for setting tunnel meta data in flow key. But tun_info
can be overwritten by tunnel send action. This would result in wrong
flow key for the recirculation.
Following patch copies flow-key meta data from OVS_CB packet key
itself thus avoids this bug.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
|
|
|
|
|
|
|
|
|
| |
OVS flow extract is called on packet receive or packet
execute code path. Following patch defines separate API
for extracting flow-key in packet execute code path.
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow datapath to recognize and extract MPLS labels into flow keys
and execute actions which push, pop, and set labels on packets.
Based heavily on work by Leo Alterman, Ravi K, Isaku Yamahata and Joe Stringer.
Cc: Ravi K <rkerur@gmail.com>
Cc: Leo Alterman <lalterman@nicira.com>
Cc: Isaku Yamahata <yamahata@valinux.co.jp>
Cc: Joe Stringer <joe@wand.net.nz>
Signed-off-by: Simon Horman <horms@verge.net.au>
Signed-off-by: Jesse Gross <jesse@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for Geneve - Generic Network Virtualization
Encapsulation. The protocol is documented at
http://tools.ietf.org/html/draft-gross-geneve-00
The kernel implementation is completely agnostic to the options
that are in use and can handle newly defined options without
further work. It does this by simply matching on a byte array
of options and allowing userspace to setup flows on this array.
Userspace currently implements only support for basic version of
Geneve. It can work with the base header (including the VNI) and
is capable of parsing options but does not currently support any
particular option definitions. Over time, the intention is to
allow options to be matched through OpenFlow without requiring
explicit support in OVS userspace.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the flow information that is matched for tunnels and
the tunnel data passed around with packets is the same. However,
as additional information is added this is not necessarily desirable,
as in the case of pointers.
This adds a new structure for tunnel metadata which currently contains
only the existing struct. This change is purely internal to the kernel
since the current OVS_KEY_ATTR_IPV4_TUNNEL is simply a compressed version
of OVS_KEY_ATTR_TUNNEL that is translated at flow setup.
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As new protocols are added, the size of the flow key tends to
increase although few protocols care about all of the fields. In
order to optimize this for hashing and matching, OVS uses a variable
length portion of the key. However, when fields are extracted from
the packet we must still zero out the entire key.
This is no longer necessary now that OVS implements masking. Any
fields (or holes in the structure) which are not part of a given
protocol will be by definition not part of the mask and zeroed out
during lookup. Furthermore, since masking already uses variable
length keys this zeroing operation automatically benefits as well.
In principle, the only thing that needs to be done at this point
is remove the memset() at the beginning of flow. However, some
fields assume that they are initialized to zero, which now must be
done explicitly. In addition, in the event of an error we must also
zero out corresponding fields to signal that there is no valid data
present. These increase the total amount of code but very little of
it is executed in non-error situations.
Removing the memset() reduces the profile of ovs_flow_extract()
from 0.64% to 0.56% when tested with large packets on a 10G link.
Suggested-by: Pravin Shelar <pshelar@nicira.com>
Signed-off-by: Jesse Gross <jesse@nicira.com>
Acked-by: Thomas Graf <tgraf@suug.ch>
Acked-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Flow statistics need to take into account the TCP flags from the packet
currently being processed (in 'key'), not the TCP flags matched by the
flow found in the kernel flow table (in 'flow').
This bug made the Open vSwitch userspace fin_timeout action have no effect
in many cases.
Bug #1219516.
Reported-by: Len Gao <leng@vmware.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Jesse Gross <jesse@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For ovs_flow_stats_get() using ovsl_dereference() was wrong, since
flow dumps call this with RCU read lock.
ovs_flow_stats_clear() is always called with ovs_mutex, so can use
ovsl_dereference().
Also, make the ovs_flow_stats_get() 'flow' argument const to make
later patches cleaner.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
|
|
|
|
|
|
|
| |
Remove unnecessary locking from functions that are always called with
appropriate locking.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Thomas Graf <tgraf@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Minimize padding in sw_flow_key and move 'tp' top the main struct.
These changes simplify code when accessing the transport port numbers
and the tcp flags, and makes the sw_flow_key 8 bytes smaller on 64-bit
systems (128->120 bytes). These changes also make the keys for IPv4
packets to fit in one cache line.
There is a valid concern for safety of packing the struct
ovs_key_ipv4_tunnel, as it would be possible to take the address of
the tun_id member as a __be64 * which could result in unaligned access
in some systems. However:
- sw_flow_key itself is 64-bit aligned, so the tun_id within is always
64-bit aligned.
- We never make arrays of ovs_key_ipv4_tunnel (which would force every
second tun_key to be misaligned).
- We never take the address of the tun_id in to a __be64 *.
- Whereever we use struct ovs_key_ipv4_tunnel outside the sw_flow_key,
it is in stack (on tunnel input functions), where compiler has full
control of the alignment.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
|