delta/openvswitch.git - github.com: openvswitch/ovs.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	compat: nf_ct_delete compat.	Jarno Rajahalme	2017-03-08	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upstream commit: commit f330a7fdbe1611104622faff7e614a246a7d20f0 Author: Florian Westphal <fw@strlen.de> Date: Thu Aug 25 15:33:31 2016 +0200 netfilter: conntrack: get rid of conntrack timer With stats enabled this eats 80 bytes on x86_64 per nf_conn entry, as Eric Dumazet pointed out during netfilter workshop 2016. Eric also says: "Another reason was the fact that Thomas was about to change max timer range [..]" (500462a9de657f8, 'timers: Switch to a non-cascading wheel'). Remove the timer and use a 32bit jiffies value containing timestamp until entry is valid. During conntrack lookup, even before doing tuple comparision, check the timeout value and evict the entry in case it is too old. The dying bit is used as a synchronization point to avoid races where multiple cpus try to evict the same entry. Because lookup is always lockless, we need to bump the refcnt once when we evict, else we could try to evict already-dead entry that is being recycled. This is the standard/expected way when conntrack entries are destroyed. Followup patches will introduce garbage colliction via work queue and further places where we can reap obsoleted entries (e.g. during netlink dumps), this is needed to avoid expired conntracks from hanging around for too long when lookup rate is low after a busy period. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Upstream commit f330a7fdbe16 ("netfilter: conntrack: get rid of conntrack timer") changes the way nf_ct_delete() is called. Prior to commit the call pattern was like this: if (del_timer(&ct->timeout)) nf_ct_delete(ct, ...); After this change nf_ct_delete() is called directly: nf_ct_delete(ct, ...); This patch provides a replacement implementation for nf_ct_delete() that first calls the del_timer(). This replacement is only used if the struct nf_conn has member 'timeout' of type 'struct timer_list'. The following patch introduces the first caller to nf_ct_delete() in the OVS kernel module. Linux <3.12 does not have nf_ct_delete() at all, so we inline it if it does not exist. The inlined code is from 3.11 death_by_timeout(), which in later versions simply calls nf_ct_delete(). Upstream commit 02982c27ba1e1bd9f9d4747214e19ca83aa88d0e introduced nf_ct_delete() in Linux 3.12. This commit has the original code that is being inlined here. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
*	datapath: add and use nf_ct_set helper	Florian Westphal	2017-03-08	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upstream commit: commit c74454fadd5ea6fc866ffe2c417a0dba56b2bf1c Author: Florian Westphal <fw@strlen.de> Date: Mon Jan 23 18:21:57 2017 +0100 netfilter: add and use nf_ct_set helper Add a helper to assign a nf_conn entry and the ctinfo bits to an sk_buff. This avoids changing code in followup patch that merges skb->nfct and skb->nfctinfo into skb->_nfct. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
*	datapath: add and use skb_nfct helper	Florian Westphal	2017-03-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upstream commit: commit cb9c68363efb6d1f950ec55fb06e031ee70db5fc Author: Florian Westphal <fw@strlen.de> Date: Mon Jan 23 18:21:56 2017 +0100 skbuff: add and use skb_nfct helper Followup patch renames skb->nfct and changes its type so add a helper to avoid intrusive rename change later. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
*	datapath: Allow compiling against Linux 4.10	Jarno Rajahalme	2017-03-08	1	-2/+2
\| \| \| \| \| \|	OVS in-tree datapath compiles against Linux 4.10 kernel, so allow it. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
*	Makefile: Drop vestiges of support for non-GNU Make.	Ben Pfaff	2017-03-08	1	-52/+2
\| \| \| \| \| \| \| \| \|	Open vSwitch has documented a requirement for GNU Make for a long time, yet it had vestiges catering to other make implementations. This removes those. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
*	datapath: add processing of L3 packets	Yang, Yi Y	2017-03-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upstream commit: commit 5108bbaddc37c1c8583f0cf2562d7d3463cd12cb Author: Jiri Benc <jbenc@redhat.com> Date: Thu Nov 10 16:28:21 2016 +0100 openvswitch: add processing of L3 packets Support receiving, extracting flow key and sending of L3 packets (packets without an Ethernet header). Note that even after this patch, non-Ethernet interfaces are still not allowed to be added to bridges. Similarly, netlink interface for sending and receiving L3 packets to/from user space is not in place yet. Based on previous versions by Lorand Jakab and Simon Horman. Signed-off-by: Lorand Jakab <lojakab@cisco.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Joe Stringer <joe@ovn.org>
*	datapath: use core MTU range checking in core net infra	Jarod Wilson	2017-03-02	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upstream commit: commit 61e84623ace35ce48975e8f90bbbac7557c43d61 Author: Jarod Wilson <jarod@redhat.com> Date: Fri Oct 7 22:04:33 2016 -0400 net: centralize net_device min/max MTU checking While looking into an MTU issue with sfc, I started noticing that almost every NIC driver with an ndo_change_mtu function implemented almost exactly the same range checks, and in many cases, that was the only practical thing their ndo_change_mtu function was doing. Quite a few drivers have either 68, 64, 60 or 46 as their minimum MTU value checked, and then various sizes from 1500 to 65535 for their maximum MTU value. We can remove a whole lot of redundant code here if we simple store min_mtu and max_mtu in net_device, and check against those in net/core/dev.c's dev_set_mtu(). In theory, there should be zero functional change with this patch, it just puts the infrastructure in place. Subsequent patches will attempt to start using said infrastructure, with theoretically zero change in functionality. CC: netdev@vger.kernel.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Upstream commit: commit 91572088e3fdbf4fe31cf397926d8b890fdb3237 Author: Jarod Wilson <jarod@redhat.com> Date: Thu Oct 20 13:55:20 2016 -0400 net: use core MTU range checking in core net infra ... openvswitch: - set min/max_mtu, remove internal_dev_change_mtu - note: max_mtu wasn't checked previously, it's been set to 65535, which is the largest possible size supported ... Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Upstream commit: commit 425df17ce3a26d98f76e2b6b0af2acf4aeb0b026 Author: Jarno Rajahalme <jarno@ovn.org> Date: Tue Feb 14 21:16:28 2017 -0800 openvswitch: Set internal device max mtu to ETH_MAX_MTU. Commit 91572088e3fd ("net: use core MTU range checking in core net infra") changed the openvswitch internal device to use the core net infra for controlling the MTU range, but failed to actually set the max_mtu as described in the commit message, which now defaults to ETH_DATA_LEN. This patch fixes this by setting max_mtu to ETH_MAX_MTU after ether_setup() call. Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra") Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> This backport detects the new max_mtu field in the struct netdevice and uses the upstream code if it exists, and local backport code if not. The latter case is amended with bounds checks with new upstream macros ETH_MIN_MTU and ETH_MAX_MTU and the corresponding error messages from the upstream commit. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Signed-off-by: Joe Stringer <joe@ovn.org>
*	datapath: backport: vlan: Check for vlan ethernet types for 8021.q or 802.1ad	Yang, Yi Y	2017-03-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upstream commit: commit fe19c4f971a55cea3be442d8032a5f6021702791 Author: Eric Garver <e@erig.me> Date: Wed Sep 7 12:56:58 2016 -0400 This is to simplify using double tagged vlans. This function allows all valid vlan ethertypes to be checked in a single function call. Also replace some instances that check for both ETH_P_8021Q and ETH_P_8021AD. Patch based on one originally by Thomas F Herbert. Signed-off-by: Thomas F Herbert <thomasfherbert@gmail.com> Signed-off-by: Eric Garver <e@erig.me> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Acked-by: Eric Garver <e@erig.me> Signed-off-by: Joe Stringer <joe@ovn.org>
*	datapath: backport: vlan: Introduce helper functions to check if skb is tagged	Yang, Yi Y	2017-03-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upstream commit: commit f5a7fb88e1f82542ca14ba93a1d4fa35471c60ca Author: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Date: Fri Mar 27 14:31:11 2015 +0900 vlan: Introduce helper functions to check if skb is tagged Separate the two checks for single vlan and multiple vlans in netif_skb_features(). This allows us to move the check for multiple vlans to another function later. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Acked-by: Eric Garver <e@erig.me> Signed-off-by: Joe Stringer <joe@ovn.org>
*	datapath: compat: Fix build on RHEL 7.3	Yi-Hung Wei	2016-12-14	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \|	RHEL 7.3 provides upstream tunnel but it does not support name_assign_type attribute in net-device. This patch fixes the build problem by backporting functions with name_assign_type, and using proper flags in acinclude.m4 to invoke backport functions. Tested on RHEL 7.3 with kernel 3.10.0-514.el7.x86_64 Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
*	configure: Use -Wformat-security with -Wformat.	Ben Pfaff	2016-12-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	GCC 6.1 warns that -Wformat-security has no effect without -Wformat, so this commit fixes the problem. The change to _OVS_CHECK_CC_OPTION is needed so that the cache variable name doesn't end up with a space in it, which obviously doesn't work. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
*	acinclude: Fix -Wstrict-prototypes and -Wold-style-definition detection.	Ben Pfaff	2016-12-12	1	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	AC_LANG_PROGRAM(,) uses a program like this: int main() { return 0; } but that triggers warnings for -Wstrict-prototypes and for -Wold-style-definition, since this definition of main() lacks a prototype and is therefore old-style. This meant that -Wstrict-prototypes and -Wold-style-definition weren't being turned on for new-enough GCC. This commit fixes the problem by changing the program that is test-compiled to: int x; which doesn't make any compilers mad, as far as I know. I recently upgraded to GCC 6.1 and just now noticed the issue, so I think that GCC somewhere between version 4.9 and version 6.1 must have started warning about main() when it's declared this way. Also, fix a few functions that lacked prototypes. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
*	datapath: Allow compile against current net-next.	Jarno Rajahalme	2016-12-09	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch allows openvswitch kernel module in the OVS tree to be compiled against the current net-next Linux kernel. The changes are due to these upstream commits: 56989f6d856 ("genetlink: mark families as __ro_after_init") 489111e5c25 ("genetlink: statically initialize families") a07ea4d9941 ("genetlink: no longer support using static family IDs") struct genl_family initialization is changed be completely static and to include the new (in Linux 4.6) __ro_after_init attribute. Compat code defines it as an empty macro if not defined already. GENL_ID_GENERATE is no longer defined, but since it was defined as 0, it is safe to drop it from all initializers also on older Linux versions. A compiletime_assert is added to make sure this is true whenever GENL_ID_GENERATE is defined. Tested with current Linux net-next (4.9) and 3.16. It should be noted that there are still a number of fixes and new features in upstream net-next that are yet to be backported. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
*	trivial: Resolve whitespace issues with acinclude	Stephen Finucane	2016-10-29	1	-7/+7
\| \| \| \| \| \| \|	Completely unrelated, but annoying. Let's fix it up. Signed-off-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Russell Bryant <russell@ovn.org>
*	configure: Support compiling with Linux 4.8.	Jarno Rajahalme	2016-10-20	1	-2/+2
\| \| \| \| \| \|	Datapath should now compile and work with Linux 4.8. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org>
*	datapath: Support a fixed size of 128 distinct labels.	Jarno Rajahalme	2016-10-20	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Port upstream change in conntrack labels extension. Add a new configure macro HAVE_NF_CONN_LABELS_WITH_WORDS to detect the old definition. Unfortunately there is no conntrack API to hide the difference, so the this makes conntrack.c deviate from upstream source a bit. Upstream commit: commit 23014011ba4209a086931ff402eac1c41abbe456 Author: Florian Westphal <fw@strlen.de> Date: Thu Jul 21 12:51:16 2016 +0200 netfilter: conntrack: support a fixed size of 128 distinct labels The conntrack label extension is currently variable-sized, e.g. if only 2 labels are used by iptables rules then the labels->bits[] array will only contain one element. We track size of each label storage area in the 'words' member. But in nftables and openvswitch we always have to ask for worst-case since we don't know what bit will be used at configuration time. As most arches are 64bit we need to allocate 24 bytes in this case: struct nf_conn_labels { u8 words; /* 0 1 / / XXX 7 bytes hole, try to pack / long unsigned bits[2]; / 8 24 */ Make bits a fixed size and drop the words member, it simplifies the code and only increases memory requirements on x86 when less than 64bit labels are required. We still only allocate the extension if its needed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org>
*	datapath: Add support for kernel 4.7	Pravin B Shelar	2016-08-22	1	-2/+2
\| \| \| \| \|	Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	netdev-dpdk: Fix occurance of error log	Ciara Loftus	2016-08-18	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \|	If NUMA information can't be derived from a vHost User device, only print an error if the VHOST_NUMA option is enabled in DPDK. Otherwise 'fail' silently. Fixes: 0a0f39df1d5a ("netdev-dpdk: Add support for DPDK 16.07") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Reported-by: Ian Stokes <ian.stokes@intel.com> Tested-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
*	datapath: compat: backport LCO optimization.	Pravin B Shelar	2016-08-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This basically backport commit: commit 179bc67f69b6cb53ad68cfdec5a917c2a2248355 Author: Edward Cree <ecree@solarflare.com> Date: Thu Feb 11 20:48:04 2016 +0000 net: local checksum offload for encapsulation The arithmetic properties of the ones-complement checksum mean that a correctly checksummed inner packet, including its checksum, has a ones complement sum depending only on whatever value was used to initialise the checksum field before checksumming (in the case of TCP and UDP, this is the ones complement sum of the pseudo header, complemented). Consequently, if we are going to offload the inner checksum with CHECKSUM_PARTIAL, we can compute the outer checksum based only on the packed data not covered by the inner checksum, and the initial value of the inner checksum field. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	netdev-dpdk: Remove dpdkvhostcuse ports	Ciara Loftus	2016-08-15	1	-12/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit removes the 'dpdkvhostcuse' port type from the userspace datapath. vhost-cuse ports are quickly becoming obsolete as the vhost-user port type begins to support a greater feature-set thanks to the addition of things like vhost-user multiqueue and potential upcoming features like vhost-user client-mode and vhost-user reconnect. The feature is also expected to be removed from DPDK soon. One potential drawback of the removal of this support is that a userspace vHost port type is not available in OVS for use with older versions of QEMU (pre v2.2). Considering v2.2 is nearly two years old this should however be a low impact change. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
*	netdev-dpdk: add DPDK pdump capability	Ciara Loftus	2016-08-12	1	-0/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit provides the ability to 'listen' on DPDK ports and save packets to a pcap file with a DPDK app that uses the librte_pdump library. One such app is the 'pdump' app that can be found in the DPDK 'app' directory. Instructions on how to use this can be found in INSTALL.DPDK-ADVANCED.md Pdump capability in OVS with DPDK will only be initialised if the CONFIG_RTE_LIBRTE_PMD_PCAP=y and CONFIG_RTE_LIBRTE_PDUMP=y options are set in DPDK. libpcap is required if the above configuration is used. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
*	datapath: compat: keep skb mark across tunnel devices.	Pravin B Shelar	2016-08-12	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	Older kernel skb_scrub_packet() has bug which resets skb mark for all packet. It is fixed during 3.18 release where it is reset only for packets crossing namespace. So OVS is forced to use compat skb_scrub_packet() on older kernel. This is related to upstream bug fix commit ca7c7b9059e3 ("skbuff: Do not scrub skb mark within the same name space"). VMware-BZ: #1710701 Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
*	datapath: compat: geneve: fix geneve_notify_add_rx_port()	Pravin B Shelar	2016-08-11	1	-0/+4
\| \| \| \| \| \| \| \|	Remove mutual exclusion between udp-gro registration and geneve receive port registration. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	netdev-dpdk: Make libnuma dependencies optional	Ciara Loftus	2016-08-04	1	-2/+12
\| \| \| \| \| \| \| \| \| \|	Prior to this patch, OVS with DPDK required the libnuma packages to build. This patch removes this dependency, making it only a requirement when the CONFIG_RTE_LIBRTE_VHOST_NUMA option is detected as enabled in the DPDK build. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
*	datapath: compat: Detect GSO support at ovs configure	Pravin B Shelar	2016-08-03	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	OVS turns on tunnel GSO for statically for kernel older than 3.18. Some distributions kernel could backport tunnel GSO. To make use of device offload on such kernel detect the support at configure stage. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	compat: Properly handle fragment lru.	Joe Stringer	2016-08-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	In kernels <=3.16 there is an LRU for managing fragment queues for IPv4 and IPv6. Because the backport code comes from more recent upstream versions of Linux, this LRU management was missing from ip_frag_queue() and nf_ct_frag6_queue(). Fixes: 595e069a0634 ("compat: Backport IPv4 reassembly.") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org>
*	compat: Simplify inet_fragment backports.	Joe Stringer	2016-08-01	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The core fragmentation handling logic is exported on all supported kernels, so it's not necessary to backport the latest version of this. This greatly simplifies the code due to inconsistencies between the old per-lookup garbage collection and the newer workqueue based garbage collection. As a result of simplifying and removing unnecessary backport code, a few bugs are fixed for corner cases such as when some fragments remain in the fragment cache when openvswitch is unloaded. Some backported ip functions need a little extra logic than what is seen on the latest code due to this, for instance on kernels <3.17: * Call inet_frag_evictor() before defrag * Limit hashsize in ip{,6}_fragment logic The pernet init/exit logic also differs a little from upstream. Upstream ipv[46]_defrag logic initializes the various pernet fragment parameters and its own global fragments cache. In the OVS backport, the pernet parameters are shared while the fragments cache is separate. The backport relies upon upstream pernet initialization to perform the shared setup, and performs no pernet initialization of its own. When it comes to pernet exit however, the backport must ensure that all OVS-specific fragment state is cleared, while the shared state remains untouched so that the regular ipv[46] logic may do its own cleanup. In practice this means that OVS must have its own divergent implementation of inet_frags_exit_net(). Fixes the following crash: Call Trace: <IRQ> [<ffffffff810744f6>] ? call_timer_fn+0x36/0x100 [<ffffffff8107548f>] run_timer_softirq+0x1ef/0x2f0 [<ffffffff8106cccc>] __do_softirq+0xec/0x2c0 [<ffffffff8106d215>] irq_exit+0x105/0x110 [<ffffffff81737095>] smp_apic_timer_interrupt+0x45/0x60 [<ffffffff81735a1d>] apic_timer_interrupt+0x6d/0x80 <EOI> [<ffffffff8104f596>] ? native_safe_halt+0x6/0x10 [<ffffffff8101cb2f>] default_idle+0x1f/0xc0 [<ffffffff8101d406>] arch_cpu_idle+0x26/0x30 [<ffffffff810bf3a5>] cpu_startup_entry+0xc5/0x290 [<ffffffff810415ed>] start_secondary+0x21d/0x2d0 Code: Bad RIP value. RIP [<ffffffffa0177480>] 0xffffffffa0177480 RSP <ffff88003f703e78> CR2: ffffffffa0177480 ---[ end trace eb98ca80ba07bd9c ]--- Kernel panic - not syncing: Fatal exception in interrupt Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org>
*	datapath: Add support for kernel 4.6	Pravin B Shelar	2016-07-26	1	-3/+4
\| \| \| \| \| \| \| \| \|	Most of patch iron out USE_UPSTREAM_TUNNEL case where datapath directly use upstream tunneling modules. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org> Acked-by: Amitabha Biswas <abiswas@us.ibm.com>
*	datapath: compat: fix udp checksum calculation	Pravin B Shelar	2016-07-26	1	-1/+0
\| \| \| \| \| \| \| \| \| \|	In upstream linux kernel networking stack udp_set_csum() is called with only udp header applied but in case of compat layer it can be called with IP header. So following patch take the offset into account. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: Add support for kernel 4.5	Pravin B Shelar	2016-07-19	1	-2/+5
\| \| \| \| \|	Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: Add support for kernel 4.4	Pravin B Shelar	2016-07-18	1	-2/+15
\| \| \| \| \| \| \| \|	Most of changes are related to ip-fragment API and genetlink API changes. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: genlmsg_new_unicast to genlmsg_new	Pravin B Shelar	2016-07-18	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	API changes are related commit: openvswitch: Revert: "Enable memory mapped Netlink i/o" revert commit 795449d8b846 ("openvswitch: Enable memory mapped Netlink i/o"). Following the mmaped netlink removal this code can be removed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: use PTR_ERR_OR_ZERO	Pravin B Shelar	2016-07-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upstream commit: commit f35423c137b0e64155f52c166db1d13834a551f2 Author: Fabian Frederick <fabf@skynet.be> openvswitch: use PTR_ERR_OR_ZERO Signed-off-by: Fabian Frederick <fabf@skynet.be> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: backport: ovs: propagate per dp max headroom to all vports	Pravin B Shelar	2016-07-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upstream commit: commit 3a927bc7cf9d0fbe8f4a8189dd5f8440228f64e7 Author: Paolo Abeni <pabeni@redhat.com> ovs: propagate per dp max headroom to all vports This patch implements bookkeeping support to compute the maximum headroom for all the devices in each datapath. When said value changes, the underlying devs are notified via the ndo_set_rx_headroom method. This also increases the internal vports xmit performance. Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: backport: ovs: align nlattr properly when needed	Pravin B Shelar	2016-07-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upstream commit: commit 66c7a5ee1a6b7c69d41dfd68d207fdd54efba56a Author: Nicolas Dichtel <nicolas.dichtel@6wind.com> ovs: align nlattr properly when needed I also fix commit 8b32ab9e6ef1: use nla_total_size_64bit() for OVS_FLOW_ATTR_USED in ovs_flow_cmd_msg_size(). Fixes: 8b32ab9e6ef1 ("ovs: use nla_put_u64_64bit()") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: Use skb_postpush_rcsum()	Pravin B Shelar	2016-07-17	1	-0/+1
\| \| \| \| \| \| \|	Use kernel function to update checksum. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: compat: get rid of OVS_CB inner header offsets.	Pravin B Shelar	2016-07-08	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \|	OVS has GSO compat functionality which needs inner offset of the packet to segment a packet. older kernel did not include these offsets in skb, therefore these were stored in OVS_GSO_CB. Now OVS has dropped support for these old kernel, So none of the supported kernel needs this comapt code. Following patch removes it. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: compat: Update Geneve and VxLAN modules.	Pravin B Shelar	2016-07-08	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \|	This patch brings in various updates to upstream Geneve and VxLAN modules. For geneve this patch adds IPv6 support, for vxlan it adds VXLAN GPE is the major feature. This should make OVS compat tunnel implementation in sync upto current net branch. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: backport: udp: Add socket based GRO and config	Pravin B Shelar	2016-07-08	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upstream commit: commit 38fd2af24fcfda93f9fea3e53f26e48775ae9e09 Author: Tom Herbert <tom@herbertland.com> udp: Add socket based GRO and config Add gro_receive and gro_complete to struct udp_tunnel_sock_cfg. Signed-off-by: Tom Herbert <tom@herbertland.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: compat: Update udp_sock_create	Pravin B Shelar	2016-07-08	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	Update udp-socket-create to create ipv6 socket currectly. Partially backports commit fd384412e199b ("udp_tunnel: Seperate ipv6 functions into its own file.") Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: compat: rename HAVE_METADATA_DST to USE_UPSTREAM_TUNNEL	Pravin B Shelar	2016-07-08	1	-1/+1
\| \| \| \| \| \| \|	To better represent the meaning of symbol. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: backport: ip_tunnel: add support for setting flow label via ↵	Pravin B Shelar	2016-07-08	1	-4/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	collect metadata Update udp_tunnel6_xmit_skb(). Specificaly changes are related to setting ipv6 label. Upstream commit: commit 134611446dc657e1bbc73ca0e4e6b599df687db0 Author: Daniel Borkmann <daniel@iogearbox.net> ip_tunnel: add support for setting flow label via collect metadata This patch extends udp_tunnel6_xmit_skb() to pass in the IPv6 flow label from call sites. Currently, there's no such option and it's always set to zero when writing ip6_flow_hdr(). Add a label member to ip_tunnel_key, so that flow-based tunnels via collect metadata frontends can make use of it. vxlan and geneve will be converted to add flow label support separately. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: backport: tunnels: Remove encapsulation offloads on decap.	Pravin B Shelar	2016-07-08	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Following patch backports updated iptunnel pull function. Also brings in following upstream fix: commit a09a4c8dd1ec7f830e1fb9e59eb72bddc965d168 Author: Jesse Gross <jesse@kernel.org> tunnels: Remove encapsulation offloads on decap. If a packet is either locally encapsulated or processed through GRO it is marked with the offloads that it requires. However, when it is decapsulated these tunnel offload indications are not removed. This means that if we receive an encapsulated TCP packet, aggregate it with GRO, decapsulate, and retransmit the resulting frame on a NIC that does not support encapsulation, we won't be able to take advantage of hardware offloads even though it is just a simple TCP packet at this point. This fixes the problem by stripping off encapsulation offload indications when packets are decapsulated. The performance impacts of this bug are significant. In a test where a Geneve encapsulated TCP stream is sent to a hypervisor, GRO'ed, decapsulated, and bridged to a VM performance is improved by 60% (5Gbps->8Gbps) as a result of avoiding unnecessary segmentation at the VM tap interface. Reported-by: Ramu Ramamurthy <sramamur@linux.vnet.ibm.com> Fixes: 68c33163 ("v4 GRE: Add TCP segmentation offload for GRE") Signed-off-by: Jesse Gross <jesse@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: backport: iptunnel: scrub packet in iptunnel_pull_header	Pravin B Shelar	2016-07-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upstream Commit: commit 7f290c94352e59b1d720055fce760a69a63bd0a1 Author: Jiri Benc <jbenc@redhat.com> iptunnel: scrub packet in iptunnel_pull_header Part of skb_scrub_packet was open coded in iptunnel_pull_header. Let it call skb_scrub_packet directly instead. Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: compat: Refactor egress tunnel info	Pravin B Shelar	2016-07-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	upstream tunnel egress info is retrieved using ndo_fill_metadata_dst. Since we do not have it on older kernel we need to keep vport operation to do same on these kernels. Following patch try to merge these to operations into one to avoid code duplication. This commit backports fc4099f1 ("openvswitch: Fix egress tunnel info.") Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	compat: ipv4: Pass struct net through ip_fragment.	Eric W. Biederman	2016-06-27	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \|	Upstream commit: ipv4: Pass struct net through ip_fragment Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Upstream: 694869b3c544 ("ipv4: Pass struct net through ip_fragment") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	acinclude: check for numa library	Bhanuprakash Bodireddy	2016-06-24	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Numa library is needed for NUMA aware vHost User functionality. Incase of missing numa package, the OVS DPDK configuration fails with "error: Could not find DPDK libraries in <DPDK_LOC>/TARGET/lib" though the DPDK library is installed. This patch fixes this inappropriate error by checking for presence of numa library and output an appropriate error message "error: unable to find libnuma, install the dependency package" in case of missing package. Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Acked-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
*	datapath: change nf_connlabels_get bit arg to 'highest used'	Jarno Rajahalme	2016-06-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Upstream commit: commit adff6c65600000ec2bb71840c943ee12668080f5 Author: Florian Westphal <fw@strlen.de> Date: Tue Apr 12 18:14:25 2016 +0200 netfilter: connlabels: change nf_connlabels_get bit arg to 'highest used' nf_connlabel_set() takes the bit number that we would like to set. nf_connlabels_get() however took the number of bits that we want to support. So e.g. nf_connlabels_get(32) support bits 0 to 31, but not 32. This changes nf_connlabels_get() to take the highest bit that we want to set. Callers then don't have to cope with a potential integer wrap when using nf_connlabels_get(bit + 1) anymore. Current callers are fine, this change is only to make folloup nft ct label set support simpler. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> OVS compat code defined nf_connlabels_get() if it was missing. Now we redefine it if it is missing, or if it has the old signature. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	datapath: compat for NAT.	Jarno Rajahalme	2016-06-20	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	Compat code required to make the NAT code in the following patch compile with Linux 3.10 - 4.6. Some compat code applies to the conntrack.c itself; these are added after the main NAT backport for conntrack.c later in the series. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
*	acinclude: Add OVS_FIND_PARAM_IFELSE.	Jarno Rajahalme	2016-06-20	1	-4/+37
\| \| \| \| \| \| \| \| \| \| \| \|	OVS_FIND_PARAM_IFELSE is more robust macro for checking function parameters, as it does not require the parameter to be on the same line as the function name like the OVS_GREP_IFELSE does. Use this to fix the check for struct conntrack_zone parameter, which is on a different line on Linux 4.3 and higher. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>