summaryrefslogtreecommitdiff
path: root/acinclude.m4
Commit message (Collapse)AuthorAgeFilesLines
* compat: Add act_pedit compatibility for old kernelsPaul Blakey2017-11-161-0/+7
| | | | | | | | Added compatibility for action pedit. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
* acinclude: Fix SKB_GSO_UDP check.William Tu2017-11-011-1/+1
| | | | | | | | | | | The HAVE_SKB_GSO_UDP checks whether skbuff.h defines SKB_GSO_UDP. However, it falsely returns yes because grep matches SKB_GSO_UDP_TUNNEL. Thus, add space character '[:space:]' before and after it. Fixes: ad283644f0e4 ("acinclude: Check for SKB_GSO_UDP") Signed-off-by: William Tu <u9012063@gmail.com> Cc: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* acinclude: Add support for Linux 4.13Greg Rose2017-09-221-2/+2
| | | | | | | Add configuration support for the just released 4.13 Linux kernel. Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
* acinclude: Check for existence of nf_hook_ops member "list".Greg Rose2017-09-221-0/+3
| | | | | | | | The "list" member of the nf_hook_ops structure is removed in Linux kernel release 4.13. Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
* acinclude: Check for extended netlink ack presenceGreg Rose2017-09-221-0/+3
| | | | | | | | | RTNL ops validate and newlink now include the extended netlink ack feature. Check for it and set HAVE_EXT_ACK_IN_RTNL_LINKOPS if found. Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
* acinclude: Add compat define for DST_NOCACHEGreg Rose2017-09-221-0/+2
| | | | | | | | DST_NOCACHE is removed in the 4.13 Linux kernel - add check for it and if found set HAVE_DST_NOCACHE. Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
* acinclude: Check for SKB_GSO_UDPGreg Rose2017-09-221-0/+2
| | | | | | | Removed in kernel 4.13 Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
* acinclude: Add missing defineGreg Rose2017-09-211-1/+2
| | | | | | | | | | The final line of a conditional search for the nf_conntrack_helper_put function does not actually define HAVE_NF_CONNTRACK_HELPER_PUT used in datapath/linux/compat/include/net/netfilter/nf_conntrack_helper.h. Fixes: ac8e3c6d14d2 ("datapath: introduce nf_conntrack_helper_put function") Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* datapath: compat: Fix build on RHEL 7.4Yi-Hung Wei2017-08-231-0/+4
| | | | | | | | RHEL 7.4 introduces netdev_master_upper_dev_link_rh() that breaks the backport of OVS kernel module on RHEL 7.4. This patch fixes that issue. Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
* compat: Update tc compatibility headerPaul Blakey2017-08-111-3/+3
| | | | | | | | | Update to include up to flower ttl matching. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Joe Stringer <joe@ovn.org>
* acinclude: Also support pkg-config for configuring dpdk.Christian Ehrhardt2017-08-071-7/+10
| | | | | | | | | | | | If available use dpdk pkg-config info of libdpdk to set the right include paths. That for example, allows packagers to provide non default include paths in a common way (pkg-config). Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> Suggested-by: Luca Boccassi <luca.boccassi@gmail.com> Acked-by: Luca Boccassi <luca.boccassi@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* acinclude.m4: Support Linux kernel 4.12Greg Rose2017-07-241-1/+1
| | | | | | | | Allow datapath kernel modules to be configured and built for kernels up to 4.12. Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
* compat: net: store port/representator id in metadata_dst.Joe Stringer2017-07-241-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit 3fcece12bc1b6dcdf0986f2cd9e8f63b1f9b6aa0 Author: Jakub Kicinski <jakub.kicinski@netronome.com> Date: Fri Jun 23 22:11:58 2017 +0200 net: store port/representator id in metadata_dst Switches and modern SR-IOV enabled NICs may multiplex traffic from Port representators and control messages over single set of hardware queues. Control messages and muxed traffic may need ordered delivery. Those requirements make it hard to comfortably use TC infrastructure today unless we have a way of attaching metadata to skbs at the upper device. Because single set of queues is used for many netdevs stopping TC/sched queues of all of them reliably is impossible and lower device has to retreat to returning NETDEV_TX_BUSY and usually has to take extra locks on the fastpath. This patch attempts to enable port/representative devs to attach metadata to skbs which carry port id. This way representatives can be queueless and all queuing can be performed at the lower netdev in the usual way. Traffic arriving on the port/representative interfaces will be have metadata attached and will subsequently be queued to the lower device for transmission. The lower device should recognize the metadata and translate it to HW specific format which is most likely either a special header inserted before the network headers or descriptor/metadata fields. Metadata is associated with the lower device by storing the netdev pointer along with port id so that if TC decides to redirect or mirror the new netdev will not try to interpret it. This is mostly for SR-IOV devices since switches don't have lower netdevs today. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net> Upstream: 3fcece12bc1b ("net: store port/representator id in metadata_dst") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Greg Rose <gvrose8192@gmail.com>
* datapath: get rid of redundant vxlan_dev.flagsGreg Rose2017-07-241-0/+2
| | | | | | | | | | | | | | | | | | | | | Upstream commit: commit dc5321d79697db1b610c25fa4fad1aec7533ea3e Author: Matthias Schiffer <mschiffer@universe-factory.net> Date: Mon Jun 19 10:03:56 2017 +0200 vxlan: get rid of redundant vxlan_dev.flags There is no good reason to keep the flags twice in vxlan_dev and vxlan_config. Signed-off-by: Matthias Schiffer <mschiffer@universe-factory.net> Signed-off-by: David S. Miller <davem@davemloft.net> Applied using HAVE_VXLAN_DEV_CFG compatibility flag defined in acinclude.m4. Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
* compat: convert many more places to skb_put_zero().Joe Stringer2017-07-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit de77b966ce8adcb4c58d50e2f087320d5479812a Author: Johannes Berg <johannes.berg@intel.com> Date: Fri Jun 16 14:29:19 2017 +0200 networking: convert many more places to skb_put_zero() There were many places that my previous spatch didn't find, as pointed out by yuan linyu in various patches. The following spatch found many more and also removes the now unnecessary casts: @@ identifier p, p2; expression len; expression skb; type t, t2; @@ ( -p = skb_put(skb, len); +p = skb_put_zero(skb, len); | -p = (t)skb_put(skb, len); +p = skb_put_zero(skb, len); ) ... when != p ( p2 = (t2)p; -memset(p2, 0, len); | -memset(p, 0, len); ) @@ type t, t2; identifier p, p2; expression skb; @@ t *p; ... ( -p = skb_put(skb, sizeof(t)); +p = skb_put_zero(skb, sizeof(t)); | -p = (t *)skb_put(skb, sizeof(t)); +p = skb_put_zero(skb, sizeof(t)); ) ... when != p ( p2 = (t2)p; -memset(p2, 0, sizeof(*p)); | -memset(p, 0, sizeof(*p)); ) @@ expression skb, len; @@ -memset(skb_put(skb, len), 0, len); +skb_put_zero(skb, len); Apply it to the tree (with one manual fixup to keep the comment in vxlan.c, which spatch removed.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Use e45a79da863c ("skbuff/mac80211: introduce and use skb_put_zero()") as the basis for the backported function. Upstream: de77b966ce8a ("networking: convert many more places to skb_put_zero()") Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Greg Rose <gvrose8192@gmail.com>
* datapath: Fix inconsistent teardown and release of private netdev state.Greg Rose2017-07-241-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit cf124db566e6b036b8bcbe8decbed740bdfac8c6 Author: David S. Miller <davem@davemloft.net> Date: Mon May 8 12:52:56 2017 -0400 net: Fix inconsistent teardown and release of private netdev state. Network devices can allocate reasources and private memory using netdev_ops->ndo_init(). However, the release of these resources can occur in one of two different places. Either netdev_ops->ndo_uninit() or netdev->destructor(). The decision of which operation frees the resources depends upon whether it is necessary for all netdev refs to be released before it is safe to perform the freeing. netdev_ops->ndo_uninit() presumably can occur right after the NETDEV_UNREGISTER notifier completes and the unicast and multicast address lists are flushed. netdev->destructor(), on the other hand, does not run until the netdev references all go away. Further complicating the situation is that netdev->destructor() almost universally does also a free_netdev(). This creates a problem for the logic in register_netdevice(). Because all callers of register_netdevice() manage the freeing of the netdev, and invoke free_netdev(dev) if register_netdevice() fails. If netdev_ops->ndo_init() succeeds, but something else fails inside of register_netdevice(), it does call ndo_ops->ndo_uninit(). But it is not able to invoke netdev->destructor(). This is because netdev->destructor() will do a free_netdev() and then the caller of register_netdevice() will do the same. However, this means that the resources that would normally be released by netdev->destructor() will not be. Over the years drivers have added local hacks to deal with this, by invoking their destructor parts by hand when register_netdevice() fails. Many drivers do not try to deal with this, and instead we have leaks. Let's close this hole by formalizing the distinction between what private things need to be freed up by netdev->destructor() and whether the driver needs unregister_netdevice() to perform the free_netdev(). netdev->priv_destructor() performs all actions to free up the private resources that used to be freed by netdev->destructor(), except for free_netdev(). netdev->needs_free_netdev is a boolean that indicates whether free_netdev() should be done at the end of unregister_netdevice(). Now, register_netdevice() can sanely release all resources after ndo_ops->ndo_init() succeeds, by invoking both ndo_ops->ndo_uninit() and netdev->priv_destructor(). And at the end of unregister_netdevice(), we invoke netdev->priv_destructor() and optionally call free_netdev(). Signed-off-by: David S. Miller <davem@davemloft.net> Applied the portion of the commit applicable to openvswitch. Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
* datapath: more accurate checksumming in queue_userspace_packet()Joe Stringer2017-07-241-0/+1
| | | | | | | | | | | | | | | | | | Upstream commit: commit 7529390d08f07fbf9b0174c5a87600b5caa1a8e8 Author: Davide Caratti <dcaratti@redhat.com> Date: Thu May 18 15:44:42 2017 +0200 openvswitch: more accurate checksumming in queue_userspace_packet() if skb carries an SCTP packet and ip_summed is CHECKSUM_PARTIAL, it needs CRC32c in place of Internet Checksum: use skb_csum_hwoffload_help to avoid corrupting such packets while queueing them towards userspace. Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Joe Stringer <joe@ovn.org>
* datapath: introduce nf_conntrack_helper_put functionGreg Rose2017-07-241-0/+2
| | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit d91fc59cd77c719f33eda65c194ad8f95a055190 Author: Liping Zhang <zlpnobody@gmail.com> Date: Sun May 7 22:01:55 2017 +0800 netfilter: introduce nf_conntrack_helper_put helper function And convert module_put invocation to nf_conntrack_helper_put, this is prepared for the followup patch, which will add a refcnt for cthelper, so we can reject the deleting request when cthelper is in use. Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Applied with additional use of HAVE_NF_CONNTRACK_HELPER_PUT compatibility flag defined in acinclude.m4. Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
* acinclude.m4: Avoid error from printf.Ben Pfaff2017-07-171-1/+1
| | | | | | | | | | GNU (at least) printf interprets -I as an option, but we want to print it literally, so use %s. CC: YAMAMOTO Takashi <yamamoto@ovn.org> Fixes: 27d41afaa446 ("acinclude.m4: Avoid echo -n") Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* acinclude.m4: Avoid echo -nYAMAMOTO Takashi2017-07-161-1/+1
| | | | | | | | -n option for echo is not portable. Use printf instead. This fixes OSX build on travis-ci. Acked-by: Ben Pfaff <blp@ovn.org> Signed-off-by: YAMAMOTO Takashi <yamamoto@ovn.org>
* ctags: include symbols with locking annotations.Flavio Leitner2017-07-131-0/+8
| | | | | | | | | | | | | | OVS uses extensively clang annotations for thread safety checks. The ctags tool can't parse them, so they are not included in the tag file. This patch improves the configure script to generate a list of identifiers from the header compiler.h to be ignored by ctags. Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Reviewed-by: Aaron Conole <aconole@redhat.com>
* configure: Fix check for rte_config.h to handle cross-compilation.Ben Pfaff2017-07-071-3/+4
| | | | | | | | | | | | | | The check for rte_config.h in acinclude.m4 used AC_CHECK_FILE, but this macro is intended to check for a file on the host system, not the build system, which means that it fails unconditionally in a cross-compilation environment. However, the intended check here is for a header file, which is part of the build system. To check for part of the build system, we can just use "test", so this commit makes that change. Reported-by: Hemant Agrawal <hemant.agrawal@nxp.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-March/329994.html Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Darrell Ball <dlu998@gmail.com>
* Makefiles: Fail build for flake8 only when configured with --enable-Werror.Ben Pfaff2017-07-071-1/+10
| | | | | | | | | | | | | | | | | | flake8 checking is useful. Until now, it always failed the build for any flake8 errors. This is too aggressive, for the same reason that always failing the build for any compiler warnings is too aggressive: compilers change over time and asynchronously from OVS itself. Thus, if we release some version of OVS today, even if it's flake8-clean today, it might not be flake8-clean tomorrow, even with the same settings. We don't want to have to track flake8 warnings on every release branch. Thus, this adopts the same policy for compiler warnings: always report them, but only fail the build if --enable-Werror was configured. Usually just developers use that configure option, and they're prepared to deal with the fallout. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
* compat: Restrict __ro_after_init usageGreg Rose2017-06-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The attribute __ro_after_init was introduced in Linux kernel 4.5. If a data structure is given this attribute then after the driver module loads the memory page where the data resides will be marked read only. The compat code in cache.h always defines __ro_after_init if it is not already defined so that it can be used as an attribute for the datapath genl_family structure definitions. If __ro_after_init is defined then it is used "as-is" where it will apply the read only attribute after driver initialization. This is incorrect usage for the Generic Netlink genl_family structure definitions prior to Linux kernel 4.10. The genl_family structure in those kernels includes a list header member that will be written to when the generic netlink family is unregistered. This will cause a subsequent page fault and kernel panic because at this time the genl_family structure data has been marked read only in the page descriptor. A new compat macro is introduced in acinclude.m4 to detect when the genl_family structure has the family_list list header as a member. In this case HAVE_GENL_FAMILY_LIST is defined and if __ro_after_init is also defined then it is undefined and redefined as empty. This will prevent the genl_family data structure from being marked read only in kernels 4.5 through 4.9 and thus prevent the page fault when the generic netlink families in datapath.c are unregistered. [Committer notes] * Rolled a short explanation comment into the code. Fixes: ba63fe260bd5 ("datapath: Allow compile against current net-next.") CC: Jarno Rajahalme <jarno@ovn.org> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
* compat: Add tc compatibility headers for old kernelsPaul Blakey2017-05-301-0/+26
| | | | | | | | | | | | | | Added compatibility headers for actions vlan and tunnel key. Do not use compat code when compiling kernel datapath there is no need for it as TC compatibility is not provided there. In other words, the compat code is only used when compiling user-space code against old kernel headers. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Simon Horman <simon.horman@netronome.com>
* datapath: Remove untracked CT on newer kernels.Joe Stringer2017-05-031-0/+2
| | | | | | | | | | | | | | Upstream commits cc41c84b7e7f ("netfilter: kill the fake untracked conntrack objects") and ab8bc7ed864b ("netfilter: remove nf_ct_is_untracked") removed the 'untracked' conntrack objects and functions. The latter commit removes the usage of nf_ct_is_untracked() from OVS. However, older kernels still have a representation of 'untracked' CT objects so the code needs to remain until the kernel support is bumped to Linux 4.12 or newer. Introduce a macro to detect this symbol and wrap these lines in the macro check. Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Greg Rose <gvrose8192@gmail.com>
* datapath: Fixups for MPLS GSOYi-Hung Wei2017-05-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch backports the following two upstream commits to fix MPLS GSO in ovs datapath. Starting from upstream commit 48d2ab609b6b ("net: mpls: Fixups for GSO"), the mpls_gso kernel module relies on the fact that skb_network_header() points to the mpls header and skb_inner_network_header() points to the L3 header so that it can derive the length of mpls header correctly, and the upstream commit updates how ovs datapath marks the skb header when push and pop mpls. However, the old mpls_gso kernel module assumes that the skb_network_header() points to the L3 header, and the old mpls_gso kernel module will misbehave if the ovs datapath marks the skb_network_header() in the new way since it will treat mpls header as the L3 header. Because of the functional signature of mpls_gso_segment() does not change, this backport patch uses the new mpls_hdr() to determine if the kernel that ovs datapath is compiled with has the new or legacy mpls_gso kernel module. It has been tested on kernel 4.4 and 4.9. Upstream commit: commit 48d2ab609b6bbecb7698487c8579bc40de9d6dfa Author: David Ahern <dsa@cumulusnetworks.com> Date: Wed Aug 24 20:10:44 2016 -0700 net: mpls: Fixups for GSO As reported by Lennert the MPLS GSO code is failing to properly segment large packets. There are a couple of problems: 1. the inner protocol is not set so the gso segment functions for inner protocol layers are not getting run, and 2 MPLS labels for packets that use the "native" (non-OVS) MPLS code are not properly accounted for in mpls_gso_segment. The MPLS GSO code was added for OVS. It is re-using skb_mac_gso_segment to call the gso segment functions for the higher layer protocols. That means skb_mac_gso_segment is called twice -- once with the network protocol set to MPLS and again with the network protocol set to the inner protocol. This patch sets the inner skb protocol addressing item 1 above and sets the network_header and inner_network_header to mark where the MPLS labels start and end. The MPLS code in OVS is also updated to set the two network markers. >From there the MPLS GSO code uses the difference between the network header and the inner network header to know the size of the MPLS header that was pushed. It then pulls the MPLS header, resets the mac_len and protocol for the inner protocol and then calls skb_mac_gso_segment to segment the skb. Afterward the inner protocol segmentation is done the skb protocol is set to mpls for each segment and the network and mac headers restored. Reported-by: Lennert Buytenhek <buytenh@wantstofly.org> Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net> Upstream commit: commit 85de4a2101acb85c3b1dde465e84596ccca99f2c Author: Jiri Benc <jbenc@redhat.com> Date: Fri Sep 30 19:08:07 2016 +0200 openvswitch: use mpls_hdr skb_mpls_header is equivalent to mpls_hdr now. Use the existing helper instead. Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
* compat: Fix build error in kernels 4.10Greg Rose2017-04-281-1/+3
| | | | | | | | | | | | | | | | | | | Use the acinclude.m4 configuration file to check for the net parameter that was added to the ipv4 and ipv6 frags init functions in the 4.10 Linux kernel to check whether DEFRAG_ENABLE_TAKES_NET should be set and then check for that at compile time. This is an alternative solution patch for the issue reported by Raymond Burkholder and the patch submitted by Guoshuai Li. [Committer notes] Squash in "acinclude.m4: Add check for struct net parameter" which provides the HAVE_DEFRAG_ENABLE_TAKES_NET. Reported-by: Raymond Burkholder <ray@oneunified.net> CC: Guoshuai Li <ligs@dtdream.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
* datapath: pass extended ACK struct to parsing functionsJohannes Berg2017-04-201-0/+3
| | | | | | | | | | | | | | | | | | | | Upstream commit: commit fceb6435e85298f747fee938415057af837f5a8a Author: Johannes Berg <johannes.berg@intel.com> Date: Wed Apr 12 14:34:07 2017 +0200 netlink: pass extended ACK struct to parsing functions Pass the new extended ACK reporting struct to all of the generic netlink parsing functions. For now, pass NULL in almost all callers (except for some in the core.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
* compat: ipv6: orphan skbs in reassembly unit.Eric Dumazet2017-04-191-4/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream commit: ipv6: orphan skbs in reassembly unit Andrey reported a use-after-free in IPv6 stack. Issue here is that we free the socket while it still has skb in TX path and in some queues. It happens here because IPv6 reassembly unit messes skb->truesize, breaking skb_set_owner_w() badly. We fixed a similar issue for IPV4 in commit 8282f27449bf ("inet: frag: Always orphan skbs inside ip_defrag()") Acked-by: Joe Stringer <joe@ovn.org> ================================================================== BUG: KASAN: use-after-free in sock_wfree+0x118/0x120 Read of size 8 at addr ffff880062da0060 by task a.out/4140 page:ffffea00018b6800 count:1 mapcount:0 mapping: (null) index:0x0 compound_mapcount: 0 flags: 0x100000000008100(slab|head) raw: 0100000000008100 0000000000000000 0000000000000000 0000000180130013 raw: dead000000000100 dead000000000200 ffff88006741f140 0000000000000000 page dumped because: kasan: bad access detected CPU: 0 PID: 4140 Comm: a.out Not tainted 4.10.0-rc3+ #59 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:15 dump_stack+0x292/0x398 lib/dump_stack.c:51 describe_address mm/kasan/report.c:262 kasan_report_error+0x121/0x560 mm/kasan/report.c:370 kasan_report mm/kasan/report.c:392 __asan_report_load8_noabort+0x3e/0x40 mm/kasan/report.c:413 sock_flag ./arch/x86/include/asm/bitops.h:324 sock_wfree+0x118/0x120 net/core/sock.c:1631 skb_release_head_state+0xfc/0x250 net/core/skbuff.c:655 skb_release_all+0x15/0x60 net/core/skbuff.c:668 __kfree_skb+0x15/0x20 net/core/skbuff.c:684 kfree_skb+0x16e/0x4e0 net/core/skbuff.c:705 inet_frag_destroy+0x121/0x290 net/ipv4/inet_fragment.c:304 inet_frag_put ./include/net/inet_frag.h:133 nf_ct_frag6_gather+0x1125/0x38b0 net/ipv6/netfilter/nf_conntrack_reasm.c:617 ipv6_defrag+0x21b/0x350 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c:68 nf_hook_entry_hookfn ./include/linux/netfilter.h:102 nf_hook_slow+0xc3/0x290 net/netfilter/core.c:310 nf_hook ./include/linux/netfilter.h:212 __ip6_local_out+0x52c/0xaf0 net/ipv6/output_core.c:160 ip6_local_out+0x2d/0x170 net/ipv6/output_core.c:170 ip6_send_skb+0xa1/0x340 net/ipv6/ip6_output.c:1722 ip6_push_pending_frames+0xb3/0xe0 net/ipv6/ip6_output.c:1742 rawv6_push_pending_frames net/ipv6/raw.c:613 rawv6_sendmsg+0x2cff/0x4130 net/ipv6/raw.c:927 inet_sendmsg+0x164/0x5b0 net/ipv4/af_inet.c:744 sock_sendmsg_nosec net/socket.c:635 sock_sendmsg+0xca/0x110 net/socket.c:645 sock_write_iter+0x326/0x620 net/socket.c:848 new_sync_write fs/read_write.c:499 __vfs_write+0x483/0x760 fs/read_write.c:512 vfs_write+0x187/0x530 fs/read_write.c:560 SYSC_write fs/read_write.c:607 SyS_write+0xfb/0x230 fs/read_write.c:599 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203 RIP: 0033:0x7ff26e6f5b79 RSP: 002b:00007ff268e0ed98 EFLAGS: 00000206 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00007ff268e0f9c0 RCX: 00007ff26e6f5b79 RDX: 0000000000000010 RSI: 0000000020f50fe1 RDI: 0000000000000003 RBP: 00007ff26ebc1220 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000206 R12: 0000000000000000 R13: 00007ff268e0f9c0 R14: 00007ff26efec040 R15: 0000000000000003 The buggy address belongs to the object at ffff880062da0000 which belongs to the cache RAWv6 of size 1504 The buggy address ffff880062da0060 is located 96 bytes inside of 1504-byte region [ffff880062da0000, ffff880062da05e0) Freed by task 4113: save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57 save_stack+0x43/0xd0 mm/kasan/kasan.c:502 set_track mm/kasan/kasan.c:514 kasan_slab_free+0x73/0xc0 mm/kasan/kasan.c:578 slab_free_hook mm/slub.c:1352 slab_free_freelist_hook mm/slub.c:1374 slab_free mm/slub.c:2951 kmem_cache_free+0xb2/0x2c0 mm/slub.c:2973 sk_prot_free net/core/sock.c:1377 __sk_destruct+0x49c/0x6e0 net/core/sock.c:1452 sk_destruct+0x47/0x80 net/core/sock.c:1460 __sk_free+0x57/0x230 net/core/sock.c:1468 sk_free+0x23/0x30 net/core/sock.c:1479 sock_put ./include/net/sock.h:1638 sk_common_release+0x31e/0x4e0 net/core/sock.c:2782 rawv6_close+0x54/0x80 net/ipv6/raw.c:1214 inet_release+0xed/0x1c0 net/ipv4/af_inet.c:425 inet6_release+0x50/0x70 net/ipv6/af_inet6.c:431 sock_release+0x8d/0x1e0 net/socket.c:599 sock_close+0x16/0x20 net/socket.c:1063 __fput+0x332/0x7f0 fs/file_table.c:208 ____fput+0x15/0x20 fs/file_table.c:244 task_work_run+0x19b/0x270 kernel/task_work.c:116 exit_task_work ./include/linux/task_work.h:21 do_exit+0x186b/0x2800 kernel/exit.c:839 do_group_exit+0x149/0x420 kernel/exit.c:943 SYSC_exit_group kernel/exit.c:954 SyS_exit_group+0x1d/0x20 kernel/exit.c:952 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203 Allocated by task 4115: save_stack_trace+0x16/0x20 arch/x86/kernel/stacktrace.c:57 save_stack+0x43/0xd0 mm/kasan/kasan.c:502 set_track mm/kasan/kasan.c:514 kasan_kmalloc+0xad/0xe0 mm/kasan/kasan.c:605 kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:544 slab_post_alloc_hook mm/slab.h:432 slab_alloc_node mm/slub.c:2708 slab_alloc mm/slub.c:2716 kmem_cache_alloc+0x1af/0x250 mm/slub.c:2721 sk_prot_alloc+0x65/0x2a0 net/core/sock.c:1334 sk_alloc+0x105/0x1010 net/core/sock.c:1396 inet6_create+0x44d/0x1150 net/ipv6/af_inet6.c:183 __sock_create+0x4f6/0x880 net/socket.c:1199 sock_create net/socket.c:1239 SYSC_socket net/socket.c:1269 SyS_socket+0xf9/0x230 net/socket.c:1249 entry_SYSCALL_64_fastpath+0x1f/0xc2 arch/x86/entry/entry_64.S:203 Memory state around the buggy address: ffff880062d9ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff880062d9ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff880062da0000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ^ ffff880062da0080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff880062da0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> This patch is a bugfix, and will be progressively backported to earlier kernels. If it is backported to any kernel 4.5 through 4.10, then users use that updated kernel with the OVS kernel module prior to this patch, it could cause a crash. The compat code here resolves such issues. Upstream: 48cac18ecf1d ("ipv6: orphan skbs in reassembly unit") Signed-off-by: Joe Stringer <joe@ovn.org> Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
* acinclude: Allow compile with Linux 4.11.Jarno Rajahalme2017-04-171-2/+2
| | | | | | | | | | Change the Linux kernel tests in OVS configuration. While the backports may still be a little behind, it is useful to be able to test the OVS tree kernel module with the upstream net-next kernel. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* compat: nf_ct_delete compat.Jarno Rajahalme2017-03-081-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit f330a7fdbe1611104622faff7e614a246a7d20f0 Author: Florian Westphal <fw@strlen.de> Date: Thu Aug 25 15:33:31 2016 +0200 netfilter: conntrack: get rid of conntrack timer With stats enabled this eats 80 bytes on x86_64 per nf_conn entry, as Eric Dumazet pointed out during netfilter workshop 2016. Eric also says: "Another reason was the fact that Thomas was about to change max timer range [..]" (500462a9de657f8, 'timers: Switch to a non-cascading wheel'). Remove the timer and use a 32bit jiffies value containing timestamp until entry is valid. During conntrack lookup, even before doing tuple comparision, check the timeout value and evict the entry in case it is too old. The dying bit is used as a synchronization point to avoid races where multiple cpus try to evict the same entry. Because lookup is always lockless, we need to bump the refcnt once when we evict, else we could try to evict already-dead entry that is being recycled. This is the standard/expected way when conntrack entries are destroyed. Followup patches will introduce garbage colliction via work queue and further places where we can reap obsoleted entries (e.g. during netlink dumps), this is needed to avoid expired conntracks from hanging around for too long when lookup rate is low after a busy period. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Upstream commit f330a7fdbe16 ("netfilter: conntrack: get rid of conntrack timer") changes the way nf_ct_delete() is called. Prior to commit the call pattern was like this: if (del_timer(&ct->timeout)) nf_ct_delete(ct, ...); After this change nf_ct_delete() is called directly: nf_ct_delete(ct, ...); This patch provides a replacement implementation for nf_ct_delete() that first calls the del_timer(). This replacement is only used if the struct nf_conn has member 'timeout' of type 'struct timer_list'. The following patch introduces the first caller to nf_ct_delete() in the OVS kernel module. Linux <3.12 does not have nf_ct_delete() at all, so we inline it if it does not exist. The inlined code is from 3.11 death_by_timeout(), which in later versions simply calls nf_ct_delete(). Upstream commit 02982c27ba1e1bd9f9d4747214e19ca83aa88d0e introduced nf_ct_delete() in Linux 3.12. This commit has the original code that is being inlined here. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
* datapath: add and use nf_ct_set helperFlorian Westphal2017-03-081-0/+2
| | | | | | | | | | | | | | | | | | | Upstream commit: commit c74454fadd5ea6fc866ffe2c417a0dba56b2bf1c Author: Florian Westphal <fw@strlen.de> Date: Mon Jan 23 18:21:57 2017 +0100 netfilter: add and use nf_ct_set helper Add a helper to assign a nf_conn entry and the ctinfo bits to an sk_buff. This avoids changing code in followup patch that merges skb->nfct and skb->nfctinfo into skb->_nfct. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
* datapath: add and use skb_nfct helperFlorian Westphal2017-03-081-0/+1
| | | | | | | | | | | | | | | | | | Upstream commit: commit cb9c68363efb6d1f950ec55fb06e031ee70db5fc Author: Florian Westphal <fw@strlen.de> Date: Mon Jan 23 18:21:56 2017 +0100 skbuff: add and use skb_nfct helper Followup patch renames skb->nfct and changes its type so add a helper to avoid intrusive rename change later. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
* datapath: Allow compiling against Linux 4.10Jarno Rajahalme2017-03-081-2/+2
| | | | | | OVS in-tree datapath compiles against Linux 4.10 kernel, so allow it. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
* Makefile: Drop vestiges of support for non-GNU Make.Ben Pfaff2017-03-081-52/+2
| | | | | | | | | Open vSwitch has documented a requirement for GNU Make for a long time, yet it had vestiges catering to other make implementations. This removes those. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
* datapath: add processing of L3 packetsYang, Yi Y2017-03-021-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit 5108bbaddc37c1c8583f0cf2562d7d3463cd12cb Author: Jiri Benc <jbenc@redhat.com> Date: Thu Nov 10 16:28:21 2016 +0100 openvswitch: add processing of L3 packets Support receiving, extracting flow key and sending of L3 packets (packets without an Ethernet header). Note that even after this patch, non-Ethernet interfaces are still not allowed to be added to bridges. Similarly, netlink interface for sending and receiving L3 packets to/from user space is not in place yet. Based on previous versions by Lorand Jakab and Simon Horman. Signed-off-by: Lorand Jakab <lojakab@cisco.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Joe Stringer <joe@ovn.org>
* datapath: use core MTU range checking in core net infraJarod Wilson2017-03-021-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit 61e84623ace35ce48975e8f90bbbac7557c43d61 Author: Jarod Wilson <jarod@redhat.com> Date: Fri Oct 7 22:04:33 2016 -0400 net: centralize net_device min/max MTU checking While looking into an MTU issue with sfc, I started noticing that almost every NIC driver with an ndo_change_mtu function implemented almost exactly the same range checks, and in many cases, that was the only practical thing their ndo_change_mtu function was doing. Quite a few drivers have either 68, 64, 60 or 46 as their minimum MTU value checked, and then various sizes from 1500 to 65535 for their maximum MTU value. We can remove a whole lot of redundant code here if we simple store min_mtu and max_mtu in net_device, and check against those in net/core/dev.c's dev_set_mtu(). In theory, there should be zero functional change with this patch, it just puts the infrastructure in place. Subsequent patches will attempt to start using said infrastructure, with theoretically zero change in functionality. CC: netdev@vger.kernel.org Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Upstream commit: commit 91572088e3fdbf4fe31cf397926d8b890fdb3237 Author: Jarod Wilson <jarod@redhat.com> Date: Thu Oct 20 13:55:20 2016 -0400 net: use core MTU range checking in core net infra ... openvswitch: - set min/max_mtu, remove internal_dev_change_mtu - note: max_mtu wasn't checked previously, it's been set to 65535, which is the largest possible size supported ... Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Upstream commit: commit 425df17ce3a26d98f76e2b6b0af2acf4aeb0b026 Author: Jarno Rajahalme <jarno@ovn.org> Date: Tue Feb 14 21:16:28 2017 -0800 openvswitch: Set internal device max mtu to ETH_MAX_MTU. Commit 91572088e3fd ("net: use core MTU range checking in core net infra") changed the openvswitch internal device to use the core net infra for controlling the MTU range, but failed to actually set the max_mtu as described in the commit message, which now defaults to ETH_DATA_LEN. This patch fixes this by setting max_mtu to ETH_MAX_MTU after ether_setup() call. Fixes: 91572088e3fd ("net: use core MTU range checking in core net infra") Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> This backport detects the new max_mtu field in the struct netdevice and uses the upstream code if it exists, and local backport code if not. The latter case is amended with bounds checks with new upstream macros ETH_MIN_MTU and ETH_MAX_MTU and the corresponding error messages from the upstream commit. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Signed-off-by: Joe Stringer <joe@ovn.org>
* datapath: backport: vlan: Check for vlan ethernet types for 8021.q or 802.1adYang, Yi Y2017-03-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Upstream commit: commit fe19c4f971a55cea3be442d8032a5f6021702791 Author: Eric Garver <e@erig.me> Date: Wed Sep 7 12:56:58 2016 -0400 This is to simplify using double tagged vlans. This function allows all valid vlan ethertypes to be checked in a single function call. Also replace some instances that check for both ETH_P_8021Q and ETH_P_8021AD. Patch based on one originally by Thomas F Herbert. Signed-off-by: Thomas F Herbert <thomasfherbert@gmail.com> Signed-off-by: Eric Garver <e@erig.me> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Acked-by: Eric Garver <e@erig.me> Signed-off-by: Joe Stringer <joe@ovn.org>
* datapath: backport: vlan: Introduce helper functions to check if skb is taggedYang, Yi Y2017-03-011-0/+1
| | | | | | | | | | | | | | | | | | | | Upstream commit: commit f5a7fb88e1f82542ca14ba93a1d4fa35471c60ca Author: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Date: Fri Mar 27 14:31:11 2015 +0900 vlan: Introduce helper functions to check if skb is tagged Separate the two checks for single vlan and multiple vlans in netif_skb_features(). This allows us to move the check for multiple vlans to another function later. Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Acked-by: Eric Garver <e@erig.me> Signed-off-by: Joe Stringer <joe@ovn.org>
* datapath: compat: Fix build on RHEL 7.3Yi-Hung Wei2016-12-141-0/+7
| | | | | | | | | | | | RHEL 7.3 provides upstream tunnel but it does not support name_assign_type attribute in net-device. This patch fixes the build problem by backporting functions with name_assign_type, and using proper flags in acinclude.m4 to invoke backport functions. Tested on RHEL 7.3 with kernel 3.10.0-514.el7.x86_64 Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Joe Stringer <joe@ovn.org>
* configure: Use -Wformat-security with -Wformat.Ben Pfaff2016-12-121-1/+1
| | | | | | | | | | | GCC 6.1 warns that -Wformat-security has no effect without -Wformat, so this commit fixes the problem. The change to _OVS_CHECK_CC_OPTION is needed so that the cache variable name doesn't end up with a space in it, which obviously doesn't work. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
* acinclude: Fix -Wstrict-prototypes and -Wold-style-definition detection.Ben Pfaff2016-12-121-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | AC_LANG_PROGRAM(,) uses a program like this: int main() { return 0; } but that triggers warnings for -Wstrict-prototypes and for -Wold-style-definition, since this definition of main() lacks a prototype and is therefore old-style. This meant that -Wstrict-prototypes and -Wold-style-definition weren't being turned on for new-enough GCC. This commit fixes the problem by changing the program that is test-compiled to: int x; which doesn't make any compilers mad, as far as I know. I recently upgraded to GCC 6.1 and just now noticed the issue, so I think that GCC somewhere between version 4.9 and version 6.1 must have started warning about main() when it's declared this way. Also, fix a few functions that lacked prototypes. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
* datapath: Allow compile against current net-next.Jarno Rajahalme2016-12-091-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This patch allows openvswitch kernel module in the OVS tree to be compiled against the current net-next Linux kernel. The changes are due to these upstream commits: 56989f6d856 ("genetlink: mark families as __ro_after_init") 489111e5c25 ("genetlink: statically initialize families") a07ea4d9941 ("genetlink: no longer support using static family IDs") struct genl_family initialization is changed be completely static and to include the new (in Linux 4.6) __ro_after_init attribute. Compat code defines it as an empty macro if not defined already. GENL_ID_GENERATE is no longer defined, but since it was defined as 0, it is safe to drop it from all initializers also on older Linux versions. A compiletime_assert is added to make sure this is true whenever GENL_ID_GENERATE is defined. Tested with current Linux net-next (4.9) and 3.16. It should be noted that there are still a number of fixes and new features in upstream net-next that are yet to be backported. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
* trivial: Resolve whitespace issues with acincludeStephen Finucane2016-10-291-7/+7
| | | | | | | Completely unrelated, but annoying. Let's fix it up. Signed-off-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Russell Bryant <russell@ovn.org>
* configure: Support compiling with Linux 4.8.Jarno Rajahalme2016-10-201-2/+2
| | | | | | Datapath should now compile and work with Linux 4.8. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org>
* datapath: Support a fixed size of 128 distinct labels.Jarno Rajahalme2016-10-201-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Port upstream change in conntrack labels extension. Add a new configure macro HAVE_NF_CONN_LABELS_WITH_WORDS to detect the old definition. Unfortunately there is no conntrack API to hide the difference, so the this makes conntrack.c deviate from upstream source a bit. Upstream commit: commit 23014011ba4209a086931ff402eac1c41abbe456 Author: Florian Westphal <fw@strlen.de> Date: Thu Jul 21 12:51:16 2016 +0200 netfilter: conntrack: support a fixed size of 128 distinct labels The conntrack label extension is currently variable-sized, e.g. if only 2 labels are used by iptables rules then the labels->bits[] array will only contain one element. We track size of each label storage area in the 'words' member. But in nftables and openvswitch we always have to ask for worst-case since we don't know what bit will be used at configuration time. As most arches are 64bit we need to allocate 24 bytes in this case: struct nf_conn_labels { u8 words; /* 0 1 */ /* XXX 7 bytes hole, try to pack */ long unsigned bits[2]; /* 8 24 */ Make bits a fixed size and drop the words member, it simplifies the code and only increases memory requirements on x86 when less than 64bit labels are required. We still only allocate the extension if its needed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Pravin B Shelar <pshelar@ovn.org>
* datapath: Add support for kernel 4.7Pravin B Shelar2016-08-221-2/+2
| | | | | Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* netdev-dpdk: Fix occurance of error logCiara Loftus2016-08-181-1/+2
| | | | | | | | | | | | If NUMA information can't be derived from a vHost User device, only print an error if the VHOST_NUMA option is enabled in DPDK. Otherwise 'fail' silently. Fixes: 0a0f39df1d5a ("netdev-dpdk: Add support for DPDK 16.07") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Reported-by: Ian Stokes <ian.stokes@intel.com> Tested-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
* datapath: compat: backport LCO optimization.Pravin B Shelar2016-08-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | This basically backport commit: commit 179bc67f69b6cb53ad68cfdec5a917c2a2248355 Author: Edward Cree <ecree@solarflare.com> Date: Thu Feb 11 20:48:04 2016 +0000 net: local checksum offload for encapsulation The arithmetic properties of the ones-complement checksum mean that a correctly checksummed inner packet, including its checksum, has a ones complement sum depending only on whatever value was used to initialise the checksum field before checksumming (in the case of TCP and UDP, this is the ones complement sum of the pseudo header, complemented). Consequently, if we are going to offload the inner checksum with CHECKSUM_PARTIAL, we can compute the outer checksum based only on the packed data not covered by the inner checksum, and the initial value of the inner checksum field. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>