summaryrefslogtreecommitdiff
path: root/NEWS
Commit message (Collapse)AuthorAgeFilesLines
* netdev-afxdp: Add need_wakeup support.William Tu2019-10-291-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch adds support for using need_wakeup flag in AF_XDP rings. A new option, use-need-wakeup, is added. When this option is used, it means that OVS has to explicitly wake up the kernel RX, using poll() syscall and wake up TX, using sendto() syscall. This feature improves the performance by avoiding unnecessary sendto syscalls for TX. For RX, instead of kernel always busy-spinning on fille queue, OVS wakes up the kernel RX processing when fill queue is replenished. The need_wakeup feature is merged into Linux kernel bpf-next tee with commit 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings") and OVS enables it by default, if libbpf supports it. If users enable it but runs in an older version of libbpf, then the need_wakeup feature has no effect, and a warning message is logged. For virtual interface, it's better set use-need-wakeup=false, since the virtual device's AF_XDP xmit is synchronous: the sendto syscall enters kernel and process the TX packet on tx queue directly. On Intel Xeon E5-2620 v3 2.4GHz system, performance of physical port to physical port improves from 6.1Mpps to 7.3Mpps. Suggested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ofproto-dpif-xlate: Translate timeout policy in ct actionYi-Hung Wei2019-09-261-0/+1
| | | | | | | | | | | | | | | | | | | | | | This patch derives the timeout policy based on ct zone from the internal data structure that we maintain on dpif layer. It also adds a system traffic test to verify the zone-based conntrack timeout feature. The test uses ovs-vsctl commands to configure the customized ICMP and UDP timeout on zone 5 to a shorter period. It then injects ICMP and UDP traffic to conntrack, and checks if the corresponding conntrack entry expires after the predefined timeout. Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> ofproto-dpif: Checks if datapath supports OVS_CT_ATTR_TIMEOUT This patch checks whether datapath supports OVS_CT_ATTR_TIMEOUT. With this check, ofproto-dpif-xlate can use this information to decide whether to translate the ct timeout policy. Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Justin Pettit <jpettit@ovn.org>
* conntrack: Add option to disable TCP sequence checking.Darrell Ball2019-09-251-0/+3
| | | | | | | | | | | | | | | | This may be needed in some special cases, such as to support some hardware offload implementations. Note that disabling TCP sequence number verification is not an optimization in itself, but supporting some hardware offload implementations may offer better performance. TCP sequence number verification is enabled by default. This option is only available for the userspace datapath. Access to this option is presently provided via 'dpctl' commands as the need for this option is quite node specific, by virtue of which nics are in use on a given node. A test is added to verify this option. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-May/359188.html Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb-tool: Convert clustered db to standalone db.Aliasgar Ginwala2019-09-231-0/+3
| | | | | | | | | | Add support in ovsdb-tool for migrating clustered dbs to standalone dbs. E.g. usage to migrate nb/sb db to standalone db from raft: ovsdb-tool cluster-to-standalone ovnnb_db.db ovnnb_db_cluster.db Acked-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Aliasgar Ginwala <aginwala@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Remove OVN.Mark Michelson2019-09-061-1/+4
| | | | | | | | | | | | | | | | OVN is separated into its own repo. This commit removes the OVN source, OVN tests, and OVN documentation. It also removes mentions of OVN from most documentation. The only place where OVN has been left is in changelogs/NEWS, since we shouldn't mess with the history of the project. There is an exception here. The ovsdb-cluster tests rely on ovn-nbctl and ovn-sbctl to run. Therefore those ovn utilities, as well as their dependencies remain in the repo with this commit. Acked-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Set release date for 2.12.0.Justin Pettit2019-09-041-2/+1
| | | | | Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Flavio Leitner <fbl@sysclose.org>
* Prepare for post-2.12.0 (2.12.90).Ben Pfaff2019-07-221-0/+4
| | | | | Acked-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Prepare for 2.12.0.Ben Pfaff2019-07-221-1/+1
| | | | | Acked-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev-afxdp: add new netdev type for AF_XDP.William Tu2019-07-191-0/+1
| | | | | | | | | | | | | | | | The patch introduces experimental AF_XDP support for OVS netdev. AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket type built upon the eBPF and XDP technology. It is aims to have comparable performance to DPDK but cooperate better with existing kernel's networking stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program attached to the netdev, by-passing a couple of Linux kernel's subsystems As a result, AF_XDP socket shows much better performance than AF_PACKET For more details about AF_XDP, please see linux kernel's Documentation/networking/af_xdp.rst. Note that by default, this feature is not compiled in. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
* dpif-netdev: Add specialized generic scalar functionsHarry van Haaren2019-07-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a number of specialized functions, that handle common miniflow fingerprints. This enables compiler optimization, resulting in higher performance. Below a quick description of how this optimization actually works; "Specialized functions" are "instances" of the generic implementation, but the compiler is given extra context when compiling. In the case of iterating miniflow datastructures, the most interesting value to enable compile time optimizations is the loop trip count per unit. In order to create a specialized function, there is a generic implementation, which uses a for() loop without the compiler knowing the loop trip count at compile time. The loop trip count is passed in as an argument to the function: uint32_t miniflow_impl_generic(struct miniflow *mf, uint32_t loop_count) { for(uint32_t i = 0; i < loop_count; i++) // do work } In order to "specialize" the function, we call the generic implementation with hard-coded numbers - these are compile time constants! uint32_t miniflow_impl_loop5(struct miniflow *mf, uint32_t loop_count) { // use hard coded constant for compile-time constant-propogation return miniflow_impl_generic(mf, 5); } Given the compiler is aware of the loop trip count at compile time, it can perform an optimization known as "constant propogation". Combined with inlining of the miniflow_impl_generic() function, the compiler is now enabled to *compile time* unroll the loop 5x, and produce "flat" code. The last step to using the specialized functions is to utilize a function-pointer to choose the specialized (or generic) implementation. The selection of the function pointer is performed at subtable creation time, when miniflow fingerprint of the subtable is known. This technique is known as "multiple dispatch" in some literature, as it uses multiple items of information (miniflow bit counts) to select the dispatch function. By pointing the function pointer at the optimized implementation, OvS benefits from the compile time optimizations at runtime. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Tested-by: Malvika Gupta <malvika.gupta@arm.com> Acked-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* doc: Remove experimental tag for SMC cache.Yipeng Wang2019-07-181-0/+1
| | | | | | | | | | | | | SMC cache was introduced in 2.10 with experimental tag. SMC cache is a layer of software cache located after EMC cache. The purpose is to improve the performance of use cases that many flows missing the EMC cache. One can enable SMC cache using smc-enable=true option. Signed-off-by: Yipeng Wang <yipeng1.wang@intel.com> Acked-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* NEWS: Mention IGMP support for OVNDumitru Ceara2019-07-171-0/+1
| | | | | | | | NEWS update was missed while adding the IGMP code. Fixes: 605535f9adf2 ("OVN: Add ovn-northd IGMP support") Signed-off-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev-dpdk: Enable tx-retries-max config.Kevin Traynor2019-07-081-0/+2
| | | | | | | | | | | | | | | | | | | | | | vhost tx retries can provide some mitigation against dropped packets due to a temporarily slow guest/limited queue size for an interface, but on the other hand when a system is fully loaded those extra cycles retrying could mean packets are dropped elsewhere. Up to now max vhost tx retries have been hardcoded, which meant no tuning and no way to disable for debugging to see if extra cycles spent retrying resulted in rx drops on some other interface. Add an option to change the max retries, with a value of 0 effectively disabling vhost tx retries. Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* tunnel: Add layer 2 IPv6 GRE encapsulation support.William Tu2019-07-031-0/+1
| | | | | | | | | | | | | | | | | The patch adds ip6gre support. Tunnel type 'ip6gre' with packet_type= legacy_l2 is a layer 2 GRE tunnel over IPv6, carrying inner ethernet packets and encap with GRE header with outer IPv6 header. Encapsulation of layer 3 packet over IPv6 GRE, ip6gre, is not supported yet. I tested it by running: # make check-kernel TESTSUITEFLAGS='-k ip6gre' under kernel 5.2 and for userspace: # make check TESTSUITEFLAGS='-k ip6gre' Tested-by: Greg Rose <gvrose8192@gmail.com> Tested-at: https://travis-ci.org/gvrose8192/ovs-experimental/builds/552977116 Reviewed-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* vswitchd: Always cleanup userspace datapath.Ilya Maximets2019-07-021-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 'netdev' datapath is implemented within ovs-vswitchd process and can not exist without it, so it should be gracefully terminated with a full cleanup of resources upon ovs-vswitchd exit. This change forces dpif cleanup for 'netdev' datapath regardless of passing '--cleanup' to 'ovs-appctl exit'. Such solution allowes to not pass this additional option everytime for userspace datapath installations and also allowes to not terminate system datapath in setups where both datapaths runs at the same time. The main part is that dpif_port_del() will lead to netdev_close() and subsequent netdev_class->destroy(dev) which will stop HW NICs and free their resources. For vhost-user interfaces it will invoke vhost driver unregistering with a properly closed vhost-user connection. For upcoming AF_XDP netdev this will allow to gracefully destroy xdp sockets and unload xdp programs from linux interfaces. Another important thing is that port deletion will also trigger flushing of flows offloaded to HW NICs. Exception made for 'internal' ports that could have user ip/route configuration. These ports will not be removed without '--cleanup'. This change fixes OVS disappearing from the DPDK point of view (keeping HW NICs improperly configured, sudden closing of vhost-user connections) and will help with linux devices clearing with upcoming AF_XDP netdev support. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Tested-by: William Tu <u9012063@gmail.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Ben Pfaff <blp@ovn.org>
* NEWS: Update regarding dumping HW offloaded flows.Ilya Maximets2019-07-021-0/+2
| | | | | | | | | | NEWS update was missed while updating docs for dynamic Flow API. Since this is a user visible change, it should be mentioned here. Fixes: d74ca2269e36 ("dpctl: Update docs about dump-flows and HW offloading.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Roi Dayan <roid@mellanox.com> Acked-by: Eli Britstein <elibr@mellanox.com>
* netdev-dpdk: Reset queue number for vhost devices on vm shutdown.David Marchand2019-06-271-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rather than poll all disabled queues and waste some memory for vms that have been shutdown, we can reconfigure when receiving a destroy connection notification from the vhost library. $ while true; do ovs-appctl dpif-netdev/pmd-rxq-show |awk ' /port: / { tot++; if ($5 == "(enabled)") { en++; } } END { print "total: " tot ", enabled: " en }' sleep 1 done total: 66, enabled: 66 total: 6, enabled: 2 This change requires a fix in the DPDK vhost library, so bump the minimal required version to 18.11.2. Co-authored-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* dpdk: Use DPDK 18.11.2 release.Ian Stokes2019-06-271-1/+1
| | | | | | | | | Modify travis linux build script to use the latest DPDK stable release 18.11.2. Update docs for latest DPDK stable releases. Signed-off-by: Ian Stokes <ian.stokes@intel.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com>
* OpenFlow: Enable OpenFlow 1.5 by default.Ben Pfaff2019-06-201-0/+3
| | | | | | | | Open vSwitch now supports all OpenFlow 1.5 required features, so enable it by default. Acked-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ofp-actions: Support OF1.5 meter action.Ben Pfaff2019-06-201-0/+1
| | | | | | | | OpenFlow 1.5 changed "meter" from an instruction to an action. This commit supports it properly. Acked-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* datapath: Support kernel version 5.0.xYifeng Sun2019-06-131-0/+1
| | | | | | | | | | | | | This patch updated acinclude.m4 so that OVS can be compiled on 5.0.x kernels. This patch also updated travis files so that 5.0.x kernel versions are used during travis test builds. Besides, NEWS and releases.rst are also updated to reflect this new support. Acked-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ofproto-dpif-xlate: Add "always" mode to priority tagsEli Britstein2019-05-241-0/+2
| | | | | | | | | | | Configure "if-nonzero" priority tags to retain the 802.1Q header when the VLAN ID is zero, except both the VLAN ID and priority are zero. Add a "always" configuration option to retain the 802.1Q header in such frames as well. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev-dpdk: Post-copy Live Migration support for vhost-user-client.Liliia Butorina2019-05-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | Post-copy Live Migration for vHost supported since DPDK 18.11 and QEMU 2.12. New global config option 'vhost-postcopy-support' added to control this feature. Ex.: ovs-vsctl set Open_vSwitch . other_config:vhost-postcopy-support=true Changing this value requires restarting the daemon. It's safe to enable this knob even if QEMU doesn't support post-copy LM. Feature marked as experimental and disabled by default because it may cause PMD thread hang on destination host on page fault for the time of page downloading from the source. Feature is not compatible with 'mlockall' and 'dequeue zero-copy'. Support added only for vhost-user-client. Signed-off-by: Liliia Butorina <l.butorina@partner.samsung.com> Co-authored-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
* datapath: Support kernel version 4.19.x and 4.20.xYifeng Sun2019-05-101-0/+2
| | | | | | | | | | | | | This patch updated acinclude.m4 so that OVS can be compiled on 4.19.x and 4.20.x kernels. This patch also updated travis files so that latest kernel versions are used during travis test builds. Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* dpdk: Use DPDK 18.11.1 release.Ian Stokes2019-05-091-0/+1
| | | | | | | | | | Modify travis linux build script to use the latest DPDK stable release 18.11.1. Update docs for latest DPDK stable releases. Signed-off-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com>
* OVN: Add support for Transport ZonesLucas Alvares Gomes2019-04-221-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is adding support for Transport Zones. Transport zones (a.k.a TZs) is way to enable users of OVN to separate Chassis into different logical groups that will only form tunnels between members of the same groups. Each Chassis can belong to one or more Transport Zones. If not set, the Chassis will be considered part of a default group. Configuring Transport Zones is done by creating a key called "ovn-transport-zones" in the external_ids column of the Open_vSwitch table from the local OVS instance. The value is a string with the name of the Transport Zone that this instance is part of. Multiple TZs can be specified with a comma-separated list. For example: $ sudo ovs-vsctl set open . external-ids:ovn-transport-zones=tz1 or $ sudo ovs-vsctl set open . external-ids:ovn-transport-zones=tz1,tz2,tz3 This configuration is also exposed in the Chassis table of the OVN Southbound Database in a new column called "transport_zones". The use for Transport Zones includes but are not limited to: * Edge computing: As a way to preventing edge sites from trying to create tunnels with every node on every other edge site while still allowing these sites to create tunnels with the central node. * Extra security layer: Where users wants to create "trust zones" and prevent computes in a more secure zone to communicate with a less secure zone. This patch is also backward compatible so the upgrade guide for OVN [0] is still valid and the ovn-controller service can be upgraded before the OVSDBs. [0] http://docs.openvswitch.org/en/latest/intro/install/ovn-upgrades/ Reported-by: Daniel Alvarez Sanchez <dalvarez@redhat.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-February/048255.html Signed-off-by: Lucas Alvares Gomes <lucasagomes@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Add a new OVS action check_pkt_largerNuman Siddique2019-04-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds a new action 'check_pkt_larger' which checks if the packet is larger than the given size and stores the result in the destination register. Usage: check_pkt_larger(len)->REGISTER Eg. match=...,actions=check_pkt_larger(1442)->NXM_NX_REG0[0],next; This patch makes use of the new datapath action - 'check_pkt_len' which was recently added in the commit [1]. At the start of ovs-vswitchd, datapath is probed for this action. If the datapath action is present, then 'check_pkt_larger' makes use of this datapath action. Datapath action 'check_pkt_len' takes these nlattrs * OVS_CHECK_PKT_LEN_ATTR_PKT_LEN - 'pkt_len' to check for * OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER (optional) - Nested actions to apply if the packet length is greater than the specified 'pkt_len' * OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL (optional) - Nested actions to apply if the packet length is lesser or equal to the specified 'pkt_len'. Let's say we have these flows added to an OVS bridge br-int table=0, priority=100 in_port=1,ip,actions=check_pkt_larger:100->NXM_NX_REG0[0],resubmit(,1) table=1, priority=200,in_port=1,ip,reg0=0x1/0x1 actions=output:3 table=1, priority=100,in_port=1,ip,actions=output:4 Then the action 'check_pkt_larger' will be translated as - check_pkt_len(size=100,gt(3),le(4)) datapath will check the packet length and if the packet length is greater than 100, it will output to port 3, else it will output to port 4. In case, datapath doesn't support 'check_pkt_len' action, the OVS action 'check_pkt_larger' sets SLOW_ACTION so that datapath flow is not added. This OVS action is intended to be used by OVN to check the packet length and generate an ICMP packet with type 3, code 4 and next hop mtu in the logical router pipeline if the MTU of the physical interface is lesser than the packet length. More information can be found here [2] [1] - https://kernel.googlesource.com/pub/scm/linux/kernel/git/davem/net-next/+/4d5ec89fc8d14dcdab7214a0c13a1c7321dc6ea9 [2] - https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.html Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.html Suggested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Numan Siddique <nusiddiq@redhat.com> CC: Ben Pfaff <blp@ovn.org> CC: Gregory Rose <gvrose8192@gmail.com> Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* OVN: Add NEWS for Policy-based routingMary Manohar2019-04-161-0/+3
| | | | | Signed-off-by: Mary Manohar <mary.manohar at nutanix.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* stream-ssl: Add support for TLS SNI (Server Name Indication).Ben Pfaff2019-04-161-0/+2
| | | | | | | | | | | This TLS extension, introduced in RFC 3546, allows the server to know what host the client believes it is contacting, the TLS equivalent of the Host: header in HTTP. Tested-by: Yifeng Sun <pkusunyifeng@gmail.com> Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Requested-by: Shivaram Mysore <smysore@servicefractal.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* OVN: add the possibility to configure a static IPv4/IPv6 address and dynamic MACLorenzo Bianconi2019-04-161-1/+5
| | | | | | | | | | | | | | | Add the possibility to configure a static IPv4 and/or IPv6 address and get MAC address dynamically allocated. This can be done using the following commands: $ovn-nbctl ls-add sw0 $ovn-nbctl set Logical-Switch sw0 other_config:subnet=192.168.0.0/24 $ovn-nbctl set Logical-switch sw0 other_config:ipv6_prefix=2001::0 $ovn-nbctl lsp-add sw0 lsp0 -- lsp-set-addresses lsp0 "dynamic 192.168.0.1 2001::1" Acked-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovn: Support a new Logical_Switch_Port.type - 'external'Numan Siddique2019-04-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the case of OpenStack + OVN, when the VMs are booted on hypervisors supporting SR-IOV nics, there are no OVS ports for these VMs. When these VMs sends DHCPv4, DHPCv6 or IPv6 Router Solicitation requests, the local ovn-controller cannot reply to these packets. OpenStack Neutron dhcp agent service needs to be run to serve these requests. With the new logical port type - 'external', OVN itself can handle these requests avoiding the need to deploy any external services like neutron dhcp agent. To make use of this feature, CMS has to - create a logical port for such VMs - set the type to 'external' - create an HA chassis group and associate the logical port to it or associate an already existing HA chassis group. - create a localnet port for the logical switch - configure the ovn-bridge-mappings option in the OVS db. HA chassis with the highest priority becomes the master of the HA chassis group and the ovn-controller running in that 'chassis', claims the Port_Binding for that logical port and it adds the necessary DHCPv4/v6 OF flows. Since the packet enters the logical switch pipeline via the localnet port, the inport register (reg14) is set to the tunnel key of localnet port in the match conditions. In case the chassis goes down for some reason, next higher priority HA chassis becomes the master and claims the port. When the VM with the external port, sends an ARP request for the router ips, only the chassis which has claimed the port, will reply to the ARP requests. Rest of the chassis on receiving these packets drop them in the ingress switch datapath stage - S_SWITCH_IN_EXTERNAL_PORT which is just before S_SWITCH_IN_L2_LKUP. This would guarantee that only the chassis which has claimed the external ports will run the router datapath pipeline. Acked-by: Mark Michelson <mmichels@redhat.com> Acked-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovn-northd: Delete the references to gateway_chasss in SB DBNuman Siddique2019-04-161-0/+1
| | | | | | | | | | | | | | | | | | Previous patch in the series added the support in ovn-controller to use ha_chassis_group table in SB DB to support HA chassis and establishing BFD tunnels instead of the gateway_chassis table. There is no need for ovn-northd to create any gateway_chassis rows in SB DB. This patch does that and deletes the code which is not required anymore. This patch also now supports 'ha_chassis_group' to be associated with a distributed logical router port and ignores 'gateway_chassis' and 'redirect-chassis' if set along with 'ha_chassis_group'. Acked-by: Mark Michelson <mmichels@redhat.com> Acked-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev-linux: netem QoS supportSharon K2019-03-141-0/+1
| | | | | Signed-off-by: Sharon Krendel <thekafkaf@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* OVN: select a random mac_prefix if not providedLorenzo Bianconi2019-03-051-0/+2
| | | | | | | | | | | | | | Select a random IPAM mac_prefix if it has not been provided by the user. With this patch the admin can avoid to configure mac_prefix in order to avoid L2 address collisions if multiple OVN deployments share the same broadcast domain. Remove MAC_ADDR_PREFIX definitions/occurrences since now mac_prefix is always provided to ovn-northd Acked-by: Numan Siddique <nusiddiq@redhat.com> Tested-by: Miguel Duarte de Mora Barroso <mdbarroso@redhat.com> Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb: Update NEWS for fast-resync feature.Han Zhou2019-03-041-5/+5
| | | | | | | | This patch updates text in NEWS committed by 5832e6a, so that it is easier to understand for end users. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovsdb: Add NEWS for fast-resync feature.Han Zhou2019-03-011-0/+5
| | | | | Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* NEWS: Clean up the 2.11.0 release notes a bit.Justin Pettit2019-02-281-7/+5
| | | | | Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* Set release dates for 2.11.0.Justin Pettit2019-02-201-1/+1
| | | | | Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Flavio Leitner <fbl@sysclose.org>
* Userspace datapath: Add fragmentation handling.Darrell Ball2019-02-141-1/+9
| | | | | | | | | | | | | | | | Fragmentation handling is added for supporting conntrack. Both v4 and v6 are supported. After discussion with several people, I decided to not store configuration state in the database to be more consistent with the kernel in future, similarity with other conntrack configuration which will not be in the database as well and overall simplicity. Accordingly, fragmentation handling is enabled by default. This patch enables fragmentation tests for the userspace datapath. Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Remove support for OpenFlow 1.6 (draft).Ben Pfaff2019-02-051-0/+2
| | | | | | | | | ONF abandoned the OpenFlow specification, so that OpenFlow 1.6 will never be completed. It did not contain much in the way of useful features, so remove what support Open vSwitch already had. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* datapath: Add support for kernel 4.18.xYifeng Sun2019-02-041-1/+2
| | | | | | | | | | | | | | No code changing is necessary to support 4.18.x. Only one kernel test failed and it is in the process of being fixed. Updated .travis.yml to include 4.18.x and also use latest 4.17 version. Updated test files to test 4.18 kernel. Tested-by: Greg Rose <gvrose8192@gmail.com> Reviewed-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Support for match & set ICMPv6 reserved and options type fieldsVishal Deep Ajmera2019-02-041-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | Currently OVS supports all ARP protocol fields as OXM match fields to implement the relevant ARP procedures for IPv4. This includes support for matching copying and setting ARP fields. In IPv6 ARP has been replaced by ICMPv6 neighbor discovery (ND) procedures, neighbor advertisement and neighbor solicitation. The support for ICMPv6 fields in OVS is not complete for the use cases equivalent to ARP in IPv4. OVS lacks support for matching, copying and setting the “ND option type” and “ND reserved” fields. Without these user cannot implement all ICMPv6 ND procedures for IPv6 support. This commit adds additional OXM fields to OVS for ICMPv6 “ND option type“ and ICMPv6 “ND reserved” using the OXM extension mechanism. This allows support for parsing these fields from an ICMPv6 packet header and extending the OpenFlow protocol with specifications for these new OXM fields for matching, copying and setting. Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Co-authored-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com> Signed-off-by: Ashvin Lakshmikantha <ashvin.lakshmikantha@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* dpdk: Limit DPDK memory usage.Ilya Maximets2019-02-011-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Since 18.05 release, DPDK moved to dynamic memory model in which hugepages could be allocated on demand. At the same time '--socket-mem' option was re-defined as a size of pre-allocated memory, i.e. memory that should be allocated at startup and could not be freed. So, DPDK with a new memory model could allocate more hugepage memory than specified in '--socket-mem' or '-m' options. This change adds new configurable 'other_config:dpdk-socket-limit' which could be used to limit the ammount of memory DPDK could use. It uses new DPDK option '--socket-limit'. Ex.: ovs-vsctl set Open_vSwitch . other_config:dpdk-socket-limit="1024,1024" Also, in order to preserve old behaviour, if '--socket-limit' is not specified, it will be defaulted to the amount of memory specified by '--socket-mem' option, i.e. OVS will not be able to allocate more. This is needed, for example, to disallow OVS to allocate more memory than reserved for it by Nova in OpenStack installations. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* Prepare for post-2.11.0 (2.11.90).Justin Pettit2019-01-201-0/+4
| | | | | Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* Prepare for 2.11.0.Justin Pettit2019-01-201-1/+1
| | | | | Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* dpif-netdev: Per-port configurable EMC.Ilya Maximets2019-01-181-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | Conditional EMC insert helps a lot in scenarios with high numbers of parallel flows, but in current implementation this option affects all the threads and ports at once. There are scenarios where we have different number of flows on different ports. For example, if one of the VMs encapsulates traffic using additional headers, it will receive large number of flows but only few flows will come out of this VM. In this scenario it's much faster to use EMC instead of classifier for traffic from the VM, but it's better to disable EMC for the traffic which flows to VM. To handle above issue introduced 'emc-enable' configurable to enable/disable EMC on a per-port basis. Ex.: ovs-vsctl set interface dpdk0 other_config:emc-enable=false EMC probability kept as is and it works for all the ports with 'emc-enable=true'. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* netdev-dpdk: support port representorsOphir Munk2019-01-171-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Dpdk port representors were introduced in dpdk versions 18.xx. Prior to port representors there was a one-to-one relationship between an rte device (e.g. PCI bus) and an eth device (referenced as dpdk port id in OVS). With port representors the relationship becomes one-to-many rte device to eth devices. For example in [3] there are two devices (representors) using the same PCI physical address 0000:08:00.0: "0000:08:00.0,representor=[3]" and "0000:08:00.0,representor=[5]". This commit handles the new one-to-many relationship. For example, when one of the device port representors in [3] is closed - the PCI bus cannot be detached until the other device port representor is closed as well. OVS remains backward compatible by supporting dpdk legacy PCI ports which do not include port representors. Dpdk port representors related commits are listed in [1]. Dpdk port representors documentation appears in [2]. A sample configuration which uses two representors ports (the output of "ovs-vsctl show" command) is shown in [3]. [1] e0cb96204b71 ("net/i40e: add support for representor ports") cf80ba6e2038 ("net/ixgbe: add support for representor ports") 26c08b979d26 ("net/mlx5: add port representor awareness") [2] https://doc.dpdk.org/guides-18.11/prog_guide/switch_representation.html [3] Bridge "ovs_br0" Port "ovs_br0" Interface "ovs_br0" type: internal Port "port-rep3" Interface "port-rep3" type: dpdk options: {dpdk-devargs="0000:08:00.0,representor=[3]"} Port "port-rep5" Interface "port-rep5" type: dpdk options: {dpdk-devargs="0000:08:00.0,representor=[5]"} ovs_version: "2.10.90" Signed-off-by: Ophir Munk <ophirmu@mellanox.com> Co-authored-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* rhel: Split OpenvSwitch and OVN packagesNuman Siddique2019-01-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | Up until now, OVN rpms were generated as sub packages of OpenvSwitch. This patch now splits it and makes OVN rpms independent. A new spec file - ovn-fedora.spec.in is added for this. The openvswitch-fedora.spec.in has been modified to create only OpenvSwitch packages. Since we are not splitting the OVN code, the spec files run the same build procedure. Only the required binaries/files are copied into the rpms. The new package names will be ovn, ovn-common, ovn-central, ovn-host, ovn-vtep and ovn-docker. Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Numan Siddique <nusiddiq@redhat.com> Acked-by: Timothy Redaelli <tredaelli@redhat.com> Tested-By: Timothy Redaelli <tredaelli@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Adding support for PMD auto load balancingNitin Katiyar2019-01-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Port rx queues that have not been statically assigned to PMDs are currently assigned based on periodically sampled load measurements. The assignment is performed at specific instances – port addition, port deletion, upon reassignment request via CLI etc. Due to change in traffic pattern over time it can cause uneven load among the PMDs and thus resulting in lower overall throughout. This patch enables the support of auto load balancing of PMDs based on measured load of RX queues. Each PMD measures the processing load for each of its associated queues every 10 seconds. If the aggregated PMD load reaches 95% for 6 consecutive intervals then PMD considers itself to be overloaded. If any PMD is overloaded, a dry-run of the PMD assignment algorithm is performed by OVS main thread. The dry-run does NOT change the existing queue to PMD assignments. If the resultant mapping of dry-run indicates an improved distribution of the load then the actual reassignment will be performed. The automatic rebalancing will be disabled by default and has to be enabled via configuration option. The interval (in minutes) between two consecutive rebalancing can also be configured via CLI, default is 1 min. Following example commands can be used to set the auto-lb params: ovs-vsctl set open_vswitch . other_config:pmd-auto-lb="true" ovs-vsctl set open_vswitch . other_config:pmd-auto-lb-rebalance-intvl="5" Co-authored-by: Rohith Basavaraja <rohith.basavaraja@gmail.com> Co-authored-by: Venkatesan Pradeep <venkatesan.pradeep@ericsson.com> Signed-off-by: Rohith Basavaraja <rohith.basavaraja@gmail.com> Signed-off-by: Venkatesan Pradeep <venkatesan.pradeep@ericsson.com> Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Tested-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* ovs-vswitchd: Implement OFPT_TABLE_FEATURES table modification request.Ben Pfaff2019-01-151-0/+4
| | | | | | | | | This allows a controller to change the name of OpenFlow flow tables in the OVS software switch. CC: Brad Cowie <brad@cowie.nz> Acked-by: Justin Pettit <jpettit@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>