delta/openvswitch.git - github.com: openvswitch/ovs.git

	Commit message (Collapse)	Author	Age	Files	Lines
*	github: Add GitHub Actions workflow.	Ilya Maximets	2020-11-26	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is an initial version of GitHub Actions support. It mostly mimics our current Travis CI build matrix with slight differences. The main issue is that we don't have ARM support here. Minor difference that we can not install 32-bit versions of libunwind and libunbound since those are not avaialble in repository. Higher concurrency level allows to finish all tests less than in 20 minutes. Which is 3 times faster than in Travis. .travis folder renamed to .ci to highlight that it used not only for Travis CI. Travis CI support will be reduced to only test ARM builds soon and will be completely removed when travis-ci.org will be turned into read-only mode. What happened to Travis CI: https://mail.openvswitch.org/pipermail/ovs-dev/2020-November/377773.html Acked-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	netdev-dpdk: Add option to configure VF MAC address.	Gaetan Rivet	2020-11-16	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In some cloud topologies, using DPDK VF representors in guest requires configuring a VF before it is assigned to the guest. A first basic option for such configuration is setting the VF MAC address. Add a key 'dpdk-vf-mac' to the 'options' column of the Interface table. This option can be used as such: $ ovs-vsctl add-port br0 dpdk-rep0 -- set Interface dpdk-rep0 type=dpdk \ options:dpdk-vf-mac=00:11:22:33:44:55 Suggested-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Eli Britstein <elibr@nvidia.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Gaetan Rivet <grive@u256.net> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	dpctl: Add the option 'pmd' for dump-flows.	Tonghao Zhang	2020-11-10	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \|	"ovs-appctl dpctl/dump-flows" added the option "pmd" which allow user to dump pmd specified. That option is useful to dump rules of pmd when we have a large number of rules in dp. Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Gaetan Rivet <grive@u256.net> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	raft: Make backlog thresholds configurable.	Ilya Maximets	2020-11-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	New appctl 'cluster/set-backlog-threshold' to configure thresholds on backlog of raft jsonrpc connections. Could be used, for example, in some extreme conditions where size of a database expected to be very large, i.e. comparable with default 4GB threshold. Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	raft: Set threshold on backlog for raft connections.	Ilya Maximets	2020-11-10	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	RAFT messages could be fairly big. If something abnormal happens to one of the servers in a cluster it may not be able to process all the incoming messages in a timely manner. This results in jsonrpc backlog growth on the sender's side. For example if follower gets many new clients at once that it needs to serve, or it decides to take a snapshot in a period of high number of database changes. If backlog grows large enough it becomes harder and harder for follower to process incoming raft messages, it sends outdated replies and starts receiving snapshots and the whole raft log from the leader. Sometimes backlog grows too high (60GB in this example): jsonrpc\|INFO\|excessive sending backlog, jsonrpc: ssl:<ip>, num of msgs: 15370, backlog: 61731060773. In this case OS might actually decide to kill the sender to free some memory. Anyway, It could take a lot of time for such a server to catch up with the rest of the cluster if it has so much data to receive and process. Introducing backlog thresholds for jsonrpc connections. If sending backlog will exceed particular values (500 messages or 4GB in size), connection will be dropped and re-created. This will allow to drop all the current backlog and start over increasing chances of cluster recovery. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829 Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	ovsdb-server: Reclaim heap memory after compaction.	Ilya Maximets	2020-11-03	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Compaction happens at most once in 10 minutes. That is a big time interval for a heavy loaded ovsdb-server in cluster mode. In 10 minutes raft logs could grow up to tens of thousands of entries with tens of gigabytes in total size. While compaction cleans up raft log entries, the memory in many cases is not returned to the system, but kept in the heap of running ovsdb-server process, and it could stay in this condition for a really long time. In the end one performance spike could lead to a fast growth of the raft log and this memory will never (for a really long time) be released to the system even if the database if empty. Simple example how to reproduce with OVN sandbox: 1. make sandbox SANDBOXFLAGS='--nbdb-model=clustered --sbdb-model=clustered' 2. Run following script that creates 1 port group, adds 4000 acls and removes all of that in the end: # cat ../memory-test.sh pg_name=my_port_group export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach --log-file -vsocket_util:off) ovn-nbctl pg-add $pg_name for i in $(seq 1 4000); do echo "Iteration: $i" ovn-nbctl --log acl-add $pg_name from-lport $i udp drop done ovn-nbctl acl-del $pg_name ovn-nbctl pg-del $pg_name ovs-appctl -t $(pwd)/sandbox/nb1 memory/show ovn-appctl -t ovn-nbctl exit --- 3. Stopping one of Northbound DB servers: ovs-appctl -t $(pwd)/sandbox/nb1 exit Make sure that ovsdb-server didn't compact the database before it was stopped. Now we have a db file on disk that contains 4000 fairly big transactions inside. 4. Trying to start same ovsdb-server with this file. # cd sandbox && ovsdb-server <...> nb1.db At this point ovsdb-server reads all the transactions from db file and performs all of them as fast as it can one by one. When it finishes this, raft log contains 4000 entries and ovsdb-server consumes (on my system) ~13GB of memory while database is empty. And libc will likely never return this memory back to system, or, at least, will hold it for a really long time. This patch adds a new command 'ovsdb-server/memory-trim-on-compaction'. It's disabled by default, but once enabled, ovsdb-server will call 'malloc_trim(0)' after every successful compaction to try to return unused heap memory back to system. This is glibc-specific, so we need to detect function availability in a build time. Disabled by default since it adds from 1% to 30% (depending on the current state) to the snapshot creation time and, also, next memory allocations will likely require requests to kernel and that might be slower. Could be enabled by default later if considered broadly beneficial. Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1888829 Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	NEWS: Move GTP-U entry to correct release.	Ilya Maximets	2020-10-25	1	-3/+3
\| \| \| \| \| \| \| \|	GTP-U support was released in 2.14, not 2.13. Fixes: 3c6d05a02e0f ("userspace: Add GTP-U support.") Acked-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	Eliminate use of term "slave" in bond, LACP, and bundle contexts.	Ben Pfaff	2020-10-21	1	-1/+3
\| \| \| \| \| \| \| \| \| \| \| \| \|	The new term is "member". Most of these changes should not change user-visible behavior. One place where they do is in "ovs-ofctl dump-flows", which will now output "members:..." inside "bundle" actions instead of "slaves:...". I don't expect this to cause real problems in most systems. The old syntax is still supported on input for backward compatibility. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
*	NEWS: Move terminology update to correct place.	Ilya Maximets	2020-10-21	1	-3/+3
\| \| \| \| \| \| \| \|	It's Post-v2.14.0, not v2.14.0. Fixes: 807152a4ddfb ("Use primary/secondary, not master/slave, as names for OpenFlow roles.") Acked-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	Documentation: Update faq and NEWS for kernel 5.8	Greg Rose	2020-10-17	1	-0/+2
\| \| \| \| \| \| \| \| \|	Update the NEWS and faq now that we will support up to Linux kernel 5.8. Acked-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	Use primary/secondary, not master/slave, as names for OpenFlow roles.	Ben Pfaff	2020-10-16	1	-0/+3
\| \| \| \| \|	Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Alin Gabriel Serdean <aserdean@ovn.org>
*	dns-resolve: Allow unbound's config file to be set through an env var.	Ted Elhourani	2020-10-08	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	When an unbound context is created, check whether OVS_UNBOUND_CONF has been set. If a valid config file is supplied then use it to configure the context. The procedure returns if the config file is invalid. If no config file is found then the default unbound config is used. Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ted Elhourani <ted.elhourani@nutanix.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	DPDK: Remove support for vhost-user zero-copy.	Ian Stokes	2020-10-05	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Support for vhost-user dequeue zero-copy was deprecated in OVS 2.14 with the aim of removing it for OVS 2.15. OVS only supports zero copy for vhost client mode, as such it will cease to function due to DPDK commit [1] Also DPDK is set to remove zero-copy functionality in DPDK 20.11 as referenced by commit [2] As such remove support from OVS. [1] 715070ea10e6 ("vhost: prevent zero-copy with incompatible client mode") [2] d21003c9dafa ("doc: announce removal of vhost zero-copy dequeue") Signed-off-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Kevin Traynor <ktraynor@redhat.com>
*	ovsdb: Add unixctl command to show storage status.	Dumitru Ceara	2020-09-16	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If a database enters an error state, e.g., in case of RAFT when reading the DB file contents if applying the RAFT records triggers constraint violations, there's no way to determine this unless a client generates a write transaction. Such write transactions would fail with "ovsdb-error: inconsistent data". This commit adds a new command to show the status of the storage that's backing a database. Example, on an inconsistent database: $ ovs-appctl -t /tmp/test.ctl ovsdb-server/get-db-storage-status DB status: ovsdb error: inconsistent data Example, on a consistent database: $ ovs-appctl -t /tmp/test.ctl ovsdb-server/get-db-storage-status DB status: ok Signed-off-by: Dumitru Ceara <dceara@redhat.com> Acked-by: Han Zhou <hzhou@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	Set release date for 2.14.0.	Ilya Maximets	2020-08-17	1	-1/+1
\| \| \| \| \| \|	Acked-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	dpdk: Deprecate vhost-user dequeue zero-copy.	Ian Stokes	2020-08-12	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Dequeue zero-copy is no longer supported for vhost-user client mode in DPDK due to commit [1]. In addition to this, zero-copy mode has been proposed to be marked deprecated in [2] with removal in the next DPDK LTS release. This commit deprecates support for vhost-user dequeue zero-copy in OVS with its removal expected in the next OVS release. [1] 715070ea10e6 ("vhost: prevent zero-copy with incompatible client mode") [2] http://mails.dpdk.org/archives/dev/2020-August/177236.html Signed-off-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com> Acked-by: Ilya Maximets <i.maximets@ovn.org>
*	Prepare for post-2.14.0 (2.14.90).	Ilya Maximets	2020-07-17	1	-0/+4
\| \| \| \| \|	Acked-by: Ian Stokes <ian.stokes@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	Prepare for 2.14.0.	Ilya Maximets	2020-07-17	1	-1/+1
\| \| \| \| \|	Acked-by: Ian Stokes <ian.stokes@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	dpdk: Add commands to configure log levels.	David Marchand	2020-07-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enabling debug logs in dpdk can be a challenge to be sure of what is actually enabled, add commands to list and change those log levels. However, these commands do not help when tracking issues in dpdk init itself: dump log levels right after init. Example: $ ovs-appctl dpdk/log-list global log level is debug id 0: lib.eal, level is info id 1: lib.malloc, level is info id 2: lib.ring, level is info id 3: lib.mempool, level is info id 4: lib.timer, level is info id 5: pmd, level is info [...] id 37: pmd.net.bnxt.driver, level is notice id 38: pmd.net.e1000.init, level is notice id 39: pmd.net.e1000.driver, level is notice id 40: pmd.net.enic, level is info [...] $ ovs-appctl dpdk/log-set debug pmd.*:notice $ ovs-appctl dpdk/log-list global log level is debug id 0: lib.eal, level is debug id 1: lib.malloc, level is debug id 2: lib.ring, level is debug id 3: lib.mempool, level is debug id 4: lib.timer, level is debug id 5: pmd, level is debug [...] id 37: pmd.net.bnxt.driver, level is notice id 38: pmd.net.e1000.init, level is notice id 39: pmd.net.e1000.driver, level is notice id 40: pmd.net.enic, level is notice [...] Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	bond: Add 'primary' interface concept for active-backup mode.	Jeff Squyres	2020-07-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In AB bonding, if the current active slave becomes disabled, a replacement slave is arbitrarily picked from the remaining set of enabled slaves. This commit adds the concept of a "primary" slave: an interface that will always be (or become) the current active slave if it is enabled. The rationale for this functionality is to allow the designation of a preferred interface for a given bond. For example: 1. Bond is created with interfaces p1 (primary) and p2, both enabled. 2. p1 becomes the current active slave (because it was designated as the primary). 3. Later, p1 fails/becomes disabled. 4. p2 is chosen to become the current active slave. 5. Later, p1 becomes re-enabled. 6. p1 is chosen to become the current active slave (because it was designated as the primary) Note that p1 becomes the active slave once it becomes re-enabled, even if nothing has happened to p2. This "primary" concept exists in Linux kernel network interface bonding, but did not previously exist in OVS bonding. Only one primary slave interface is supported per bond, and is only supported for active/backup bonding. The primary slave interface is designated via "other_config:bond-primary" when creating a bond. Also, while adding tests for the "primary" concept, make a few small improvements to the non-primary AB bonding test. Signed-off-by: Jeff Squyres <jsquyres@cisco.com> Reviewed-by: Aaron Conole <aconole@redhat.com> Tested-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Greg Rose <gvrose8192@gmail.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	dpdk: Use DPDK 19.11.2 release.	Ian Stokes	2020-07-13	1	-0/+3
\| \| \| \| \| \| \| \| \|	Modify travis linux build script to use DPDK 19.11.2 stable release and update docs to reference 19.11.2 stable release. Update release faq to reflect latest validated DPDK versions for all branches. Signed-off-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com>
*	docs/dpdk/bridge: add datapath performance section.	Harry van Haaren	2020-07-13	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit adds a section to the dpdk/bridge.rst netdev documentation, detailing the added DPCLS functionality. The newly added commands are documented, and sample output is provided. Running the DPCLS autovalidator with unit tests by default is possible through re-compiling the autovalidator to have the highest priority at startup time. This avoids making changes to all tests, and enables debug and CI builds to validate every lookup implementation with all unit tests. Add NEWS updates for CPU ISA, dynamic subtables, and AVX512 lookup. Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
*	netdev-offload-dpdk: Support offload of clone tnl_push/output actions.	Eli Britstein	2020-07-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Tunnel encapsulation is done by tnl_push and output actions nested in a clone action. Support offloading of such flows with RTE_FLOW_ACTION_TYPE_RAW_ENCAP attribute. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	netdev-offload-dpdk: Support offload of set IPv6 actions.	Eli Britstein	2020-07-08	1	-0/+2
\| \| \| \| \| \| \| \| \|	Add support for set IPv6 actions. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roni Bar Yanai <roniba@mellanox.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	netdev-offload-dpdk: Add IPv6 pattern matching.	Eli Britstein	2020-07-08	1	-0/+1
\| \| \| \| \| \| \| \| \|	Add support for IPv6 pattern matching for offloading flows. Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roni Bar Yanai <roniba@mellanox.com> Acked-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	dpif-netdev: Delete the artificial flow limit.	Tonghao Zhang	2020-07-07	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The MAX_FLOWS constant was there from the introduction of dpif-netdev, however, later new flow-limit mechanism was implemented that controls number of datapath flows in a dynamic way on ofproto level. So, we can just remove the limit and fully rely on ofproto to decide what flow limit we need. There are no limitations for flow table size in dpif-netdev beside the artificial one. 'other_config:flow-limit' seems suitable to control this. Suggested-by: Ilya Maximets <i.maximets@ovn.org> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	netdev-offload-dpdk: Support offload of VLAN PUSH/POP actions.	Sriharsha Basavapatna	2020-06-22	1	-0/+1
\| \| \| \| \| \| \| \|	Parse VLAN PUSH/POP OVS datapath actions and add respective RTE actions. Signed-off-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Acked-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	userspace: Avoid dp_hash recirculation for balance-tcp bond mode.	Vishal Deep Ajmera	2020-06-22	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Problem: In OVS, flows with output over a bond interface of type “balance-tcp” gets translated by the ofproto layer into "HASH" and "RECIRC" datapath actions. After recirculation, the packet is forwarded to the bond member port based on 8-bits of the datapath hash value computed through dp_hash. This causes performance degradation in the following ways: 1. The recirculation of the packet implies another lookup of the packet’s flow key in the exact match cache (EMC) and potentially Megaflow classifier (DPCLS). This is the biggest cost factor. 2. The recirculated packets have a new “RSS” hash and compete with the original packets for the scarce number of EMC slots. This implies more EMC misses and potentially EMC thrashing causing costly DPCLS lookups. 3. The 256 extra megaflow entries per bond for dp_hash bond selection put additional load on the revalidation threads. Owing to this performance degradation, deployments stick to “balance-slb” bond mode even though it does not do active-active load balancing for VXLAN- and GRE-tunnelled traffic because all tunnel packet have the same source MAC address. Proposed optimization: This proposal introduces a new load-balancing output action instead of recirculation. Maintain one table per-bond (could just be an array of uint16's) and program it the same way internal flows are created today for each possible hash value (256 entries) from ofproto layer. Use this table to load-balance flows as part of output action processing. Currently xlate_normal() -> output_normal() -> bond_update_post_recirc_rules() -> bond_may_recirc() and compose_output_action__() generate 'dp_hash(hash_l4(0))' and 'recirc(<RecircID>)' actions. In this case the RecircID identifies the bond. For the recirculated packets the ofproto layer installs megaflow entries that match on RecircID and masked dp_hash and send them to the corresponding output port. Instead, we will now generate action as 'lb_output(<bond id>)' This combines hash computation (only if needed, else re-use RSS hash) and inline load-balancing over the bond. This action is used only for balance-tcp bonds in userspace datapath (the OVS kernel datapath remains unchanged). Example: Current scheme: With 8 UDP flows (with random UDP src port): flow-dump from pmd on cpu core: 2 recirc_id(0),in_port(7),<...> actions:hash(hash_l4(0)),recirc(0x1) recirc_id(0x1),dp_hash(0xf8e02b7e/0xff),<...> actions:2 recirc_id(0x1),dp_hash(0xb236c260/0xff),<...> actions:1 recirc_id(0x1),dp_hash(0x7d89eb18/0xff),<...> actions:1 recirc_id(0x1),dp_hash(0xa78d75df/0xff),<...> actions:2 recirc_id(0x1),dp_hash(0xb58d846f/0xff),<...> actions:2 recirc_id(0x1),dp_hash(0x24534406/0xff),<...> actions:1 recirc_id(0x1),dp_hash(0x3cf32550/0xff),<...> actions:1 New scheme: We can do with a single flow entry (for any number of new flows): in_port(7),<...> actions:lb_output(1) A new CLI has been added to dump datapath bond cache as given below. # ovs-appctl dpif-netdev/bond-show [dp] Bond cache: bond-id 1 : bucket 0 - slave 2 bucket 1 - slave 1 bucket 2 - slave 2 bucket 3 - slave 1 Co-authored-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com> Signed-off-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com> Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Tested-by: Matteo Croce <mcroce@redhat.com> Tested-by: Adrian Moreno <amorenoz@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	netdev-offload-tc: Revert tunnel src/dst port masks handling	Roi Dayan	2020-06-19	1	-2/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The cited commit intended to add tc support for masking tunnel src/dst ips and ports. It's not possible to do tunnel ports masking with openflow rules and the default mask for tunnel ports set to 0 in tnl_wc_init(), unlike tunnel ports default mask which is full mask. So instead of never passing tunnel ports to tc, revert the changes to tunnel ports to always pass the tunnel port. In sw classification is done by the kernel, but for hw we must match the tunnel dst port. Fixes: 5f568d049130 ("netdev-offload-tc: Allow to match the IP and port mask of tunnel") Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
*	netdev-offload-tc: Allow to match the IP and port mask of tunnel	Tonghao Zhang	2020-06-03	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch allows users to offload the TC flower rules with tunnel mask. This patch allows masked match of the following, where previously supported an exact match was supported: * Remote (dst) tunnel endpoint address * Local (src) tunnel endpoint address * Remote (dst) tunnel endpoint UDP port And also allows masked match of the following, where previously no match was supported: * Local (src) tunnel endpoint UDP port In some case, mask is useful as wildcards. For example, DDOS, in that case, we don’t want to allow specified hosts IPs or only source Ports to access the targeted host. For example: $ ovs-appctl dpctl/add-flow "tunnel(dst=2.2.2.100,src=2.2.2.0/255.255.255.0,tp_dst=4789),\ recirc_id(0),in_port(3),eth(),eth_type(0x0800),ipv4()" "" $ tc filter show dev vxlan_sys_4789 ingress ... eth_type ipv4 enc_dst_ip 2.2.2.100 enc_src_ip 2.2.2.0/24 enc_dst_port 4789 enc_ttl 64 in_hw in_hw_count 2 action order 1: gact action drop ... Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
*	userspace: Add conntrack timeout policy support.	William Tu	2020-05-01	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	Commit 1f1613183733 ("ct-dpif, dpif-netlink: Add conntrack timeout policy support") adds conntrack timeout policy for kernel datapath. This patch enables support for the userspace datapath. I tested using the 'make check-system-userspace' which checks the timeout policies for ICMP and UDP cases. Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
*	ofp-actions: Add delete field action	Yi-Hung Wei	2020-04-29	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds a new OpenFlow action, delete field, to delete a field in packets. Currently, only the tun_metadata fields are supported. One use case to add this action is to support multiple versions of geneve tunnel metadatas to be exchanged among different versions of networks. For example, we may introduce tun_metadata2 to replace old tun_metadata1, but still want to provide backward compatibility to the older release. In this case, in the new OpenFlow pipeline, we would like to support the case to receive a packet with tun_metadata1, do some processing. And if the packet is going to a switch in the newer release, we would like to delete the value in tun_metadata1 and set a value into tun_metadata2. Currently, ovs does not provide an action to remove a value in tun_metadata if the value is present. This patch fulfills the gap by adding the delete_field action. For example, the OpenFlow syntax to delete tun_metadata1 is: actions=delete_field:tun_metadata1 Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: William Tu <u9012063@gmail.com>
*	netdev-afxdp: Add interrupt mode netdev class.	William Tu	2020-04-28	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	The patch adds a new netdev class 'afxdp-nonpmd' to enable afxdp interrupt mode. This is similar to 'type=afxdp', except that the is_pmd field is set to false. As a result, the packet processing is handled by main thread, not pmd thread. This avoids burning the CPU to always 100% when there is no traffic. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	userspace: Add GTP-U support.	William Tu	2020-03-25	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GTP, GPRS Tunneling Protocol, is a group of IP-based communications protocols used to carry general packet radio service (GPRS) within GSM, UMTS and LTE networks. GTP protocol has two parts: Signalling (GTP-Control, GTP-C) and User data (GTP-User, GTP-U). GTP-C is used for setting up GTP-U protocol, which is an IP-in-UDP tunneling protocol. Usually GTP is used in connecting between base station for radio, Serving Gateway (S-GW), and PDN Gateway (P-GW). This patch implements GTP-U protocol for userspace datapath, supporting only required header fields and G-PDU message type. See spec in: https://tools.ietf.org/html/draft-hmm-dmm-5g-uplane-analysis-00 Tested-at: https://travis-ci.org/github/williamtu/ovs-travis/builds/666518784 Signed-off-by: Feng Yang <yangfengee04@gmail.com> Co-authored-by: Feng Yang <yangfengee04@gmail.com> Signed-off-by: Yi Yang <yangyi01@inspur.com> Co-authored-by: Yi Yang <yangyi01@inspur.com> Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Ben Pfaff <blp@ovn.org>
*	datapath: Update kernel test list, news and FAQ	Greg Rose	2020-03-06	1	-0/+2
\| \| \| \| \| \| \| \|	We are adding support for Linux kernels up to 5.5 so update the Travis test list, NEWS and FAQ. Signed-off-by: Greg Rose <gvrose8192@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	ofproto: Add support to watch controller port liveness in fast-failover group	Vishal Deep Ajmera	2020-03-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently fast-failover group does not support checking liveness of controller port (OFPP_CONTROLLER). However this feature can be useful for selecting alternate pipeline when controller connection itself is down for e.g. by using local DHCP server to reply for any DHCP request originating from VMs. This patch adds the support for watching controller port liveness in fast- failover group. Controller port is considered live when atleast one of-connection is alive. Example usage: ovs-ofctl add-group br-int 'group_id=1234,type=ff, bucket=watch_port:CONTROLLER,actions:<A>, bucket=watch_port:1,actions:<B> Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	netdev-dpdk: Remove deprecated ring port type.	Ilya Maximets	2020-03-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	'dpdkr' ring ports was deprecated in 2.13 release and was not actually used for a long time. Remove support now. More details in commit b4c5f00c339b ("netdev-dpdk: Deprecate ring ports.") Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Ian Stokes <ian.stokes@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	dpdk: Remove deprecated pdump support.	Ilya Maximets	2020-03-06	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	DPDK pdump was deprecated in 2.13 release and didn't actually work since 2.11. Removing it. More details in commit 4ae8c4617fd3 ("dpdk: Deprecate pdump support.") Acked-by: Aaron Conole <aconole@redhat.com> Acked-by: David Marchand <david.marchand@redhat.com> Acked-by: Ian Stokes <ian.stokes@intel.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	Set release date for 2.13.0.	Ben Pfaff	2020-02-14	1	-1/+1
\| \| \| \| \| \| \|	The "Valentine's Day" release. Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	vswitchd: Add serial number configuration.	Kirill A. Kornilov	2020-01-31	1	-0/+3
\| \| \| \| \|	Signed-off-by: Kirill A. Kornilov <kornilov@zelax.ru> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	ofproto: Do not delete datapath flows on exit by default.	Ben Pfaff	2020-01-24	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Commit e96a5c24e853 ("upcall: Remove datapath flows when setting n-threads.") caused OVS to delete datapath flows when it exits through any graceful means. This is not necessarily desirable, especially when OVS is being stopped as part of an upgrade. This commit changes OVS so that it only removes datapath flows when requested, via "ovs-appctl exit --cleanup". Acked-by: Numan Siddique <numans@ovn.org> Tested-by: Numan Siddique <numans@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	Prepare for post-2.13.0 (2.13.90).	Ben Pfaff	2020-01-21	1	-0/+4
\| \| \| \| \|	Acked-by: Gurucharan Shetty <guru@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	Prepare for 2.13.0.	Ben Pfaff	2020-01-21	1	-1/+1
\| \| \| \| \|	Acked-by: Gurucharan Shetty <guru@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	userspace: Add TCP Segmentation Offload support	Flavio Leitner	2020-01-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Abbreviated as TSO, TCP Segmentation Offload is a feature which enables the network stack to delegate the TCP segmentation to the NIC reducing the per packet CPU overhead. A guest using vhostuser interface with TSO enabled can send TCP packets much bigger than the MTU, which saves CPU cycles normally used to break the packets down to MTU size and to calculate checksums. It also saves CPU cycles used to parse multiple packets/headers during the packet processing inside virtual switch. If the destination of the packet is another guest in the same host, then the same big packet can be sent through a vhostuser interface skipping the segmentation completely. However, if the destination is not local, the NIC hardware is instructed to do the TCP segmentation and checksum calculation. It is recommended to check if NIC hardware supports TSO before enabling the feature, which is off by default. For additional information please check the tso.rst document. Signed-off-by: Flavio Leitner <fbl@sysclose.org> Tested-by: Ciara Loftus <ciara.loftus.intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
*	Typo fix: vswtch -> vswitch.	Ben Pfaff	2020-01-17	1	-1/+1
\| \| \| \| \|	Acked-by: Numan Siddique <numans@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	dpif-netdev: Modified ovs-appctl dpctl/dump-flows command	Emma Finn	2020-01-17	1	-0/+2
\| \| \| \| \| \| \| \| \| \|	Modified ovs-appctl dpctl/dump-flows command to output the miniflow bits for a given flow when -m option is passed. $ ovs-appctl dpctl/dump-flows -m Signed-off-by: Emma Finn <emma.finn@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
*	NEWS: Fix location of output actions to be under DPDK.	Eli Britstein	2020-01-17	1	-2/+2
\| \| \| \| \| \| \| \| \|	In the cited commit, the output actions NEWS was mistakenly under OVSDB instead of DPDK. Fix it. Fixes: 3c7330ebf036 ("netdev-offload-dpdk: Support offload of output action.") Signed-off-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	ovsdb-server: Allow OVSDB clients to specify the UUID for inserted rows.	Ben Pfaff	2020-01-16	1	-0/+1
\| \| \| \| \| \|	Acked-by: Han Zhou <hzhou@ovn.org> Requested-by: Leonid Ryzhyk <lryzhyk@vmware.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
*	netdev-offload-dpdk: Support offload of set TCP/UDP ports actions.	Eli Britstein	2020-01-16	1	-2/+2
\| \| \| \| \| \|	Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
*	netdev-offload-dpdk: Support offload of set IPv4 actions.	Eli Britstein	2020-01-16	1	-2/+2
\| \| \| \| \| \|	Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>