summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* python: Allow building json C extension with static OVS library.Frode Nordahl2022-07-151-1/+11
| | | | | | | | | | | | | | | Allow caller of setup.py to pass in libopenvswitch.a as an object for linking through the use of LDFLAGS environment variable when not building a shared openvswitch library. To accomplish this set the `enable_shared` environment variable to 'no'. Example: LDFLAGS=lib/libopenvswitch.a enable_shared=no setup.py install Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ci: Separate job for debs, ensure built pkg is tested.Frode Nordahl2022-07-152-18/+53
| | | | | | | | | | | | | | | | Use a separate GitHub Actions job for deb test so that we can control base image for package test. The CI deb package test code currently attempts to use `apt` to install local packages. That may not produce the expected result. Explicitly install the local packages with `dpkg` after installing dependencies from `apt` instead. Also enable test installation of ipsec deb package. Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* debian: Update packaging source from Debian/Ubuntu.Frode Nordahl2022-07-1570-2/+3485
| | | | | | | | | | | | | | | | | | | | | | | | * Update upstream OVS debian packaging to be on par with package source in Debian/Ubuntu: - Provide a openvswitch-switch-dpdk package that integrates with the dpdk package in the distributions so that end users can opt into a DPDK-enabled Open vSwitch binary. - Provide systemd service files. - Provide openvswitch-source package for reproducible integrated build of for example OVN. - Stop building shared library and subsequently remove libopenvswitch and libopenvswitch-dev binary packages. Co-authored-by: Luca Boccassi <bluca@debian.org> Signed-off-by: Luca Boccassi <bluca@debian.org> Co-authored-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com> Co-authored-by: James Page <james.page@ubuntu.com> Signed-off-by: James Page <james.page@ubuntu.com> Co-authored-by: Corey Bryant <corey.bryant@canonical.com> Signed-off-by: Corey Bryant <corey.bryant@canonical.com> Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* checkpatch: Ignore line length and leading whitespace for debian/*.Frode Nordahl2022-07-151-2/+2
| | | | | Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* debian: Archive debian packaging source.Frode Nordahl2022-07-1560-2932/+1
| | | | | | | | | | | | | | | | | | | | | | | The packaging source in the OVS repository has drifted away from what is currently in Debian and Ubuntu. This state is problematic because from time to time someone tries to build packages from the upstream OVS debian package source and then expect that package to work with up-/down-grades from-/to/ distro versions. To support the on-going work to remove the out of tree OVS kernel driver from the repository [0], an update to the debian packaging is also required. On the back of the discussion in [0] we agreed that replacing the current version with what Debian and Ubuntu is currently converging on would be preferable. This commit is a first in a series to update the upstream OVS debian packaging source to be up to date with what is currently in Debian and Ubuntu. 0: https://mail.openvswitch.org/pipermail/ovs-dev/2022-June/394634.html Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* odp-execute: Add ISA implementation of set_masked IPv4 actionEmma Finn2022-07-151-0/+206
| | | | | | | | | | | This commit adds support for the AVX512 implementation of the ipv4_set_addrs action as well as an AVX512 implementation of updating the checksums. Signed-off-by: Emma Finn <emma.finn@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* odp-execute: Add ISA implementation of set_masked ETHEmma Finn2022-07-155-22/+137
| | | | | | | | | | | This commit includes infrastructure changes for enabling set_masked_X actions and also adds support for the AVX512 implementation of the eth_set_addrs action. Signed-off-by: Emma Finn <emma.finn@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* odp-execute: Add ISA implementation of push_vlan action.Emma Finn2022-07-152-9/+67
| | | | | | | | | | This commit adds the AVX512 implementation of the push_vlan action. Signed-off-by: Emma Finn <emma.finn@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* odp-execute: Add ISA implementation of pop_vlan action.Emma Finn2022-07-154-1/+226
| | | | | | | | | | | | This commit adds the AVX512 implementation of the pop_vlan action. Signed-off-by: Emma Finn <emma.finn@intel.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* odp-execute: Add ISA implementation of actions.Emma Finn2022-07-159-8/+99
| | | | | | | | | | | | | | This commit adds the AVX512 implementation of the action functionality. Usage: $ ovs-appctl odp-execute/action-impl-set avx512 Signed-off-by: Emma Finn <emma.finn@intel.com> Signed-off-by: Harry van Haaren <harry.van.haaren@intel.com> Co-authored-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* acinclude: Add configure option to enable actions autovalidator at build time.Kumar Amber2022-07-154-0/+27
| | | | | | | | | | | | | | This commit adds a new command to allow the user to enable the actions autovalidator by default at build time thus allowing for running unit test by default. $ ./configure --enable-actions-default-autovalidator Signed-off-by: Kumar Amber <kumar.amber@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* odp-execute: Add command to switch action implementation.Emma Finn2022-07-158-0/+111
| | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a new command to allow the user to switch the active action implementation at runtime. Usage: $ ovs-appctl odp-execute/action-impl-set scalar This commit also adds a new command to retrieve the list of available action implementations. This can be used by to check what implementations of actions are available and what implementation is active during runtime. Usage: $ ovs-appctl odp-execute/action-impl-show Added separate test-case for ovs-actions show/set commands: odp-execute - actions implementation Signed-off-by: Emma Finn <emma.finn@intel.com> Signed-off-by: Kumar Amber <kumar.amber@intel.com> Signed-off-by: Sunil Pai G <sunil.pai.g@intel.com> Co-authored-by: Kumar Amber <kumar.amber@intel.com> Co-authored-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* odp-execute: Add auto validation function for actions.Emma Finn2022-07-155-0/+134
| | | | | | | | | | | | | | | | | | This commit introduced the auto-validation function which allows users to compare the batch of packets obtained from different action implementations against the linear action implementation. The autovalidator function can be triggered at runtime using the following command: $ ovs-appctl odp-execute/action-impl-set autovalidator Signed-off-by: Emma Finn <emma.finn@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* odp-execute: Add function pointer for pop_vlan action.Emma Finn2022-07-153-7/+44
| | | | | | | | | | | | | This commit removes the pop_vlan action from the large switch and creates a separate function for batched processing. A function pointer is also added to call the new batched function for the pop_vlan action. Signed-off-by: Emma Finn <emma.finn@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* odp-execute: Add function pointers to odp-execute for different action ↵Emma Finn2022-07-157-1/+228
| | | | | | | | | | | | | | | | implementations. This commit introduces the initial infrastructure required to allow different implementations for OvS actions. The patch introduces action function pointers which allows user to switch between different action implementations available. This will allow for more performance and flexibility so the user can choose the action implementation to best suite their use case. Signed-off-by: Emma Finn <emma.finn@intel.com> Acked-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* AUTHORS: Add Jinjun Gao.Ilya Maximets2022-07-141-0/+1
| | | | | | | | Jinjun Gao submitted several patches for the datapath-windows in 2019 and 2020, but wasn't added to the list of authors. Fixing that. Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ovsdb/TODO: Update the list of tasks.Ilya Maximets2022-07-141-13/+33
| | | | | | | | | | | Some of the work is already done, e.g. 'diff' file format and DNS support. Added more items collected over time including relay and local_config items. Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* man: Fix various typos across manual pages.Frode Nordahl2022-07-146-11/+11
| | | | | | | As reported by Debian lintian. Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* Fix spelling error exposed in binaries.Frode Nordahl2022-07-144-5/+5
| | | | | | | As reported by Debian lintian. Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* system-dpdk: Add unit test for user configured mempools.Kevin Traynor2022-07-141-0/+34
| | | | | | | | Test that user configured mempool params have been stored. Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* system-dpdk: Split ovsdb creation and vswitchd start.Kevin Traynor2022-07-141-7/+20
| | | | | | | | | | | Splitting them allows them to be reused separately. This is useful for setting some things in ovsdb before vswitchd is started or DPDK is initialized. Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* netdev-dpdk: Add shared mempool config.Kevin Traynor2022-07-146-5/+195
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mempools may currently be shared between DPDK ports based on port MTU and NUMA. With some hint from the user we can increase the sharing on MTU and hence reduce memory consumption in many cases. For example, a port with MTU 9000, uses a mempool with an mbuf size based on 9000 MTU. A port with MTU 1500, uses a different mempool with an mbuf size based on 1500 MTU. In this case, assuming same NUMA, both these ports could share the 9000 MTU mempool. The user must give a hint as order of creation of ports and setting of MTUs may vary and we need to ensure that upgrades from older OVS versions do not require more memory. This scheme can also prevent multiple mempools being created for cases where a port is added picking up a default MTU and an appropriate mempool, but later has it's MTU changed to a different value requiring a different mempool. Example usage: $ ovs-vsctl --no-wait set Open_vSwitch . \ other_config:shared-mempool-config=9000,1500:1,6000:1 Port added on NUMA 0: * MTU 1500, use mempool based on 9000 MTU * MTU 5000, use mempool based on 9000 MTU * MTU 9000, use mempool based on 9000 MTU * MTU 9300, use mempool based on 9300 MTU (existing behaviour) Port added on NUMA 1: * MTU 1500, use mempool based on 1500 MTU * MTU 5000, use mempool based on 6000 MTU * MTU 9000, use mempool based on 9000 MTU * MTU 9300, use mempool based on 9300 MTU (existing behaviour) Default behaviour is unchanged and mempools are still only created when needed. Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* tc: Fix misaligned access while creating pedit actions.Ilya Maximets2022-07-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | calc_offsets() function returns 'data' and 'mask' pointers, which are pointers somewhere inside struct tc_flower_key, and they are not aligned, causing misaligned memory access. For example: ipv6.rewrite_hlimit is at 148 byte offset inside the struct tc_flower_key. While the actual field is in the 7th byte of the IPv6 header in the actual packet. So, pedit will need to write the last byte of the [4-7] range to the actual packet. So, data pointer is positioned to 145th byte inside the tc_flower_key with the 000000FF mask. Obviously, 145th byte inside the structure is not 4-byte aligned. lib/tc.c:2879:34: runtime error: load of misaligned address 0x7f2802eaa321 for type 'ovs_be32' (aka 'unsigned int'), which requires 4 byte alignment 0x7f2802eaa321: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ... ^ 0 0xd7f2fb in nl_msg_put_flower_rewrite_pedits lib/tc.c:2879:34 1 0xd7f2fb in nl_msg_put_flower_acts lib/tc.c:3141:25 2 0xd6ae5a in nl_msg_put_flower_options lib/tc.c:3445:12 3 0xd6a2be in tc_replace_flower lib/tc.c:3712:17 4 0xd2bf25 in netdev_tc_flow_put lib/netdev-offload-tc.c:2224:11 5 0x94f6b7 in netdev_flow_put lib/netdev-offload.c:316:14 6 0xcbd19e in parse_flow_put lib/dpif-netlink.c:2289:11 7 0xcbd19e in try_send_to_netdev lib/dpif-netlink.c:2376:15 8 0xcbd19e in dpif_netlink_operate lib/dpif-netlink.c:2447:23 9 0x86536e in dpif_operate lib/dpif.c:1372:13 10 0x6bc289 in handle_upcalls ofproto/ofproto-dpif-upcall.c:1654:5 11 0x6bc289 in recv_upcalls ofproto/ofproto-dpif-upcall.c:892:9 12 0x6b766a in udpif_upcall_handler ofproto/ofproto-dpif-upcall.c:792:13 13 0xb5015a in ovsthread_wrapper lib/ovs-thread.c:422:12 14 0x7f280b2081ce in start_thread (/lib64/libpthread.so.0+0x81ce) 15 0x7f2809e39dd2 in clone (/lib64/libc.so.6+0x39dd2) Fix misaligned read by using appropriate functions. Fixes: 8ada482bbe19 ("tc: Add header rewrite using tc pedit action") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>
* tc: Fix misaligned access to struct tcf_t for police action.Ilya Maximets2022-07-141-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | lib/tc.c:1334:44: runtime error: member access within misaligned address 0x6210001f5a2c for type 'const struct tcf_t', which requires 8 byte alignment 0x6210001f5a2c: note: pointer points here 24 00 06 00 c3 00 00 00 00 00 00 00 c3 00 00 00 ... ^ 0 0xd7c7ea in nl_parse_tcf lib/tc.c:1334:44 1 0xd7bd3a in nl_parse_act_police lib/tc.c:1433:9 2 0xd68b1a in nl_parse_single_action lib/tc.c:1922:9 3 0xd62c7e in nl_parse_flower_actions lib/tc.c:1992:19 4 0xd62c7e in nl_parse_flower_options lib/tc.c:2107:12 5 0xd5fa32 in parse_netlink_to_tc_flower lib/tc.c:2155:12 6 0xd21760 in netdev_tc_flow_dump_next lib/netdev-offload-tc.c:1158:13 7 0x94f442 in netdev_flow_dump_next lib/netdev-offload.c:301:14 8 0xcba2f6 in dpif_netlink_flow_dump_next lib/dpif-netlink.c:1901:20 9 0x8665b6 in dpif_flow_dump_next lib/dpif.c:1135:9 10 0xee5f0f in dpctl_dump_flows lib/dpctl.c:1106:12 11 0xee27a3 in dpctl_unixctl_handler lib/dpctl.c:3035:17 12 0xc7f78b in process_command lib/unixctl.c:310:13 13 0xc7f78b in run_connection lib/unixctl.c:344:17 14 0xc7f78b in unixctl_server_run lib/unixctl.c:395:21 15 0x59acb4 in main vswitchd/ovs-vswitchd.c:130:9 16 0x7f1be043acf2 in __libc_start_main (/lib64/libc.so.6+0x3acf2) 17 0x47e91d in _start (vswitchd/ovs-vswitchd+0x47e91d) Fixes: a9b8cdde69de ("tc: Add support parsing tc police action") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>
* netdev-linux: Fix leak of a tc police get/del reply.Ilya Maximets2022-07-141-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Direct leak of 64 byte(s) in 1 object(s) allocated from: 0 0x51b1d8 in malloc (vswitchd/ovs-vswitchd+0x51b1d8) 1 0xc81ded in xmalloc__ lib/util.c:137:15 2 0xc81ded in xmalloc lib/util.c:172:12 3 0xb32153 in ofpbuf_new lib/ofpbuf.c:168:24 4 0xd563e4 in nl_sock_transact lib/netlink-socket.c:1113:34 5 0xd56261 in nl_transact lib/netlink-socket.c:1812:13 6 0xd5e096 in tc_transact lib/tc.c:238:17 7 0xd01622 in tc_del_policer_action lib/netdev-linux.c:5807:13 8 0xd2e681 in meter_tc_del_policer lib/netdev-offload-tc.c:2827:15 9 0x94ec21 in meter_offload_del lib/netdev-offload.c:245:23 10 0xcc86c4 in dpif_netlink_meter_del lib/dpif-netlink.c:4288:9 11 0x86c595 in dpif_meter_del lib/dpif.c:2014:13 12 0x663439 in free_meter_id ofproto/ofproto-dpif.c:6789:5 13 0xb47518 in ovsrcu_call_postponed lib/ovs-rcu.c:346:13 14 0xb48031 in ovsrcu_postpone_thread lib/ovs-rcu.c:362:14 15 0xb5015a in ovsthread_wrapper lib/ovs-thread.c:422:12 16 0x7f86af4081ce in start_thread (/lib64/libpthread.so.0+0x81ce) Fixes: 5c039ddc64ff ("netdev-linux: Add functions to manipulate tc police action") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>
* ovsdb: Add missing ovs-thread include.Ilya Maximets2022-07-141-0/+1
| | | | | | | | | MSVC doesn't have pthread_t defined by default as other compilers, so the build fails without the header. Fixes: 3cd2cbd684e0 ("ovsdb: Prepare snapshot JSON in a separate thread.") Signed-off-by: Ilya Maximets <i.maximets@ovn.org> Acked-by: Dumitru Ceara <dceara@redhat.com>
* ovsdb: Prepare snapshot JSON in a separate thread.Ilya Maximets2022-07-1310-24/+206
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Conversion of the database data into JSON object, serialization and destruction of that object are the most heavy operations during the database compaction. If these operations are moved to a separate thread, the main thread can continue processing database requests in the meantime. With this change, the compaction is split in 3 phases: 1. Initialization: - Create a copy of the database. - Remember current database index. - Start a separate thread to convert a copy of the database into serialized JSON object. 2. Wait: - Continue normal operation until compaction thread is done. - Meanwhile, compaction thread: * Convert database copy to JSON. * Serialize resulted JSON. * Destroy original JSON object. 3. Finish: - Destroy the database copy. - Take the snapshot created by the thread. - Write on disk. The key for this schema to be fast is the ability to create a shallow copy of the database. This doesn't take too much time allowing the thread to do most of work. Database copy is created and destroyed only by the main thread, so there is no need for synchronization. Such solution allows to reduce the time main thread is blocked by compaction by 80-90%. For example, in ovn-heater tests with 120 node density-heavy scenario, where compaction normally takes 5-6 seconds at the end of a test, measured compaction times was all below 1 second with the change applied. Also, note that these measured times are the sum of phases 1 and 3, so actual poll intervals are about half a second in this case. Only implemented for raft storage for now. The implementation for standalone databases can be added later by using a file offset as a database index and copying newly added changes from the old file to a new one during ovsdb_log_replace(). Reported-at: https://bugzilla.redhat.com/2069108 Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ovsdb: Add lazy-copy support for ovsdb_datum objects.Ilya Maximets2022-07-1313-80/+137
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently ovsdb-server is using shallow copies of some JSON objects by keeping a reference counter. JSON string objects are also used directly as ovsdb atoms in database rows to avoid extra copies. Taking this approach one step further ovsdb_datum objects can also be mostly deduplicated by postponing the copy until it actually needed. datum object itself contains a type and 2 pointers to data arrays. Adding a one more pointer to a reference counter we may create a shallow copy of the datum by simply copying type and pointers and increasing the reference counter. Before modifying the datum, special function needs to be called to perform an actual copy of the object, a.k.a. unshare it. Most of the datum modifications are performed inside the special functions in ovsdb-data.c, so that is not very hard to track. A few places like ovsdb-server.c and column mutations are accessing and changing the data directly, so a few extra unshare() calls has to be added there. This change doesn't affect the maximum memory consumption too much, because most of the copies are short-living. However, not actually performing these copies saves up to 40% of CPU time on operations with large sets. Reported-at: https://bugzilla.redhat.com/2069089 Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* tests: Add check_pkt_len action test to system-offload-traffic.Eelco Chaudron2022-07-132-5/+416
| | | | | | | Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* system-offloads-traffic: Properly initialize offload before testing.Eelco Chaudron2022-07-135-15/+20
| | | | | | | | | This patch will properly initialize offload, as it requires the setting to be enabled before starting ovs-vswitchd (or do a restart once configured). Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netdev-offload-tc: Handle check_pkt_len datapath action.Eelco Chaudron2022-07-133-53/+586
| | | | | | | | | | This change implements support for the check_pkt_len action using the TC police action, which supports MTU checking. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netdev-offload-tc: Move flower_to_match action handling to isolated function.Eelco Chaudron2022-07-131-210/+230
| | | | | | | | | | | Move handling of the individual actions in the parse_tc_flower_to_match() function to a separate function that will make recursive action handling easier. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netdev-offload-tc: Move flow_put action handling to isolated function.Eelco Chaudron2022-07-131-127/+148
| | | | | | | | | | | Move handling of the individual actions in the netdev_tc_flow_put() function to a separate function that will make recursive action handling easier. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Mike Pattrick <mkp@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* conntrack: Check for expiration before comparing the keys during the lookup.Ilya Maximets2022-07-131-2/+5
| | | | | | | | | | | This could save some costly key comparison miss, especially in the case there are many expired connections waiting for the sweeper to evict them. Acked-by: Aaron Conole <aconole@redhat.com> Co-authored-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* conntrack: Use an atomic conn expiration value.Gaetan Rivet2022-07-133-14/+17
| | | | | | | | | | | | | | | | | | | | | A lock is taken during conn_lookup() to check whether a connection is expired before returning it. This lock can have some contention. Even though this lock ensures a consistent sequence of writes, it does not imply a specific order. A ct_clean thread taking the lock first could read a value that would be updated immediately after by a PMD waiting on the same lock, just as well as the inverse order. As such, the expiration time can be stale anytime it is read. In this context, using an atomic will ensure the same guarantees for either writes or reads, i.e. writes are consistent and reads are not undefined behaviour. Reading an atomic is however less costly than taking and releasing a lock. Signed-off-by: Gaetan Rivet <grive@u256.net> Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* conntrack: Replace timeout based expiration lists with rculists.Gaetan Rivet2022-07-134-183/+145
| | | | | | | | | | | | | | | | | | | | | This patch aims to replace the expiration lists as, due to the way they are used, besides being a source of contention, they have a known issue when used with non-default policies for different zones that could lead to retaining expired connections potentially for a long time. This patch replaces them with an array of rculist used to distribute all the newly created connections in order to, during the sweeping phase, scan them without locking, and evict the expired connections only locking during the actual removal. This allows to reduce the contention introduced by the pushback performed at every packet update, also solving the issue related to zones and timeout policies. Signed-off-by: Gaetan Rivet <grive@u256.net> Co-authored-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* conntrack-tp: Use a cmap to store timeout policies.Gaetan Rivet2022-07-124-29/+38
| | | | | | | | | | | | | | | Multiple lookups are done to stored timeout policies, each time blocking the global 'ct_lock'. This is usually not necessary and it should be acceptable to get policy updates slightly delayed (by one RCU sync at most). Using a CMAP reduces multiple lock taking and releasing in the connection insertion path. Signed-off-by: Gaetan Rivet <grive@u256.net> Reviewed-by: Eli Britstein <elibr@nvidia.com> Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* conntrack: Use a cmap to store zone limits.Gaetan Rivet2022-07-124-26/+53
| | | | | | | | | | | | | | Change the data structure from hmap to cmap for zone limits. As they are shared amongst multiple conntrack users, multiple readers want to check the current zone limit state before progressing in their processing. Using a CMAP allows doing lookups without taking the global 'ct_lock', thus reducing contention. Signed-off-by: Gaetan Rivet <grive@u256.net> Reviewed-by: Eli Britstein <elibr@nvidia.com> Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* utilities/bashcomp: Fix incorrect file mode.Frode Nordahl2022-07-123-53/+53
| | | | | | | | | | | | | | | | | The bash completion scripts shipped with Open vSwitch currently have the executable bit set. This is problematic because the files do not start with a shebang and as such a user may end up executing them using the wrong shell. When installed in a system the bash shell will source these files and not execute them. This also triggers Debian lintian warnings [0] and defies Debian policy [1]. 0: https://lintian.debian.org/tags/executable-not-elf-or-script 1: https://www.debian.org/doc/debian-policy/ch-files.html#scripts Fixes: 423ede182b65 ("utilities: Add bash command-line completion script.") Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* acinclude: Fix double -Werror.Ilya Maximets2022-07-122-1/+5
| | | | | | | | | | | | | | | | | | We're adding -Werror argument twice to every compiler invocation, if configured with --enable-Werror. The reason is the double expansion of the OVS_ENABLE_WERROR macro. It's called once from the top level in configure.ac and the second time from the AC_REQUIRE while checking CXX compatibility. AC_REQUIRE by itself protects from double expansion, but it can't protect from top level calls and it can not be used outside of AC_DEFUN. One way to fix that is to use AC_DEFUN_ONCE for OVS_ENABLE_WERROR, but it's not available in older autoconf < 2.64. So, creating a separate macro with AC_REQUIRE inside for the top level invocation to make it expanded only once. Acked-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ovsdb: Enable memory trimming after compaction by default.Ilya Maximets2022-07-122-1/+4
| | | | | | | | | | Memory trimming was introduced in OVS 2.15 and didn't cause any issues in production environments since then, while allowing ovsdb-sever to consume a lot less memory in high scale OVN deployments. Enabling by default to make it easier to use. Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* tests: Add test for later IPv6 fragments nw_proto=44.Rosemarie O'Riorden2022-07-121-0/+34
| | | | | | | | | | | | | | | | | | | | | | This is a regression test to make sure that all later IPv6 fragments have proto=44 in the flow key, and that there are not any later IPv6 frag flows that do not have it. Previously, the way that later IPv6 fragments' nw_proto field is parsed in the kernel was changed to equal the next_header field of the last extension header. The same change was not made in OVS userspace. This was a problem because OVS creates actions based on what is parsed in userspace, but the kernel-provided flow key is used as a match criteria. This lead to issues such as packets incorrectly matching on a flow and thus the wrong list of actions being applied to the packet. Therefore, OVS and the kernel must parse this field the same way to prevent this issue. OVS and the kernel both currently parse this field the same way for later IPv6 fragments, with nw_proto=44. Signed-off-by: Rosemarie O'Riorden <roriorden@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* meta-flow: Document nw_proto limitation for IPv6 later frags.Paolo Valerio2022-07-121-0/+9
| | | | | Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* dpif-avx512: Add support for simple match lookup.Cian Ferriter2022-07-121-3/+50
| | | | | | | | | | | | | | | | | Perform scalar simple match lookup in AVX512 DPIF by reusing the simple match lookup functions. The simple match lookup is placed in a separate per packet loop before the batch miniflow extract call since miniflow extract can be skipped when simple match is being used. Unsuccessful lookup during simple match lookup means an upcall is required because there is no suitable flow in the datapath. Fall back to the scalar DPIF to do this upcall just like we already do later in AVX512 DPIF when we have misses in the DPCLS. Signed-off-by: Cian Ferriter <cian.ferriter@intel.com> Tested-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* dpif-netdev: Refactor simple match lookup functions.Cian Ferriter2022-07-122-8/+15
| | | | | | | | | | Make the simple match functions used during lookup non-static to allow reuse of these functions in the AVX512 DPIF. Signed-off-by: Cian Ferriter <cian.ferriter@intel.com> Tested-by: Harry van Haaren <harry.van.haaren@intel.com> Acked-by: Sunil Pai G <sunil.pai.g@intel.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* doc: Add meter offload topic documentJianbo Liu2022-07-113-0/+116
| | | | | | | | For now, add introduction and the limitation of meter offload. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>
* dpif-netlink: Offloading meter to tc police actionJianbo Liu2022-07-113-5/+126
| | | | | | | | | | | OVS meters are created in advance and openflow rules refer to them by their unique ID. New tc_police API is used to offload them. By calling the API, police actions are created and meters are mapped to them. These actions then can be used in tc filter rules by the index. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>
* netdev-offload-tc: Offloading rules with police actionsJianbo Liu2022-07-112-2/+65
| | | | | | | | | | | | | When offloading rule, tc should be filled with police index, instead of meter id. As meter is mapped to police action, and the mapping is inserted into meter_id_to_police_idx hmap, this hmap is used to find the police index. Besides, the reverse mapping between meter id and police index is also added, so meter id can be retrieved from this hashmap and pass to dpif while dumping rules. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>
* netdev-offload-tc: Cleanup police actions with reserved indexes on startupJianbo Liu2022-07-113-0/+148
| | | | | | | | | | | | As the police actions with indexes of 0x10000000-0x1fffffff are reserved for meter offload, to provide a clean environment for OVS, these reserved police actions should be deleted on startup. So dump all the police actions, delete those actions with indexes in this range. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>
* netdev-offload-tc: Implement meter offload API for tcJianbo Liu2022-07-111-0/+202
| | | | | | | | | | | | For dpif-netlink, meters are mapped by tc to police actions with one-to-one relationship. Implement meter offload API to set/get/del the police action, and a hmap is used to save the mappings. An id-pool is used to manage all the available police indexes, which are 0x10000000-0x1fffffff, reserved only for OVS. Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Simon Horman <simon.horman@corigine.com>