summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
...
* ovs-router: Introduce src option in ovs/route/add command.Nobuhiro MIKI2023-03-074-13/+161
| | | | | | | | | | | | | | | | | | | When adding a route with ovs/route/add command, the source address in "ovs_router_entry" structure is always the FIRST address that the interface has. See "ovs_router_get_netdev_source_address" function for more information. If an interface has multiple ipv4 and/or ipv6 addresses, there are use cases where the user wants to control the source address. This patch therefore addresses this issue by adding a src parameter. Note that same constraints also exist when caching routes from Kernel FIB with Netlink, but are not dealt with in this patch. Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ofproto: Fix man page for tunnel related commands.Nobuhiro MIKI2023-03-073-8/+8
| | | | | | | | | | | Fixed the manual page to indicate that both IPv4/IPv6 are supported. Also added missing pkt_mark on one side and fixed the "gw" and "bridge" notation quirks. Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ovs-router: Cleanup parser for ovs/route/add command.Nobuhiro MIKI2023-03-072-28/+52
| | | | | | | | | | | | | | This patch cleans up the parser to accept pkt_mark and gw in any order. pkt_mark and gw are normally expected to be specified exactly once. However, as with other tools, if specified multiple times, the last specification is used. Also, pkt_mark and gw have separate prefix strings so they can be parsed in any order. Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netdev-dummy: Support multiple IP addresses.Nobuhiro MIKI2023-03-071-26/+44
| | | | | | | | | | This is useful in test cases where multiple IPv4/IPv6 addresses are assigned together. Acked-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* AUTHORS: Add Fangrui Song.Ilya Maximets2023-03-061-0/+1
| | | | Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* treewide: Remove uses of ATOMIC_VAR_INIT.Fangrui Song2023-03-0616-28/+19
| | | | | | | | | | ATOMIC_VAR_INIT has a trivial definition `#define ATOMIC_VAR_INIT(value) (value)`, is deprecated in C17/C++20, and will be removed in newer standards in newer GCC/Clang (e.g. https://reviews.llvm.org/D144196). Signed-off-by: Fangrui Song <maskray@google.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netdev-windows: Add checking when creating netdev with system type on WindowsWilson Peng2023-03-061-0/+11
| | | | | | | | | | | | | | | | | | | | In the recent Antrea project testing, some port could not be created on Windows. When doing debug, our team found there is one case happening when multiple ports are waiting for be created with correct port number. Some system type port will be created netdev successfully and it will cause conflict as in the dpif side it will be internal type. So finally the port will be created failed and it could not be easily recovered. With the patch, on Windows the netdev creating will be blocked for system type when the ovs_tyep got on dpif is internal. More detailed case description is in the reported issue No.262 with link below. Reported-at:https://github.com/openvswitch/ovs-issues/issues/262 Signed-off-by: Wilson Peng <pweisong@vmware.com> Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org>
* ofproto-dpif-upcall: Include hardware offloaded flows in total flows.Eelco Chaudron2023-03-031-0/+11
| | | | | | | | | | | | | | | | The revalidator process uses the internal call udpif_get_n_flows() to get the total number of flows installed in the system. It uses this value for various decisions on flow installation and removal. With the tc offload this values is incorrect, as the hardware offloaded are not included. With rte_flow offload this is not a problem as dpif netdev keeps both in sync. This patch will include the hardware offloaded flows if the underlying dpif implementation is not syncing them. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ofproto-dpif-upcall: Reset ukey's last stats value if the datapath changed.Eelco Chaudron2023-03-037-2/+116
| | | | | | | | | | | | | When the ukey's action set changes, it could cause the flow to use a different datapath, for example, when it moves from tc to kernel. This will cause the the cached previous datapath statistics to be used. This change will reset the cached statistics when a change in datapath is discovered. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* classifier: Fix missing masks on a final stage with ports trie.Ilya Maximets2023-02-282-5/+108
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Flow lookup doesn't include masks of the final stage in a resulting flow wildcards in case that stage had L4 ports match. Only the result of ports trie lookup is added to the mask. It might be sufficient in many cases, but it's not correct, because ports trie is not how we decided that the packet didn't match in this subtable. In fact, we used a full subtable mask in order to determine that, so all the subtable mask bits has to be added. Ports trie can still be used to adjust ports' mask, but it is not sufficient to determine that the packet didn't match. Assuming we have following 2 OpenFlow rules on the bridge: table=0, priority=10,tcp,tp_dst=80,tcp_flags=+psh actions=drop table=0, priority=0 actions=output(1) The first high priority rule supposed to drop all the TCP data traffic sent on port 80. The handshake, however, is allowed for forwarding. Both 'tcp_flags' and 'tp_dst' are on the final stage in the flow. Since the stage mask from that stage is not incorporated into the flow wildcards and only ports mask is getting updated, we have the following megaflow for the SYN packet that has no match on 'tcp_flags': $ ovs-appctl ofproto/trace br0 "in_port=br0,tcp,tp_dst=80,tcp_flags=syn" Megaflow: recirc_id=0,eth,tcp,in_port=LOCAL,nw_frag=no,tp_dst=80 Datapath actions: 1 If this flow is getting installed into datapath flow table, all the packets for port 80, regardless of TCP flags, will be forwarded. Incorporating all the looked at bits from the final stage into the stages map in order to get all the necessary wildcards. Ports mask has to be updated as a last step, because it doesn't cover the full 64-bit slot in the flowmap. With this change, in the example above, OVS is producing correct flow wildcards including match on TCP flags: Megaflow: recirc_id=0,eth,tcp,in_port=LOCAL,nw_frag=no,tp_dst=80,tcp_flags=-psh Datapath actions: 1 This way only -psh packets will be forwarded, as expected. This issue affects all other fields on stage 4, not only TCP flags. Tests included to cover tcp_flags, nd_target and ct_tp_src/dst. First two are frequently used, ct ones are sharing the same flowmap slot with L4 ports, so important to test. Before the pre-computation of stage masks, flow wildcards were updated during lookup, so there was no issue. The bits of the final stage was lost with introduction of 'stages_map'. Recent adjustment of segment boundaries exposed 'tcp_flags' to the issue. Reported-at: https://github.com/openvswitch/ovs-issues/issues/272 Fixes: ca44218515f0 ("classifier: Adjust segment boundary to execute prerequisite processing.") Fixes: fa2fdbf8d0c1 ("classifier: Pre-compute stage masks.") Acked-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* system-traffic.at: Add icmp error tests while dnatting address and port.Paolo Valerio2023-02-281-0/+74
| | | | | | | | | | | The two tests verify, for both icmp and icmpv6, that the correct port translation happen in the inner packet in the case an error is received in the reply direction. Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* test: Move check for tc ingress pps support to test script.Simon Horman2023-02-282-13/+12
| | | | | | | | | | | | | | | | | | | | | | | | | Move check for tc ingress pps support to from aclocal to test script This has several problems: 1. Stderror from failing commands is output when executing various make targets. 2. There are various failure conditions that lead to veth0 and veth1 being created by not cleaned up. 3. The check seems to execute for many make targets. And it attempts to temporarily modify system state. This seems inappropriate. 4. veth0 and veth1 seem far too generic and could easily conflict with other parts of the system. All these problems are addressed by this patch. Signed-off-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Louis Peens <louis.peens@corigine.com> Acked-by: Ilya Maximets <i.maximets@ovn.org>
* ipfix: Make template and stats interval configurable.Adrian Moreno2023-02-276-12/+88
| | | | | | | | | Add options to the IPFIX table configure the interval to send statistics and template information. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ofproto: Fix re-creation of tunnel backing interfaces on restart.Ilya Maximets2023-02-272-19/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tunnel OpenFlow ports do not exist in the datapath, instead there is a tunnel backing interface that serves all the tunnels of the same type. For example, if the geneve port 'my_tunnel' is added to OVS, it will create 'geneve_sys_6041' datapath port, if it doesn't already exist, and use this port as a tunnel output. However, while creating/opening a new datapath after re-start, ovs-vswitchd only has a list of names of OpenFlow interfaces. And it thinks that each datapath port, that is not on the list, is a stale port that needs to be removed. This is obviously not correct for tunnel backing interfaces that can serve multiple tunnel ports and do not match OpenFlow port names. This is causing removal and re-creation of all the tunnel backing interfaces in the datapath on OVS restart, causing disruption in existing connections. It's hard to tell by only having a name of the interface if this interface is a tunnel backing interface, or someone just named a normal interface this way. So, instead of trying to determine that, not removing any interfaces at all, while we don't know types of actual ports we need. Assuming that all the ports that are currently not in the list of OF ports are tunnel backing ports. Later, revalidation of tunnel backing ports in type_run() will determine which ports are still needed and which should be removed. It's OK to add even a non-tunnel stale ports into tnl_backers, they will be cleaned up the same way as stale tunnel backers. Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2023-February/052215.html Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* AUTHORS: Add Liang Mancang and Viacheslav Galaktionov.Ilya Maximets2023-02-211-0/+2
| | | | Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ofproto: Include flow cookies in bridge/dump-flows output.Viacheslav Galaktionov2023-02-211-0/+4
| | | | | | | | Cookies are an important part of flow descriptions and must be available to the end user. Signed-off-by: Viacheslav Galaktionov <viacheslav.galaktionov@arknetworks.am> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* conntrack: Fix conntrack_clean may access the same exp_list each time.Liang Mancang2023-02-211-2/+2
| | | | | | | | | | | | | when a exp_list contains more than the clean_end's number of nodes, and these nodes will not expire immediately. Then, every times we call conntrack_clean, it use the same next_sweep to get exp_list. Actually, we should add i every times after we call ct_sweep. Fixes: 3d9c1b855a5f ("conntrack: Replace timeout based expiration lists with rculists.") Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Liang Mancang <liangmc1@chinatelecom.cn> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ovs-actions: Correct typo in ovs-actions man page.Mike Pattrick2023-02-211-1/+1
| | | | | | | There was a minor typo in the ovs-actions man page. Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ofproto-ipfix: Use per-domain template timeouts.Adrian Moreno2023-02-211-24/+105
| | | | | | | | | | | | | | | | | IPFIX templates have to be sent for each Observation Domain ID. Currently, a timer is kept at each dpif_ipfix_exporter to send them. This works fine for per-bridge sampling where there is only one Observation Domain ID per exporter. However, this is does not work for per-flow sampling where more than one Observation Domain IDs can be specified by the controller. In this case, ovs-vswitchd will only send template information for one (arbitrary) DomainID. Fix per-flow sampling by using an hmap to keep a timer for each Observation Domain ID. Signed-off-by: Adrian Moreno <amorenoz@redhat.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* MAINTAINERS: Move myself to emeritus status.Russell Bryant2023-02-211-2/+2
| | | | | | | | I have not been active in OVS development in long enough that I should move to emeritus status. Signed-off-by: Russell Bryant <russell@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* AUTHORS: Add Miika Petäjäniemi.Ilya Maximets2023-02-211-0/+1
| | | | Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netdev-linux: Add jitter parameter to the netem qos options.Miika Petäjäniemi2023-02-213-8/+24
| | | | | | | | Adds jitter option to enable emulating latency fluctuation with netem. Submitted-at: https://github.com/openvswitch/ovs/pull/407 Signed-off-by: Miika Petäjäniemi <miika.petajaniemi@solita.fi> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* AUTHORS: Add Vladislav Odintsov.Ilya Maximets2023-02-201-0/+1
| | | | Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* utilities: Add support to set umask in ovs-ctl.Vladislav Odintsov2023-02-203-7/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds new ovs-ctl options to pass umask configuration to allow OVS daemons set requested socket permissions on group. Previous behaviour (if using with systemd service unit) created sockets with 0750 permissions mask (group has no write permission). Write permission for group is reasonable in usecase, where ovs-vswitchd or ovsdb-server runs as a non-privileged user:group (say, openvswitch:openvswitch) and it is needed to access unix socket from process running as another non-privileged user. In this case administrator has to add that user to openvswitch group and can connect to OVS sockets from a process running under that user. Two new ovs-ctl options --ovsdb-server-umask and --ovs-vswitchd-umask were added to manage umask values for appropriate daemons. This is useful for systemd users: both ovs-vswitchd and ovsdb-server systemd units read options from single /etc/sysconfig/openvswitch configuration file. So, with separate options it is possible to set umask only for specific daemon. OPTIONS="--ovsdb-server-umask=0002" in /etc/openvswitch/sysconfig file will set umask to 0002 value before starting only ovsdb-server, while OPTIONS="--ovs-vswitchd-umask=0002" will set umask to ovs-vswitchd daemon. Previous behaviour (not setting umask) is left as default. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2023-January/401501.html Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Vladislav Odintsov <odivlad@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* Set release date for 3.1.0.Ilya Maximets2023-02-162-2/+2
| | | | | Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* test: Remove duplicate test from system-offloads-traffic.at.Eelco Chaudron2023-02-151-64/+0
| | | | | | | | | Remove the "offloads - simulated flow action update" test case, as it's covered by the "datapath - simulated flow action update" test. Fixes: b1f58f5072d6 ("netdev-offload-tc: Preserve tc statistics when flow gets modified.") Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ofproto-dpif-upcall: Use last known stats ukey stats on revalidate missed dp ↵Eelco Chaudron2023-02-151-1/+1
| | | | | | | | | | | | | | | flows. Instead of using all zero stats when executing a revalidate for missed dp flows, use the last known stats to avoid odd statistics being used. As these zero stats are stored in the ukey, the next time revalidate_ukey() is called the delta between the new stats and the zero stats is used, which would cause an additional increase in total packets/bytes. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Michael Santana <msantana@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* conntrack: Properly unNAT inner header of related traffic.Ales Musil2023-02-132-163/+196
| | | | | | | | | | | The inner header was not handled properly. Simplify the code which allows proper handling of the inner headers. Reported-at: https://bugzilla.redhat.com/2137754 Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ales Musil <amusil@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* sparse: Fix build with DPDK and GCC 12.David Marchand2023-02-101-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | rte_vect.h pulls some AVX512 instrinsics headers added in GCC 12 [1] trigger a lot of warnings: libtool: compile: env "REAL_CC=ccache gcc" "CHECK=sparse -Wsparse-error -I ../include/sparse -I ../include -m64 -I /usr/local/include " cgcc -target=x86_64 -target=host_os_specs -D__MMX__=1 -D__MMX_WITH_SSE__=1 -D__SSE2_MATH__=1 -D__SSE_MATH__=1 -D__SSE__=1 -D__SSE2__=1 -DHAVE_CONFIG_H -I. -I.. -I ../include -I ./include -I ../lib -I ./lib -Wstrict-prototypes -Wall -Wextra -Wno-sign-compare -Wpointer-arith -Wformat -Wformat-security -Wswitch-enum -Wunused-parameter -Wbad-function-cast -Wcast-align -Wstrict-prototypes -Wold-style-definition -Wmissing-prototypes -Wmissing-field-initializers -fno-strict-aliasing -Wswitch-bool -Wlogical-not-parentheses -Wsizeof-array-argument -Wbool-compare -Wshift-negative-value -Wduplicated-cond -Wshadow -Wmultistatement-macros -Wcast-align=strict -mssse3 -I/home/dmarchan/git/pub/dpdk.org/22.11/install/include -include rte_config.h -I/usr/local/include -Werror -D_FILE_OFFSET_BITS=64 -g -O2 -MT lib/bfd.lo -MD -MP -MF lib/.deps/bfd.Tpo -c ../lib/bfd.c -o lib/bfd.o ../lib/bfd.c: note: in included file (through /usr/lib/gcc/x86_64-redhat-linux/12//include/immintrin.h, /usr/lib/gcc/x86_64-redhat-linux/12//include/x86intrin.h, ...): /usr/lib/gcc/x86_64-redhat-linux/12//include/avx512fp16intrin.h:38:9: error: '_Float16' has implicit type /usr/lib/gcc/x86_64-redhat-linux/12//include/avx512fp16intrin.h:38:18: error: Expected ; at end of declaration /usr/lib/gcc/x86_64-redhat-linux/12//include/avx512fp16intrin.h:38:18: error: got __v8hf /usr/lib/gcc/x86_64-redhat-linux/12//include/avx512fp16intrin.h:62:41: error: Expected ; at end of statement /usr/lib/gcc/x86_64-redhat-linux/12//include/avx512fp16intrin.h:62:41: error: got { /usr/lib/gcc/x86_64-redhat-linux/12//include/avx512fp16intrin.h:420:32: error: Expected ) in expression /usr/lib/gcc/x86_64-redhat-linux/12//include/avx512fp16intrin.h:420:32: error: got __A /usr/lib/gcc/x86_64-redhat-linux/12//include/avx512fp16intrin.h:2271:61: error: Expected ) in function call /usr/lib/gcc/x86_64-redhat-linux/12//include/avx512fp16intrin.h:2271:61: error: got __A /usr/lib/gcc/x86_64-redhat-linux/12//include/avx512fp16intrin.h:2279:61: error: Expected ) in function call /usr/lib/gcc/x86_64-redhat-linux/12//include/avx512fp16intrin.h:2279:61: error: got __A /usr/lib/gcc/x86_64-redhat-linux/12//include/avx512fp16intrin.h:2328:50: error: Expected ) in function call [...] Besides, the list of headers by rte_memcpy.h is now out of sync with DPDK. OVS takes care to include the right headers in its sources. Simply make this header self-sufficient. 1: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=a68412117fa4 Signed-off-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ovsdb-server: Fix handling of DNS name for listener configuration.Frode Nordahl2023-02-103-20/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 08e9e5337383 fixed proper initialization of the dns-resolve module, and made DNS resolution asynchronous. A side effect of that change revealed a long standing logic bug which broke ovsdb-server listener configuration using DNS names. Previously this worked because the DNS resolution would block, now that DNS resolution is asynchronous the code before this change would assume the error from jsonrpc_pstream_open meant the remote was a specification for an active outgoing connection, even when that was not the case. To fix this a couple of changes was made to socket-util: 1) Pass optional result of dns resolution from inet_parse_passive. When (re-)configuring listeners that use DNS names, we may need to know whether the provided connection string is invalid or if the provided DNS name has finished resolving. 2) Check dns resolution status in inet_open_passive. If the connection string is valid, and contains a DNS name, inet_open_passive will now return -EAGAIN if dns resolution failed. DNS resolution failure may either mean the asynchronous resolver has not completed yet, or that the name does not resolve. Reported-at: https://bugs.launchpad.net/bugs/1998781 Fixes: 08e9e5337383 ("ovsdb: raft: Fix inability to read the database with DNS host names.") Fixes: 771680d96fb6 ("DNS: Add basic support for asynchronous DNS resolving") Signed-off-by: Frode Nordahl <frode.nordahl@canonical.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* tests: Comment currently failing TC system-traffic tests.Eelco Chaudron2023-02-092-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | I commented the three remaining failures when running tc with the system-traffic tests. In addition I ran the following test to verify we did not see any failures with recheck enabled: for i in {1..50}; do make check-offloads || \ make check-offloads TESTSUITEFLAGS="--recheck" || break; \ echo "ALL_50_OK: $i"; done; Unfortunately, a bunch of test cases showed occasional failures. For now, they are excluded from the test cases and need further investigation. They are: datapath - truncate and output to gre tunnel datapath - truncate and output to gre tunnel by simulated packets These tests where executed on a Fedora37 machine with the kernel 6.1.5-200.fc37.x86_64 installed. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* tests: Fix reading of OpenFlow byte counters in GRE test cases.Eelco Chaudron2023-02-091-11/+7
| | | | | | | | | | | With some datapaths, read TC, it takes a bit longer to update the OpenFlow statistics. Rather than adding an additional delay, try to read the counters multiple times until we get the desired value. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netdev-offload-tc: If the flow has not been used, report it as such.Eelco Chaudron2023-02-092-2/+13
| | | | | | | | | | | | | | | If a tc flow was installed but has not yet been used, report it as such. In addition, add a delay to the "IGMP - flood under normal action" test case to make it work with many repetitions. This delay is also present in other ICMP/IGMP tests. Fixes: f98e418fbdb6 ("tc: Add tc flower functions") Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* odp-util: Make odp_flow_key_from_flow__ nlattr order the same as the kernel.Eelco Chaudron2023-02-0912-106/+102
| | | | | | | | | | | | | | Make the order of the Netlink attributes for odp_flow_key_from_flow__() the same as the kernel will return them. This will make sure the attributes displayed in the dpctl/dump-flows output appear in the same order for all datapath. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* test: Fix 'conntrack - Multiple ICMP traverse' for tc case.Eelco Chaudron2023-02-091-2/+1
| | | | | | | | | | | | | | tc does not include ethernet header length in packet byte count. This fix will allow the packets that go trough tc to be 14 bytes less. This difference in the TC implementation is already described in tc-offload.rst. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* test: Tc does not support conntrack timeout, skip the related test.Eelco Chaudron2023-02-092-1/+6
| | | | | | | | | | | | | | The tc conntrack implementation does not support the timeout option. The current implementation is silently ignoring the timeout option by adding a general conntrack entry. This patch will skip the related test by overriding the support macro. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netdev-offload-tc: Conntrack ALGs are not supported with tc.Eelco Chaudron2023-02-094-15/+21
| | | | | | | | | | | | | | | tc does not support conntrack ALGs. Even worse, with tc enabled, they should not be used/configured at all. This is because even though TC will ignore the rules with ALG configured, i.e., they will flow through the kernel module, return traffic might flow through a tc conntrack rule, and it will not invoke the ALG helper. Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support") Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* test: Flush datapath when changing rules on the fly.Eelco Chaudron2023-02-091-1/+6
| | | | | | | | | | | Flush datapath flows as TC flows take some more time to be flushed out. The flush speeds this up. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netdev-offload-tc: Fix tc conntrack force commit support.Eelco Chaudron2023-02-091-2/+11
| | | | | | | | | | | | tc was not setting the OVS_CT_ATTR_FORCE_COMMIT flag when a forced commit was requested. This patch will fix this. Fixes: 576126a931cd ("netdev-offload-tc: Add conntrack support") Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* test: Do not use MPLS implicit null label in test cases.Eelco Chaudron2023-02-081-6/+4
| | | | | | | | | | | TC flower does not allow the push of the implicit null labels (RFC3032). Avoid the use of such labels in the MPLS test cases. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* tests: Include working system-traffic tests into the system-offloads-testsuite.Eelco Chaudron2023-02-087-2/+79
| | | | | | | | | | | | | | | | Include and run the system-traffic.at tests as part of the system offload testsuite. Exclude all the tests that will not run without any special modifications. Lowered log level for "recirc_id sharing not supported" message, so tests will not fail with older kernels. This is not an error level message, but should be debug, like all other, EOPNOTSUPP, related log messages. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Tested-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* ofproto-dpif-upcall: New ukey needs to take the old ukey's dump seq.Peng He2023-02-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The userspace datapath manages all the magaflows by a cmap. The cmap data structure will grow/shrink during the datapath processing and it will re-position megaflows. This might result in two revalidator threads might process a same megaflow during one dump stage. Consider a situation that, revalidator 1 processes a megaflow A, and decides to delete it from the datapath, at the mean time, this megaflow A is also queued in the process batch of revalidator 2. Normally it's ok for revalidators to process the same megaflow multiple times, as the dump_seq shows it's already dumped and the stats will not be contributed twice. Assume that right after A is deleted, a PMD thread generates again a new megaflow B which has the same match and action of A. The ukey of megaflow B will replace the one of megaflow A. Now the ukey B is new to the revalidator system and its dump seq is 0. Now since the dump seq of ukey B is 0, when processing megaflow A, the revalidator 2 will not identify this megaflow A has already been dumped by revalidator 1 and will contribute the old megaflow A's stats again, this results in an inconsistent stats between ukeys and megaflows. To fix this, the newly generated the ukey B should take the dump_seq of the replaced ukey A to avoid a same megaflow being revalidated twice in one dump stage. We observe in the production environment, the OpenFlow rules' stats sometimes are amplified compared to the actual value. Signed-off-by: Peng He <hepeng.0320@bytedance.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* hash: Avoid 64bit crc intrinsics on 32bit aligned data.Mike Pattrick2023-02-031-1/+64
| | | | | | | | | | | | | | | UB Sanitizer report: lib/hash.h:219:17: runtime error: load of misaligned address 0x7ffc164a88b4 for type 'const uint64_t', which requires 8 byte alignment #0 in hash_words_inline lib/hash.h:219 #1 in hash_words lib/hash.h:297 [...] Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* dp-packet: Allocate on cacheline boundary with DPDK.Mike Pattrick2023-02-032-0/+8
| | | | | | | | | | | | | | | | | UB Sanitizer report: lib/dp-packet.h:587:22: runtime error: member access within misaligned address 0x000001ecde10 for type 'struct dp_packet', which requires 64 byte alignment #0 in dp_packet_set_base lib/dp-packet.h:587 #1 in dp_packet_use__ lib/dp-packet.c:46 #2 in dp_packet_use lib/dp-packet.c:60 #3 in dp_packet_init lib/dp-packet.c:126 #4 in dp_packet_new lib/dp-packet.c:150 [...] Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netdev-dummy: Allocate dummy_packet_stream on cacheline boundary.Mike Pattrick2023-02-031-5/+5
| | | | | | | | | | | | | | | | UB Sanitizer report: lib/netdev-dummy.c:197:15: runtime error: member access within misaligned address 0x00000217a7f0 for type 'struct dummy_packet_stream', which requires 64 byte alignment ^ #0 dummy_packet_stream_init lib/netdev-dummy.c:197 #1 dummy_packet_stream_create lib/netdev-dummy.c:208 #2 dummy_packet_conn_set_config lib/netdev-dummy.c:436 [...] Signed-off-by: Mike Pattrick <mkp@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* netdev-offload-tc: Preserve tc statistics when flow gets modified.Eelco Chaudron2023-02-034-23/+218
| | | | | | | | | | | | | | | When a flow gets modified, i.e. the actions are changes, the tc layer will remove, and re-add the flow. This is causing all the counters to be reset. This patch will remember the previous tc counters and adjust any requests for statistics. This is done in a similar way as the rte_flow implementation. It also updates the check_pkt_len tc test to purge the flows, so we do not use existing updated tc flow counters, but start with fresh installed set of datapath flows. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* sparse: Fix numa.h for libnuma >= 2.0.13.Ilya Maximets2023-02-031-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Current numa.h header for sparse re-defines functions in a way that breaks the header from libnuma 2.0.13+, because the original issue was fixed in that version: https://github.com/numactl/numactl/commit/25dcde021dd4f1a1dcac2ba0094f1cb441a2e4a5 Sparse errors as a result: lib/netdev-afxdp.c: note: in included file (through include/sparse/numa.h): /usr/include/numa.h:346:26: error: macro "numa_get_interleave_mask_compat" passed 1 arguments, but takes just 0 /usr/include/numa.h:376:26: error: macro "numa_get_membind_compat" passed 1 arguments, but takes just 0 /usr/include/numa.h:406:26: error: macro "numa_get_run_node_mask_compat" passed 1 arguments, but takes just 0 /usr/include/numa.h:347:1: error: Expected ; at end of declaration /usr/include/numa.h:347:1: error: got { /usr/include/numa.h:351:9: error: 'tp' has implicit type It's hard to adjust defines to work with both versions of a header. Just defining all the functions we actually use in OVS instead and not including the original header. Fixes: e8568993e062 ("netdev-afxdp: NUMA-aware memory allocation for XSK related memory.") Reviewed-by: David Marchand <david.marchand@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* AUTHORS: Add wangchuanlei.Ilya Maximets2023-01-311-0/+1
| | | | Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* dpctl: Add support to count upcall packets.wangchuanlei2023-01-318-12/+61
| | | | | | | | | Add support to count upcall packets per port, both succeed and failed, which is a better way to see how many packets upcalled on each interface. Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: wangchuanlei <wangchuanlei@inspur.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
* tc: Add TCA_KIND flower to delete and get operation to avoid rtnl_lock().Eelco Chaudron2023-01-304-13/+22
| | | | | | | | | | | | | | | | | | | | A long long time ago, an effort was made to make tc flower rtnl_lock() free. However, on the OVS part we forgot to add the TCA_KIND "flower" attribute, which tell the kernel to skip the lock. This patch corrects this by adding the attribute for the delete and get operations. The kernel code calls tcf_proto_is_unlocked() to determine the rtnl_lock() is needed for the specific tc protocol. It does this in the tc_new_tfilter(), tc_del_tfilter(), and in tc_get_tfilter(). If the name is not set, tcf_proto_is_unlocked() will always return false. If set, the specific protocol is queried for unlocked support. Fixes: f98e418fbdb6 ("tc: Add tc flower functions") Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>