| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tunnel OpenFlow ports do not exist in the datapath, instead there is a
tunnel backing interface that serves all the tunnels of the same type.
For example, if the geneve port 'my_tunnel' is added to OVS, it will
create 'geneve_sys_6041' datapath port, if it doesn't already exist,
and use this port as a tunnel output.
However, while creating/opening a new datapath after re-start,
ovs-vswitchd only has a list of names of OpenFlow interfaces. And it
thinks that each datapath port, that is not on the list, is a stale
port that needs to be removed. This is obviously not correct for
tunnel backing interfaces that can serve multiple tunnel ports and do
not match OpenFlow port names.
This is causing removal and re-creation of all the tunnel backing
interfaces in the datapath on OVS restart, causing disruption in
existing connections.
It's hard to tell by only having a name of the interface if this
interface is a tunnel backing interface, or someone just named a
normal interface this way. So, instead of trying to determine that,
not removing any interfaces at all, while we don't know types of
actual ports we need.
Assuming that all the ports that are currently not in the list of OF
ports are tunnel backing ports. Later, revalidation of tunnel backing
ports in type_run() will determine which ports are still needed and
which should be removed.
It's OK to add even a non-tunnel stale ports into tnl_backers, they
will be cleaned up the same way as stale tunnel backers.
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2023-February/052215.html
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Add extension that allows to flush connections from CT
by specifying fields that the connections should be
matched against. This allows to match only some fields
of the connection e.g. source address for orig direction.
Reported-at: https://bugzilla.redhat.com/2120546
Signed-off-by: Ales Musil <amusil@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If lldp didn't change, we are not supposed to trigger backer
revalidation.
Without this patch, bridge_reconfigure() always trigger udpif
revalidator because of lldp.
Signed-off-by: lic121 <lic121@chinatelecom.cn>
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Co-authored-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
| |
ipfix cfg creation/deleting triggers revalidation. But this does
not cover the case where ipfix options changes, which also suppose
to trigger revalidation.
Fixes: a9f5ee1199e1 ("ofproto-dpif: Trigger revalidation when ipfix config set.")
Signed-off-by: lic121 <lic121@chinatelecom.cn>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a rcu_barrier before close_dpif_backer to ensure that
all meters have been freed before id_pool_destory meter's
id-pool.
Signed-off-by: Peng He <hepeng.0320@bytedance.com>
Tested-by: David Marchand <david.marchand@redhat.com>
Reviewed-by: David Marchand <david.marchand@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
| |
Once ct_zone timeout policy changes, revalidator is supposed to be
triggered.
Fixes: 993cae678bca ("ofproto-dpif: Consume CT_Zone, and CT_Timeout_Policy tables")
Signed-off-by: lic121 <lic121@chinatelecom.cn>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
| |
When we create two or more tunnels with the same type, only the first
tunnel will be added by dpif since they share the same datapath port.
Set the dpif_type here will clear the ioctl error logs.
Fixes: 4f19a78 ("netdev-vport: Fix userspace tunnel ioctl(SIOCGIFINDEX) info logs.")
Signed-off-by: Wan Junjie <wanjunjie@bytedance.com>
Acked-by: Mike Pattrick <mkp@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using SHORT version of the *_SAFE loops makes the code cleaner and less
error prone. So, use the SHORT version and remove the extra variable
when possible for hmap and all its derived types.
In order to be able to use both long and short versions without changing
the name of the macro for all the clients, overload the existing name
and select the appropriate version depending on the number of arguments.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using the SHORT version of the *_SAFE loops makes the code cleaner
and less error-prone. So, use the SHORT version and remove the extra
variable when possible.
In order to be able to use both long and short versions without changing
the name of the macro for all the clients, overload the existing name
and select the appropriate version depending on the number of arguments.
Acked-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ofproto map 'all_ofproto_dpifs_by_uuid' does not support
concurrent accesses. It is however read by upcall handler threads
and written by the main thread at the same time.
Additionally, handler threads will change the ams_seq while
an ofproto is being destroyed, triggering crashes with the
following backtrace:
(gdb) bt
hmap_next (hmap.h:398)
seq_wake_waiters (seq.c:326)
seq_change_protected (seq.c:134)
seq_change (seq.c:144)
ofproto_dpif_send_async_msg (ofproto_dpif.c:263)
process_upcall (ofproto_dpif_upcall.c:1782)
recv_upcalls (ofproto_dpif_upcall.c:1026)
udpif_upcall_handler (ofproto/ofproto_dpif_upcall.c:945)
ovsthread_wrapper (ovs_thread.c:734)
To solve both issues, remove the 'all_ofproto_dpifs_by_uuid'.
Instead, another map already storing ofprotos in xlate can be used.
During an ofproto destruction, its reference is removed from the current
xlate xcfg. Such change is committed only after all threads have quiesced
at least once during xlate_txn_commit(). This wait ensures that the
removal is seen by all threads, rendering impossible for a thread to
still hold a reference while the destruction proceeds.
Furthermore, the xlate maps are copied during updates instead of
being written in place. It is thus correct to read xcfg->xbridges while
inserting or removing from new_xcfg->xbridges.
Finally, now that ofproto_dpifs lookups are done through xcfg->xbridges,
it is important to use a high level of entropy. As it used the ofproto pointer
hashed, fewer bits were random compared to the uuid key used in
'all_ofproto_dpifs_by_uuid'. To solve this, use the ofproto uuid as the key
in xbridges as well, improving entropy.
Fixes: fcb9579be3c7 ("ofproto: Add 'ofproto_uuid' and 'ofp_in_port' to user action cookie.")
Suggested-by: Adrian Moreno <amorenoz@redhat.com>
Acked-by: Adrian Moreno <amorenoz@redhat.com>
Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org>
Tested-by: Alin-Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Gaetan Rivet <grive@u256.net>
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Co-authored-by: Yunjian Wang <wangyunjian@huawei.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From hepeng:
https://patchwork.ozlabs.org/project/openvswitch/patch/20200717015041.82746-1-hepeng.0320@bytedance.com/#2487473
also from guohongzhi <guohongzhi1@huawei.com>:
http://patchwork.ozlabs.org/project/openvswitch/patch/20200306130555.19884-1-guohongzhi1@huawei.com/
also from a discussion about the mixing use of RCU and refcount in the mail
list with Ilya Maximets, William Tu, Ben Pfaf, and Gaëtan Rivet.
A summary, as quoted from Ilya:
"
RCU for ofproto was introduced for one
and only one reason - to avoid freeing ofproto while rules are still
alive. This was done in commit f416c8d61601 ("ofproto: RCU postpone
rule destruction."). The goal was to allow using rules without
refcounting them within a single grace period. And that forced us
to postpone destruction of the ofproto for a single grace period.
Later commit 39c9459355b6 ("Use classifier versioning.") made it
possible for rules to be alive for more than one grace period, so
the commit made ofproto wait for 2 grace periods by double postponing.
As we can see now, that wasn't enough and we have to wait for more
than 2 grace periods in certain cases.
"
In a short, the ofproto should have a longer life time than rule, if
the rule lasts for more than 2 grace periods, the ofproto should live
longer to ensure rule->ofproto is valid. It's hard to predict how long
a ofproto should live, thus we need to use refcount on ofproto to make
things easy. The controversial part is that we have already used RCU postpone
to delay ofproto destrution, if we have to add refcount, is it simpler to
use just refcount without RCU postpone?
IMO, I think going back to the pure refcount solution is more
complicated than mixing using both.
Gaëtan Rive asks some questions on guohongzhi's v2 patch:
during ofproto_rule_create, should we use ofproto_ref
or ofproto_try_ref? how can we make sure the ofproto is alive?
By using RCU, ofproto has three states:
state 1: alive, with refcount >= 1
state 2: dying, with refcount == 0, however pointer is valid
state 3: died, memory freed, pointer might be dangling.
Without using RCU, there is no state 2, thus, we have to be very careful
every time we see a ofproto pointer. In contrast, with RCU, we can be sure
that it's alive at least in this grace peroid, so we can just check if
it is dying by ofproto_try_ref.
This shows that by mixing use of RCU and refcount we can save a lot of work
worrying if ofproto is dangling.
In short, the RCU part makes sure the ofproto is alive when we use it,
and the refcount part makes sure it lives longer enough.
In this patch, I have merged guohongzhi's patch and mine, and fixes
accoring to the previous comments.
Acked-by: Adrian Moreno <amorenoz@redhat.com>
Acked-by: Gaetan Rivet <grive@u256.net>
Acked-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org>
Tested-by: Alin-Gabriel Serdean <aserdean@ovn.org>
Signed-off-by: Peng He <hepeng.0320@bytedance.com>
Co-authored-by: Hongzhi Guo <guohongzhi1@huawei.com>
Signed-off-by: Hongzhi Guo <guohongzhi1@huawei.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, ipfix conf creation/deletion don't trigger dpif backer
revalidation. This is not expected, as we need the revalidation
to commit ipfix into xlate. So that xlate can generate ipfix
actions.
This patch covers only new creation/deletion of ipfix config.
Will upload one more patch to cover ipfix option changes.
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: lic121 <lic121@chinatelecom.cn>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
| |
Fixes: a98b700db617 ("ofproto: Add appctl command to show Datapath features")
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The encap & decap actions are extended to support MPLS packet type.
Encap & decap actions adds and removes MPLS header at start of the
packet.
The existing PUSH MPLS & POP MPLS actions inserts & removes MPLS
header between ethernet header and the IP header. Though this behaviour
is fine for L3 VPN where an IP packet is encapsulated inside a MPLS
tunnel, it does not suffice the L2 VPN requirements. In L2 VPN the
ethernet packets must be encapsulated inside MPLS tunnel.
In this change the encap & decap actions are extended to support MPLS
packet type. The encap & decap adds and removes MPLS header at the
start of packet as depicted below.
Encapsulation:
Actions - encap(mpls),encap(ethernet)
Incoming packet -> | ETH | IP | Payload |
1 Actions - encap(mpls) [Datapath action - ADD_MPLS:0x8847]
Outgoing packet -> | MPLS | ETH | Payload|
2 Actions - encap(ethernet) [ Datapath action - push_eth ]
Outgoing packet -> | ETH | MPLS | ETH | Payload|
Decapsulation:
Incoming packet -> | ETH | MPLS | ETH | IP | Payload |
Actions - decap(),decap(packet_type(ns=0,type=0))
1 Actions - decap() [Datapath action - pop_eth)
Outgoing packet -> | MPLS | ETH | IP | Payload|
2 Actions - decap(packet_type(ns=0,type=0)) [Datapath action - POP_MPLS:0x6558]
Outgoing packet -> | ETH | IP | Payload|
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently when a user creates an openflow group with with multiple
buckets without specifying a selection type, the efficient dp_hash is
only selected if the user is creating fewer than 64 buckets. But when
dp_hash is explicitly selected, up to 256 buckets are supported.
While up to 64 buckets seems like a lot, certain OVN/Open Stack
workloads could result in the user creating more than 64 buckets. For
example, when using OVN to load balance. This patch increases the
default maximum from 64 to 256.
This change to the default limit doesn't affect how many buckets are
actually created, that is specified by the user when the group is
created, just how traffic is distributed across buckets.
Signed-off-by: Mike Pattrick <mkp@redhat.com>
Acked-by: Gaetan Rivet <grive@u256.net>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently there is an option to add/flush/show ARP/ND neighbor. This
covers L3 side. For L2 side, there is only fdb show command. This
commit gives an option to add/del an fdb entry via ovs-appctl.
CLI command looks like:
To add:
ovs-appctl fdb/add <bridge> <port> <vlan> <Mac>
ovs-appctl fdb/add br0 p1 0 50:54:00:00:00:05
To del:
ovs-appctl fdb/del <bridge> <vlan> <Mac>
ovs-appctl fdb/del br0 0 50:54:00:00:00:05
Added two new APIs to provide convenient interface to add and delete
static-macs.
bool xlate_add_static_mac_entry(const struct ofproto_dpif *,
ofp_port_t in_port,
struct eth_addr dl_src, int vlan);
bool xlate_delete_static_mac_entry(const struct ofproto_dpif *,
struct eth_addr dl_src, int vlan);
1. Static entry should not age. To indicate that entry being
programmed is a static entry, 'expires' field in 'struct mac_entry'
will be set to a MAC_ENTRY_AGE_STATIC_ENTRY. A check for this value
is made while deleting mac entry as part of regular aging process.
2. Another change to the mac-update logic, when a packet with same
dl_src as that of a static-mac entry arrives on any port, the logic
will not modify the expires field.
3. While flushing fdb entries, made sure static ones are not evicted.
4. Updated "ovs-appctl fdb/stats-show br0" to display number of static
entries in switch
Added following tests:
ofproto-dpif - static-mac add/del/flush
ofproto-dpif - static-mac mac moves
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048894.html
Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1597752
Signed-off-by: Vasu Dasari <vdasari@gmail.com>
Tested-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, conntrack in the kernel has an undocumented feature referred
to as all-zero IP address SNAT. Basically, when a source port
collision is detected during the commit, the source port will be
translated to an ephemeral port. If there is no collision, no SNAT is
performed.
This patchset documents this behavior and adds a self-test to verify
it's not changing. In addition, a datapath feature flag is added for
the all-zero IP SNAT case. This will help applications on top of OVS,
like OVN, to determine this feature can be used.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Acked-by: Dumitru Ceara <dceara@redhat.com>
Acked-by: Alin-Gabriel Serdean <aserdean@ovn.org>
Acked-by: Paolo Valerio <pvalerio@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'cdtp' is allocated on a stack and it has uninitialized 'present'
field that specifies which attributes are actually set. This
causes use of uninitialized attributes.
Conditional jump or move depends on uninitialised value(s)
at 0x539651: dpif_netlink_get_nl_tp_udp_attrs (dpif-netlink.c:3206)
by 0x539651: dpif_netlink_get_nl_tp_attrs (dpif-netlink.c:3234)
by 0x539651: dpif_netlink_ct_set_timeout_policy (dpif-netlink.c:3370)
by 0x42B615: ct_add_timeout_policy_to_dpif (ofproto-dpif.c:5421)
by 0x42B615: ct_set_zone_timeout_policy (ofproto-dpif.c:5525)
by 0x40BD98: ct_zones_reconfigure (bridge.c:756)
by 0x40BD98: datapath_reconfigure (bridge.c:804)
by 0x40D737: bridge_reconfigure (bridge.c:903)
by 0x411720: bridge_run (bridge.c:3331)
by 0x40751C: main (ovs-vswitchd.c:127)
Clearing the whole structure to avoid this kind of problems.
Fixes: 993cae678bca ("ofproto-dpif: Consume CT_Zone, and CT_Timeout_Policy tables")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Mark D. Gray <mark.d.gray@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new term is "member".
Most of these changes should not change user-visible behavior. One
place where they do is in "ovs-ofctl dump-flows", which will now output
"members:..." inside "bundle" actions instead of "slaves:...". I don't
expect this to cause real problems in most systems. The old syntax
is still supported on input for backward compatibility.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem:
In OVS, flows with output over a bond interface of type “balance-tcp”
gets translated by the ofproto layer into "HASH" and "RECIRC" datapath
actions. After recirculation, the packet is forwarded to the bond
member port based on 8-bits of the datapath hash value computed through
dp_hash. This causes performance degradation in the following ways:
1. The recirculation of the packet implies another lookup of the
packet’s flow key in the exact match cache (EMC) and potentially
Megaflow classifier (DPCLS). This is the biggest cost factor.
2. The recirculated packets have a new “RSS” hash and compete with the
original packets for the scarce number of EMC slots. This implies more
EMC misses and potentially EMC thrashing causing costly DPCLS lookups.
3. The 256 extra megaflow entries per bond for dp_hash bond selection
put additional load on the revalidation threads.
Owing to this performance degradation, deployments stick to “balance-slb”
bond mode even though it does not do active-active load balancing for
VXLAN- and GRE-tunnelled traffic because all tunnel packet have the
same source MAC address.
Proposed optimization:
This proposal introduces a new load-balancing output action instead of
recirculation.
Maintain one table per-bond (could just be an array of uint16's) and
program it the same way internal flows are created today for each
possible hash value (256 entries) from ofproto layer. Use this table to
load-balance flows as part of output action processing.
Currently xlate_normal() -> output_normal() ->
bond_update_post_recirc_rules() -> bond_may_recirc() and
compose_output_action__() generate 'dp_hash(hash_l4(0))' and
'recirc(<RecircID>)' actions. In this case the RecircID identifies the
bond. For the recirculated packets the ofproto layer installs megaflow
entries that match on RecircID and masked dp_hash and send them to the
corresponding output port.
Instead, we will now generate action as
'lb_output(<bond id>)'
This combines hash computation (only if needed, else re-use RSS hash)
and inline load-balancing over the bond. This action is used *only* for
balance-tcp bonds in userspace datapath (the OVS kernel datapath
remains unchanged).
Example:
Current scheme:
With 8 UDP flows (with random UDP src port):
flow-dump from pmd on cpu core: 2
recirc_id(0),in_port(7),<...> actions:hash(hash_l4(0)),recirc(0x1)
recirc_id(0x1),dp_hash(0xf8e02b7e/0xff),<...> actions:2
recirc_id(0x1),dp_hash(0xb236c260/0xff),<...> actions:1
recirc_id(0x1),dp_hash(0x7d89eb18/0xff),<...> actions:1
recirc_id(0x1),dp_hash(0xa78d75df/0xff),<...> actions:2
recirc_id(0x1),dp_hash(0xb58d846f/0xff),<...> actions:2
recirc_id(0x1),dp_hash(0x24534406/0xff),<...> actions:1
recirc_id(0x1),dp_hash(0x3cf32550/0xff),<...> actions:1
New scheme:
We can do with a single flow entry (for any number of new flows):
in_port(7),<...> actions:lb_output(1)
A new CLI has been added to dump datapath bond cache as given below.
# ovs-appctl dpif-netdev/bond-show [dp]
Bond cache:
bond-id 1 :
bucket 0 - slave 2
bucket 1 - slave 1
bucket 2 - slave 2
bucket 3 - slave 1
Co-authored-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Signed-off-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Tested-by: Matteo Croce <mcroce@redhat.com>
Tested-by: Adrian Moreno <amorenoz@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 1f1613183733 ("ct-dpif, dpif-netlink: Add conntrack timeout
policy support") adds conntrack timeout policy for kernel datapath.
This patch enables support for the userspace datapath. I tested
using the 'make check-system-userspace' which checks the timeout
policies for ICMP and UDP cases.
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently fast-failover group does not support checking liveness of controller
port (OFPP_CONTROLLER). However this feature can be useful for selecting
alternate pipeline when controller connection itself is down for e.g.
by using local DHCP server to reply for any DHCP request originating from VMs.
This patch adds the support for watching controller port liveness in fast-
failover group. Controller port is considered live when atleast one
of-connection is alive.
Example usage:
ovs-ofctl add-group br-int 'group_id=1234,type=ff,
bucket=watch_port:CONTROLLER,actions:<A>,
bucket=watch_port:1,actions:<B>
Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The admin can choose whether or not to delete flows from datapaths when
they stop ovs-vswitchd. The goal of not deleting flows it to allow
existing traffic to continue being forwarded until ovs-vswitchd is
restarted. Until now, regardless of this choice, ovs-vswitchd has
always deleted tunnel ports from the datapath. When flows are not
deleted, this nevertheless prevents tunnel traffic from being forwarded.
With this patch, ovs-vswitchd no longer deletes tunnel ports in the
case where it does not delete flows, allowing tunnel traffic to continue
being forwarded.
Reported-by: Antonin Bas <abas@vmware.com>
Tested-by: Antonin Bas <abas@vmware.com>
Tested-by: txfh2007 <txfh2007@aliyun.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dpctl module leaks the 'dp_extra_info' in case the dumped flow doesn't
fit the dump filter while executing dpctl/dump-flows and also while
executing dpctl/get-flow.
This is already a 3rd attempt to fix all the leaks and incorrect usage
of this string that definitely indicates poor initial design of the
feature.
Flow dump/get documentation clearly states that the caller does not own
the data provided in dpif_flow. Datapath still owns all the data and
promises to not free/modify it until the next quiescent period, however
we're requesting the caller to free 'dp_extra_info' and this obviously
breaks the rules.
This patch fixes the issue by by storing 'dp_extra_info' within
'struct dp_netdev_flow' making datapath to own it. 'dp_netdev_flow'
is RCU-protected, so it will be valid until the next quiescent period.
Fixes: 0e8f5c6a38d0 ("dpif-netdev: Modified ovs-appctl dpctl/dump-flows command")
Tested-by: Emma Finn <emma.finn@intel.com>
Acked-by: Emma Finn <emma.finn@intel.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
RCU provides the semantics we want from udpif_synchronize() and it
should be much more lightweight than killing and restarting all the
upcall threads. It looks like udpif_synchronize() was written before
the OVS tree had RCU support, which is probably why we didn't use it
here from the beginning. So we can just change udpif_synchronize()
to a single ovsrcu_synchronize() call.
However, udpif_synchronize() only has a single caller, which calls
ovsrcu_synchronize() anyway just beforehand, via xlate_txn_commit().
So we can get rid of udpif_synchronize() entirely, which this patch
does.
As a side effect, this eliminates one reason why terminating OVS cleanly
clears the datapath flow table. An upcoming patch will eliminate
other reasons.
Acked-by: Numan Siddique <numans@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
| |
Leak was detected by running test: "ofproto-dpif - balance-tcp bonding"
Fixes: 0e8f5c6a38d0 ("dpif-netdev: Modified ovs-appctl dpctl/dump-flows command")
Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current implementation of dpif_flow_hash() doesn't depend on datapath
interface and only complicates the callers by forcing them to figure
out what is their current 'dpif'. If we'll need different hashing
for different 'dpif's we'll implement an API for dpif-providers
and each dpif implementation will be able to use their local function
directly without calling it via dpif API.
This change will allow us to not store 'dpif' pointer in the userspace
datapath implementation which is broken and will be removed in next
commits.
This patch moves dpif_flow_hash() to odp-util module and replaces
unused odp_flow_key_hash() by it, along with removing of unused 'dpif'
argument.
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently OVS maintains explicit packet drop/error counters only on port
level. Packets that are dropped as part of normal OpenFlow processing
are counted in flow stats of “drop” flows or as table misses in table
stats. These can only be interpreted by controllers that know the
semantics of the configured OpenFlow pipeline. Without that knowledge,
it is impossible for an OVS user to obtain e.g. the total number of
packets dropped due to OpenFlow rules.
Furthermore, there are numerous other reasons for which packets can be
dropped by OVS slow path that are not related to the OpenFlow pipeline.
The generated datapath flow entries include a drop action to avoid
further expensive upcalls to the slow path, but subsequent packets
dropped by the datapath are not accounted anywhere.
Finally, the datapath itself drops packets in certain error situations.
Also, these drops are today not accounted for.This makes it difficult
for OVS users to monitor packet drop in an OVS instance and to alert a
management system in case of a unexpected increase of such drops.
Also OVS trouble-shooters face difficulties in analysing packet drops.
With this patch we implement following changes to address the issues
mentioned above.
1. Identify and account all the silent packet drop scenarios
2. Display these drops in ovs-appctl coverage/show
Co-authored-by: Rohith Basavaraja <rohith.basavaraja@gmail.com>
Co-authored-by: Keshav Gupta <keshugupta1@gmail.com>
Signed-off-by: Anju Thomas <anju.thomas@ericsson.com>
Signed-off-by: Rohith Basavaraja <rohith.basavaraja@gmail.com>
Signed-off-by: Keshav Gupta <keshugupta1@gmail.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com
Acked-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of callers doesn't initialize dpif_execute.hash leaving random
value from the stack. And this random value used later while encoding
netlink message and might produce unwanted kernel behavior.
Fix that by fully initializing dpif_execute structure. Using
designated initializers to avoid such issues in the future.
Fixes: 0442bfb11d6c ("ofproto-dpif-upcall: Echo HASH attribute back to datapath.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add argument '--offload-stats' for command ovs-appctl bridge/dump-flows
to display the offloaded packets statistics.
The commands display as below:
orignal command:
ovs-appctl bridge/dump-flows br0
duration=574s, n_packets=1152, n_bytes=110768, priority=0,actions=NORMAL
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=2,recirc_id=0,actions=drop
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=0,reg0=0x1,actions=controller(reason=)
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=0,reg0=0x2,actions=drop
table_id=254, duration=574s, n_packets=0, n_bytes=0, priority=0,reg0=0x3,actions=drop
new command with argument '--offload-stats'
Notice: 'n_offload_packets' are a subset of n_packets and 'n_offload_bytes' are
a subset of n_bytes.
ovs-appctl bridge/dump-flows --offload-stats br0
duration=582s, n_packets=1152, n_bytes=110768, n_offload_packets=1107, n_offload_bytes=107992, priority=0,actions=NORMAL
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=2,recirc_id=0,actions=drop
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=0,reg0=0x1,actions=controller(reason=)
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=0,reg0=0x2,actions=drop
table_id=254, duration=582s, n_packets=0, n_bytes=0, n_offload_packets=0, n_offload_bytes=0, priority=0,reg0=0x3,actions=drop
Signed-off-by: zhaozhanxu <zhaozhanxu@163.com>
Signed-off-by: Simon Horman <simon.horman@netronome.com>
|
|
|
|
|
|
|
|
|
| |
Make the code simpler by removing the use of
xasprintf and free, and use smap_add_format.
Cc: Ben Pfaff <blp@ovn.org>
Signed-off-by: William Tu <u9012063@gmail.com>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
| |
'get_datapath_cap' callback was mistakenly placed in
'Connection tracking' section of the 'struct dpif_class'
while belongs to the 'Datapath information'.
CC: William Tu <u9012063@gmail.com>
Fixes: 27501802d09f ("ofproto-dpif: Expose datapath capability to ovsdb.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: William Tu <u9012063@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Handling of OpenFlow PACKET_OUT implies pushing the packet through
the datapath and packet processing inside the datapath could trigger
an upcall. In case of system datapath, 'dpif_execute()' sends packet
to the kernel module and returns. If any, upcall will be triggered
inside the kernel and handled by a separate thread in userspace.
But in case of userspace datapath full processing of the packet happens
inside the 'dpif_execute()' in the same thread that handled PACKET_OUT.
This causes an issue if upcall will lead to modification of flow rules.
For example, it could happen while processing of 'learn' actions.
Since whole handling of PACKET_OUT is protected by 'ofproto_mutex',
OVS will assert on attempt to take it recursively while processing
'learn' actions:
0 __GI_raise (sig=sig@entry=6)
1 __GI_abort ()
2 ovs_abort_valist ()
3 ovs_abort ()
4 ovs_mutex_lock_at (where=where@entry=0xad4199 "ofproto/ofproto.c:5391")
<Trying to acquire ofproto_mutex again>
5 ofproto_flow_mod_learn () at ofproto/ofproto.c:5391
<Trying to modify flows according to 'learn' action>
6 xlate_learn_action () at ofproto/ofproto-dpif-xlate.c:5378
<'learn' action found>
7 do_xlate_actions () at ofproto/ofproto-dpif-xlate.c:6893
8 xlate_recursively () at ofproto/ofproto-dpif-xlate.c:4233
9 xlate_table_action () at ofproto/ofproto-dpif-xlate.c:4361
10 in xlate_ofpact_resubmit () at ofproto/ofproto-dpif-xlate.c:4672
11 do_xlate_actions () at ofproto/ofproto-dpif-xlate.c:6773
12 xlate_actions () at ofproto/ofproto-dpif-xlate.c:7570
<Translating actions>
13 upcall_xlate () at ofproto/ofproto-dpif-upcall.c:1197
14 process_upcall () at ofproto/ofproto-dpif-upcall.c:1413
15 upcall_cb () at ofproto/ofproto-dpif-upcall.c:1315
16 dp_netdev_upcall (DPIF_UC_MISS) at lib/dpif-netdev.c:6236
<Flow cache miss. Making upcall>
17 handle_packet_upcall () at lib/dpif-netdev.c:6591
18 fast_path_processing () at lib/dpif-netdev.c:6709
19 dp_netdev_input__ () at lib/dpif-netdev.c:6797
20 dp_netdev_recirculate () at lib/dpif-netdev.c:6842
21 dp_execute_cb () at lib/dpif-netdev.c:7158
22 odp_execute_actions () at lib/odp-execute.c:794
23 dp_netdev_execute_actions () at lib/dpif-netdev.c:7332
24 dpif_netdev_execute () at lib/dpif-netdev.c:3725
25 dpif_netdev_operate () at lib/dpif-netdev.c:3756
<Packet pushed to userspace datapath for processing>
26 dpif_operate () at lib/dpif.c:1367
27 dpif_execute () at lib/dpif.c:1321
28 packet_execute () at ofproto/ofproto-dpif.c:4760
29 ofproto_packet_out_finish () at ofproto/ofproto.c:3594
<Taking ofproto_mutex>
30 handle_packet_out () at ofproto/ofproto.c:3635
31 handle_single_part_openflow (OFPTYPE_PACKET_OUT) at ofproto/ofproto.c:8400
32 handle_openflow () at ofproto/ofproto.c:8587
33 ofconn_run ()
34 connmgr_run ()
35 ofproto_run ()
36 bridge_run__ ()
37 bridge_run ()
38 main ()
Fix that by splitting the 'ofproto_packet_out_finish()' in two parts.
First one that translates side-effects and requires holding 'ofproto_mutex'
and the second that only pushes packet to datapath. The second part moved
out of 'ofproto_mutex' because 'ofproto_mutex' is not required and actually
should not be taken in order to avoid recursive locking.
Reported-by: Anil Kumar Koli <anilkumar.k@altencalsoftlabs.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-April/048494.html
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
| |
Document and expose datapath ND Extensions capability to ovsdb.
Fixes: d0d571493 ("ofproto-dpif: Allow IPv6 ND Extensions only if supported")
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The IPv6 ND Extensions is only implemented in userspace datapath,
but nothing prevents that to be used with other datapaths.
This patch probes the datapath and only allows if the support
is available.
Fixes: 9b2b84973 ("Support for match & set ICMPv6 reserved and options type fields")
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Acked-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Flavio Leitner <fbl@sysclose.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch adds support for fetching the datapath's capabilities
from the result of 'check_support()', and write the supported capability
to a new database column, called 'capabilities' under Datapath table.
To see how it works, run:
# ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
# ovs-vsctl -- --id=@m create Datapath datapath_version=0 \
'ct_zones={}' 'capabilities={}' \
-- set Open_vSwitch . datapaths:"netdev"=@m
# ovs-vsctl list-dp-cap netdev
ufid=true sample_nesting=true clone=true tnl_push_pop=true \
ct_orig_tuple=true ct_eventmask=true ct_state=true \
ct_clear=true max_vlan_headers=1 recirc=true ct_label=true \
max_hash_alg=1 ct_state_nat=true ct_timeout=true \
ct_mark=true ct_orig_tuple6=true check_pkt_len=true \
masked_set_action=true max_mpls_depth=3 trunc=true ct_zone=true
Signed-off-by: William Tu <u9012063@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
---
v5:
Add improved documentation from Ben and
fix checkpatch error (tab and line 79 char)
v4:
rebase to master
v3:
fix 32-bit build, reported by Greg
travis: https://travis-ci.org/williamtu/ovs-travis/builds/599276267
v2:
rebase to master
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It might be difficult to trace whether LACP PDU tx (as in
response) was successful when the pdu was not transmitted by
egress slave for various reasons (including resource contention
within NIC) and only way to trace its fate is by looking at
ofproto->stats.tx_[packets/bytes] and slave port stats.
Adding a warning when there is tx failure could help user
debug at the root of this problem.
Signed-off-by: Gowrishankar Muthukrishnan <gmuthukr@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue:
When LLDP is enabled on a port, a structure to hold LLDP related state
is created and that structure has a reference to the port. The ofproto
monitor thread accesses the LLDP structure to periodically send packets
over the associated port. When the port is deleted, the LLDP structure
is not cleaned up and it continues to refer to the deleted port.
When the monitor thread attempts to access the deleted port OVS crashes.
Crash can happen with bridge delete and bond delete also.
Fix:
Remove all references to the LLDP structure and free it when
the port is deleted.
Signed-off-by: Surya Rudra <rudrasurya.r@altencalsoftlabs.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch derives the timeout policy based on ct zone from the
internal data structure that we maintain on dpif layer.
It also adds a system traffic test to verify the zone-based conntrack
timeout feature. The test uses ovs-vsctl commands to configure
the customized ICMP and UDP timeout on zone 5 to a shorter period.
It then injects ICMP and UDP traffic to conntrack, and checks if the
corresponding conntrack entry expires after the predefined timeout.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
ofproto-dpif: Checks if datapath supports OVS_CT_ATTR_TIMEOUT
This patch checks whether datapath supports OVS_CT_ATTR_TIMEOUT. With this
check, ofproto-dpif-xlate can use this information to decide whether to
translate the ct timeout policy.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch consumes the CT_Zone and CT_Timeout_Policy tables, maintains
the zone-based configuration in the vswitchd. Whenever there is a
database change, vswitchd will read the datapath, CT_Zone, and
CT_Timeout_Policy tables from ovsdb, builds an internal snapshot of the
database configuration in bridge.c, and pushes down the change into
ofproto and dpif layer.
If a new zone-based timeout policy is added, it updates the zone to
timeout policy mapping in the per datapath type datapath structure in
dpif-backer, and pushes down the timeout policy into the datapath via
dpif interface.
If a timeout policy is no longer used, for kernel datapath, vswitchd
may not be able to remove it from datapath immediately since
datapath flows can still reference the to-be-deleted timeout policies.
Thus, we keep an timeout policy kill list, that vswitchd will go
back to the list periodically and try to kill the unused timeout policies.
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Justin Pettit <jpettit@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Valgrind reported:
1122: ofproto-dpif - select group with explicit dp_hash selection method
==16884== 64 bytes in 1 blocks are definitely lost in loss record 320 of 346
==16884== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16884== by 0x532512: xcalloc (util.c:121)
==16884== by 0x4262B9: group_setup_dp_hash_table (ofproto-dpif.c:4846)
==16884== by 0x4267CB: group_set_selection_method (ofproto-dpif.c:4938)
==16884== by 0x4267CB: group_construct (ofproto-dpif.c:4984)
==16884== by 0x417250: init_group (ofproto.c:7286)
==16884== by 0x41B4FC: add_group_start (ofproto.c:7316)
==16884== by 0x42247A: ofproto_group_mod_start (ofproto.c:7589)
==16884== by 0x4250EC: handle_group_mod (ofproto.c:7744)
==16884== by 0x4250EC: handle_single_part_openflow (ofproto.c:8428)
==16884== by 0x4250EC: handle_openflow (ofproto.c:8606)
==16884== by 0x4579E2: ofconn_run (connmgr.c:1318)
==16884== by 0x4579E2: connmgr_run (connmgr.c:355)
==16884== by 0x41E0F5: ofproto_run (ofproto.c:1845)
==16884== by 0x40BA63: bridge_run__ (bridge.c:2971)
==16884== by 0x410CF3: bridge_run (bridge.c:3029)
==16884== by 0x407614: main (ovs-vswitchd.c:127)
This patch fixes it.
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Valgrind reported:
1210: ofproto-dpif - continuation after clone
==32205== 4,392 (1,440 direct, 2,952 indirect) bytes in 12 blocks are definitely lost in loss record 359 of 362
==32205== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==32205== by 0x532574: xmalloc (util.c:138)
==32205== by 0x4F98CA: ofpbuf_init (ofpbuf.c:123)
==32205== by 0x42C07B: nxt_resume (ofproto-dpif.c:5110)
==32205== by 0x41796F: handle_nxt_resume (ofproto.c:3677)
==32205== by 0x424583: handle_single_part_openflow (ofproto.c:8473)
==32205== by 0x424583: handle_openflow (ofproto.c:8606)
==32205== by 0x4579E2: ofconn_run (connmgr.c:1318)
==32205== by 0x4579E2: connmgr_run (connmgr.c:355)
==32205== by 0x41E0F5: ofproto_run (ofproto.c:1845)
==32205== by 0x40BA63: bridge_run__ (bridge.c:2971)
==32205== by 0x410CF3: bridge_run (bridge.c:3029)
==32205== by 0x407614: main (ovs-vswitchd.c:127)
This is because 'xcache' was not destroyed properly. This patch fixes it.
Acked-by: William Tu <u9012063@gmail.com>
Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'netdev' datapath is implemented within ovs-vswitchd process and can
not exist without it, so it should be gracefully terminated with a
full cleanup of resources upon ovs-vswitchd exit.
This change forces dpif cleanup for 'netdev' datapath regardless of
passing '--cleanup' to 'ovs-appctl exit'. Such solution allowes to
not pass this additional option everytime for userspace datapath
installations and also allowes to not terminate system datapath in
setups where both datapaths runs at the same time.
The main part is that dpif_port_del() will lead to netdev_close()
and subsequent netdev_class->destroy(dev) which will stop HW NICs
and free their resources. For vhost-user interfaces it will invoke
vhost driver unregistering with a properly closed vhost-user
connection. For upcoming AF_XDP netdev this will allow to gracefully
destroy xdp sockets and unload xdp programs from linux interfaces.
Another important thing is that port deletion will also trigger
flushing of flows offloaded to HW NICs.
Exception made for 'internal' ports that could have user ip/route
configuration. These ports will not be removed without '--cleanup'.
This change fixes OVS disappearing from the DPDK point of view
(keeping HW NICs improperly configured, sudden closing of vhost-user
connections) and will help with linux devices clearing with upcoming
AF_XDP netdev support.
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Tested-by: William Tu <u9012063@gmail.com>
Acked-by: Flavio Leitner <fbl@sysclose.org>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the ofp_port to odp_port translation issue on patch
port with nxt_resume. When OVS resumes processing a packet from
nxt_resume, OVS does not translate the ofp in_port to odp in_port
correctly if the packet is originally received from a patch port.
Currently,OVS sets the odp in_port for this resume pakcet as ODPP_NONE
and push the resume packet back to the datapath. Later on, if the packet
goes through a recirc, OVS will generate the following message since it
can not translate odp in_port (ODPP_NONE) back to ofp in_port during upcall,
and push down a datapath rule to drop the packet.
ofproto_dpif_upcall(handler16)|INFO|received packet on unassociated
datapath port 4294967295
When OVS revalidates the drop datapath flow with ODPP_NONE in_port, we
will see the following warning.
ofproto_dpif_upcall(revalidator18)|WARN|Failed to acquire udpif_key
corresponding to unexpected flow (Invalid argument): ufid:....
This patch resolves this issue by storing the odp in_port in the
continuation messages, and restores the odp in_port before push the
packet back to the datapath.
VMWare-BZ: 2364696
Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fortville NICs (or their drivers) can get into an inconsistent state,
in which the NIC can actually transmit and receive packets even
though they report "PHY down". In such a state, OVS can exchange and
process LACP messages and enable a LACP slave. However, further packet
exchange over the slave fails because OVS sees that the PHY is down.
This commit fixes the problem by making OVS ignore received LACP PDUs
and suppress transmitting LACP PDUs when carrier is down. In addition,
when a LACP PDU is received with carrier down, this commit triggers
rechecking the carrier status (by incrementing the connectivity sequence
number) to ensure that it is updated as quickly as possible.
Signed-off-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Co-authored-by: Manohar Krishnappa Chidambaraswamy <manukc@gmail.com>
Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Priority tags is a port configuration to determine how the port treats
priority tags, e.g. zero VLAN ID. Change the type from boolean to enum
as a pre-step towards introducing additional modes. The new options are
"never", equivalent to previously "false", and "if-nonzero",
equivalent to previously "true". "true" is still supported for backwards
compatibility.
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Roi Dayan <roid@mellanox.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, rule_insert() API does not have return value. There are some
possible scenarios where rule insertions can fail at run-time even though the
static checks during rule_construct() had passed previously. Some possible
scenarios for failure of rule insertions:
**) Rule insertions can fail dynamically in Hybrid mode (both Openflow and
Normal switch functioning coexist) where the CAM space could get suddenly
filled up by Normal switch functioning and Openflow gets devoid of available
space.
**) Some deployments could have separate independent layers for HW rule
insertions and application layer to interact with OVS. HW layer could face any
dynamic issue during rule handling which application could not have
predicted/captured in rule-construction phase.
Rule-insert errors for bundles are handled too.
Testing: Tested failures of rule insertions and also with bundles.
Signed-off-by: Aravind Prasad S <aravind.sridharan at dell.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a new action 'check_pkt_larger' which checks if the
packet is larger than the given size and stores the result in the
destination register.
Usage: check_pkt_larger(len)->REGISTER
Eg. match=...,actions=check_pkt_larger(1442)->NXM_NX_REG0[0],next;
This patch makes use of the new datapath action - 'check_pkt_len'
which was recently added in the commit [1].
At the start of ovs-vswitchd, datapath is probed for this action.
If the datapath action is present, then 'check_pkt_larger'
makes use of this datapath action.
Datapath action 'check_pkt_len' takes these nlattrs
* OVS_CHECK_PKT_LEN_ATTR_PKT_LEN - 'pkt_len' to check for
* OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_GREATER (optional) - Nested actions
to apply if the packet length is greater than the specified 'pkt_len'
* OVS_CHECK_PKT_LEN_ATTR_ACTIONS_IF_LESS_EQUAL (optional) - Nested
actions to apply if the packet length is lesser or equal to the
specified 'pkt_len'.
Let's say we have these flows added to an OVS bridge br-int
table=0, priority=100 in_port=1,ip,actions=check_pkt_larger:100->NXM_NX_REG0[0],resubmit(,1)
table=1, priority=200,in_port=1,ip,reg0=0x1/0x1 actions=output:3
table=1, priority=100,in_port=1,ip,actions=output:4
Then the action 'check_pkt_larger' will be translated as
- check_pkt_len(size=100,gt(3),le(4))
datapath will check the packet length and if the packet length is greater than 100,
it will output to port 3, else it will output to port 4.
In case, datapath doesn't support 'check_pkt_len' action, the OVS action
'check_pkt_larger' sets SLOW_ACTION so that datapath flow is not added.
This OVS action is intended to be used by OVN to check the packet length
and generate an ICMP packet with type 3, code 4 and next hop mtu
in the logical router pipeline if the MTU of the physical interface
is lesser than the packet length. More information can be found here [2]
[1] - https://kernel.googlesource.com/pub/scm/linux/kernel/git/davem/net-next/+/4d5ec89fc8d14dcdab7214a0c13a1c7321dc6ea9
[2] - https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.html
Reported-at:
https://mail.openvswitch.org/pipermail/ovs-discuss/2018-July/047039.html
Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
CC: Ben Pfaff <blp@ovn.org>
CC: Gregory Rose <gvrose8192@gmail.com>
Acked-by: Mark Michelson <mmichels@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Virtual ports like 'patch' ports that almost fully implemented on
'ofproto' layer could have internal to 'ofproto' statuses that
could not be retrieved from 'netdev' or other layers. For example,
in current implementation there is no way to get the patch port
pairing status (i.e. if it has usable peer?).
New 'ofproto-provider' API function 'vport_get_status' introduced to
cover this gap. It allowes 'bridge' layer to retrive current status
of ofproto virtual ports and propagate it to DB.
For now we're only interested in pairing errors of 'patch' ports.
That are propagated to the 'error' column of the 'Interface' table.
Ex.:
$ ovs-vsctl show
...
Bridge "br1"
...
Port "patch1"
Interface "patch1"
type: patch
options: {peer="patch0"}
error: "No usable peer 'patch0' exists in 'system' datapath."
Bridge "br0"
datapath_type: netdev
...
Port "patch0"
Interface "patch0"
type: patch
options: {peer="patch1"}
error: "No usable peer 'patch1' exists in 'netdev' datapath."
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
| |
As a side effect, this also reduces a lot of log messages' severities from
ERR to WARN. They just didn't seem like messages that in general reported
anything that would prevent functioning.
Signed-off-by: Ben Pfaff <blp@ovn.org>
|