| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
The may_steal flag is now used, Remove OVS_UNUSED.
Since dp_packet_delete() handles the NULL pointer properly, we can
drop a few tracking variables, and make the code easier to follow.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
set_tunnel_config() always logs a warning, even on success. This
shouldn't happen.
Without this, some unit tests fail.
Fixes: 9fff138ec3a6("netdev: Add 'errp' to set_config().")
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Antonio Fischetti <antonio.fischetti@intel.com>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"ovs-appctl ofproto/trace" is invaluable for debugging, but as the users of
Open vSwitch have evolved it has failed to keep up with the times. It's
pretty easy to design OpenFlow tables and pipelines that resubmit dozens of
times. Each resubmit causes an additional tab of indentation, so the
output wraps around, sometimes again and again, and makes the output close
to unreadable.
ovn-trace pioneered better formatting for tracing in OVN logical datapaths,
mostly by not increasing indentation for tail recursion, which in practice
gets rid of almost all indentation.
This commit experiments with redoing ofproto/trace the same way. Try
looking at, for example, the testsuite output for test 2282 "ovn -- 3 HVs,
3 LRs connected via LS, source IP based routes". Without this commit, it
indents 61 levels (488 spaces!). With this commit, it indents 1 level
(4 spaces) and it's possible to actually understand what's going on almost
at a glance.
To see this for yourself, try the following command either with or without
this commit (but be sure to keep the change to ovn.at that adds an
ofproto/trace to the test):
make check TESTSUITEFLAGS='-d 2282' && less tests/testsuite.dir/2282/testsuite.log
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Lance Richardson <lrichard@redhat.com>
Acked-by: Justin Pettit <jpettit@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since 55e075e65ef9("netdev-dpdk: Arbitrary 'dpdk' port naming"),
set_config() is used to identify a DPDK device, so it's better to report
its detailed error message to the user. Tunnel devices and patch ports
rely a lot on set_config() as well.
This commit adds a param to set_config() that can be used to return
an error message and makes use of that in netdev-dpdk and netdev-vport.
Before this patch:
$ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl: Error detected while setting up 'dpdk0': dpdk0: could not set
configuration (Invalid argument). See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".
$ ovs-vsctl add-port br0 p+ -- set Interface p+ type=patch
ovs-vsctl: Error detected while setting up 'p+': p+: could not set
configuration (Invalid argument). See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".
$ ovs-vsctl add-port br0 gnv0 -- set Interface gnv0 type=geneve
ovs-vsctl: Error detected while setting up 'gnv0': gnv0: could not set
configuration (Invalid argument). See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".
After this patch:
$ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
ovs-vsctl: Error detected while setting up 'dpdk0': 'dpdk0' is missing
'options:dpdk-devargs'. The old 'dpdk<port_id>' names are not
supported. See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".
$ ovs-vsctl add-port br0 p+ -- set Interface p+ type=patch
ovs-vsctl: Error detected while setting up 'p+': p+: patch type requires
valid 'peer' argument. See ovs-vswitchd log for details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".
$ ovs-vsctl add-port br0 gnv0 -- set Interface gnv0 type=geneve
ovs-vsctl: Error detected while setting up 'gnv0': gnv0: geneve type
requires valid 'remote_ip' argument. See ovs-vswitchd log for
details.
ovs-vsctl: The default log directory is "/var/log/openvswitch/".
CC: Ciara Loftus <ciara.loftus@intel.com>
CC: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can hotplug attach DPDK ports specified via the 'dpdk-devargs'
option now.
But the socket id of DPDK ports can't be assigned correctly,
it is always 0. The socket id of DPDK ports should be assigned
according to the numa id of the device.
Fixes: 55e075e65ef9e ("netdev-dpdk: Arbitrary 'dpdk' port naming")
Signed-off-by: Binbin Xu <xu.binbin1@zte.com.cn>
Acked-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
|
|
|
|
|
|
|
|
|
|
| |
This patch avoids the ovs_rcu to report WARN, caused by blocked
for a long time, when ovs-vswitchd processes a port with many
rx/tx queues. The number of tx/rx queues per port may be appropriate,
because the dpdk uses it as an default max value.
Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this commit, we allow the user to set other_config:dpdk-init=true
after the process is started. This makes it easier to start Open
vSwitch with DPDK using standard init scripts without restarting the
service.
This is still far from ideal, because initializing DPDK might still
abort the process (e.g. if there not enough memory), so the user must
check the status of the process after setting dpdk-init to true.
Nonetheless, I think this is an improvement, because it doesn't require
restarting the whole unit.
CC: Aaron Conole <aconole@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Aaron Conole <aconole@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows to skip the 8-byte chunk comprising of dp_hash and
in_port in the subtable mask when dp_hash is wildcarded. This will
slightly speed up the hash computation as one expensive function call
to hash_add64() can be skipped.
For each new netdev flow we wildcard in_port in the mask, so in the
typical case where dp_hash is also wildcarded, the resulting 8-byte
chunk will not be part of the subtable mask.
This manipulation of the mask is possible as the datapath classifier
is explicitly selected based on the in_port value, so that all the
datapath flows in the selected classifier have an exact match on that
in_port value. Given this, it is safe to ignore the in_port value
when doing a lookup in the chosen classifier.
Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com>
Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Co-authored-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Co-authored-by: Jarno Rajahalme <jarno@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is being introduced specifically to allow a user of the "clone" action
to clear the connection tracking state, but it's implemented as a separate
action as a matter of clean design and in case another use case arises
later.
Reported-by: Mickey Spiegel <mickeys.dev@gmail.com>
Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/326981.html
Fixes: 7ae62a676d3a ("ofp-actions: Add clone action.")
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
Tested-by: Dong Jun <dongj@dtdream.com>
|
|
|
|
|
|
|
|
|
| |
To allow client to know when the conditional monitoring changes
has been accepted by the OVSDB server and the 'idl' contents has
been updated to match the new conditions.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When generating conditional monitoring update request, current code
failed to update idl's 'request-id'. This bug causes the reply
message of the update request, regardless an ACK or a NACK, be
logged as an unexpected message at the debug level and ignored by
the core idl logic.
In addition, the idl should not generate another conditional
monitoring update request when there is an outstanding request.
So that the requests and their reply are properly serialized.
When the conditional monitoring is nacked by the server, drop idl
into a client visible error state.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
| |
This patch uses OVS_CORE_UNSPEC for the queue unpinned instead
of "-1". More important, the "-1" casted to unsigned int is
equal to NON_PMD_CORE_ID. We make the distinction between them.
Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
|
|
|
|
|
|
|
|
|
| |
The NR_QUEUE is defined in "lib/dpif-netdev.h", netdev-dpdk
uses it instead of magic number. netdev-dummy should be
in the same case.
Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
|
|
|
|
|
| |
Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Always storing the maximum mf_value size wastes about 120 bytes for
each stack entry. This patch changes the stack from an mf_value array
to a string of value-length pairs.
The length is stored after the value so that the stack pop may first
read the length and then the appropriate number of bytes.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
| |
OVS should not crash if the controller sends a malformed OpenFlow
message. Return the error code instead.
Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
bridge_delete_or_reconfigure() deletes every interface that's not dumped
by OFPROTO_PORT_FOR_EACH(). ofproto_dpif.c:port_dump_next(), used by
OFPROTO_PORT_FOR_EACH, checks if the ofport is in the datapath by
calling port_query_by_name(). If port_query_by_name() returns an error,
the dump is interrupted. If port_query_by_name() returns ENODEV, the
device doesn't exist and the dump can continue.
port_query_by_name() for the userspace datapath returns ENOENT instead
of ENODEV. This is expected by dpif_port_query_by_name(), but it's not
handled correctly by port_dump_next().
dpif-netdev handles reconfiguration errors for an interface by deleting
it from the datapath, so it's possible that a device is missing. When this
happens we must make sure that port_dump_next() continues to dump other
devices, otherwise they will be deleted and the two layers will have an
inconsistent view.
This commit fixes the problem by returning ENODEV from the userspace
datapath if the port doesn't exist, and by documenting this clearly in
the dpif interfaces.
The problem was found while developing new code.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In case connection is reset when there are buffered but unsent
conditions, these conditions will be sent as the new "monitor_cond"
message that will be sent after the idl reconnects.
Without this patch, those conditions will be unnecessarily sent again
with following monitoring condition update message.
Signed-off-by: Andy Zhou <azhou@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this commit, the 'dpdk' port type could only be used for
physical DPDK devices. Now, virtual devices (or 'vdevs') are supported.
'vdev' devices are those which use virtual DPDK Poll Mode Drivers eg.
null, pcap. To add a DPDK vdev, a valid 'dpdk-devargs' must be set for
the given dpdk port. The format expected is 'eth_<driver_name><x>' where
'x' is a number between 0 and RTE_MAX_ETHPORTS -1.
For example to add a port that uses the 'null' DPDK PMD driver:
ovs-vsctl set Interface null0 options:dpdk-devargs=eth_null0
Not all DPDK vdevs have been verified to work at this point in time.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Stephen Finucane <stephen@that.guru> # docs only
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
'dpdk' ports no longer have naming restrictions. Now, instead of
specifying the dpdk port ID as part of the name, the PCI address of the
device must be specified via the 'dpdk-devargs' option. eg.
ovs-vsctl add-port br0 my-port
ovs-vsctl set Interface my-port type=dpdk
options:dpdk-devargs=0000:06:00.3
The user must no longer hotplug attach DPDK ports by issuing the
specific ovs-appctl netdev-dpdk/attach command. The hotplug is now
automatically invoked when a valid PCI address is set in the
dpdk-devargs. The format for ovs-appctl netdev-dpdk/detach command
has changed in that the user now must specify the relevant PCI address
as input instead of the port name.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Co-authored-by: Kevin Traynor <ktraynor@redhat.com>
Acked-by: Stephen Finucane <stephen@that.guru> # docs only
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to use dpdk ports in ovs they have to be bound to a DPDK
compatible driver before ovs is started.
This patch adds the possibility to hotplug (or hot-unplug) a device
after ovs has been started. The implementation adds two appctl commands:
netdev-dpdk/attach and netdev-dpdk/detach
After the user attaches a new device, it has to be added to a bridge
using the add-port command, similarly, before detaching a device,
it has to be removed using the del-port command.
Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it>
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Co-authored-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Stephen Finucane <stephen@that.guru> # docs only
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
|
|
|
|
|
| |
Signed-off-by: Justin Pettit <jpettit@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adding / removing a range of integers to a column accepting a set of
integers requires enumarating all of the integers. This patch simplifies
it by introducing 'range' concept to the database commands. Two integers
separated by a hyphen represent an inclusive range.
The patch adds positive and negative tests for the new syntax.
The patch was tested by 'make check'. Covarage was tested by
'make check-lcov'.
Signed-off-by: Lukasz Rzasik <lukasz.rzasik@gmail.com>
Suggested-by: <my_ovs_discuss@yahoo.com>
Suggested-by: Ben Pfaff <blp@ovn.org>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
nx_put_match() needs a non-NULL tunnel metadata table, otherwise it will
crash if a flow matches on tunnel metadata.
This wasn't handled in ofputil_append_flow_update(), causing a crash
when the controller sent a flow monitor request.
To fix the problem, this commit changes ofputil_append_flow_update() to
behave like ofputil_append_flow_stats_reply().
Since ofputil_append_flow_update() now needs to temporarily modify the
match, this commits also embeds 'struct match' into 'struct
ofputil_flow_update', to be safer. This is more similar to
'struct ofputil_flow_stats'.
A regression test is added and a comment is updated in ovs-ofctl.c
#0 0x000055699bd82fa0 in memcpy_from_metadata (dst=0x7ffc770930d0, src=0x7ffc77093698, loc=0x18) at ../lib/tun-metadata.c:451
#1 0x000055699bd83c2e in metadata_loc_from_match_read (map=0x0, match=0x7ffc77093410, idx=0, mask=0x7ffc77093658, is_masked=0x7ffc77093287) at ../lib/tun-metadata.c:848
#2 0x000055699bd83d9b in tun_metadata_to_nx_match (b=0x55699d3f0300, oxm=0, match=0x7ffc77093410) at ../lib/tun-metadata.c:871
#3 0x000055699bce523d in nx_put_raw (b=0x55699d3f0300, oxm=0, match=0x7ffc77093410, cookie=0, cookie_mask=0) at ../lib/nx-match.c:1052
#4 0x000055699bce5580 in nx_put_match (b=0x55699d3f0300, match=0x7ffc77093410, cookie=0, cookie_mask=0) at ../lib/nx-match.c:1116
#5 0x000055699bd3926f in ofputil_append_flow_update (update=0x7ffc770940b0, replies=0x7ffc77094e00) at ../lib/ofp-util.c:6805
#6 0x000055699bc4b5a9 in ofproto_compose_flow_refresh_update (rule=0x55699d405b40, flags=(NXFMF_INITIAL | NXFMF_ACTIONS), msgs=0x7ffc77094e00) at ../ofproto/ofproto.c:5915
#7 0x000055699bc4b5f6 in ofmonitor_compose_refresh_updates (rules=0x7ffc77094e10, msgs=0x7ffc77094e00) at ../ofproto/ofproto.c:5929
#8 0x000055699bc4bafc in handle_flow_monitor_request (ofconn=0x55699d404090, oh=0x55699d404220) at ../ofproto/ofproto.c:6082
#9 0x000055699bc4f46d in handle_openflow__ (ofconn=0x55699d404090, msg=0x55699d404910) at ../ofproto/ofproto.c:7912
#10 0x000055699bc4f5df in handle_openflow (ofconn=0x55699d404090, ofp_msg=0x55699d404910) at ../ofproto/ofproto.c:8002
#11 0x000055699bc88154 in ofconn_run (ofconn=0x55699d404090, handle_openflow=0x55699bc4f5bc <handle_openflow>) at ../ofproto/connmgr.c:1427
#12 0x000055699bc85934 in connmgr_run (mgr=0x55699d3adb90, handle_openflow=0x55699bc4f5bc <handle_openflow>) at ../ofproto/connmgr.c:363
#13 0x000055699bc422c9 in ofproto_run (p=0x55699d3c85e0) at ../ofproto/ofproto.c:1798
#14 0x000055699bc31ec6 in bridge_run__ () at ../vswitchd/bridge.c:2881
#15 0x000055699bc320a6 in bridge_run () at ../vswitchd/bridge.c:2938
#16 0x000055699bc3784e in main (argc=10, argv=0x7ffc770952c8) at ../vswitchd/ovs-vswitchd.c:111
Fixes: 8d8ab6c2d574 ("tun-metadata: Manage tunnel TLV mapping table on a
per-bridge basis.")
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
| |
Rename some structures that call themselves ivshmem,
as they are just a collection of dpdk rings and other
information.
Signed-off-by: Kevin Traynor <ktraynor@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
|
|
|
|
|
|
|
|
|
| |
Code is simplified when the ODP keys use the same type as the struct
flow for the IPv6 addresses. As the change is facilitated by
extract-odp-netlink-h, this change only affects the userspace. We
already do the same for the ethernet addresses.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow using match field names in addition to the canonical register
names in actions (including 'load', 'move', 'push', 'pop', 'output',
'multipath', 'bundle_load', and 'learn'). Allow also leaving out the
trailing '[]' to indicate full field. These changes allow simpler
syntax similar to 'set_field' to be used also elsewhere.
Correspondingly, allow the '[start..end]' syntax to be used in matches
in addition to the more explicit 'value/mask' notation. For example,
to match on the value 2 of the bits 14..15 of NXM_NX_REG0, the match
could include:
... reg0[14..15]=2 ...
instead of
... reg0=0x8000/0xc000 ...
Note that only contiguous masks can be specified with the bracket
notation.
Signed-off-by: Jarno Rajahalme <jarno@ovn.org>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A recent commit fixed ovs_rwlock_init() to pass the pthread_rwlockattr_t
that it initialized to pthread_rwlock_init(). According to POSIX
documentation this is correct, but on Windows the current implementation of
pthreads does not support a pre-initialized attribute. Please see a fork
of the implementation
https://github.com/GerHobbelt/pthread-win32/blob/19fd5054b29af1b4e3b3278bfffbb6274c6c89f5/pthread_rwlock_init.c#L59-L63
This is the same implementation as the official version found under:
ftp://sourceware.org/pub/pthreads-win32/)
A short debug output from `vswitch` to confirm the above:
>k
Index Function
--------------------------------------------------------------------------------
*1 ovs-vswitchd.exe!ovs_rwlock_init(const ovs_rwlock * l_=0x000001721c7da250)
2 ovs-vswitchd.exe!open_dpif_backer(const char * type=0x000001721c7d8d60, dpif_backer * * backerp=0x000001721c7d89c0)
3 ovs-vswitchd.exe!construct(ofproto * ofproto_=0x000001721c7d87d0)
4 ovs-vswitchd.exe!ofproto_create(const char * datapath_name=0x000001721c7d86e0, const char * datapath_type=0x000001721c7d8750, ofproto * * ofprotop=0x000001721c7d80b8)
5 ovs-vswitchd.exe!bridge_reconfigure(const ovsrec_open_vswitch * ovs_cfg=0x000001721c7e05b0)
6 ovs-vswitchd.exe!bridge_run()
7 ovs-vswitchd.exe!main(int argc=6, char * * argv=0x000001721c729e10)
8 [External Code]
>? error
22
https://github.com/openvswitch/ovs/blob/master/lib/ovs-thread.c#L243
This patch is critical because the majority (over 800) of the unit tests
are failing.
Fixes: 1a15f390afd6 ("lib/ovs-thread: set prefer writer lock for ovs_rwlock_init()")
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Shashank Ram <rams@vmware.com>
[blp@ovn.org changed the details of the approach]
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add Rx checksum offloading feature support on DPDK physical ports. By default,
the Rx checksum offloading is enabled if NIC supports. However,
the checksum offloading can be turned OFF either while adding a new DPDK
physical port to OVS or at runtime.
The rx checksum offloading can be turned off by setting the parameter to
'false'. For eg: To disable the rx checksum offloading when adding a port,
'ovs-vsctl add-port br0 dpdk0 -- \
set Interface dpdk0 type=dpdk options:rx-checksum-offload=false'
OR (to disable at run time after port is being added to OVS)
'ovs-vsctl set Interface dpdk0 options:rx-checksum-offload=false'
Similarly to turn ON rx checksum offloading at run time,
'ovs-vsctl set Interface dpdk0 options:rx-checksum-offload=true'
The Tx checksum offloading support is not implemented due to the following
reasons.
1) Checksum offloading and vectorization are mutually exclusive in DPDK poll
mode driver. Vector packet processing is turned OFF when checksum offloading
is enabled which causes significant performance drop at Tx side.
2) Normally, OVS generates checksum for tunnel packets in software at the
'tunnel push' operation, where the tunnel headers are created. However
enabling Tx checksum offloading involves,
*) Mark every packets for tx checksum offloading at 'tunnel_push' and
recirculate.
*) At the time of xmit, validate the same flag and instruct the NIC to do the
checksum calculation. In case NIC doesnt support Tx checksum offloading,
the checksum calculation has to be done in software before sending out the
packets.
No significant performance improvement noticed with Tx checksum offloading
due to the e overhead of additional validations + non vector packet processing.
In some test scenarios, it introduces performance drop too.
Rx checksum offloading still offers 8-9% of improvement on VxLAN tunneling
decapsulation even though the SSE vector Rx function is disabled in DPDK poll
mode driver.
Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com>
Acked-by: Jesse Gross <jesse@kernel.org>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
|
|
|
|
|
|
|
|
|
| |
The kernel datapath provides support for TFTP helpers, so add support
for this ALG to the commandline and OpenFlow encoding/decoding.
Signed-off-by: Joe Stringer <joe@ovn.org>
Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Sometimes seeing the OpenFlow flows that back a given logical flow can
provide additional insight. This commit adds a new --ovs option to
ovn-trace that makes it connect to Open vSwitch over OpenFlow and retrieve
and print the OpenFlow flows behind each logical flow encountered during
a trace.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it easy to find the logical flow that generated a particular
OpenFlow flow, by running "ovn-sbctl dump-flows <cookie>".
Later, this can be refined (and automated for "ofproto/trace"), but this
is still a significant advance.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ICMP error packets (e.g. destination unreachable messages) are
considered 'related' to another connection and are treated as part of
that.
However:
* We shouldn't create new entries in the connection table if the
original connection is not found. This is consistent with what the
kernel does.
* We certainly shouldn't call valid_new() on the packet, because
valid_new() assumes the packet l4 type (might be TCP, UDP or ICMP)
to be consistent with the conn_key nw_proto type.
Found by inspection.
Fixes: a489b16854b5("conntrack: New userspace connection tracker.")
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Darrell Ball <dlu998@gmail.com>
|
|
|
|
|
|
|
|
| |
It's simpler to make two calls than eight.
Reported-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Andy Zhou <azhou@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an rconn peer fails to send a hello message, the version number doesn't
get set. Later, if the peer delays long enough, the rconn attempts to send
an echo request but assert-fails instead because it doesn't know what
version to use. This fixes the problem.
To reproduce this problem:
make sandbox
ovs-vsctl add-br br0
ovs-vsctl set-controller br0 ptcp:12345
nc 127.0.0.1 12345
and wait 10 seconds for ovs-vswitchd to die. (Then exit the sandbox.)
Reported-by: 张东亚 <fortitude.zhang@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Justin Pettit <jpettit@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A reboot of one switch in an MC-LAG bond makes all bond links
to go down, causing a total connectivity loss for 3 seconds.
Packet capture shows that spurious LACP PDUs are sent to OVS with
a different MAC address (partner system id) during the final
stages of the MC-LAG switch reboot.
The current code selects a lead interface based on information
in the LACP PDU, regardless of its synchronization state. If a
non-synchronized interface is selected as the OVS lead interface
then all other interfaces are forced down as their stored partner
system id differs and the bond ends up with no working interface.
The bond recovers within three seconds after the last spurious
message.
To avoid the problem, this commit requires a lead interface
to be synchronized. In case no synchronized interface exists,
the selection of lead interface is done as in the current code.
Signed-off-by: Torgny Lindberg <torgny.lindberg@ericsson.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I measured the packet processing cost of OVS DPDK datapath for different
OpenFlow actions. I configured OVS to use a single pmd thread and
measured the packet throughput in a phy-to-phy setup. I used 10G
interfaces bounded to DPDK driver and overloaded the vSwitch with 64
byte packets through one of the 10G interfaces.
The processing cost of the dec_ttl action seemed to be gratuitously high
compared with other actions.
I looked into the code and saw that dec_ttl is encoded as a masked
nested attribute in OVS_ACTION_ATTR_SET_MASKED(OVS_KEY_ATTR_IPV4).
That way, OVS datapath can modify several IP header fields (TTL, TOS,
source and destination IP addresses) by a single invocation of
packet_set_ipv4() in the odp_set_ipv4() function in the
lib/odp-execute.c file. The packet_set_ipv4() function takes the new
TOS, TTL and IP addresses as arguments, compares them with the actual
ones and updates the fields if needed. This means, that even if only TTL
needs to be updated, each of the four IP header fields is passed to the
callee and is compared to the actual field for each packet.
The odp_set_ipv4() caller function possesses information about the
fields that need to be updated in the 'mask' structure. The idea is to
spare invocation of the packet_set_ipv4() function but use its code
parts directly. So the 'mask' can be used to decide which IP header
fields need to be updated. In addition, a faster packet processing can
be achieved if the values of local variables are
calculated right before their usage.
| T | T | I | I |
| T | O | P | P | Vanilla OVS || + new patch
| L | S | s | d | (nsec/packet) || (nsec/packet)
-------+---+---+---+---+---------------++---------------
output | | | | | 67.19 || 67.19
| X | | | | 74.48 || 68.78
| | X | | | 74.42 || 70.07
| | | X | | 84.62 || 78.03
| | | | X | 84.25 || 77.94
| | | X | X | 97.46 || 91.86
| X | | X | X | 100.42 || 96.00
| X | X | X | X | 102.80 || 100.73
The table shows the average processing cost of packets in nanoseconds
for the following actions:
output; output + dec_ttl; output + mod_nw_tos; output + mod_nw_src;
output + mod_nw_dst and some of their combinations.
I ran each test five times. The values are the mean of the readings
obtained.
I added OVS_LIKELY to the 'if' condition for the TTL field, since as far
as I know, this field will typically be decremented when any field of
the IP header is modified.
Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An alternative "writer nonrecursive" rwlock allows recursive
read-locks to succeed only if there are no threads waiting for the
write-lock. In the function ovs_rwlock_init(), there exist a problem,
the parameter of 'attr' is not used to set the attributes of ovs_rwlock 'l_',
just because use pthread_rwlock_init(&l->lock, NULL) to init l->lock.
The attr object needs to be passed to the pthread_rwlock_init()
call in order to make use of it.
Signed-off-by: zangchuanqiang <zangchuanqiang@huawei.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are currently five users of the table formatting library,
all of which default to "list" except for ovsdb-client which
defaults to "table". The library current default is "table",
and the table.man man page fragment only considers ovs-vsctl
to use something other than "table" as a default.As a result,
the man pages for ovn-sbctl and vtep-ctl are currently incorrect
(these options aren't documented in the ovn-nbctl man page, which
will need to be addressed in a future patch).
Fix by making the library default format "list" and handling
ovsdb-client as the exception.
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
| |
When IGMP or MLD packets arrive their content is used without the checksum
being verified. With this change the checksum is verified, and the packet
is not used for multicast snooping on failure.
Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a route is withdrawn (blackholed) the netlink message doesn't include
an RTA_OIF element. This results in an "unexpected netlink message
contents" log message because this element is not optional.
Given that the netlink message will be ignored anyway, and subsequent
error checking will cope with missing RTA_OIF, the element should be
optional in order to suppress unnecessary log messages.
Signed-off-by: Tony van der Peet <tony.vanderpeet@alliedtelesis.co.nz>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
| |
DPDK documentation is recently updated to reflect that DPDK does not
hold any references to, nor take ownership of, the argv/argc elements.
With that understanding, let's just release the memory asap.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
|
|
|
|
|
|
|
|
| |
s/deamon/daemon/
s/dependant/dependent/
Signed-off-by: Lance Richardson <lrichard@redhat.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
| |
The MurmurHash code repo has moved from code.google to github. Update
the link to reflect this.
Signed-off-by: Cian Ferriter <cian.ferriter@intel.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch updates the following commands in the vswitch:
ovs-vsctl add-br br-test
ovs-vsctl del-br br-test
ovs-vsctl add-br br-test:
This command will now create an internal port on the MSFT virtual switch
using the WMI interface from Msvm_VirtualEthernetSwitchManagementService
leveraging the method AddResourceSettings.
Before creating the actual port, the switch will be queried to see if there
is not a port already created (good for restarts when restarting the
vswitch daemon). If there is a port defined it will return success and log
a message.
After checking if the port already exists the command will also verify
if the forwarding extension (windows datapath) is enabled and on a single
switch. If it is not activated or if it is activated on multiple switches
it will return an error and a message will be logged.
After the port was created on the switch, we will disable the adapter on
the host and rename to the corresponding OVS bridge name for consistency.
The user will enable and set the values he wants after creation.
ovs-vsctl del-br br-test
This command will remove an internal port on the MSFT virtual switch
using the Msvm_VirtualEthernetSwitchManagementService class and executing
the method RemoveResourceSettings.
Both commands will be blocking until the WMI job is finished, this allows us
to guarantee that the ports are created and their name are set before issuing
a netlink message to the windows datapath.
This patch also includes helpers for normal WMI retrievals and initializations.
Appveyor and documentation has been modified to include the libraries needed
for COM objects.
This patch was tested individually using IMallocSpy and CRT heap checks
to ensure no new memory leaks are introduced.
Tested on the following OS's:
Windows 2012, Windows 2012r2, Windows 2016
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Acked-by: Paul Boca <pboca@cloudbasesolutions.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Checking for ERROR_INSUFFICIENT_BUFFER is incorrect per
MSFT documentation:
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365915(v=vs.85).aspx
Also, the initial call to GetAdaptersAddresses was wrong. In the case
of a successful return 'all_addr' was not allocated leading to a crash.
Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com>
Reported-by: Lior Baram <lior.baram@hpe.com>
Acked-by: Sairam Venugopal <vsairam@vmware.com>
Signed-off-by: Gurucharan Shetty <guru@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most users of OVSDB react to whatever is currently in their view of the
database, as opposed to keeping track of changes and reacting to those
changes individually. The interface to conditional monitoring was
different, in that it expected the client to say what to add or remove from
monitoring instead of what to monitor. This seemed reasonable at the time,
but in practice it turns out that the usual approach actually works better,
because the condition is generally a function of the data visible in the
database. This commit changes the approach.
This commit also changes the meaning of an empty condition for a table.
Previously, an empty condition meant to replicate every row. Now, an empty
condition means to replicate no rows. This is more convenient for code
that gradually constructs conditions, because it does not need special
cases for replicating nothing.
This commit also changes the internal implementation of conditions from
linked lists to arrays. I just couldn't see an advantage to using linked
lists.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Liran Schour <lirans@il.ibm.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds OpenFlow clone action with syntax as below:
"clone([action][,action...])". The clone() action makes a copy of the
current packet and executes the list of actions against the packet,
without affecting the packet after the "clone(...)" action. In other
word, the packet before the clone() and after the clone() is the same,
no matter what actions executed inside the clone().
Use case 1:
Set different fields and output to different ports without unset
actions=
clone(mod_dl_src:<mac1>, output:1), clone(mod_dl_dst:<mac2>, output:2), output:3
Since each clone() has independent packet, output:1 has only dl_src modified,
output:2 has only dl_dst modified, output:3 has original packet.
Similar to case1
actions=
push_vlan(...), output:2, pop_vlan, push_vlan(...), output:3
can be changed to
actions=
clone(push_vlan(...), output:2),clone(push_vlan(...), output:3)
without having to add pop_vlan.
case 2: resubmit to another table without worrying packet being modified
actions=clone(resubmit(1,2)), ...
Signed-off-by: William Tu <u9012063@gmail.com>
[blp@ovn.org revised this to omit the "sample" action]
Signed-off-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
| |
A few Open vSwitch extension actions have no fixed arguments but do have
variable-length options that follow the header, and an upcoming commit will
add another such action. There is little value in having individual
structures for these actions, since they all have the same form, so this
commit makes all of them use the existing struct ext_action_header.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
|
|
|
|
|
|
|
|
| |
The 'tc' member of struct ovsdb_idl_condition was written but never read,
so remove it.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
|