summaryrefslogtreecommitdiff
path: root/lib
Commit message (Collapse)AuthorAgeFilesLines
* dpif: Simplify dpif_execute_helper_cb()Andy Zhou2017-01-121-19/+12
| | | | | | | | | | The may_steal flag is now used, Remove OVS_UNUSED. Since dp_packet_delete() handles the NULL pointer properly, we can drop a few tracking variables, and make the code easier to follow. Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
* netdev-vport: Do not log empty warnings on success.Daniele Di Proietto2017-01-121-4/+6
| | | | | | | | | | | | set_tunnel_config() always logs a warning, even on success. This shouldn't happen. Without this, some unit tests fail. Fixes: 9fff138ec3a6("netdev: Add 'errp' to set_config().") Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Antonio Fischetti <antonio.fischetti@intel.com> Acked-by: Ben Pfaff <blp@ovn.org>
* ofproto-dpif: Make ofproto/trace output easier to read.Ben Pfaff2017-01-122-13/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | "ovs-appctl ofproto/trace" is invaluable for debugging, but as the users of Open vSwitch have evolved it has failed to keep up with the times. It's pretty easy to design OpenFlow tables and pipelines that resubmit dozens of times. Each resubmit causes an additional tab of indentation, so the output wraps around, sometimes again and again, and makes the output close to unreadable. ovn-trace pioneered better formatting for tracing in OVN logical datapaths, mostly by not increasing indentation for tail recursion, which in practice gets rid of almost all indentation. This commit experiments with redoing ofproto/trace the same way. Try looking at, for example, the testsuite output for test 2282 "ovn -- 3 HVs, 3 LRs connected via LS, source IP based routes". Without this commit, it indents 61 levels (488 spaces!). With this commit, it indents 1 level (4 spaces) and it's possible to actually understand what's going on almost at a glance. To see this for yourself, try the following command either with or without this commit (but be sure to keep the change to ovn.at that adds an ofproto/trace to the test): make check TESTSUITEFLAGS='-d 2282' && less tests/testsuite.dir/2282/testsuite.log Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Lance Richardson <lrichard@redhat.com> Acked-by: Justin Pettit <jpettit@ovn.org>
* netdev: Add 'errp' to set_config().Daniele Di Proietto2017-01-115-41/+84
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since 55e075e65ef9("netdev-dpdk: Arbitrary 'dpdk' port naming"), set_config() is used to identify a DPDK device, so it's better to report its detailed error message to the user. Tunnel devices and patch ports rely a lot on set_config() as well. This commit adds a param to set_config() that can be used to return an error message and makes use of that in netdev-dpdk and netdev-vport. Before this patch: $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk ovs-vsctl: Error detected while setting up 'dpdk0': dpdk0: could not set configuration (Invalid argument). See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch/". $ ovs-vsctl add-port br0 p+ -- set Interface p+ type=patch ovs-vsctl: Error detected while setting up 'p+': p+: could not set configuration (Invalid argument). See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch/". $ ovs-vsctl add-port br0 gnv0 -- set Interface gnv0 type=geneve ovs-vsctl: Error detected while setting up 'gnv0': gnv0: could not set configuration (Invalid argument). See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch/". After this patch: $ ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk ovs-vsctl: Error detected while setting up 'dpdk0': 'dpdk0' is missing 'options:dpdk-devargs'. The old 'dpdk<port_id>' names are not supported. See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch/". $ ovs-vsctl add-port br0 p+ -- set Interface p+ type=patch ovs-vsctl: Error detected while setting up 'p+': p+: patch type requires valid 'peer' argument. See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch/". $ ovs-vsctl add-port br0 gnv0 -- set Interface gnv0 type=geneve ovs-vsctl: Error detected while setting up 'gnv0': gnv0: geneve type requires valid 'remote_ip' argument. See ovs-vswitchd log for details. ovs-vsctl: The default log directory is "/var/log/openvswitch/". CC: Ciara Loftus <ciara.loftus@intel.com> CC: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Tested-by: Ciara Loftus <ciara.loftus@intel.com>
* netdev-dpdk: Assign socket id according to device's numa idxu.binbin1@zte.com.cn2017-01-111-2/+6
| | | | | | | | | | | | | | | We can hotplug attach DPDK ports specified via the 'dpdk-devargs' option now. But the socket id of DPDK ports can't be assigned correctly, it is always 0. The socket id of DPDK ports should be assigned according to the numa id of the device. Fixes: 55e075e65ef9e ("netdev-dpdk: Arbitrary 'dpdk' port naming") Signed-off-by: Binbin Xu <xu.binbin1@zte.com.cn> Acked-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* netdev-dummy: Limits the number of tx/rx queues.nickcooper-zhangtonghao2017-01-101-0/+17
| | | | | | | | | | This patch avoids the ovs_rcu to report WARN, caused by blocked for a long time, when ovs-vswitchd processes a port with many rx/tx queues. The number of tx/rx queues per port may be appropriate, because the dpdk uses it as an default max value. Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* dpdk: Late initialization.Daniele Di Proietto2017-01-102-14/+25
| | | | | | | | | | | | | | | | | | With this commit, we allow the user to set other_config:dpdk-init=true after the process is started. This makes it easier to start Open vSwitch with DPDK using standard init scripts without restarting the service. This is still far from ideal, because initializing DPDK might still abort the process (e.g. if there not enough memory), so the user must check the status of the process after setting dpdk-init to true. Nonetheless, I think this is an improvement, because it doesn't require restarting the whole unit. CC: Aaron Conole <aconole@redhat.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Aaron Conole <aconole@redhat.com>
* dpcls: Avoid one 8-byte chunk in subtable mask.Jarno Rajahalme2017-01-101-2/+18
| | | | | | | | | | | | | | | | | | | | | | This patch allows to skip the 8-byte chunk comprising of dp_hash and in_port in the subtable mask when dp_hash is wildcarded. This will slightly speed up the hash computation as one expensive function call to hash_add64() can be skipped. For each new netdev flow we wildcard in_port in the mask, so in the typical case where dp_hash is also wildcarded, the resulting 8-byte chunk will not be part of the subtable mask. This manipulation of the mask is possible as the datapath classifier is explicitly selected based on the in_port value, so that all the datapath flows in the selected classifier have an exact match on that in_port value. Given this, it is safe to ignore the in_port value when doing a lookup in the chosen classifier. Signed-off-by: Antonio Fischetti <antonio.fischetti@intel.com> Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Co-authored-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Co-authored-by: Jarno Rajahalme <jarno@ovn.org>
* New action "ct_clear".Ben Pfaff2017-01-101-1/+42
| | | | | | | | | | | | | | This is being introduced specifically to allow a user of the "clone" action to clear the connection tracking state, but it's implemented as a separate action as a matter of clean design and in case another use case arises later. Reported-by: Mickey Spiegel <mickeys.dev@gmail.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/326981.html Fixes: 7ae62a676d3a ("ofp-actions: Add clone action.") Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com> Tested-by: Dong Jun <dongj@dtdream.com>
* ovsdb-idl: Enhance conditional monitoring APIAndy Zhou2017-01-092-6/+45
| | | | | | | | | To allow client to know when the conditional monitoring changes has been accepted by the OVSDB server and the 'idl' contents has been updated to match the new conditions. Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* ovsdb-idl: Properly handle conditional monitor update errorandy zhou2017-01-091-8/+26
| | | | | | | | | | | | | | | | | | When generating conditional monitoring update request, current code failed to update idl's 'request-id'. This bug causes the reply message of the update request, regardless an ACK or a NACK, be logged as an unexpected message at the debug level and ignored by the core idl logic. In addition, the idl should not generate another conditional monitoring update request when there is an outstanding request. So that the requests and their reply are properly serialized. When the conditional monitoring is nacked by the server, drop idl into a client visible error state. Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* dpif-netdev: Uses the OVS_CORE_UNSPEC instead of magic numbers.nickcooper-zhangtonghao2017-01-081-6/+6
| | | | | | | | | This patch uses OVS_CORE_UNSPEC for the queue unpinned instead of "-1". More important, the "-1" casted to unsigned int is equal to NON_PMD_CORE_ID. We make the distinction between them. Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* netdev-dummy: Uses the NR_QUEUE instead of magic numbers.nickcooper-zhangtonghao2017-01-081-2/+2
| | | | | | | | | The NR_QUEUE is defined in "lib/dpif-netdev.h", netdev-dpdk uses it instead of magic number. netdev-dummy should be in the same case. Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* netdev-dpdk: Fix formatting typo.nickcooper-zhangtonghao2017-01-081-1/+1
| | | | | Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* nx-match: Only store significant bytes to stack.Jarno Rajahalme2017-01-064-52/+92
| | | | | | | | | | | Always storing the maximum mf_value size wastes about 120 bytes for each stack entry. This patch changes the stack from an mf_value array to a string of value-length pairs. The length is stored after the value so that the stack pop may first read the length and then the appropriate number of bytes. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* ofp-util: Do not assert fail if decoding malformed property.Jarno Rajahalme2017-01-061-2/+4
| | | | | | | | | OVS should not crash if the controller sends a malformed OpenFlow message. Return the error code instead. Suggested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* dpif: Return ENODEV from dpif_port_query_by_*() if there's no port.Daniele Di Proietto2017-01-063-6/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | bridge_delete_or_reconfigure() deletes every interface that's not dumped by OFPROTO_PORT_FOR_EACH(). ofproto_dpif.c:port_dump_next(), used by OFPROTO_PORT_FOR_EACH, checks if the ofport is in the datapath by calling port_query_by_name(). If port_query_by_name() returns an error, the dump is interrupted. If port_query_by_name() returns ENODEV, the device doesn't exist and the dump can continue. port_query_by_name() for the userspace datapath returns ENOENT instead of ENODEV. This is expected by dpif_port_query_by_name(), but it's not handled correctly by port_dump_next(). dpif-netdev handles reconfiguration errors for an interface by deleting it from the datapath, so it's possible that a device is missing. When this happens we must make sure that port_dump_next() continues to dump other devices, otherwise they will be deleted and the two layers will have an inconsistent view. This commit fixes the problem by returning ENODEV from the userspace datapath if the port doesn't exist, and by documenting this clearly in the dpif interfaces. The problem was found while developing new code. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>
* ovsdb-idl: Avoid sending redundant conditional monitoring updatesAndy Zhou2017-01-061-0/+2
| | | | | | | | | | | | In case connection is reset when there are buffered but unsent conditions, these conditions will be sent as the new "monitor_cond" message that will be sent after the idl reconnects. Without this patch, those conditions will be unnecessarily sent again with following monitoring condition update message. Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* netdev-dpdk: Add support for virtual DPDK PMDs (vdevs)Ciara Loftus2017-01-051-24/+11
| | | | | | | | | | | | | | | | | | | Prior to this commit, the 'dpdk' port type could only be used for physical DPDK devices. Now, virtual devices (or 'vdevs') are supported. 'vdev' devices are those which use virtual DPDK Poll Mode Drivers eg. null, pcap. To add a DPDK vdev, a valid 'dpdk-devargs' must be set for the given dpdk port. The format expected is 'eth_<driver_name><x>' where 'x' is a number between 0 and RTE_MAX_ETHPORTS -1. For example to add a port that uses the 'null' DPDK PMD driver: ovs-vsctl set Interface null0 options:dpdk-devargs=eth_null0 Not all DPDK vdevs have been verified to work at this point in time. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Stephen Finucane <stephen@that.guru> # docs only Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* netdev-dpdk: Arbitrary 'dpdk' port namingCiara Loftus2017-01-051-56/+126
| | | | | | | | | | | | | | | | | | | | | | | 'dpdk' ports no longer have naming restrictions. Now, instead of specifying the dpdk port ID as part of the name, the PCI address of the device must be specified via the 'dpdk-devargs' option. eg. ovs-vsctl add-port br0 my-port ovs-vsctl set Interface my-port type=dpdk options:dpdk-devargs=0000:06:00.3 The user must no longer hotplug attach DPDK ports by issuing the specific ovs-appctl netdev-dpdk/attach command. The hotplug is now automatically invoked when a valid PCI address is set in the dpdk-devargs. The format for ovs-appctl netdev-dpdk/detach command has changed in that the user now must specify the relevant PCI address as input instead of the port name. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Co-authored-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Stephen Finucane <stephen@that.guru> # docs only Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* netdev-dpdk: add hotplug supportMauricio Vásquez2017-01-051-6/+89
| | | | | | | | | | | | | | | | | | | In order to use dpdk ports in ovs they have to be bound to a DPDK compatible driver before ovs is started. This patch adds the possibility to hotplug (or hot-unplug) a device after ovs has been started. The implementation adds two appctl commands: netdev-dpdk/attach and netdev-dpdk/detach After the user attaches a new device, it has to be added to a bridge using the add-port command, similarly, before detaching a device, it has to be removed using the del-port command. Signed-off-by: Mauricio Vasquez B <mauricio.vasquezbernal@studenti.polito.it> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Co-authored-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Stephen Finucane <stephen@that.guru> # docs only Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* flow: Fix small typo in comment describing flow_compose().Justin Pettit2017-01-051-1/+1
| | | | | Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* ovsdb-data: Add support for integer ranges in database commandsLukasz Rzasik2017-01-056-61/+199
| | | | | | | | | | | | | | | | Adding / removing a range of integers to a column accepting a set of integers requires enumarating all of the integers. This patch simplifies it by introducing 'range' concept to the database commands. Two integers separated by a hyphen represent an inclusive range. The patch adds positive and negative tests for the new syntax. The patch was tested by 'make check'. Covarage was tested by 'make check-lcov'. Signed-off-by: Lukasz Rzasik <lukasz.rzasik@gmail.com> Suggested-by: <my_ovs_discuss@yahoo.com> Suggested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ofproto: Fix crash on flow monitor request with tun_metadata.Daniele Di Proietto2017-01-042-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | nx_put_match() needs a non-NULL tunnel metadata table, otherwise it will crash if a flow matches on tunnel metadata. This wasn't handled in ofputil_append_flow_update(), causing a crash when the controller sent a flow monitor request. To fix the problem, this commit changes ofputil_append_flow_update() to behave like ofputil_append_flow_stats_reply(). Since ofputil_append_flow_update() now needs to temporarily modify the match, this commits also embeds 'struct match' into 'struct ofputil_flow_update', to be safer. This is more similar to 'struct ofputil_flow_stats'. A regression test is added and a comment is updated in ovs-ofctl.c #0 0x000055699bd82fa0 in memcpy_from_metadata (dst=0x7ffc770930d0, src=0x7ffc77093698, loc=0x18) at ../lib/tun-metadata.c:451 #1 0x000055699bd83c2e in metadata_loc_from_match_read (map=0x0, match=0x7ffc77093410, idx=0, mask=0x7ffc77093658, is_masked=0x7ffc77093287) at ../lib/tun-metadata.c:848 #2 0x000055699bd83d9b in tun_metadata_to_nx_match (b=0x55699d3f0300, oxm=0, match=0x7ffc77093410) at ../lib/tun-metadata.c:871 #3 0x000055699bce523d in nx_put_raw (b=0x55699d3f0300, oxm=0, match=0x7ffc77093410, cookie=0, cookie_mask=0) at ../lib/nx-match.c:1052 #4 0x000055699bce5580 in nx_put_match (b=0x55699d3f0300, match=0x7ffc77093410, cookie=0, cookie_mask=0) at ../lib/nx-match.c:1116 #5 0x000055699bd3926f in ofputil_append_flow_update (update=0x7ffc770940b0, replies=0x7ffc77094e00) at ../lib/ofp-util.c:6805 #6 0x000055699bc4b5a9 in ofproto_compose_flow_refresh_update (rule=0x55699d405b40, flags=(NXFMF_INITIAL | NXFMF_ACTIONS), msgs=0x7ffc77094e00) at ../ofproto/ofproto.c:5915 #7 0x000055699bc4b5f6 in ofmonitor_compose_refresh_updates (rules=0x7ffc77094e10, msgs=0x7ffc77094e00) at ../ofproto/ofproto.c:5929 #8 0x000055699bc4bafc in handle_flow_monitor_request (ofconn=0x55699d404090, oh=0x55699d404220) at ../ofproto/ofproto.c:6082 #9 0x000055699bc4f46d in handle_openflow__ (ofconn=0x55699d404090, msg=0x55699d404910) at ../ofproto/ofproto.c:7912 #10 0x000055699bc4f5df in handle_openflow (ofconn=0x55699d404090, ofp_msg=0x55699d404910) at ../ofproto/ofproto.c:8002 #11 0x000055699bc88154 in ofconn_run (ofconn=0x55699d404090, handle_openflow=0x55699bc4f5bc <handle_openflow>) at ../ofproto/connmgr.c:1427 #12 0x000055699bc85934 in connmgr_run (mgr=0x55699d3adb90, handle_openflow=0x55699bc4f5bc <handle_openflow>) at ../ofproto/connmgr.c:363 #13 0x000055699bc422c9 in ofproto_run (p=0x55699d3c85e0) at ../ofproto/ofproto.c:1798 #14 0x000055699bc31ec6 in bridge_run__ () at ../vswitchd/bridge.c:2881 #15 0x000055699bc320a6 in bridge_run () at ../vswitchd/bridge.c:2938 #16 0x000055699bc3784e in main (argc=10, argv=0x7ffc770952c8) at ../vswitchd/ovs-vswitchd.c:111 Fixes: 8d8ab6c2d574 ("tun-metadata: Manage tunnel TLV mapping table on a per-bridge basis.") Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>
* netdev-dpdk: Rename ivshmem structures.Kevin Traynor2017-01-041-20/+20
| | | | | | | | | Rename some structures that call themselves ivshmem, as they are just a collection of dpdk rings and other information. Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* odp: Use struct in6_addr for IPv6 addresses.Jarno Rajahalme2017-01-046-80/+70
| | | | | | | | | Code is simplified when the ODP keys use the same type as the struct flow for the IPv6 addresses. As the change is facilitated by extract-odp-netlink-h, this change only affects the userspace. We already do the same for the ethernet addresses. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* ofp-parse: Allow match field names in actions and brackets in matches.Jarno Rajahalme2017-01-045-111/+188
| | | | | | | | | | | | | | | | | | | | | | | | Allow using match field names in addition to the canonical register names in actions (including 'load', 'move', 'push', 'pop', 'output', 'multipath', 'bundle_load', and 'learn'). Allow also leaving out the trailing '[]' to indicate full field. These changes allow simpler syntax similar to 'set_field' to be used also elsewhere. Correspondingly, allow the '[start..end]' syntax to be used in matches in addition to the more explicit 'value/mask' notation. For example, to match on the value 2 of the bits 14..15 of NXM_NX_REG0, the match could include: ... reg0[14..15]=2 ... instead of ... reg0=0x8000/0xc000 ... Note that only contiguous masks can be specified with the bracket notation. Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* ovs-thread: Avoid pthread_rwlockattr_t on Windows.Alin Serdean2017-01-041-4/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A recent commit fixed ovs_rwlock_init() to pass the pthread_rwlockattr_t that it initialized to pthread_rwlock_init(). According to POSIX documentation this is correct, but on Windows the current implementation of pthreads does not support a pre-initialized attribute. Please see a fork of the implementation https://github.com/GerHobbelt/pthread-win32/blob/19fd5054b29af1b4e3b3278bfffbb6274c6c89f5/pthread_rwlock_init.c#L59-L63 This is the same implementation as the official version found under: ftp://sourceware.org/pub/pthreads-win32/) A short debug output from `vswitch` to confirm the above: >k Index Function -------------------------------------------------------------------------------- *1 ovs-vswitchd.exe!ovs_rwlock_init(const ovs_rwlock * l_=0x000001721c7da250) 2 ovs-vswitchd.exe!open_dpif_backer(const char * type=0x000001721c7d8d60, dpif_backer * * backerp=0x000001721c7d89c0) 3 ovs-vswitchd.exe!construct(ofproto * ofproto_=0x000001721c7d87d0) 4 ovs-vswitchd.exe!ofproto_create(const char * datapath_name=0x000001721c7d86e0, const char * datapath_type=0x000001721c7d8750, ofproto * * ofprotop=0x000001721c7d80b8) 5 ovs-vswitchd.exe!bridge_reconfigure(const ovsrec_open_vswitch * ovs_cfg=0x000001721c7e05b0) 6 ovs-vswitchd.exe!bridge_run() 7 ovs-vswitchd.exe!main(int argc=6, char * * argv=0x000001721c729e10) 8 [External Code] >? error 22 https://github.com/openvswitch/ovs/blob/master/lib/ovs-thread.c#L243 This patch is critical because the majority (over 800) of the unit tests are failing. Fixes: 1a15f390afd6 ("lib/ovs-thread: set prefer writer lock for ovs_rwlock_init()") Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Shashank Ram <rams@vmware.com> [blp@ovn.org changed the details of the approach] Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev-dpdk: Enable Rx checksum offloading feature on DPDK physical ports.Sugesh Chandran2017-01-044-15/+105
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add Rx checksum offloading feature support on DPDK physical ports. By default, the Rx checksum offloading is enabled if NIC supports. However, the checksum offloading can be turned OFF either while adding a new DPDK physical port to OVS or at runtime. The rx checksum offloading can be turned off by setting the parameter to 'false'. For eg: To disable the rx checksum offloading when adding a port, 'ovs-vsctl add-port br0 dpdk0 -- \ set Interface dpdk0 type=dpdk options:rx-checksum-offload=false' OR (to disable at run time after port is being added to OVS) 'ovs-vsctl set Interface dpdk0 options:rx-checksum-offload=false' Similarly to turn ON rx checksum offloading at run time, 'ovs-vsctl set Interface dpdk0 options:rx-checksum-offload=true' The Tx checksum offloading support is not implemented due to the following reasons. 1) Checksum offloading and vectorization are mutually exclusive in DPDK poll mode driver. Vector packet processing is turned OFF when checksum offloading is enabled which causes significant performance drop at Tx side. 2) Normally, OVS generates checksum for tunnel packets in software at the 'tunnel push' operation, where the tunnel headers are created. However enabling Tx checksum offloading involves, *) Mark every packets for tx checksum offloading at 'tunnel_push' and recirculate. *) At the time of xmit, validate the same flag and instruct the NIC to do the checksum calculation. In case NIC doesnt support Tx checksum offloading, the checksum calculation has to be done in software before sending out the packets. No significant performance improvement noticed with Tx checksum offloading due to the e overhead of additional validations + non vector packet processing. In some test scenarios, it introduces performance drop too. Rx checksum offloading still offers 8-9% of improvement on VxLAN tunneling decapsulation even though the SSE vector Rx function is disabled in DPDK poll mode driver. Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com> Acked-by: Jesse Gross <jesse@kernel.org> Acked-by: Pravin B Shelar <pshelar@ovn.org>
* lib: Add support for tftp ct helper.Joe Stringer2017-01-032-3/+17
| | | | | | | | | The kernel datapath provides support for TFTP helpers, so add support for this ALG to the commandline and OpenFlow encoding/decoding. Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>
* ovn-trace: New --ovs option to also print OpenFlow flows.Ben Pfaff2016-12-281-1/+119
| | | | | | | | | | | Sometimes seeing the OpenFlow flows that back a given logical flow can provide additional insight. This commit adds a new --ovs option to ovn-trace that makes it connect to Open vSwitch over OpenFlow and retrieve and print the OpenFlow flows behind each logical flow encountered during a trace. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* ovn-controller: Tie OpenFlow and logical flows using OpenFlow cookie.Ben Pfaff2016-12-282-2/+23
| | | | | | | | | | | This makes it easy to find the logical flow that generated a particular OpenFlow flow, by running "ovn-sbctl dump-flows <cookie>". Later, this can be refined (and automated for "ofproto/trace"), but this is still a significant advance. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* conntrack: Do not create new connections from ICMP errors.Daniele Di Proietto2016-12-231-1/+5
| | | | | | | | | | | | | | | | | | | | | ICMP error packets (e.g. destination unreachable messages) are considered 'related' to another connection and are treated as part of that. However: * We shouldn't create new entries in the connection table if the original connection is not found. This is consistent with what the kernel does. * We certainly shouldn't call valid_new() on the packet, because valid_new() assumes the packet l4 type (might be TCP, UDP or ICMP) to be consistent with the conn_key nw_proto type. Found by inspection. Fixes: a489b16854b5("conntrack: New userspace connection tracker.") Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Darrell Ball <dlu998@gmail.com>
* packets: Simplify packet_csum_pseudoheader6().Ben Pfaff2016-12-231-10/+2
| | | | | | | | It's simpler to make two calls than eight. Reported-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
* rconn: Avoid abort for ill-behaved remote.Ben Pfaff2016-12-231-6/+10
| | | | | | | | | | | | | | | | | | | | If an rconn peer fails to send a hello message, the version number doesn't get set. Later, if the peer delays long enough, the rconn attempts to send an echo request but assert-fails instead because it doesn't know what version to use. This fixes the problem. To reproduce this problem: make sandbox ovs-vsctl add-br br0 ovs-vsctl set-controller br0 ptcp:12345 nc 127.0.0.1 12345 and wait 10 seconds for ovs-vswitchd to die. (Then exit the sandbox.) Reported-by: 张东亚 <fortitude.zhang@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* lacp: Select a may-enable IF as the lead IFBen Pfaff2016-12-231-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | A reboot of one switch in an MC-LAG bond makes all bond links to go down, causing a total connectivity loss for 3 seconds. Packet capture shows that spurious LACP PDUs are sent to OVS with a different MAC address (partner system id) during the final stages of the MC-LAG switch reboot. The current code selects a lead interface based on information in the LACP PDU, regardless of its synchronization state. If a non-synchronized interface is selected as the OVS lead interface then all other interfaces are forced down as their stored partner system id differs and the bond ends up with no working interface. The bond recovers within three seconds after the last spurious message. To avoid the problem, this commit requires a lead interface to be synchronized. In case no synchronized interface exists, the selection of lead interface is done as in the current code. Signed-off-by: Torgny Lindberg <torgny.lindberg@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* odp-execute: Optimize IP header modification in OVS datapathZoltán Balogh2016-12-223-7/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I measured the packet processing cost of OVS DPDK datapath for different OpenFlow actions. I configured OVS to use a single pmd thread and measured the packet throughput in a phy-to-phy setup. I used 10G interfaces bounded to DPDK driver and overloaded the vSwitch with 64 byte packets through one of the 10G interfaces. The processing cost of the dec_ttl action seemed to be gratuitously high compared with other actions. I looked into the code and saw that dec_ttl is encoded as a masked nested attribute in OVS_ACTION_ATTR_SET_MASKED(OVS_KEY_ATTR_IPV4). That way, OVS datapath can modify several IP header fields (TTL, TOS, source and destination IP addresses) by a single invocation of packet_set_ipv4() in the odp_set_ipv4() function in the lib/odp-execute.c file. The packet_set_ipv4() function takes the new TOS, TTL and IP addresses as arguments, compares them with the actual ones and updates the fields if needed. This means, that even if only TTL needs to be updated, each of the four IP header fields is passed to the callee and is compared to the actual field for each packet. The odp_set_ipv4() caller function possesses information about the fields that need to be updated in the 'mask' structure. The idea is to spare invocation of the packet_set_ipv4() function but use its code parts directly. So the 'mask' can be used to decide which IP header fields need to be updated. In addition, a faster packet processing can be achieved if the values of local variables are calculated right before their usage. | T | T | I | I | | T | O | P | P | Vanilla OVS || + new patch | L | S | s | d | (nsec/packet) || (nsec/packet) -------+---+---+---+---+---------------++--------------- output | | | | | 67.19 || 67.19 | X | | | | 74.48 || 68.78 | | X | | | 74.42 || 70.07 | | | X | | 84.62 || 78.03 | | | | X | 84.25 || 77.94 | | | X | X | 97.46 || 91.86 | X | | X | X | 100.42 || 96.00 | X | X | X | X | 102.80 || 100.73 The table shows the average processing cost of packets in nanoseconds for the following actions: output; output + dec_ttl; output + mod_nw_tos; output + mod_nw_src; output + mod_nw_dst and some of their combinations. I ran each test five times. The values are the mean of the readings obtained. I added OVS_LIKELY to the 'if' condition for the TTL field, since as far as I know, this field will typically be decremented when any field of the IP header is modified. Signed-off-by: Zoltán Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* lib/ovs-thread: set prefer writer lock for ovs_rwlock_init()zangchuanqiang2016-12-221-1/+1
| | | | | | | | | | | | | | An alternative "writer nonrecursive" rwlock allows recursive read-locks to succeed only if there are no threads waiting for the write-lock. In the function ovs_rwlock_init(), there exist a problem, the parameter of 'attr' is not used to set the attributes of ovs_rwlock 'l_', just because use pthread_rwlock_init(&l->lock, NULL) to init l->lock. The attr object needs to be passed to the pthread_rwlock_init() call in order to make use of it. Signed-off-by: zangchuanqiang <zangchuanqiang@huawei.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* table: correct documented default format in man pagesLance Richardson2016-12-222-5/+5
| | | | | | | | | | | | | | | | | There are currently five users of the table formatting library, all of which default to "list" except for ovsdb-client which defaults to "table". The library current default is "table", and the table.man man page fragment only considers ovs-vsctl to use something other than "table" as a default.As a result, the man pages for ovn-sbctl and vtep-ctl are currently incorrect (these options aren't documented in the ovn-nbctl man page, which will need to be addressed in a future patch). Fix by making the library default format "list" and handling ovsdb-client as the exception. Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ofproto-dpif-xlate: Adding IGMP/MLD checksum verificationEelco Chaudron2016-12-222-0/+22
| | | | | | | | | When IGMP or MLD packets arrive their content is used without the checksum being verified. With this change the checksum is verified, and the packet is not used for multicast snooping on failure. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* route-table: Stop netlink log message when routes withdrawnTony van der Peet2016-12-221-1/+1
| | | | | | | | | | | | | When a route is withdrawn (blackholed) the netlink message doesn't include an RTA_OIF element. This results in an "unexpected netlink message contents" log message because this element is not optional. Given that the netlink message will be ignored anyway, and subsequent error checking will cope with missing RTA_OIF, the element should be optional in order to suppress unnecessary log messages. Signed-off-by: Tony van der Peet <tony.vanderpeet@alliedtelesis.co.nz> Signed-off-by: Ben Pfaff <blp@ovn.org>
* lib/dpdk: No more deferred releaseAaron Conole2016-12-211-12/+5
| | | | | | | | | DPDK documentation is recently updated to reflect that DPDK does not hold any references to, nor take ownership of, the argv/argc elements. With that understanding, let's just release the memory asap. Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* Documentation: fix some typosLance Richardson2016-12-211-1/+1
| | | | | | | | s/deamon/daemon/ s/dependant/dependent/ Signed-off-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* hash: Update murmurhash repo link in commentsCian Ferriter2016-12-211-1/+1
| | | | | | | | The MurmurHash code repo has moved from code.google to github. Update the link to reflect this. Signed-off-by: Cian Ferriter <cian.ferriter@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Windows: Add internal switch port per OVS bridgeAlin Serdean2016-12-204-1/+1352
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch updates the following commands in the vswitch: ovs-vsctl add-br br-test ovs-vsctl del-br br-test ovs-vsctl add-br br-test: This command will now create an internal port on the MSFT virtual switch using the WMI interface from Msvm_VirtualEthernetSwitchManagementService leveraging the method AddResourceSettings. Before creating the actual port, the switch will be queried to see if there is not a port already created (good for restarts when restarting the vswitch daemon). If there is a port defined it will return success and log a message. After checking if the port already exists the command will also verify if the forwarding extension (windows datapath) is enabled and on a single switch. If it is not activated or if it is activated on multiple switches it will return an error and a message will be logged. After the port was created on the switch, we will disable the adapter on the host and rename to the corresponding OVS bridge name for consistency. The user will enable and set the values he wants after creation. ovs-vsctl del-br br-test This command will remove an internal port on the MSFT virtual switch using the Msvm_VirtualEthernetSwitchManagementService class and executing the method RemoveResourceSettings. Both commands will be blocking until the WMI job is finished, this allows us to guarantee that the ports are created and their name are set before issuing a netlink message to the windows datapath. This patch also includes helpers for normal WMI retrievals and initializations. Appveyor and documentation has been modified to include the libraries needed for COM objects. This patch was tested individually using IMallocSpy and CRT heap checks to ensure no new memory leaks are introduced. Tested on the following OS's: Windows 2012, Windows 2012r2, Windows 2016 Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Acked-by: Paul Boca <pboca@cloudbasesolutions.com> Acked-by: Sairam Venugopal <vsairam@vmware.com> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
* windows: Incorrect check while fetching adapter addressesAlin Serdean2016-12-201-3/+3
| | | | | | | | | | | | | | Checking for ERROR_INSUFFICIENT_BUFFER is incorrect per MSFT documentation: https://msdn.microsoft.com/en-us/library/windows/desktop/aa365915(v=vs.85).aspx Also, the initial call to GetAdaptersAddresses was wrong. In the case of a successful return 'all_addr' was not allocated leading to a crash. Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Reported-by: Lior Baram <lior.baram@hpe.com> Acked-by: Sairam Venugopal <vsairam@vmware.com> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
* ovsdb-idl: Change interface to conditional monitoring.Ben Pfaff2016-12-194-115/+195
| | | | | | | | | | | | | | | | | | | | | | | | Most users of OVSDB react to whatever is currently in their view of the database, as opposed to keeping track of changes and reacting to those changes individually. The interface to conditional monitoring was different, in that it expected the client to say what to add or remove from monitoring instead of what to monitor. This seemed reasonable at the time, but in practice it turns out that the usual approach actually works better, because the condition is generally a function of the data visible in the database. This commit changes the approach. This commit also changes the meaning of an empty condition for a table. Previously, an empty condition meant to replicate every row. Now, an empty condition means to replicate no rows. This is more convenient for code that gradually constructs conditions, because it does not need special cases for replicating nothing. This commit also changes the internal implementation of conditions from linked lists to arrays. I just couldn't see an advantage to using linked lists. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Liran Schour <lirans@il.ibm.com>
* ofp-actions: Add clone action.William Tu2016-12-191-0/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds OpenFlow clone action with syntax as below: "clone([action][,action...])". The clone() action makes a copy of the current packet and executes the list of actions against the packet, without affecting the packet after the "clone(...)" action. In other word, the packet before the clone() and after the clone() is the same, no matter what actions executed inside the clone(). Use case 1: Set different fields and output to different ports without unset actions= clone(mod_dl_src:<mac1>, output:1), clone(mod_dl_dst:<mac2>, output:2), output:3 Since each clone() has independent packet, output:1 has only dl_src modified, output:2 has only dl_dst modified, output:3 has original packet. Similar to case1 actions= push_vlan(...), output:2, pop_vlan, push_vlan(...), output:3 can be changed to actions= clone(push_vlan(...), output:2),clone(push_vlan(...), output:3) without having to add pop_vlan. case 2: resubmit to another table without worrying packet being modified actions=clone(resubmit(1,2)), ... Signed-off-by: William Tu <u9012063@gmail.com> [blp@ovn.org revised this to omit the "sample" action] Signed-off-by: Ben Pfaff <blp@ovn.org>
* ofp-actions: Use struct ext_action_header for appropriate actions.Ben Pfaff2016-12-191-57/+39
| | | | | | | | | | | A few Open vSwitch extension actions have no fixed arguments but do have variable-length options that follow the header, and an upcoming commit will add another such action. There is little value in having individual structures for these actions, since they all have the same form, so this commit makes all of them use the existing struct ext_action_header. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
* ovsdb-idl: Drop write-only member from struct ovsdb_idl_condition.Ben Pfaff2016-12-192-7/+3
| | | | | | | | The 'tc' member of struct ovsdb_idl_condition was written but never read, so remove it. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>