summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* actions: Add new "ct_clear" action.Ben Pfaff2017-01-215-0/+26
| | | | | Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
* actions: Make "next" action able to jump from egress to ingress pipeline.Ben Pfaff2017-01-217-48/+167
| | | | | | | This feature is useful for centralized gateways. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
* actions: Introduce enum ovnact_pipeline.Ben Pfaff2017-01-212-22/+26
| | | | | | | | | This isn't used yet by the actions code, but an upcoming commit will introduce a user. This commit just adjusts ovn-trace to use this common type instead of its own local type. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
* actions: Omit table number when possible for formatting "next" action.Ben Pfaff2017-01-213-26/+34
| | | | | | | | | | | | | | | | Until now, formatting the "next" action has always required including the table number, because the action struct didn't include enough context so that the formatter could decide whether the table number was the next table or some other table. This is more or less OK, but an upcoming commit will add a "pipeline" field to the "next" action, which means that the same policy there would require that the pipeline always be printed. That's a little obnoxious because 99+% of the time, the pipeline to be printed is the same pipeline that the flow is in and printing it would be distracting. So it's better to store some context to help with formatting. This commit begins adopting that policy for the existing table number field. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
* actions: Separate action structures for "next" and "ct_next".Ben Pfaff2017-01-212-5/+16
| | | | | | | | | These actions aren't very similar but until now they both had the same action structure. These structures are going to diverge in an upcoming commit, so separate them now. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
* actions: Add new OVN action "clone".Ben Pfaff2017-01-215-16/+84
| | | | | Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
* actions: Make "free" functions per-struct, not per-action.Ben Pfaff2017-01-211-73/+20
| | | | | | | | | | | In some cases multiple kinds of OVN action share the same structure. In all of these cases, a given kind of structure is freed one particular way (it would be confusing if this were not the case), so there's no benefit in having per-action free functions. Therefore, this commit switches to a free function per structure type. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
* ovn-trace: Fix selection of table that "next" jumps to.Ben Pfaff2017-01-211-2/+2
| | | | | | | | | The common case is that "next" advances to the next table, but it can jump to any table. Reported-by: Mickey Spiegel <mickeys.dev@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
* actions: Make "arp { drop; };" acceptable.Ben Pfaff2017-01-202-10/+10
| | | | | | | | | | | | Before this commit, the OVN action parser would accept "arp {};" and then the formatter would format it back as "arp { drop; };", but the parser didn't accept the latter. There were basically two choices: make the parser accept "arp { drop; };" or make the formatter output "arp {};" (or both). This patch does (only) the former, and adds a test to avoid regression. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
* lex: Make lexer_force_match() work for LEX_T_END.Ben Pfaff2017-01-202-6/+11
| | | | | | | | | | | | Without this change, lexer_force_match(lex, LEX_T_END) mostly works, except that in the failure case it emits an error that says "expecting `$'", which is a surprising error message. Arguably, lexer_force_end() could be removed entirely, but I don't see a real problem with the existing arrangement. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
* actions: Fix "arp" and "nd_na" followed by another action.Ben Pfaff2017-01-202-5/+7
| | | | | | | | | | | | OVN logical actions are supposed to be padded to a multiple of 8 bytes, but the code for parsing "arp" and "nd_na" actions didn't do this properly. The result was that it worked OK if one of these actions was the last one in a sequence of logical actions, but failed badly if they were in the middle. This commit fixes the problem, adds assertions to make it harder for the problem to recur, and adds a test. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Mickey Spiegel <mickeys.dev@gmail.com>
* tnl-neigh-cache: Force revalidation for a new neighbor entry.Ben Pfaff2017-01-202-1/+2
| | | | | | | | | | When a new ARP or ND entry was added, the code failed to force revalidation. This commit fixes the problem. Reported-by: László Sürü <laszlo.suru@ericsson.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2017-January/327788.html Tested-by: László Sürü <laszlo.suru@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Documentation: Update DPDK doc after port naming change.Daniele Di Proietto2017-01-192-36/+43
| | | | | | | | | | | | options:dpdk-devargs is always required now. This commit also changes some of the names from 'dpdk0' to various others. netdev-dpdk/detach accepts a PCI id instead of a port name. CC: Ciara Loftus <ciara.loftus@intel.com> Fixes: 55e075e65ef9("netdev-dpdk: Arbitrary 'dpdk' port naming") Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ciara Loftus <ciara.loftus@intel.com>
* ovn: Introduce distributed gateway port and "chassisredirect" port bindingMickey Spiegel2017-01-1911-23/+945
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently OVN distributed logical routers achieve reachability to physical networks by passing through a "join" logical switch to a centralized gateway router, which then connects to another logical switch that has a localnet port connecting to the physical network. This patch adds logical port and port binding abstractions that allow an OVN distributed logical router to connect directly to a logical switch that has a localnet port connecting to the physical network. In this patch, this logical router port is called a "distributed gateway port". The primary design goal of distributed gateway ports is to allow as much traffic as possible to be handled locally on the hypervisor where a VM or container resides. Whenever possible, packets from the VM or container to the outside world should be processed completely on that VM's or container's hypervisor, eventually traversing a localnet port instance on that hypervisor to the physical network. Whenever possible, packets from the outside world to a VM or container should be directed through the physical network directly to the VM's or container's hypervisor, where the packet will enter the integration bridge through a localnet port. However, due to the implications of the use of L2 learning in the physical network, as well as the need to support advanced features such as one-to-many NAT (aka IP masquerading), where multiple logical IP addresses spread across multiple chassis are mapped to one external IP address, it will be necessary to handle some of the logical router processing on a specific chassis in a centralized manner. For this reason, the user must associate a chassis with each distributed gateway port. In order to allow for the distributed processing of some packets, distributed gateway ports need to be logical patch ports that effectively reside on every hypervisor, rather than "l3gateway" ports that are bound to a particular chassis. However, the flows associated with distributed gateway ports often need to be associated with physical locations. This is implemented in this patch (and subsequent patches) by adding "is_chassis_resident()" match conditions to several logical router flows. While most of the physical location dependent aspects of distributed gateway ports can be handled by restricting some flows to specific chassis, one additional mechanism is required. When a packet leaves the ingress pipeline and the logical egress port is the distributed gateway port, one of two different sets of actions is required at table 32: - If the packet can be handled locally on the sender's hypervisor (e.g. one-to-one NAT traffic), then the packet should just be resubmitted locally to table 33, in the normal manner for distributed logical patch ports. - However, if the packet needs to be handled on the chassis associated with the distributed gateway port (e.g. one-to-many SNAT traffic or non-NAT traffic), then table 32 must send the packet on a tunnel port to that chassis. In order to trigger the second set of actions, the "chassisredirect" type of southbound port_binding is introduced. Setting the logical egress port to the type "chassisredirect" logical port is simply a way to indicate that although the packet is destined for the distributed gateway port, it needs to be redirected to a different chassis. At table 32, packets with this logical egress port are sent to a specific chassis, in the same way that table 32 directs packets whose logical egress port is a VIF or a type "l3gateway" port to different chassis. Once the packet arrives at that chassis, table 33 resets the logical egress port to the value representing the distributed gateway port. For each distributed gateway port, there is one type "chassisredirect" port, in addition to the distributed logical patch port representing the distributed gateway port. A "chassisredirect" port represents a particular instance, bound to a specific chassis, of an otherwise distributed port. A "chassisredirect" port is associated with a chassis in the same manner as a "l3gateway" port. However, unlike "l3gateway" ports, "chassisredirect" ports have no associated IP or MAC addresses, and "chassisredirect" ports should never be used as the "inport". Any pipeline stages that depend on port specific IP or MAC addresses should be carried out in the context of the distributed gateway port's logical patch port. Although the abstraction represented by the "chassisredirect" port binding is generalized, in this patch the "chassisredirect" port binding is only created for NB logical router ports that specify the new "redirect-chassis" option. There is no explicit notion of a "chassisredirect" port in the NB database. The expectation is when capabilities are implemented that take advantage of "chassisredirect" ports (e.g. distributed gateway ports), flows specifying a "chassisredirect" port as the outport will be added as part of that capability. Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovn: add is_chassis_resident match expression componentMickey Spiegel2017-01-199-17/+279
| | | | | | | | | | | | | | | | | | | | | This patch introduces a new match expression component is_chassis_resident(). Unlike match expression comparisons, is_chassis_resident is not pushed down to OpenFlow. It is a conditional that is evaluated in the controller during expr_simplify(), when it is replaced by a boolean expression. The is_chassis_resident conditional evaluates to "true" when the specified string identifies a port name that is resident on this controller chassis, i.e., the corresponding southbound database Port_Binding has a chassis column that matches this chassis. Otherwise it evaluates to "false". This allows higher level features to specify flows that are only installed on some chassis rather than on all chassis with the corresponding datapath. Suggested-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com> Acked-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
* lacp: add test step for link recoveryShu Shen2017-01-191-0/+136
| | | | | | | | | An additional step is added to test case "lacp - negotiation" to ensure the bond port and its slave interfaces properly re-negotiate after a link previously down comes back. Signed-off-by: Shu Shen <shu.shen@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovn-nbctl: Ability to bootstrap CA certificate.Gurucharan Shetty2017-01-194-0/+38
| | | | | | | | | | | | | Utilities like ovs-vsctl have the ability to bootstrap CA certificate. It looks useful for ovn-nbctl to have the same ability too. One could connect over to OVN NB database over SSL for transactions without having to copy over the certificate being used by ovsdb-server backing OVN NB. Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Lance Richardson <lrichard@redhat.com> Acked-by: Ben Pfaff <blp@ovn.org>
* faq: Document OVS packet buffering.Ben Pfaff2017-01-181-0/+32
| | | | | | | We get questions about this sometimes. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
* ofproto-dpif: Use acquire/release barriers with 'tables_version'.Jarno Rajahalme2017-01-181-3/+15
| | | | | | | | | | | | | | | | | | Use memory_order_release when updating the tables version number to make sure no memory accesses before the atomic_store (possibly relating to setting up the new version) are reordered to take place after the atomic_store, which makes the new version available to other threads. Correspondingly, use memory_order_acquire when reading the current tables_version to make sure no later memory accesses (possibly relating to the current version) are reordered to take place before the atomic_read to ensure that those memory accesses can not relate to an older version than returned by the atomic_read. Suggested-by: Daniele Di Proietto <ddiproietto@vmware.com> Fixes: 621b8064b7 ("ofproto: Infra for table versioning.") Signed-off-by: Jarno Rajahalme <jarno@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* configuration.rst: Update the example of DPDK port's configurationBinbin Xu2017-01-181-4/+3
| | | | | | | | After the hotplug of DPDK ports, a valid dpdk-devargs must be specified. Otherwise, the DPDK device can't be available. Signed-off-by: Binbin Xu <xu.binbin1@zte.com.cn> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* ovn-ctl: Add bootstrap ovn-controller CA certificate option.Gurucharan Shetty2017-01-182-2/+19
| | | | | | | | | | | | | ovn-controller accepts the option --bootstrap-ca-cert. With this commit, ovn-ctl will let user pass a value for that via --ovn-controller-ssl-bootstrap-ca-cert option. Bootstrapping is useful for ovn-controller as you don't have to copy the controller's certificate (self-signed or otherwise) to every host. Signed-off-by: Gurucharan Shetty <guru@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org> Acked-by: Lance Richardson <lrichard@redhat.com>
* libX: add new release / version info tagsAaron Conole2017-01-1811-6/+147
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit uses the $PACKAGE_VERSION automake variable to construct a release and version info combination which sets the library name to be: libfoo-$(OVS_MAJOR_VERSION).so.$(OVS_MINOR_VERSION).0.$(OVS_MICRO_VERSION) where formerly, it was always: libfoo.so.1.0.0 This allows releases of Open vSwitch libraries to reflect which specific versions they came with, and sets up a psuedo ABI-versioning scheme. In this fashion, future releases of Open vSwitch could be installed alongside older releases, allowing 3rd party utilities linked against previous versions to continue to function. ex: $ ldd /path/to/utility linux-vdso.so.1 (0x00007ffe92cf6000) libopenvswitch-2.so.6 => /lib64/libopenvswitch-2.so.6 (0x00007f733b7a3000) libssl.so.10 => /lib64/libssl.so.10 (0x00007f733b530000) ... Note the library name and version information. Signed-off-by: Aaron Conole <aconole@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovn: document logical routers and logical patch ports in ovn-architectureMickey Spiegel2017-01-171-8/+140
| | | | | | | | This patch adds a description of logical routers and logical patch ports, including gateway routers, to ovn/ovn-architecture.7.xml. Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* vlan.rst: Strip leftover HTML.Russell Bryant2017-01-171-1/+1
| | | | | | | | Strip a couple of closing HTML tags that were left over from when this doc was converted from the web site to RST. Signed-off-by: Russell Bryant <russell@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* dpif-netdev: Avoids repeated addition of DP_STAT_LOST.nickcooper-zhangtonghao2017-01-161-1/+0
| | | | | | | | CC: Daniele Di Proietto <diproiettod@vmware.com> Fixes: 8aaa125dab66 ("dpif-netdev: Share emc and fast path output batches.") Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Acked-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* ovs-numa: Remove unused functions.Daniele Di Proietto2017-01-152-182/+0
| | | | | | | | ovs-numa doesn't need to keep the state of the pmd threads, it is an implementation detail of dpif-netdev. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* dpif-netdev: Centralized threads and queues handling code.Daniele Di Proietto2017-01-152-449/+450
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently we have three different code paths that deal with pmd threads and queues, in response to different input 1. When a port is added 2. When a port is deleted 3. When the cpumask changes or a port must be reconfigured. 1. and 2. are carefully written to minimize disruption to the running datapath, while 3. brings down all the threads reconfigure all the ports and restarts everything. This commit removes the three separate code paths by introducing the reconfigure_datapath() function, that takes care of adapting the pmd threads and queues to the current datapath configuration, no matter how we got there. This aims at simplifying maintenance and introduces a long overdue improvement: port reconfiguration (can happen quite frequently for dpdkvhost ports) is now done without shutting down the whole datapath, but just by temporarily removing the port that needs to be reconfigured (while the rest of the datapath is running). We now also recompute the rxq scheduling from scratch every time a port is added of deleted. This means that the queues will be more balanced, especially when dealing with explicit rxq-affinity from the user (without shutting down the threads and restarting them), but it also means that adding or deleting a port might cause existing queues to be moved between pmd threads. This negative effect can be avoided by taking into account the existing distribution when computing the new scheduling, but I considered code clarity and fast reconfiguration more important than optimizing port addition or removal (a port is added and removed only once, but can be reconfigured many times) Lastly, this commit moves the pmd threads state away from ovs-numa. Now the pmd threads state is kept only in dpif-netdev. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Co-authored-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* dpif-netdev: Use hmap for poll_list in pmd threads.Daniele Di Proietto2017-01-151-56/+112
| | | | | | | | | | | | | | | A future commit will use this to determine if a queue is already contained in a pmd thread. To keep the behavior unaltered we now have to sort queues before printing them in pmd_info_show_rxq(). Also this commit introduces 'struct polled_queue' that will be used exclusively in the fast path, uses 'struct dp_netdev_rxq' from 'struct rxq_poll' and uses 'rx' for 'netdev_rxq' and 'rxq' for 'dp_netdev_rxq'. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* ovs-numa: Add per numa and global counts in dump.Daniele Di Proietto2017-01-152-37/+77
| | | | | | | | They will be used by a future commit. Suggested-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* ovs-numa: Don't use hmap_first_with_hash().Daniele Di Proietto2017-01-151-12/+14
| | | | | | | | I think it's better to iterate the hmap than to use hmap_first_with_hash(), because it handles hash collisions. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* ovs-numa: Add new dump types.Daniele Di Proietto2017-01-152-1/+79
| | | | | | | | | | They will be used by a future commit. This patch introduces some code duplication which will be removed in a future commit. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* ovs-numa: New ovs_numa_dump_contains_core() function.Daniele Di Proietto2017-01-152-7/+28
| | | | | | | | | It will be used by a future commit. struct ovs_numa_dump now uses an hmap instead of a list to make ovs_numa_dump_contains_core() more efficient. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* dpctl: Avoid making assumptions on pmd threads.Daniele Di Proietto2017-01-155-155/+194
| | | | | | | | | | | | | | | | | | | Currently dpctl depends on ovs-numa module to delete and create flows on different pmd threads for pmd devices. The next commits will move away the pmd threads state from ovs-numa to dpif-netdev, so the ovs-numa interface will not be supported. Also, the assignment between ports and thread is an implementation detail of dpif-netdev, dpctl shouldn't know anything about it. This commit changes the dpif_flow_put() and dpif_flow_del() calls to iterate over all the pmd threads, if pmd_id is PMD_ID_NULL. A simple test is added. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* dpif-netdev: Make 'static_tx_qid' const.Daniele Di Proietto2017-01-151-6/+5
| | | | | | | | | Since previous commit, 'static_tx_qid' doesn't need to be atomic and is actually never touched (except for initialization), so it can be made const. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* dpif-netdev: Create pmd threads for every numa node.Daniele Di Proietto2017-01-152-141/+69
| | | | | | | | | | | | | | | A lot of the complexity in the code that handles pmd threads and ports in dpif-netdev is due to the fact that we postpone the creation of pmd threads on a numa node until we have a port that needs to be polled on that particular node. Since the previous commit, a pmd thread with no ports will not consume any CPU, so it seems easier to create all the threads at once. This will also make future commits easier. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* dpif-netdev: Block pmd threads if there are no ports.Daniele Di Proietto2017-01-151-0/+15
| | | | | | | | | | | | | | | | | There's no reason for a pmd thread to perform its main loop if there are no queues in its poll_list. This commit introduces a seq object on which the pmd thread can be blocked, if there are no queues. When the main thread wants to reload a pmd threads it must now change the seq object (in case it's blocked) and set 'reload' to true. This is useful to avoid wasting CPU cycles and is also necessary for a future commit. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* dpif-netdev: Use a boolean instead of pmd->port_seq.Daniele Di Proietto2017-01-151-12/+7
| | | | | | | | | | | | There's no need for a sequence number, since the main thread has to wait for the pmd thread, so there's no chance that an update will be undetected. A seq object will be introduced for another purpose in the next commit, and changing this to boolean makes the code more readable. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* netdev-dpdk: Refactor construct and destruct.Daniele Di Proietto2017-01-151-45/+41
| | | | | | | | | | | | | | Some refactoring for _construct() and _destruct() methods: * Rename netdev_dpdk_init() to common_construct(). init() has a different meaning in the netdev context. * Remove DPDK_DEV_ETH and DPDK_DEV_VHOST checks in common_construct() and move them to different functions * Introduce common_destruct(). * Avoid taking 'dev->mutex' in construct and destruct: we're guaranteed to be the only thread with access to the object. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* netdev-dpdk: Start also dpdkr devices only once on port-add.Daniele Di Proietto2017-01-153-40/+38
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 55e075e65ef9("netdev-dpdk: Arbitrary 'dpdk' port naming"), we don't call rte_eth_start() from netdev_open() anymore, we only call it from netdev_reconfigure(). This commit does that also for 'dpdkr' devices, and remove some useless code. Calling rte_eth_start() also from netdev_open() was unnecessary and wasteful. Not doing it reduces code duplication and makes adding a port faster (~900ms before the patch, ~400ms after). Another reason why this is useful is that some DPDK driver might have problems with reconfiguration. For example, until DPDK commit 8618d19b52b1("net/vmxnet3: reallocate shared memzone on re-config"), vmxnet3 didn't support being restarted with a different number of queues. Technically, the netdev interface changed because before opening rxqs or calling netdev_send() the user must check if reconfiguration is required. This patch also documents that, even though no change to the userspace datapath (the only user) is required. Lastly, this patch makes sure the errors returned by ofproto_port_add (which includes the first port reconfiguration) are reported back to the database. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* netdev-dpdk: Don't call rte_dev_stop() in update_flags().Daniele Di Proietto2017-01-151-16/+12
| | | | | | | | | | | | | | | | | | | Calling rte_eth_dev_stop() while the device is running causes a crash. We could use rte_eth_dev_set_link_down(), but not every PMD implements that, and I found one NIC where that has no effect. Instead, this commit checks if the device has the NETDEV_UP flag when transmitting or receiving (similarly to what we do for vhostuser). I didn't notice any performance difference with this check in case the device is up. An alternative would be to remove the device queues from the pmd threads tx and receive cache, but that requires reconfiguration and I'd prefer to avoid it, because the change can come from OpenFlow. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* dpif-netdev: Don't try to output on a device without txqs.Daniele Di Proietto2017-01-153-36/+73
| | | | | | | | | | | | | | | | | Tunnel devices have 0 txqs and don't support netdev_send(). While netdev_send() simply returns EOPNOTSUPP, the XPS logic is still executed on output, and that might be confused by devices with no txqs. It seems better to have different structures in the fast path for ports that support netdev_{push,pop}_header (tunnel devices), and ports that support netdev_send. With this we can also remove a branch in netdev_send(). This is also necessary for a future commit, which starts DPDK devices without txqs. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* dpif-netdev: Take non_pmd_mutex to access tx cached ports.Daniele Di Proietto2017-01-151-0/+2
| | | | | | | | | | As documented in dp_netdev_pmd_thread, we must take non_pmd_mutex to access the tx port caches for the non pmd thread. Found by inspection. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* dpif-netdev: Fix memory leak.Daniele Di Proietto2017-01-151-0/+1
| | | | | | | | | | | | | | We keep all the per-port classifiers around, since they can be reused, but when a pmd thread is destroyed we should free them. Found using valgrind. Fixes: 3453b4d62a98("dpif-netdev: dpcls per in_port with sorted subtables") Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ben Pfaff <blp@ovn.org>
* python: Catch exception "SSL.SysCallError" for send by SSL.Guoshuai Li2017-01-141-1/+3
| | | | | | | | | | | | When OVSDB server is aborted, the SSL send function will throw SSL.SysCallError exception, which we need to catch and return it's -errno. While SSL.WantWriteError exception needs to return -EAGAIN based on its parent class, not EAGAIN Signed-off-by: Guoshuai Li <ligs@dtdream.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Use PRIu32 format for ofp_port_tShu Shen2017-01-147-24/+24
| | | | | | | | | | Although ofp_port_t uses a 16-bit range, it is defined as a 32-bit type. The format strings throughout the code base were using PRIu16 for ofp_port_t which leads to the compiler to throw Wformat message on platforms that don't promote 16-bit to 32-bit integers, e.g., on macOS. Signed-off-by: Shu Shen <shu.shen@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovn: specify addresses of type "router" lsps as "router"Mickey Spiegel2017-01-135-3/+63
| | | | | | | | | | | | | | | | | | | | Currently in OVN, when a logical switch port of type "router" is created, the MAC and optionally IP addresses of the peer logical router port must be specified again as the addresses of the logical switch port. This patch allows the logical switch port's addresses to be specified as the string "router", rather than explicitly copying the logical router port's MAC and optionally IP addresses. The router addresses are used to populate the logical switch's destination lookup, and to populate op->lsp_addrs in ovn-northd.c, which in turn is used to generate logical switch ARP and ND replies. Since ipam already looks at logical router ports, the only ipam modification necessary is to skip logical switch ports with addresses "router". Signed-off-by: Mickey Spiegel <mickeys.dev@gmail.com> Acked-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
* db-ctl-base: Always support all tables in schema.Ben Pfaff2017-01-136-367/+223
| | | | | | | | | | | | | | | | | When one adds a new table to a database schema, it's easy to forget to add the table to the list of tables in the *ctl.c program. When this happens, the database commands for that program don't work on that table at all, even for commands like "list" and "create" that don't need any special help. This patch fixes that problem, by making sure that db-ctl-base always has the complete list of tables. Previously, each ctl_table_class pointed directly to the corresponding ovsdb_idl_table_class. With this patch, there are instead two parallel arrays, one of ovsdb_idl_table_classes and the other of ctl_table_classes. This change accounts for the bulk of the change to the db-ctl-base code. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Lance Richardson <lrichard@redhat.com>
* travis: Update build list email address.Ben Pfaff2017-01-121-1/+1
| | | | | | | | The lists these days prefer an ovs- prefix. Currently all of the build emails are being dropped because it is missing. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
* dpif: Simplify dpif_execute_helper_cb()Andy Zhou2017-01-121-19/+12
| | | | | | | | | | The may_steal flag is now used, Remove OVS_UNUSED. Since dp_packet_delete() handles the NULL pointer properly, we can drop a few tracking variables, and make the code easier to follow. Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
* netdev-vport: Do not log empty warnings on success.Daniele Di Proietto2017-01-121-4/+6
| | | | | | | | | | | | set_tunnel_config() always logs a warning, even on success. This shouldn't happen. Without this, some unit tests fail. Fixes: 9fff138ec3a6("netdev: Add 'errp' to set_config().") Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Antonio Fischetti <antonio.fischetti@intel.com> Acked-by: Ben Pfaff <blp@ovn.org>