summaryrefslogtreecommitdiff
path: root/lib/netdev.c
Commit message (Collapse)AuthorAgeFilesLines
* netdev: Avoid leaking seq in netdev_open() error path.Huanle Han2016-09-201-0/+1
| | | | | Signed-off-by: Huanle Han <hanxueluo@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ofproto: Honor mtu_request even for internal ports.Daniele Di Proietto2016-09-021-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | By default Open vSwitch tries to configure internal interfaces MTU to match the bridge minimum, overriding any attempt by the user to configure it through standard system tools, or the database. While this works in many simple cases (there are probably many users that rely on this) it may create problems for more advanced use cases (like any overlay networks). This commit allows the user to override the default behavior by providing an explict MTU in the mtu_request column in the Interface table. This means that Open vSwitch will now treat differently database MTU requests from standard system tools MTU requests (coming from `ip link` or `ifconfig`), but this seems the best way to remain compatible with old users while providing a more powerful interface. Suggested-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org> Tested-by: Joe Stringer <joe@ovn.org>
* Revert "netdev: do not allow devices to be opened with conflicting types"Thadeu Lima de Souza Cascardo2016-08-161-7/+1
| | | | | | | | | | | | | | This reverts commit d2fa6c676a13e86acc7f17261b2d87484f625d45. When doing a restart, the routing table will open ports as system, which prevents internal ports to be opened with the right type. That causes failures in creating the ports. We should revisit this patch after finding a proper fix on the routing table layer. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
* netdev: Pass 'netdev_class' to ->run() and ->wait().Daniele Di Proietto2016-08-151-2/+2
| | | | | | | | This will allow run() and wait() methods to be shared between different classes and still perform class-specific work. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>
* netdev: Make netdev_set_mtu() netdev parameter non-const.Daniele Di Proietto2016-08-121-1/+1
| | | | | | | | | Every provider silently drops the const attribute when converting the parameter to the appropriate subclass. Might as well drop the const attribute from the parameter, since this is a "set" function. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* netdev-provider: fix comments for netdev_rxq_recvMark Kavanagh2016-07-271-7/+8
| | | | | | | | | | | | | Commit 64839cf43 applies batch objects to netdev-providers, but some comments were not updated accordingly. Fix these: - replace 'pkts' with 'batch' - replace '*cnt' with 'batch->count' - replace MAX_RX_BATCH with NETDEV_MAX_BURST - remove superfluous whitespace Signed-off-by: Mark Kavanagh <mark.b.kavanagh@intel.com> Acked-by: William Tu <u9012063@gmail.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
* netdev: do not allow devices to be opened with conflicting typesThadeu Lima de Souza Cascardo2016-07-271-1/+7
| | | | | | | | | | | | | | | | | | | | | When a device is already opened, netdev_open should verify that the types match, or else return an error. Otherwise, users might expect to open a device with a certain type and get a handle belonging to a different type. This also prevents certain conflicting configurations that would have a port of a certain type in the database and one of a different type on the system. For example, when adding an interface with a type other than system, and there is already a system interface with the same name, as the routing table will hold a reference to that system interface, some conflicts will arise. The netdev will be opened with the incorrect type and that will make vswitchd remove it, but adding it again will fail as it already exists. Failing earlier prevents some vswitchd loops in reconfiguring the interface. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
* dpif-netdev: XPS (Transmit Packet Steering) implementation.Ilya Maximets2016-07-271-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If CPU number in pmd-cpu-mask is not divisible by the number of queues and in a few more complex situations there may be unfair distribution of TX queue-ids between PMD threads. For example, if we have 2 ports with 4 queues and 6 CPUs in pmd-cpu-mask such distribution is possible: <------------------------------------------------------------------------> pmd thread numa_id 0 core_id 13: port: vhost-user1 queue-id: 1 port: dpdk0 queue-id: 3 pmd thread numa_id 0 core_id 14: port: vhost-user1 queue-id: 2 pmd thread numa_id 0 core_id 16: port: dpdk0 queue-id: 0 pmd thread numa_id 0 core_id 17: port: dpdk0 queue-id: 1 pmd thread numa_id 0 core_id 12: port: vhost-user1 queue-id: 0 port: dpdk0 queue-id: 2 pmd thread numa_id 0 core_id 15: port: vhost-user1 queue-id: 3 <------------------------------------------------------------------------> As we can see above dpdk0 port polled by threads on cores: 12, 13, 16 and 17. By design of dpif-netdev, there is only one TX queue-id assigned to each pmd thread. This queue-id's are sequential similar to core-id's. And thread will send packets to queue with exact this queue-id regardless of port. In previous example: pmd thread on core 12 will send packets to tx queue 0 pmd thread on core 13 will send packets to tx queue 1 ... pmd thread on core 17 will send packets to tx queue 5 So, for dpdk0 port after truncating in netdev-dpdk: core 12 --> TX queue-id 0 % 4 == 0 core 13 --> TX queue-id 1 % 4 == 1 core 16 --> TX queue-id 4 % 4 == 0 core 17 --> TX queue-id 5 % 4 == 1 As a result only 2 of 4 queues used. To fix this issue some kind of XPS implemented in following way: * TX queue-ids are allocated dynamically. * When PMD thread first time tries to send packets to new port it allocates less used TX queue for this port. * PMD threads periodically performes revalidation of allocated TX queue-ids. If queue wasn't used in last XPS_TIMEOUT_MS milliseconds it will be freed while revalidation. * XPS is not working if we have enough TX queues. Reported-by: Zhihong Wang <zhihong.wang@intel.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* json: Move from lib to include/openvswitch.Terry Wilson2016-07-221-1/+1
| | | | | | | | | | | | | | | To easily allow both in- and out-of-tree building of the Python wrapper for the OVS JSON parser (e.g. w/ pip), move json.h to include/openvswitch. This also requires moving lib/{hmap,shash}.h. Both hmap.h and shash.h were #include-ing "util.h" even though the headers themselves did not use anything from there, but rather from include/openvswitch/util.h. Fixing that required including util.h in several C files mostly due to OVS_NOT_REACHED and things like xmalloc. Signed-off-by: Terry Wilson <twilson@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev-provider: Apply batch object to netdev provider.William Tu2016-07-211-4/+2
| | | | | | | | | | | | | | Commit 1895cc8dbb64 ("dpif-netdev: create batch object") introduces batch process functions and 'struct dp_packet_batch' to associate with batch-level metadata. This patch applies the packet batch object to the netdev provider interface (dummy, Linux, BSD, and DPDK) so that batch APIs can be used in providers. With batch metadata visible in providers, optimizations can be introduced at per-batch level instead of per-packet. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/145694197 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* ofp-actions: Add truncate action.William Tu2016-06-241-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | The patch adds a new action to support packet truncation. The new action is formatted as 'output(port=n,max_len=m)', as output to port n, with packet size being MIN(original_size, m). One use case is to enable port mirroring to send smaller packets to the destination port so that only useful packet information is mirrored/copied, saving some performance overhead of copying entire packet payload. Example use case is below as well as shown in the testcases: - Output to port 1 with max_len 100 bytes. - The output packet size on port 1 will be MIN(original_packet_size, 100). # ovs-ofctl add-flow br0 'actions=output(port=1,max_len=100)' - The scope of max_len is limited to output action itself. The following packet size of output:1 and output:2 will be intact. # ovs-ofctl add-flow br0 \ 'actions=output(port=1,max_len=100),output:1,output:2' - The Datapath actions shows: # Datapath actions: trunc(100),1,1,2 Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/140037134 Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org>
* netdev-native-tnl: Introduce ip_build_header()Pravin B Shelar2016-05-231-4/+18
| | | | | | | | | | The native tunneling build tunnel header code is spread across two different modules, it makes pretty hard to follow the code. Following patch refactors the code to move all code to netdev-ative-tnl module. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* netdev-dpdk: Use ->reconfigure() call to change rx/tx queues.Daniele Di Proietto2016-05-231-25/+11
| | | | | | | | | | | | | | | | | | | | | | This introduces in dpif-netdev and netdev-dpdk the first use for the newly introduce reconfigure netdev call. When a request to change the number of queues comes, netdev-dpdk will remember this and notify the upper layer via netdev_request_reconfigure(). The datapath, instead of periodically calling netdev_set_multiq(), can detect this and call reconfigure(). This mechanism can also be used to: * Automatically match the number of rxq with the one provided by qemu via the new_device callback. * Provide a way to change the MTU of dpdk devices at runtime. * Move a DPDK vhost device to the proper NUMA socket. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Tested-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* netdev: Add reconfigure request mechanism.Daniele Di Proietto2016-05-231-0/+39
| | | | | | | | | | | | | | | | | | | | | A netdev provider, especially a PMD provider (like netdev DPDK) might not be able to change some of its parameters (such as MTU, or number of queues) without stopping everything and restarting. This commit introduces a mechanism that allows a netdev provider to request a restart (netdev_request_reconfigure()). The upper layer can be notified via netdev_wait_reconf_required() and netdev_is_reconf_required(). After closing all the rxqs the upper layer can finally call netdev_reconfigure(), to make sure that the new configuration is in place. This will be used by next commit to reconfigure rx and tx queues in netdev-dpdk. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Tested-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Mark Kavanagh <mark.b.kavanagh@intel.com>
* dpif-netdev: Fix memory leak in tunnel header pop action.Pravin B Shelar2016-05-181-3/+4
| | | | | | | | The tunnel header pop action can leak batch of packet in case of error. Following patch fixex the error code path. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* dpif-netdev: create batch objectPravin B Shelar2016-05-181-19/+17
| | | | | | | | | | DPDK datapath operate on batch of packets. To pass the batch of packets around we use packets array and count. Next patch needs to associate meta-data with each batch of packets. So Introducing a batch structure to make handling the metadata easier. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* netdev: Return number of packet from netdev_pop_header()Pravin B Shelar2016-05-181-8/+6
| | | | | | | | | | | Current tunnel-pop API does not allow the netdev implementation retain a packet but STT can keep a packet from batch of packets during TCP reassembly processing. To return exact count of valid packet STT need to pass this number of packet parameter as a reference. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jesse Gross <jesse@kernel.org>
* netdev: Initialise DPDK netdev classes only onceCiara Loftus2016-05-171-1/+0
| | | | | | | | | | | | | DPDK netdev classes were being initialised twice, resulting in warning logs like so: netdev|WARN|attempted to register duplicate netdev provider: dpdk This commit removes one of the initialisation calls. Fixes: 0692257923fe ("netdev: Fix potential deadlock.") Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev: Fix potential deadlock.Ben Pfaff2016-05-091-78/+48
| | | | | | | | | | | | | | | | | | | | | | | | | Until now, netdev_class_mutex and route_table_mutex could be taken in either order: * netdev_run() takes netdev_class_mutex, then netdev_vport_run() calls route_table_run(), which takes route_table_mutex. * route_table_init() takes route_table_mutex and then eventually calls netdev_open(), which takes netdev_class_mutex. This commit fixes the problem by converting the netdev_classes hmap, protected by netdev_class_mutex, into a cmap protected on the read side by RCU. Only a very small amount of code actually writes to the cmap in question, so it's a lot easier to understand the locking rules at that point. In particular, there's no need to take netdev_class_mutex from either netdev_run() or netdev_open(), so neither of the code paths above determines a lock ordering any longer. Reported-by: William Tu <u9012063@gmail.com> Reported-at: http://openvswitch.org/pipermail/discuss/2016-February/020216.html Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ryan Moats <rmoats@us.ibm.com> Tested-by: William Tu <u9012063@gmail.com>
* Add support for extended netdev statistics based on RFC 2819.mweglicx2016-05-061-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Implementation of new statistics extension for DPDK ports: - Add new counters definition to netdev struct and open flow, based on RFC2819. - Initialize netdev statistics as "filtered out" before passing it to particular netdev implementation (because of that change, statistics which are not collected are reported as filtered out, and some unit tests were modified in this respect). - New statistics are retrieved using experimenter code and are printed as a result to ofctl dump-ports. - New counters are available for OpenFlow 1.4+. - Add new vendor id: INTEL_VENDOR_ID. - New statistics are printed to output via ofctl only if those are present in reply message. - Add new file header: include/openflow/intel-ext.h which contains new statistics definition. - Extended statistics are implemented only for dpdk-physical and dpdk-vhost port types. - Dpdk-physical implementation uses xstats to collect statistics. - Dpdk-vhost implements only part of statistics (RX packet sized based counters). Signed-off-by: Michal Weglicki <michalx.weglicki@intel.com> [blp@ovn.org made software devices more consistent] Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev-dpdk: Convert initialization from cmdline to dbAaron Conole2016-04-291-2/+0
| | | | | | | | | | | | | | | | Existing DPDK integration is provided by use of command line options which must be split out and passed to librte in a special manner. However, this forces any configuration to be passed by way of a special DPDK flag, and interferes with ovs+dpdk packaging solutions. This commit delays dpdk initialization until after the OVS database connection is established, at which point ovs initializes librte. It pulls all of the config data from the OVS database, and assembles a new argv/argc pair to be passed along. Signed-off-by: Aaron Conole <aconole@redhat.com> Acked-by: Kevin Traynor <kevin.traynor@intel.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
* netdev: Verify ifa_addr is not NULL when iterating over getifaddrs.Thadeu Lima de Souza Cascardo2016-03-301-6/+8
| | | | | | | | | | | Some point-to-point devices like TUN devices will not have an address, and while iterating over ifaddrs, its ifa_addr will be NULL. This patch fixes a crash when starting ovs-vswitchd on a system with such a device. Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@redhat.com> Fixes: a8704b502785 ("tunneling: Handle multiple ip address for given device.") Cc: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
* list: Rename all functions in list.h with ovs_ prefix.Ben Warren2016-03-301-4/+4
| | | | | | | This attempts to prevent namespace collisions with other list libraries Signed-off-by: Ben Warren <ben@skyportsystems.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* list: Remove lib/list.h completely.Ben Warren2016-03-301-1/+1
| | | | | | | | All code is now in include/openvswitch/list.h. Signed-off-by: Ben Warren <ben@skyportsystems.com> Acked-by: Ryan Moats <rmoats@us.ibm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev: remove netdev_get_in4()Pravin B Shelar2016-03-241-42/+24
| | | | | | | | | | Since netdev can have multiple IP address use generic api netdev_get_addr_list(). This also make it easier to handle IPv4 and IPv6 address across vswitchd layers. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* tunneling: Handle multiple ip address for given device.Pravin B Shelar2016-03-241-12/+115
| | | | | | | | | | | | Device can have multiple IP address but netdev_get_in4/6() returns only one configured IPv6 address. Following patch fixes it. OVS router is also updated to return source ip address for given destination, This is required when interface has multiple IP address configured. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* Move lib/dynamic-string.h to include/openvswitch directoryBen Warren2016-03-191-1/+1
| | | | | Signed-off-by: Ben Warren <ben@skyportsystems.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev: New field 'is_pmd' in netdev_class.Ilya Maximets2016-03-161-4/+1
| | | | | | | Made to simplify creation of derived classes. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
* netdev: Improve comments on netdev_rxq_recv().Ben Pfaff2016-03-071-16/+16
| | | | | | | | | | The comment was incomplete in some ways and simply wrong in others. Also ensure that *cnt is set to 0 if an error is encountered. It's nice when callers can rely on this. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* dpif-netdev: Add dpif-netdev/pmd-rxq-show appctl command.Ilya Maximets2016-02-221-0/+6
| | | | | | | | | | | | | This command can be used to check the port/rxq assignment to pmd threads. For each pmd thread of the datapath shows list of queue-ids with port names. Additionally log message from pmd_thread_main() extended with queue-id, and type of this message changed from INFO to DBG. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
* netdev: Free packets in netdev_send() for devices that don't support send.Ben Pfaff2016-02-081-6/+14
| | | | | | | | | | This manifested as a memory leak in test 898 "ofproto-dpif - sFlow packet sampling - tunnel set", which included an output to a tunnel vport that doesn't have an implementation of netdev_send(). Reported-by: William Tu <u9012063@gmail.com> Reported-at: http://openvswitch.org/pipermail/dev/2016-February/065873.html Signed-off-by: Ben Pfaff <blp@ovn.org>
* dpif-netdev: Allow different numbers of rx queues for different ports.Ilya Maximets2016-02-041-0/+7
| | | | | | | | | | | | | | | | | | | Currently, all of the PMD netdevs can only have the same number of rx queues, which is specified in other_config:n-dpdk-rxqs. Fix that by introducing of new option for PMD interfaces: 'n_rxq', which specifies the maximum number of rx queues to be created for this interface. Example: ovs-vsctl set Interface dpdk0 options:n_rxq=8 Old 'other_config:n-dpdk-rxqs' deleted. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ben Pfaff <blp@ovn.org> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* userspace: Define and use struct eth_addr.Jarno Rajahalme2015-08-281-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Define struct eth_addr and use it instead of a uint8_t array for all ethernet addresses in OVS userspace. The struct is always the right size, and it can be assigned without an explicit memcpy, which makes code more readable. "struct eth_addr" is a good type name for this as many utility functions are already named accordingly. struct eth_addr can be accessed as bytes as well as ovs_be16's, which makes the struct 16-bit aligned. All use seems to be 16-bit aligned, so some algorithms on the ethernet addresses can be made a bit more efficient making use of this fact. As the struct fits into a register (in 64-bit systems) we pass it by value when possible. This patch also changes the few uses of Linux specific ETH_ALEN to OVS's own ETH_ADDR_LEN, and removes the OFP_ETH_ALEN, as it is no longer needed. This work stemmed from a desire to make all struct flow members assignable for unrelated exploration purposes. However, I think this might be a nice code readability improvement by itself. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
* tunnels: Don't initialize unnecessary packet metadata.Jesse Gross2015-07-011-1/+1
| | | | | | | | | | | | | | | | | | | The addition of Geneve options to packet metadata significantly expanded its size. It was reported that this can decrease performance for DPDK ports by up to 25% since we need to initialize the whole structure on each packet receive. It is not really necessary to zero out the entire structure because miniflow_extract() only copies the tunnel metadata when particular fields indicate that it is valid. Therefore, as long as we zero out these fields when the metadata is initialized and ensure that the rest of the structure is correctly set in the presence of a tunnel, we can avoid touching the tunnel fields on packet reception. Reported-by: Ciara Loftus <ciara.loftus@intel.com> Tested-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* Merge remote-tracking branch 'origin/master' into ovn4Justin Pettit2015-06-181-1/+2
|\
| * netdev-dpdk: add dpdk vhost-user portsCiara Loftus2015-06-141-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for a new port type to the userspace datapath called dpdkvhostuser. A new dpdkvhostuser port will create a unix domain socket which when provided to QEMU is used to facilitate communication between the virtio-net device on the VM and the OVS port on the host. vhost-cuse ('dpdkvhost') ports are still available as 'dpdkvhostcuse' ports and will be enabled if vhost-cuse support is detected in the DPDK build specified during compilation of the switch. Otherwise, vhost-user ports are enabled. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* | netdev: Initialize at the beginning of netdev_unregister_provider().Ben Pfaff2015-06-161-0/+2
|/ | | | | | | | | | | | | Otherwise, if netdev_unregister_provider() is called before any other netdev function, netdev_class_mutex is not initialized and the attempt to lock it aborts. This doesn't fix an existing bug but with the following commit --enable-dummy=system will make netdev_unregister_provider() the first netdev function to be called. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Alex Wang <alexw@nicira.com>
* netdev-dpdk: Adapt the requested number of tx and rx queues.Daniele Di Proietto2015-05-221-0/+10
| | | | | | | | | | | | | | This commit changes the semantics of 'netdev_set_multiq()' to allow OVS DPDK to run on device with limited multi queue support. * If a netdev doesn't have the requested number of rxqs it can simply inform the datapath without failing. * If a netdev doesn't have the requested number of txqs it should try to create as many as possible and use locking. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ethan Jackson <ethan@nicira.com> Acked-by: Ethan Jackson <ethan@nicira.com>
* tunneling: Invalid packets should be cleared.Jesse Gross2015-04-091-2/+1
| | | | | | | | | | | | | | | | | | | If we receive a packet with an invalid tunnel header, we should drop the packet without further processing. Currently we do this by removing any parsed tunnel metadata. However, this is not sufficient to stop processing - this only results in the packet getting dropped by chance when something usually runs across part of the packet that does not make sense. Since both the packet and its metadata are in an inconsistent state, it's also possible that the result is an ovs-vswitchd crash or forwarding of a mangled packet. Rather than clear the metadata, an alternate solution is to remove all of the packet data. This guarantees that the packet gets dropped during the next round of processing. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* tunneling: Convert tunnel push/pop functions to act on single packets.Jesse Gross2015-04-091-6/+28
| | | | | | | | | | | | The userspace tunneling API for pushing and popping tunnel headers is currently based on processing batches of packets. However, there is no obvious way to take advantage of batching for these operations and so each tunnel operation has a pair of loops to process the batch. This changes the API to operate on single packets to enable better code reuse. Signed-off-by: Jesse Gross <jesse@nicira.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* netdev: Fix user space tunneling for set_tunnel action.Ricky Li2015-03-261-2/+4
| | | | | | | | | | | | | | | e.g. Set tunnel id for encapsulated VxLAN packet (out_key=flow): ovs-vsctl add-port int-br vxlan0 -- set interface vxlan0 \ type=vxlan options:remote_ip=172.168.1.2 options:out_key=flow ovs-ofctl add-flow int-br in_port=LOCAL, icmp,\ actions=set_tunnel:3, output:1 (1 is the port# of vxlan0) Output tunnel ID should be modified to 3 with this patch. Signed-off-by: Ricky Li <ricky.li@intel.com> Acked-by: Pravin B Shelar <pshelar@nicira.com>
* netdev-dpdk: add dpdk vhost-cuse portsKevin Traynor2015-03-191-1/+2
| | | | | | | | | | | | | | | This patch adds support for a new port type to userspace datapath called dpdkvhost. This allows KVM (QEMU) to offload the servicing of virtio-net devices to its associated dpdkvhost port. Instructions for use are in INSTALL.DPDK. This has been tested on Intel multi-core platforms and with clients that have virtio-net interfaces. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Kevin Traynor <kevin.traynor@intel.com> Signed-off-by: Maryam Tahhan <maryam.tahhan@intel.com> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
* dp-packet: Remove ofpbuf dependency.Pravin B Shelar2015-03-031-2/+2
| | | | | | | | | | | | | Currently dp-packet make use of ofpbuf for managing packet buffers. That complicates ofpbuf, by making dp-packet independent of ofpbuf both libraries can be optimized for their own use case. This avoids mapping operation between ofpbuf and dp_packet in datapath upcalls. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* dpif_packet: Rename to dp_packetPravin B Shelar2015-03-031-4/+4
| | | | | | | | dp_packet is short and better name for datapath packet structure. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
* lib: Move vlog.h to <openvswitch/vlog.h>Thomas Graf2014-12-151-1/+1
| | | | | | | | A new function vlog_insert_module() is introduced to avoid using list_insert() from the vlog.h header. Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Ben Pfaff <blp@nicira.com>
* list: Rename struct list to struct ovs_listThomas Graf2014-12-151-1/+1
| | | | | | | struct list is a common name and can't be used in public headers. Signed-off-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Ben Pfaff <blp@nicira.com>
* dpif: Fix initialization order.Pravin B Shelar2014-11-241-4/+0
| | | | | | | | | OVS router depends on tnl_conf_seq and all tunnel related components should be initialized before registering dpif implementations. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com>
* openvswitch: Userspace tunneling.Pravin B Shelar2014-11-121-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | Following patch adds support for userspace tunneling. Tunneling needs three more component first is routing table which is configured by caching kernel routes and second is ARP cache which build automatically by snooping arp. And third is tunnel protocol table which list all listening protocols which is populated by vswitchd as tunnel ports are added. GRE and VXLAN protocol support is added in this patch. Tunneling works as follows: On packet receive vswitchd check if this packet is targeted to tunnel port. If it is then vswitchd inserts tunnel pop action which pops header and sends packet to tunnel port. On packet xmit rather than generating Set tunnel action it generate tunnel push action which has tunnel header data. datapath can use tunnel-push action data to generate header for each packet and forward this packet to output port. Since tunnel-push action contains most of packet header vswitchd needs to lookup routing table and arp table to build this action. Signed-off-by: Pravin B Shelar <pshelar@nicira.com> Acked-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Thomas Graf <tgraf@noironetworks.com> Acked-by: Ben Pfaff <blp@nicira.com>
* netdev-windows: New module.Nithin Raju2014-10-061-0/+5
| | | | | | | | | | | | | | | In this patch, we add a lib/netdev-windows.c which mostly contains stub code and in subsequent patches, would use the netlink interface to query netdev information for a vport. The code implements netdev functionality for "internal" and "system" types of vports. Signed-off-by: Nithin Raju <nithin@vmware.com> Acked-by: Ankur Sharma <ankursharma@vmware.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Tested-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
* netdev: Fix error check.Alex Wang2014-09-251-1/+1
| | | | | | Reported-by: Daniel Badea <daniel.badea@windriver.com> Signed-off-by: Alex Wang <alexw@nicira.com> Acked-by: Daniele Di Proietto <ddiproietto@vmware.com>