delta/openvswitch.git - github.com: openvswitch/ovs.git

	Commit message (Collapse)	Author	Age	Files	Lines
...
*	netdev-dummy: Add dummy-internal class.	Daniele Di Proietto	2016-08-15	1	-4/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	"internal" netdevs are treated specially in OVS (e.g. for MTU), but the dummy datapath remaps both "system" and "internal" devices to the same "dummy" netdev class, so there's no way to discern those in tests. This commit adds a new "dummy-internal" netdev type, which will be used by the dummy datapath for internal ports, so that other parts of the code can understand which ports are internal just by looking at the netdev object. The alternative solution, using the original interface type ("internal") instead of the translated netdev type ("dummy"), is harder to implement, because in so many places only the netdev object is available. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>
*	dpif-netdev: dpcls per in_port with sorted subtables	Jan Scheurich	2016-08-12	1	-2/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The user-space datapath (dpif-netdev) consists of a first level "exact match cache" (EMC) matching on 5-tuples and the normal megaflow classifier. With many parallel packet flows (e.g. TCP connections) the EMC becomes inefficient and the OVS forwarding performance is determined by the megaflow classifier. The megaflow classifier (dpcls) consists of a variable number of hash tables (aka subtables), each containing megaflow entries with the same mask of packet header and metadata fields to match upon. A dpcls lookup matches a given packet against all subtables in sequence until it hits a match. As megaflow cache entries are by construction non-overlapping, the first match is the only match. Today the order of the subtables in the dpcls is essentially random so that on average a dpcls lookup has to visit N/2 subtables for a hit, when N is the total number of subtables. Even though every single hash-table lookup is fast, the performance of the current dpcls degrades when there are many subtables. How does the patch address this issue: In reality there is often a strong correlation between the ingress port and a small subset of subtables that have hits. The entire megaflow cache typically decomposes nicely into partitions that are hit only by packets entering from a range of similar ports (e.g. traffic from Phy -> VM vs. traffic from VM -> Phy). Therefore, maintaining a separate dpcls instance per ingress port with its subtable vector sorted by frequency of hits reduces the average number of subtables lookups in the dpcls to a minimum, even if the total number of subtables gets large. This is possible because megaflows always have an exact match on in_port, so every megaflow belongs to unique dpcls instance. For thread safety, the PMD thread needs to block out revalidators during the periodic optimization. We use ovs_mutex_trylock() to avoid blocking the PMD. To monitor the effectiveness of the patch we have enhanced the ovs-appctl dpif-netdev/pmd-stats-show command with an extra line "avg. subtable lookups per hit" to report the average number of subtable lookup needed for a megaflow match. Ideally, this should be close to 1 and almost all cases much smaller than N/2. The PMD tests have been adjusted to the additional line in pmd-stats-show. We have benchmarked a L3-VPN pipeline on top of a VXLAN overlay mesh. With pure L3 tenant traffic between VMs on different nodes the resulting netdev dpcls contains N=4 subtables. Each packet traversing the OVS datapath is subject to dpcls lookup twice due to the tunnel termination. Disabling the EMC, we have measured a baseline performance (in+out) of ~1.45 Mpps (64 bytes, 10K L4 packet flows). The average number of subtable lookups per dpcls match is 2.5. With the patch the average number of subtable lookups per dpcls match is reduced to 1 and the forwarding performance grows by ~50% to 2.13 Mpps. Even with EMC enabled, the patch improves the performance by 9% (for 1000 L4 flows) and 34% (for 50K+ L4 flows). As the actual number of subtables will often be higher in reality, we can assume that this is at the lower end of the speed-up one can expect from this optimization. Just running a parallel ping between the VXLAN tunnel endpoints increases the number of subtables and hence the average number of subtable lookups from 2.5 to 3.5 on master with a corresponding decrease of throughput to 1.2 Mpps. With the patch the parallel ping has no impact on average number of subtable lookups and performance. The performance gain is then ~75%. Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Acked-by: Antonio Fischetti <antonio.fischetti@intel.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
*	tests: Add new pmd test for pmd-rxq-affinity.	Daniele Di Proietto	2016-07-29	1	-0/+53
\| \| \| \| \| \| \| \|	This tests that the newly introduced pmd-rxq-affinity option works as intended, at least for a single port. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
*	netdev-*: Do not use dp_packet_pad() in recv() functions.	Daniele Di Proietto	2016-07-29	1	-16/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All the netdevs used by dpif-netdev (except for netdev-dpdk) have a dp_packet_pad() call in the receive function, probably because the userspace datapath couldn't handle properly short packets. This doesn't appear to be the case anymore. This commit removes the call to have a more consistent behavior with the kernel datapath. All the testsuite changes in this commit adjust the expectations for packet lengths in flow dumps and other stats. There's only one fix in ovn.at: one of the test_ip() functions generated an incomplete udp packet, which was not a problem until now, because of the padding. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>
*	dpif-netdev: Execute conntrack action.	Daniele Di Proietto	2016-07-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This commit implements the OVS_ACTION_ATTR_CT action in dpif-netdev. To allow ofproto-dpif to detect the conntrack feature, flow_put will not discard anymore flows with ct_* fields set. We still shouldn't allow flows with NAT bits set, since there is no support for NAT. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Acked-by: Antonio Fischetti <antonio.fischetti@intel.com>
*	dpif-netdev: Introduce pmd-rxq-affinity.	Ilya Maximets	2016-07-27	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	New 'other_config:pmd-rxq-affinity' field for Interface table to perform manual pinning of RX queues to desired cores. This functionality is required to achieve maximum performance because all kinds of ports have different cost of rx/tx operations and only user can know about expected workload on different ports. Example: # ./bin/ovs-vsctl set interface dpdk0 options:n_rxq=4 \ other_config:pmd-rxq-affinity="0:3,1:7,3:8" Queue #0 pinned to core 3; Queue #1 pinned to core 7; Queue #2 not pinned. Queue #3 pinned to core 8; It's decided to automatically isolate cores that have rxq explicitly assigned to them because it's useful to keep constant polling rate on some performance critical ports while adding/deleting other ports without explicit pinning of all ports. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
*	tests: Fixed PMD tests on Windows	Paul Boca	2016-06-28	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	CHECK_CPU_DISCOVERED check the log file now, not the stderr. On Windows the ovs-vswitchd output is logged only in log file, not to stderr. Tested both on Windows and Linux Signed-off-by: Paul-Daniel Boca <pboca@cloudbasesolutions.com> Acked-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Gurucharan Shetty <guru@ovn.org>
*	test: Add more pmd tests.	Daniele Di Proietto	2016-06-24	1	-0/+275
\| \| \| \| \| \| \| \|	These tests stress the pmd thread and multiqueue handling in dpif-netdev. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>
*	dpif-netdev: Print installed flows in dpif format.	Jesse Gross	2016-06-13	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When debug logging is enabled, dpif-netdev can print each flow as it is installed, which it currently does using OpenFlow match formatting. Compared to ODP formatting, there generally isn't too much difference since the fields are largely the same but it is inconsistent with other logging in dpif-netdev as well as the analogous functions that deal with the kernel. However, in some cases there is a difference between the two formats, such as in the cases of input port or tunnel metadata. For input port, datapath format helped detect that the generated masks were incorrect. As for tunnels, at the moment, it's possible to convert between the two formats on demand as we have a global metadata table. In the future, though this won't be possible as the metadata table becomes per-bridge which the datapath won't have access to. Signed-off-by: Jesse Gross <jesse@kernel.org> Acked-by: Daniele Di Proietto <diproiettod@vmware.com>
*	testsuite: Add PMD specific tests.	Ilya Maximets	2016-06-07	1	-0/+182
	Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>