summaryrefslogtreecommitdiff
path: root/vswitchd
Commit message (Collapse)AuthorAgeFilesLines
* vswitch.xml: Fix typo in documentation.Ben Pfaff2017-10-231-1/+1
| | | | | Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Yi-Hung Wei <yihung.wei@gmail.com>
* vswitch.xml: Better document patch ports.Ben Pfaff2017-10-121-1/+29
| | | | | | Reported-by: Hui Xiang <xianghuir@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
* bridge: Fix controller status update to passive connectionsAndy Zhou2017-09-141-5/+11
| | | | | | | | | | | | The bug can cause ovs-vswitchd to crash (due to assert) when it is set up with a passive controller connection. Since only active connections are kept, the passive connection status update should be ignored and not trigger asserts. Fixes: 85c55772a453 ("bridge: Fix controller status update") Reported-by: Josh Bailey <josh@faucet.nz> Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Joe Stringer <joe@ovn.org>
* dpif-netdev: Add ovs-appctl dpif-netdev/pmd-rxq-rebalance.Kevin Traynor2017-08-251-0/+2
| | | | | | | | | | | | | | | Rxqs consumed processing cycles are used to improve the balance of how rxqs are assigned to pmds. Currently some reconfiguration is needed to perform a reassignment. Add an ovs-appctl command to perform a new assignment in order to balance based on the latest rxq processing cycle information. Note: Jan requested this for testing purposes. Suggested-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Darrell Ball <dlu998@gmail.com>
* ovsschema: Fix line lengths.Bhanuprakash Bodireddy2017-08-081-23/+44
| | | | | | | | | According to coding style the line lengths should be <=79. Fix the schema file and update the checksum and version number to reflect the change. Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* bridge: Avoid read of uninitialized data configuring Auto-Attach.Ben Pfaff2017-08-031-1/+1
| | | | | | | Reported-by: "qintao (F)" <qintao5@huawei.com> Reported-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2017-April/044309.html Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
* Eliminate most shadowing for local variable names.Ben Pfaff2017-08-021-19/+11
| | | | | | | | | | | | | | Shadowing is when a variable with a given name in an inner scope hides a different variable with the same name in a surrounding scope. This is generally undesirable because it can confuse programmers. This commit eliminates most of it. Found with -Wshadow=local in GCC 7. The repo is not really ready to enable this option by default because of a few cases that are harder to fix, and harmless, such as nested use of CMAP_FOR_EACH. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Andy Zhou <azhou@ovn.org>
* vswitch.xml: Fix L2 balancing mentioning for balance-tcp bond.Ilya Maximets2017-07-251-3/+2
| | | | | | | | | | | | | | L2 fields are not used in userspace hash action since commit 4f150744921f ("dpif-netdev: Use miniflow as a flow key."). In kernel datapath RSS (which is not include L2 by default for most of the NICs) was used from the beginning. This means that if recirculation is in use, L2 fields are not used for flow balancing. Fix the documentation accordingly. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Signed-off-by: Andy Zhou <azhou@ovn.org>
* process: Consolidate process related APIs.Bhanuprakash Bodireddy2017-07-131-250/+1
| | | | | | | | | | | | As part of retrieving system statistics, process status APIs along with helper functions were implemented. Some of them are very generic and can be reused by other subsystems. Move the APIs in system-stats.c to process.c and util.c and make them available. This patch doesn't change any functionality. Signed-off-by: Bhanuprakash Bodireddy <bhanuprakash.bodireddy@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* bridge: Filter all zero mac when use ovs-vsctl to set maczhongbaisong2017-07-131-0/+3
| | | | | Signed-off-by: zhongbaisong <zhongbaisong@huawei.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev-dpdk: Remove Rx checksum reconfigure.Kevin Traynor2017-07-111-14/+0
| | | | | | | | | | | | | | | | | | | | Rx checksum offload is enabled by default on DPDK physical NICs where available, with reconfiguration through options:rx-checksum-offload. However, changing rx-checksum-offload did not result in a reconfiguration of the NIC and wrong status is reported for it. As there seems to be diminishing reasons why a user would want to disable Rx checksum offload, just remove the broken reconfiguration option. Fixes: 1a2bb11817a4 ("netdev-dpdk: Enable Rx checksum offloading feature on DPDK physical ports.") Reported-by: Kevin Traynor <ktraynor@redhat.com> Suggested-by: Sugesh Chandran <sugesh.chandran@intel.com> Acked-by: Darrell Ball <dlu998@gmail.com> Tested-by: Sugesh Chandran <sugesh.chandran@intel.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* vswitchd: Fix IFACE_STAT name error in iface_refresh_statsZhenyu Gao2017-07-111-2/+2
| | | | | | | | | | | | | | | The element of rx_1024_to_1522_packets has wrong name(rx_1024_to_1518_packets). Change it from rx_1024_to_1518_packets to rx_1024_to_1522_packets, it should record packets between 1024 to 1522. The element of tx_1024_to_1522_packets has wrong name(tx_1024_to_1518_packets). Change it from tx_1024_to_1518_packets to tx_1024_to_1522_packets, it should record packets between 1024 to 1522. CC: mweglicx <michalx.weglicki@intel.com> Fixes: d6e3feb57c44 ("Add support for extended netdev statistics based on RFC 2819.") Signed-off-by: Zhenyu Gao <sysugaozhenyu@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* dpif-netdev: Change definitions of 'idle' & 'processing' cyclesCiara Loftus2017-07-061-1/+4
| | | | | | | | | | | | | | | Instead of counting all polling cycles as processing cycles, only count the cycles where packets were received from the polling. Signed-off-by: Georg Schmuecking <georg.schmuecking@ericsson.com> Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Co-authored-by: Georg Schmuecking <georg.schmuecking@ericsson.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Ian Stokes <ian.stokes@intel.com> Tested-by: Ian Stokes <ian.stokes@intel.com> Acked-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* userspace: Handling of versatile tunnel portsBen Pfaff2017-06-271-17/+77
| | | | | | | | | | In netdev_gre_build_header(), GRE protocol and VXLAN next_potocol is set based on packet_type of flow. If it's about an Ethernet packet, it is set to ETP_TYPE_TEB. Otherwise, if the name space is OFPHTN_ETHERNET, it is set according to the name space type. Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* docs: Document that hw-offload is experimental.Joe Stringer2017-06-191-0/+5
| | | | | | | | | | | | | | | | | | Currently, the set of flows that may be offloaded is very small compared to the overall capabilities of the OpenFlow support in OVS. In the majority of cases, if a user attempts to enable this flag they are unlikely to observe a performance increase, because for instance they lack the correct hardware; lack the correct kernel version; or their flow tables are too complex for the hardware to handle. To moderate expectations around this feature, describe it as experimental. Over time, we expect that the functionality and usefulness of this feature will grow and we should be in a better shape to revisit the status of this functionality after it has had some time to mature. Signed-off-by: Joe Stringer <joe@ovn.org> Acked-by: Simon Horman <simon.horman@netronome.com> Acked-by: Flavio Leitner <fbl@sysclose.org>
* other-config: Add tc-policy switch to control tc flower flagPaul Blakey2017-06-151-0/+17
| | | | | | | | | | | | Add a new configuration tc-policy option that controls tc flower flag. Possible options are none, skip_sw, skip_hw. The default is none which is to insert the rule both to sw and hw. This option is only relevant if hw-offload is enabled. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Simon Horman <simon.horman@netronome.com>
* other-config: Add hw-offload switch to control netdev flow offloadingPaul Blakey2017-06-142-0/+16
| | | | | | | | | | | | Add a new configuration option - hw-offload that enables netdev flow api. Enabling this option will allow offloading flows using netdev implementation instead of the kernel datapath. This configuration option defaults to false - disabled. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Acked-by: Flavio Leitner <fbl@sysclose.org> Signed-off-by: Simon Horman <simon.horman@netronome.com>
* rstp: Add the 'ovs-appctl rstp/show' command.nickcooper-zhangtonghao2017-06-081-1/+10
| | | | | | | | | The rstp/show command will help users and developers to get more details about rstp. This patch works together with the previous patches. Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Signed-off-by: Ben Pfaff <blp@ovn.org>
* docs: Update dpdk vdev naming instructions.Ciara Loftus2017-06-071-4/+5
| | | | | | Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Stephen Finucane <stephen@that.guru>
* bfd: Detect Multiplier configurationSzucs Gabor2017-06-061-6/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mult value (bfd.DetectMult in RFC5880) is hard-coded and equal to 3 in current openvswitch. As a consequence remote and local mult is the same. In this commit the mult (Detect Multiplier/bfd.DetectMult/Detect Mult) can be set on each interface setting the mult=<value> in bfd Column in Interface table of ovsdb database. Example: ovs-vsctl set Interface p1 bfd:mult=4 sets mult=4 on p1 interface The modification based on RFC5880 June 2010. The relevant paragraphs are: 4.1. Generic BFD Control Packet Format 6.8.4. Calculating the Detection Time 6.8.7. Transmitting BFD Control Packets 6.8.12. Detect Multiplier Change The mult value is set to default 3 if it is not set in ovsdb. This provides backward compatibility to previous openvswitch behaviour. The RFC5880 says in 6.8.1 that DetectMult shall be a non-zero integer. In RFC5880 4.1. "Detect Mult" has 8 bit length and is declared as a 8 bit unsigned integer in bfd.c. Consequently mult value shall be greater than 0 and less then 256. In case of incorrect mult value is given in ovsdb the default value (3) will be set and a message is logged into ovs-vswitchd.log on that. Local or remote mult value change is also logged into ovs-vswitchd.log. Since remote and local mult is not the same calculation of detect time has been changed. Due to RFC5880 6.8.4 Detection Time is calculated using mult value of the remote system. Detection time is recalculated due to remote mult change. The BFD packet transmission jitter is different in case of mult=1 due to RFC5880 6.8.7. The maximum interval of the transmitted bfd packet is 90% of the transmission interval. The value of remote mult is printed in the last line of the output of ovs-appctl bfd/show command with label: Remote Detect Mult. There is a feature in openvswitch connected with forwarding_if_rx that is not the part of RFC5880. This feature also uses mult value but it is not specified if local or remote since it was the same in original code. The relevant description in code: /* When 'bfd->forwarding_if_rx' is set, at least one bfd control packet * is required to be received every 100 * bfd->cfg_min_rx. If bfd * control packet is not received within this interval, even if data * packets are received, the bfd->forwarding will still be false. */ Due to lack of specification local mult value is used for calculation of forwarding_if_rx_detect_time. This detect time is recalculated at mult change if forwarding_if_rx is true and bfd is in UP state. A new unit test has been added: "bfd - Edit the Detect Mult values" The following cases are tested: - Without setting mult the mult will be the default value (3). - The setting of the lowest (1) and highest (255) valid mult value and the detection of remote mult value. - The setting of out of range mult value (0, 256) in ovsdb results sets default value in ovs-vswitchd - Clearing non default mult value from ovsdb results sets default value in ovs-vswitchd. Signed-off-by: Gábor Szűcs <gabor.sz.cs@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* userspace: add vxlan gpe support to vportGeorg Schmuecking2017-06-021-0/+17
| | | | | | | | | | | | | | This patch is based on the "datapath: enable vxlangpe creation in compat mode" from Yi Yang. It introduces an extension option "gpe" to the vxlan port in the netdev-dpdk datapath. Description of vxlan gpe protocoll was added to header file lib/packets.h. In the vxlan specific methods the different packet are introduced and handled. Added VXLAN GPE tunnel push test. Signed-off-by: Yi Yang <yi.y.yang at intel.com> Signed-off-by: Georg Schmuecking <georg.schmuecking@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* userspace: L3 tunnel support for GRE and LISPJan Scheurich2017-06-021-2/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a boolean "layer3" configuration option for tunnel vports. The layer3 option defaults to false for all ports except LISP. GRE ports accept both true and false for "layer3". A tunnel vport configured with layer3=true receives L3 packets. which are then converted to Ethernet packets by pushing a dummy Ethernet heder at the ingress of the OpenFlow pipeline. The Ethernet header of a packet is stripped before sending to a layer3 tunnel vport. Presently a single GRE vport cannot carry both L2 and L3 packets. But it is possible to create two GRE vports representing the same GRE tunel, one with layer3=false, the other with layer3=true. L2 packet from the tunnel are received on the first vport, L3 packets on the second. The controller must send packets to the layer3 GRE vport to tunnel them without their Ethernet header. Units tests have been added to check the L3 tunnel handling. LISP tunnels are not yet supported by the netdev userspace datapath. Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Jiri Benc <jbenc@redhat.com> Signed-off-by: Yi Yang <yi.y.yang@intel.com> Signed-off-by: Jan Scheurich <jan.scheurich@ericsson.com> Co-authored-by: Zoltan Balogh <zoltan.balogh@ericsson.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* bridge: Fix memory leak in bridge_aa_update_trunks().Ben Pfaff2017-06-011-0/+2
| | | | | | | | Found by Coverity. Reported-at: https://scan3.coverity.com/reports.htm#v16889/p10449/fileInstanceId=14763131&defectInstanceId=4305313&mergedDefectId=180411 Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* rstp: Increment the rstp port num counter.nickcooper-zhangtonghao2017-05-311-0/+4
| | | | | | | | This counter is supposed to prevent having too many RSTP ports, but nothing ever incremented it. Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Signed-off-by: Ben Pfaff <blp@ovn.org>
* bridge: Fix controller status updateAndy Zhou2017-05-151-18/+15
| | | | | | | | | | | | | When multiple bridges connects to the same controller, the controller status should be maintained separately for each bridge. Current logic pushes status updates only based on the connection string, which is the same across multiple bridges when connecting to the same controller. Report-at: https://mail.openvswitch.org/pipermail/ovs-discuss/2017-May/044412.html Reported-by: Tulio Ribeiro <tribeiro@lasige.di.fc.ul.pt> Signed-off-by: Andy Zhou <azhou@ovn.org> Reviewed-by: Greg Rose <gvrose@8192@gmail.com>
* vswitchd: Add --cleanup option to the 'appctl exit' commandAndy Zhou2017-05-034-12/+26
| | | | | | | | | | | | | | | | | | 'appctl exit' stops the running vswitchd daemon, without releasing the datapath resources (such as bridges and ports) that vswitchd has created. This is expected when vswitchd is to be relaunched, to reduce the perturbation of exiting traffic and connections. However, when vswitchd is intended to be shutdown permanently, it is desirable not to leak datapath resources. In theory, this can be achieved by removing the corresponding configurations from OVSDB before shutting down vswitchd. However it is not always possible in practice. Sometimes it is convenient and robust for vswitchd to release all datapath resources that it has configured. Add 'appctl exit --cleanup' option for this use case. Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
* OpenFlow: Enable OpenFlow 1.4 by default.Ben Pfaff2017-05-011-3/+3
| | | | | | | Open vSwitch now supports all OpenFlow 1.4 required features, so enable it by default. Signed-off-by: Ben Pfaff <blp@ovn.org>
* docs: Add some detail about dpdk-socket-mem.Kevin Traynor2017-04-241-2/+3
| | | | | | | | | Using dpdk-socket-mem to allocate memory for some NUMA nodes but leaving blank for subsequent ones is equivalent of assigning 0 MB memory to those subsequent nodes. Document this behavior. Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* bridge: Fix memory leak in port_configure()Yi-Hung Wei2017-04-241-0/+1
| | | | | | | | | | | | | | | In testcase "ofproto-dpif - VLAN handling", valgrind reports a memory leak with the following call stack. xcalloc (util.c:95) bitmap_allocate (bitmap.h:51) vlan_bitmap_from_array (vlan-bitmap.c:32) port_configure (bridge.c:983) bridge_reconfigure (bridge.c:682) bridge_run (bridge.c:2993) main (ovs-vswitchd.c:111) Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* bridge: Log interface deletionAndy Zhou2017-04-211-0/+3
| | | | | | | | | Currently interface additions are logged but not deletions. This makes system debugging, such as confirming OVSDB transaction are timely replicated harder than necessary. Signed-off-by: Andy Zhou <azhou@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
* use portable getpagesize() in system-statsAlin Serdean2017-04-141-1/+1
| | | | | | | | Use the intended portable function defined above "get_page_size()" not "getpagesize()". Signed-off-by: Alin Gabriel Serdean <aserdean@cloudbasesolutions.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ovs-ctl: Expose openvswitch run directory through ovsdb.Robert Wojciechowicz2017-04-071-5/+13
| | | | | | | | | | | | | When using vhost-user client or server mode with OpenStack, Neutron needs to be able to construct the fully qualified socket path and pass it to Nova. While the relative vhost-user socket directory is exposed via the Open_vSwitch table in other_config:vhost-sock-dir, the openvswitch run directory that it is relative to is not. This patch adds it to the Open_vSwitch table as external_ids:rundir. Signed-off-by: Robert Wojciechowicz <robertx.wojciechowicz@intel.com> Acked-by: Sean K Mooney <sean.k.mooney@intel.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* stp: Add the 'ovs-appctl stp/show' command.nickcooper-zhangtonghao2017-03-201-0/+4
| | | | | | | | | | The stp/show command will help users and developers to get more details about stp. This patch works together with the previous patch "stp: Change the api for next patch." Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Co-authored-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Fix format specifier technicalities.Ben Pfaff2017-03-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Various printf() format specifiers in the tree had minor technical issues which the Mac OS build reported, e.g. here: https://s3.amazonaws.com/archive.travis-ci.org/jobs/208718342/log.txt These tend to fall into two categories of harmless warnings: 1. Wrong width for types that are all promoted to 'int'. For example, both uint8_t and uint16_t are both promoted to 'int' as part of a call to printf(), but using PRIu8 for a uint16_t causes a warning. 2. Wrong format specifier for type promoted to 'int' due to arithmetic. For example, if 'x' is a uint8_t, then x >> 1 has type 'int' due to C's promotion rules, so the correct format specifier is %d and using PRIu8 will cause a warning. This commit fixes the warnings. I didn't see anything that rose to the level of a bug. These warnings only showed up on Mac OS X because of differences in the format specifiers that Mac OS uses for PRI*. Reported-by: Shu Shen <shu.shen@gmail.com> Acked-by: Daniele Di Proietto <diproiettod@vmware.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* stp: Use correct default for BPDU max age.nickcooper-zhangtonghao2017-03-171-1/+1
| | | | | | | | The default max age should be 20 seconds, but this typo caused it to default to 2 seconds. Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Add new port VLAN mode "dot1q-tunnel"Eric Garver2017-03-173-6/+111
| | | | | | | | | | | | | | | - Example: ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100 Pushes another VLAN 100 header on packets (tagged and untagged) on ingress, and pops it on egress. - Customer VLAN check: ovs-vsctl set Port p1 vlan_mode=dot1q-tunnel tag=100 cvlans=10,20 Only customer VLAN of 10 and 20 are allowed. Co-authored-by: Xiao Liang <shaw.leon@gmail.com> Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Signed-off-by: Eric Garver <e@erig.me> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Add support for 802.1ad (QinQ tunneling)Eric Garver2017-03-162-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Flow key handling changes: - Add VLAN header array in struct flow, to record multiple 802.1q VLAN headers. - Add dpif multi-VLAN capability probing. If datapath supports multi-VLAN, increase the maximum depth of nested OVS_KEY_ATTR_ENCAP. Refactor VLAN handling in dpif-xlate: - Introduce 'xvlan' to track VLAN stack during flow processing. - Input and output VLAN translation according to the xbundle type. Push VLAN action support: - Allow ethertype 0x88a8 in VLAN headers and push_vlan action. - Support push_vlan on dot1q packets. Use other_config:vlan-limit in table Open_vSwitch to limit maximum VLANs that can be matched. This allows us to preserve backwards compatibility. Add test cases for VLAN depth limit, Multi-VLAN actions and QinQ VLAN handling Co-authored-by: Thomas F Herbert <thomasfherbert@gmail.com> Signed-off-by: Thomas F Herbert <thomasfherbert@gmail.com> Co-authored-by: Xiao Liang <shaw.leon@gmail.com> Signed-off-by: Xiao Liang <shaw.leon@gmail.com> Signed-off-by: Eric Garver <e@erig.me> Signed-off-by: Ben Pfaff <blp@ovn.org>
* dpdk: Improve manpage for dpdk memory configuration.nickcooper-zhangtonghao2017-03-031-6/+4
| | | | | Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> Signed-off-by: Ben Pfaff <blp@ovn.org>
* dpif-netdev: Conditional EMC insertCiara Loftus2017-02-161-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | Unconditional insertion of EMC entries results in EMC thrashing at high numbers of parallel flows. When this occurs, the performance of the EMC often falls below that of the dpcls classifier, rendering the EMC practically useless. Instead of unconditionally inserting entries into the EMC when a miss occurs, use a 1% probability of insertion. This ensures that the most frequent flows have the highest chance of creating an entry in the EMC, and the probability of thrashing the EMC is also greatly reduced. The probability of insertion is configurable, via the other_config:emc-insert-inv-prob option. This value sets the average probability of insertion to 1/emc-insert-inv-prob. For example the following command changes the insertion probability to (on average) 1 in every 20 packets ie. 1/20 ie. 5%. ovs-vsctl set Open_vSwitch . other_config:emc-insert-inv-prob=20 Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Georg Schmuecking <georg.schmuecking@ericsson.com> Co-authored-by: Georg Schmuecking <georg.schmuecking@ericsson.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* vswitchd: Move config_ofproto_types call before bridge_add_portShashank Ram2017-02-151-2/+3
| | | | | | | | | | | | | | | | | | | | Currently, the call to config_ofproto_types() happens at the end of bridge_reconfigure(), after missing ofprotos and ports are created. However, it might be usefull to make this call before adding missing ports through the dpif interface. With the current use case (dpif-netdev), this will save us a reconfiguration cycle. The call to config_ofproto_types() was introduced as a part of passing the Openvswitch other_config smap to dpif. However, if we want to do this before the ports are added, it needs to be done after ofproto_create() is called so that dpif_backer is added to all_dpif_backers list. Once the dpif_backer is added, the call to config_ofproto_types() will ensure that the set_config handler in dpif-netdev/netlink.c is called. Signed-off-by: Shashank Ram <rams@vmware.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* Remove build-time generated files when "make clean" is run.Justin Pettit2017-02-131-3/+3
| | | | | | | | | | | | "make clean" should remove all files generated by building a program, while "make distclean" should also remove files generated by configuring the program. Previously some generated files during the build process, such as man pages, were left behind when "make clean" was run. This commit only leaves configuration files after "make clean" is run, and removes all other generated files. Signed-off-by: Justin Pettit <jpettit@ovn.org> Acked-by: Ben Pfaff <blp@ovn.org>
* dpif-netdev: Pass Openvswitch other_config smap to dpif.Daniele Di Proietto2017-02-031-1/+17
| | | | | | | | | | | | | | | | | Currently we parse the 'other_config' column in Openvswitch table in bridge.c. We extract the values (just 'pmd-cpu-mask' for now) and we pass them down to the datapath, via different layers. If we want to pass other values to dpif-netdev.c (like we recently discussed) we would have to touch ofproto.c, ofproto-dpif.c and dpif.c. This patch sends the entire other_config column to dpif-netdev, so that dpif-netdev can extract the values it's interested in. No functional change. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>
* tunnel: Add support to configure ptk_markPravin B Shelar2017-01-281-0/+6
| | | | | | | | | | | | | | | | Today packet mark action is broken for Tunnel ports with tunnel monitoring. User can write a flow to set pkt-mark for any tunnel traffic, but there is no way to set the packet mark for corresponding BFD traffic. Following patch introduces new option in OVSDB tunnel configuration so that user can set skb-mark for given tunnel endpoint. OVS would set the mark according to the skb-mark option for all tunnel traffic including packets generated by vSwitchd like tunnel monitoring BFD packet. Signed-off-by: Pravin B Shelar <pshelar@ovn.org> Acked-by: Jarno Rajahalme <jarno@ovn.org>
* ovs-fields: New manpage to document Open vSwitch and OpenFlow fields.Ben Pfaff2017-01-251-3/+3
| | | | | | | | | There is still plenty of opportunity for improvement, but this new ovs-fields(7) manpage is much more comprehensive than ovs-ofctl(8) could be. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org>
* netdev-dpdk: Start also dpdkr devices only once on port-add.Daniele Di Proietto2017-01-151-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since commit 55e075e65ef9("netdev-dpdk: Arbitrary 'dpdk' port naming"), we don't call rte_eth_start() from netdev_open() anymore, we only call it from netdev_reconfigure(). This commit does that also for 'dpdkr' devices, and remove some useless code. Calling rte_eth_start() also from netdev_open() was unnecessary and wasteful. Not doing it reduces code duplication and makes adding a port faster (~900ms before the patch, ~400ms after). Another reason why this is useful is that some DPDK driver might have problems with reconfiguration. For example, until DPDK commit 8618d19b52b1("net/vmxnet3: reallocate shared memzone on re-config"), vmxnet3 didn't support being restarted with a different number of queues. Technically, the netdev interface changed because before opening rxqs or calling netdev_send() the user must check if reconfiguration is required. This patch also documents that, even though no change to the userspace datapath (the only user) is required. Lastly, this patch makes sure the errors returned by ofproto_port_add (which includes the first port reconfiguration) are reported back to the database. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ilya Maximets <i.maximets@samsung.com>
* netdev-dpdk: Add support for virtual DPDK PMDs (vdevs)Ciara Loftus2017-01-051-2/+7
| | | | | | | | | | | | | | | | | | | Prior to this commit, the 'dpdk' port type could only be used for physical DPDK devices. Now, virtual devices (or 'vdevs') are supported. 'vdev' devices are those which use virtual DPDK Poll Mode Drivers eg. null, pcap. To add a DPDK vdev, a valid 'dpdk-devargs' must be set for the given dpdk port. The format expected is 'eth_<driver_name><x>' where 'x' is a number between 0 and RTE_MAX_ETHPORTS -1. For example to add a port that uses the 'null' DPDK PMD driver: ovs-vsctl set Interface null0 options:dpdk-devargs=eth_null0 Not all DPDK vdevs have been verified to work at this point in time. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Stephen Finucane <stephen@that.guru> # docs only Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* netdev-dpdk: Arbitrary 'dpdk' port namingCiara Loftus2017-01-051-0/+8
| | | | | | | | | | | | | | | | | | | | | | | 'dpdk' ports no longer have naming restrictions. Now, instead of specifying the dpdk port ID as part of the name, the PCI address of the device must be specified via the 'dpdk-devargs' option. eg. ovs-vsctl add-port br0 my-port ovs-vsctl set Interface my-port type=dpdk options:dpdk-devargs=0000:06:00.3 The user must no longer hotplug attach DPDK ports by issuing the specific ovs-appctl netdev-dpdk/attach command. The hotplug is now automatically invoked when a valid PCI address is set in the dpdk-devargs. The format for ovs-appctl netdev-dpdk/detach command has changed in that the user now must specify the relevant PCI address as input instead of the port name. Signed-off-by: Ciara Loftus <ciara.loftus@intel.com> Signed-off-by: Kevin Traynor <ktraynor@redhat.com> Co-authored-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Stephen Finucane <stephen@that.guru> # docs only Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* netdev-dpdk: Enable Rx checksum offloading feature on DPDK physical ports.Sugesh Chandran2017-01-041-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add Rx checksum offloading feature support on DPDK physical ports. By default, the Rx checksum offloading is enabled if NIC supports. However, the checksum offloading can be turned OFF either while adding a new DPDK physical port to OVS or at runtime. The rx checksum offloading can be turned off by setting the parameter to 'false'. For eg: To disable the rx checksum offloading when adding a port, 'ovs-vsctl add-port br0 dpdk0 -- \ set Interface dpdk0 type=dpdk options:rx-checksum-offload=false' OR (to disable at run time after port is being added to OVS) 'ovs-vsctl set Interface dpdk0 options:rx-checksum-offload=false' Similarly to turn ON rx checksum offloading at run time, 'ovs-vsctl set Interface dpdk0 options:rx-checksum-offload=true' The Tx checksum offloading support is not implemented due to the following reasons. 1) Checksum offloading and vectorization are mutually exclusive in DPDK poll mode driver. Vector packet processing is turned OFF when checksum offloading is enabled which causes significant performance drop at Tx side. 2) Normally, OVS generates checksum for tunnel packets in software at the 'tunnel push' operation, where the tunnel headers are created. However enabling Tx checksum offloading involves, *) Mark every packets for tx checksum offloading at 'tunnel_push' and recirculate. *) At the time of xmit, validate the same flag and instruct the NIC to do the checksum calculation. In case NIC doesnt support Tx checksum offloading, the checksum calculation has to be done in software before sending out the packets. No significant performance improvement noticed with Tx checksum offloading due to the e overhead of additional validations + non vector packet processing. In some test scenarios, it introduces performance drop too. Rx checksum offloading still offers 8-9% of improvement on VxLAN tunneling decapsulation even though the SSE vector Rx function is disabled in DPDK poll mode driver. Signed-off-by: Sugesh Chandran <sugesh.chandran@intel.com> Acked-by: Jesse Gross <jesse@kernel.org> Acked-by: Pravin B Shelar <pshelar@ovn.org>
* vswitch.xml: Document reasonable range for MTU.nickcooper-zhangtonghao2016-12-121-0/+6
| | | | | | | | | | | | | | According to RFC 791, every internet module must be able to forward a datagram of 68 octets without further fragmentation. This is because an internet header may be up to 60 octets, and the minimum fragment is 8 octets. The maximum size of IP packets is 65535 bytes. The range of MTU values allowes for the MTU configuration parameter is 68 to 65535. Signed-off-by: nickcooper-zhangtonghao <nic@opencloud.tech> [blp@ovn.org changed this to just a documentation patch] Signed-off-by: Ben Pfaff <blp@ovn.org>
* doc: Populate 'topics' sectionStephen Finucane2016-12-122-245/+0
| | | | | | | | | | | There are many docs that don't need to kept at the top level, along with many more hidden in random folders. Move them all. This also allows us to add the '-W' flag to Sphinx, ensuring unindexed docs result in build failures. Signed-off-by: Stephen Finucane <stephen@that.guru> Signed-off-by: Ben Pfaff <blp@ovn.org>