summaryrefslogtreecommitdiff
path: root/lib
Commit message (Collapse)AuthorAgeFilesLines
* vlog: fix memory leak in vlog_set_log_file() functionDamijan Skvarc2019-10-021-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | memory leak happens in case previously closed log file was reopened again, for example: ovs-appctl vlog/close ovs-appctl vlog/reopen memory leak is reported by valgrind in a form: ==4463== 76 bytes in 1 blocks are definitely lost in loss record 322 of 344 ==4463== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==4463== by 0x534314: xmalloc (util.c:138) ==4463== by 0x534384: xmemdup0 (util.c:168) ==4463== by 0x53ACB9: vlog_set_log_file (vlog.c:403) ==4463== by 0x53AEDC: vlog_reopen_log_file (vlog.c:434) ==4463== by 0x53AF22: vlog_unixctl_reopen (vlog.c:683) ==4463== by 0x533730: process_command (unixctl.c:308) ==4463== by 0x533730: run_connection (unixctl.c:342) ==4463== by 0x533730: unixctl_server_run (unixctl.c:393) ==4463== by 0x4073AE: main (ovs-vswitchd.c:128) Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* odp-util: calc checksum of ip hdr for tunnel encapMartin Zhang2019-10-011-3/+7
| | | | | Signed-off-by: Martin Zhang <martinbj2008@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* fatal-signal: Catch SIGSEGV and print backtrace.William Tu2019-09-275-7/+114
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch catches the SIGSEGV signal and prints the backtrace using libunwind at the monitor daemon. This makes debugging easier when there is no debug symbol package or gdb installed on production systems. The patch works when the ovs-vswitchd compiles even without debug symbol (no -g option), because the object files still have function symbols. For example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <fatal_signal_handler+0x52> |daemon_unix(monitor)|WARN|0x00007fb4900734b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007fb49013974d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <time_poll+0x108> |daemon_unix(monitor)|WARN|0x00000000005153ec <poll_block+0x8c> |daemon_unix(monitor)|WARN|0x000000000058630a <clean_thread_main+0x1aa> |daemon_unix(monitor)|WARN|0x00000000004ffd1d <ovsthread_wrapper+0x7d> |daemon_unix(monitor)|WARN|0x00007fb490b3b6ba <start_thread+0xca> |daemon_unix(monitor)|WARN|0x00007fb49014541d <clone+0x6d> |daemon_unix(monitor)|ERR|1 crashes: pid 122849 died, killed \ (Segmentation fault), core dumped, restarting However, if the object files' symbols are stripped, then we can only get init function plus offset value. This is still useful when trying to see if two bugs have the same root cause, Example: |daemon_unix(monitor)|WARN|SIGSEGV detected, backtrace: |daemon_unix(monitor)|WARN|0x0000000000482752 <_init+0x7d68a> |daemon_unix(monitor)|WARN|0x00007f5f7c8cf4b0 <killpg+0x40> |daemon_unix(monitor)|WARN|0x00007f5f7c99574d <__poll+0x2d> |daemon_unix(monitor)|WARN|0x000000000052b348 <_init+0x126280> |daemon_unix(monitor)|WARN|0x00000000005153ec <_init+0x110324> |daemon_unix(monitor)|WARN|0x0000000000407439 <_init+0x2371> |daemon_unix(monitor)|WARN|0x00007f5f7c8ba830 <__libc_start_main+0xf0> |daemon_unix(monitor)|WARN|0x0000000000408329 <_init+0x3261> |daemon_unix(monitor)|ERR|1 crashes: pid 106155 died, killed \ (Segmentation fault), core dumped, restarting Most C library functions are not async-signal-safe, meaning that it is not safe to call them from a signal handler, for example printf() or fflush(). To be async-signal-safe, the handler only collects the stack info using libunwind, which is signal-safe, and issues 'write' to the pipe, where the monitor thread reads and prints to ovs-vswitchd.log. Tested-at: https://travis-ci.org/williamtu/ovs-travis/builds/590503433 Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* ofproto-dpif-xlate: Translate timeout policy in ct actionYi-Hung Wei2019-09-265-20/+68
| | | | | | | | | | | | | | | | | | | | | | This patch derives the timeout policy based on ct zone from the internal data structure that we maintain on dpif layer. It also adds a system traffic test to verify the zone-based conntrack timeout feature. The test uses ovs-vsctl commands to configure the customized ICMP and UDP timeout on zone 5 to a shorter period. It then injects ICMP and UDP traffic to conntrack, and checks if the corresponding conntrack entry expires after the predefined timeout. Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> ofproto-dpif: Checks if datapath supports OVS_CT_ATTR_TIMEOUT This patch checks whether datapath supports OVS_CT_ATTR_TIMEOUT. With this check, ofproto-dpif-xlate can use this information to decide whether to translate the ct timeout policy. Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Justin Pettit <jpettit@ovn.org>
* datapath: Add support for conntrack timeout policyYi-Hung Wei2019-09-262-4/+29
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds support for specifying a timeout policy for a connection in connection tracking system in kernel datapath. The timeout policy will be attached to a connection when the connection is committed to conntrack. This patch introduces a new odp field OVS_CT_ATTR_TIMEOUT in the ct action that specifies the timeout policy in the datapath. In the following patch, during the upcall process, the vswitchd will use the ct_zone to look up the corresponding timeout policy and fill OVS_CT_ATTR_TIMEOUT if it is available. The datapath code is from the following two net-next upstream commits. Upstream commit: commit 06bd2bdf19d2f3d22731625e1a47fa1dff5ac407 Author: Yi-Hung Wei <yihung.wei@gmail.com> Date: Tue Mar 26 11:31:14 2019 -0700 openvswitch: Add timeout support to ct action Add support for fine-grain timeout support to conntrack action. The new OVS_CT_ATTR_TIMEOUT attribute of the conntrack action specifies a timeout to be associated with this connection. If no timeout is specified, it acts as is, that is the default timeout for the connection will be automatically applied. Example usage: $ nfct timeout add timeout_1 inet tcp syn_sent 100 established 200 $ ovs-ofctl add-flow br0 in_port=1,ip,tcp,action=ct(commit,timeout=timeout_1) CC: Pravin Shelar <pshelar@ovn.org> CC: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> commit 6d670497e01803b486aa72cc1a718401ab986896 Author: Dan Carpenter <dan.carpenter@oracle.com> Date: Tue Apr 2 09:53:14 2019 +0300 openvswitch: use after free in __ovs_ct_free_action() We free "ct_info->ct" and then use it on the next line when we pass it to nf_ct_destroy_timeout(). This patch swaps the order to avoid the use after free. Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Signed-off-by: Justin Pettit <jpettit@ovn.org>
* simap: Add utility function to help compare two simaps.Ben Pfaff2019-09-262-1/+15
| | | | | Signed-off-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Justin Pettit <jpettit@ovn.org>
* ct-dpif, dpif-netlink: Add conntrack timeout policy supportYi-Hung Wei2019-09-269-6/+1019
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch first defines the dpif interface for a datapath to support adding, deleting, getting and dumping conntrack timeout policy. The timeout policy is identified by a 4 bytes unsigned integer in datapath, and it currently support timeout for TCP, UDP, and ICMP protocols. Moreover, this patch provides the implementation for Linux kernel datapath in dpif-netlink. In Linux kernel, the timeout policy is maintained per L3/L4 protocol, and it is identified by 32 bytes null terminated string. On the other hand, in vswitchd, the timeout policy is a generic one that consists of all the supported L4 protocols. Therefore, one of the main task in dpif-netlink is to break down the generic timeout policy into 6 sub policies (ipv4 tcp, udp, icmp, and ipv6 tcp, udp, icmp), and push down the configuration using the netlink API in netlink-conntrack.c. This patch also adds missing symbols in the windows datapath so that the build on windows can pass. Appveyor CI: * https://ci.appveyor.com/project/YiHungWei/ovs/builds/26387754 Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Alin Gabriel Serdean <aserdean@ovn.org> Signed-off-by: Justin Pettit <jpettit@ovn.org>
* ct-dpif: Export ct_dpif_format_ipproto()Yi-Hung Wei2019-09-262-2/+2
| | | | | | | | This function will be useful for following patches. Signed-off-by: Yi-Hung Wei <yihung.wei@gmail.com> Acked-by: Justin Pettit <jpettit@ovn.org> Signed-off-by: Justin Pettit <jpettit@ovn.org>
* dpdk: Remove unneeded log message copy.David Marchand2019-09-261-7/+5
| | | | | | | | | | No need to duplicate and null-terminate the passed buffer. We can directly give it to the vlog subsystem using a dynamic precision in the format string. Signed-off-by: David Marchand <david.marchand@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ian Stokes <ian.stokes@intel.com>
* flow: fix incorrect padding length checking in ipv6_sanity_checkYanqin Wei2019-09-251-1/+1
| | | | | | | | | | The padding length is (packet size - ipv6 header length - ipv6 plen). This patch fixes incorrect padding size checking in ipv6_sanity_check. Acked-by: William Tu <u9012063@gmail.com> Reviewed-by: Gavin Hu <Gavin.Hu@arm.com> Signed-off-by: Yanqin Wei <Yanqin.Wei@arm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* treewide: Use packet batch APIsPaul Chaignon2019-09-257-17/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch replaces direct accesses to dp_packet_batch and dp_packet internal components by the appropriate API calls. It extends commit 1270b6e52 (treewide: Wider use of packet batch APIs). This patch was generated using the following semantic patch (cf. http://coccinelle.lip6.fr). // <smpl> @ dp_packet @ struct dp_packet_batch *b1; struct dp_packet_batch b2; struct dp_packet *p; expression e; @@ ( - b1->packets[b1->count++] = p; + dp_packet_batch_add(b1, p); | - b2.packets[b2.count++] = p; + dp_packet_batch_add(&b2, p); | - p->packet_type == htonl(PT_ETH) + dp_packet_is_eth(p) | - p->packet_type != htonl(PT_ETH) + !dp_packet_is_eth(p) | - b1->count == 0 + dp_packet_batch_is_empty(b1) | - !b1->count + dp_packet_batch_is_empty(b1) | b1->count = e; | b1->count++ | b2.count = e; | b2.count++ | - b1->count + dp_packet_batch_size(b1) | - b2.count + dp_packet_batch_size(&b2) ) // </smpl> Signed-off-by: Paul Chaignon <paul.chaignon@orange.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* conntrack: Add option to disable TCP sequence checking.Darrell Ball2019-09-2511-5/+164
| | | | | | | | | | | | | | | | This may be needed in some special cases, such as to support some hardware offload implementations. Note that disabling TCP sequence number verification is not an optimization in itself, but supporting some hardware offload implementations may offer better performance. TCP sequence number verification is enabled by default. This option is only available for the userspace datapath. Access to this option is presently provided via 'dpctl' commands as the need for this option is quite node specific, by virtue of which nics are in use on a given node. A test is added to verify this option. Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-May/359188.html Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* conntrack: Optimize recirculations.Darrell Ball2019-09-254-5/+76
| | | | | | | | | | | | Cache the 'conn' context and use it when it is valid. The cached 'conn' context will get reset if it is not expected to be valid; the cost to do this is negligible. Besides being most optimal, this also handles corner cases, such as decapsulation leading to the same tuple, as in tunnel VPN cases. A negative test is added to check the resetting of the cached 'conn'. Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* conntrack: Fix 'reverse_nat_packet()' variable datatype.Darrell Ball2019-09-241-1/+5
| | | | | | | | | | | | The datatype 'pad' in the function 'reverse_nat_packet()' was incorrectly declared as 'char' instead of 'uint8_t'. This can affect reverse natting of icmpX packets with padding > 127 bytes. At the same time, add some comments regarding 'extract_l3_ipvX' usage in this function. Found by inspection. Fixes: edd1bef468c0 ("dpdk: Add more ICMP Related NAT support.") Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* conntrack: Fix 'check_orig_tuple()' Valgrind false positive.Darrell Ball2019-09-241-4/+3
| | | | | | | | | | | | | | | | | Valgrind reported that 'pkt->md.ct_orig_tuple.ipv4.ipv4_proto' is uninitialized in 'check_orig_tuple()', if 'ct_state' is zero. Although this is true, the check is superceded, as even if it succeeds the check for natted packets based on 'ct_state' is an ORed condition and is intended to catch this case. The check is '!(pkt->md.ct_state & (CS_SRC_NAT | CS_DST_NAT))' which filters out all packets excepted natted ones. Move this check up to prevent the Valgrind complaint, which also helps performance and also remove recenlty added redundant check adding extra cycles. Fixes: f44733c527da ("conntrack: Validate accessing of conntrack data in pkt_metadata.") CC: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* odp-util: fill IPv4 ver and head length for tnl_pushMartin Zhang2019-09-241-0/+1
| | | | | | | | | | | | | | | | | | When parse tnl_push, if IPv4 is used, we forget to fill the ipv4 version and ip header length fields. so there is a wrong ip header in the header of "struct ovs_action_push_tnl", which will caused wrong packdet sent by dpcl. test command: ovs-appctl dpctl/add-flow "in_port(1),eth_type(0x0800),ipv4(dst=9.9.9.6)" \ "tnl_push(tnl_port(2),header(size=50,type=4,eth(dst=08:00:27:2e:87:0d,src=98:03:9b:c6:d1:7c,dl_type=0x0800), \ ipv4(src=10.97.240.147,dst=10.96.74.33,proto=17,tos=0,ttl=64,frag=0x4000), \ udp(src=0,dst=4789,csum=0x0),vxlan(flags=0x8000000,vni=0x270f)),out_port(3)),4" Signed-off-by: Martin Zhang <martinbj2008@gmail.com> Signed-off-by: Dujie <dujie@didiglobal.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* stream_ssl: fix important memory leak in ssl_connect() functionDamijan Skvarc2019-09-231-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While checking valgrind reports after running "make check-valgrind" I have noticed reports for several tests similar to the following: .... ==5345== Memcheck, a memory error detector ==5345== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==5345== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==5345== Command: ovsdb-client --private-key=/home/damijan.skvarc/doma/ovs/tests/testpki-privkey.pem --certificate=/home/damijan.skvarc/doma/ovs/tests/testpki-cert.pem --ca-cert=/home/damijan.skvarc/doma/ovs/tests/testpki-cacert.pem transact ssl:127.0.0.1:40111 \ \ \ ["ordinals", ==5345== \ \ \ \ \ \ {"op":\ "update", ==5345== \ \ \ \ \ \ \ "table":\ "ordinals", ==5345== \ \ \ \ \ \ \ "where":\ [["number",\ "==",\ 1]], ==5345== \ \ \ \ \ \ \ "row":\ {"number":\ 2,\ "name":\ "old\ two"}}, ==5345== \ \ \ \ \ \ {"op":\ "update", ==5345== \ \ \ \ \ \ \ "table":\ "ordinals", ==5345== \ \ \ \ \ \ \ "where":\ [["name",\ "==",\ "two"]], ==5345== \ \ \ \ \ \ \ "row":\ {"number":\ 1,\ "name":\ "old\ one"}}] ==5345== Parent PID: 5344 ==5345== ==5345== ==5345== HEAP SUMMARY: ==5345== in use at exit: 116,551 bytes in 3,341 blocks ==5345== total heap usage: 5,134 allocs, 1,793 frees, 412,290 bytes allocated ==5345== ==5345== 6,221 (184 direct, 6,037 indirect) bytes in 1 blocks are definitely lost in loss record 498 of 500 ==5345== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==5345== by 0x5105E77: CRYPTO_malloc (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0) ==5345== by 0x51E1D23: ??? (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0) ==5345== by 0x51E4861: ??? (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0) ==5345== by 0x51E5414: ASN1_item_ex_d2i (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0) ==5345== by 0x51E546A: ASN1_item_d2i (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0) ==5345== by 0x4E56B27: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0) ==5345== by 0x4E5BA11: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0) ==5345== by 0x4E65145: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0) ==5345== by 0x4522DF: ssl_connect (stream-ssl.c:530) ==5345== by 0x443D38: scs_connecting (stream.c:315) ==5345== by 0x443D38: stream_connect (stream.c:338) ==5345== by 0x443FA1: stream_open_block (stream.c:266) ==5345== by 0x40AB79: open_jsonrpc (ovsdb-client.c:507) ==5345== by 0x40AB79: open_rpc (ovsdb-client.c:143) ==5345== by 0x40B06B: do_transact__ (ovsdb-client.c:871) ==5345== by 0x40B245: do_transact (ovsdb-client.c:893) ==5345== by 0x405F76: main (ovsdb-client.c:282) ==5345== ==5345== LEAK SUMMARY: ==5345== definitely lost: 184 bytes in 1 blocks ==5345== indirectly lost: 6,037 bytes in 117 blocks ==5345== possibly lost: 0 bytes in 0 blocks ==5345== still reachable: 110,330 bytes in 3,223 blocks ==5345== suppressed: 0 bytes in 0 blocks ==5345== Reachable blocks (those to which a pointer was found) are not shown. ==5345== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==5345== ==5345== For counts of detected and suppressed errors, rerun with: -v ==5345== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) .... This report was extracted from "index uniqueness checking" test and complains about leaking memory in ovsdb-client application. The problem is not huge, since ovsdb-client is CLI tool which is constantly reinvoked/restarted, thus leaked memory is not accumulated. More problematic issue is that for the same test valgrind reports the similar problem also for ovsdb-server: .... ==5290== Memcheck, a memory error detector ==5290== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al. ==5290== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info ==5290== Command: ovsdb-server --log-file --detach --no-chdir --pidfile --private-key=/home/damijan.skvarc/doma/ovs/tests/testpki-privkey2.pem --certificate=/home/damijan.skvarc/doma/ovs/tests/testpki-cert2.pem --ca-cert=/home/damijan.skvarc/doma/ovs/tests/testpki-cacert.pem --remote=pssl:0:127.0.0.1 db ==5290== Parent PID: 5289 ==5290== ==5292== Warning: noted but unhandled ioctl 0x2403 with no size/direction hints. ==5292== This could cause spurious value errors to appear. ==5292== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. ==5292== Warning: noted but unhandled ioctl 0x2400 with no size/direction hints. ==5292== This could cause spurious value errors to appear. ==5292== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. ==5290== ==5290== HEAP SUMMARY: ==5290== in use at exit: 2,066 bytes in 48 blocks ==5290== total heap usage: 87 allocs, 39 frees, 14,152 bytes allocated ==5290== ==5290== LEAK SUMMARY: ==5290== definitely lost: 0 bytes in 0 blocks ==5290== indirectly lost: 0 bytes in 0 blocks ==5290== possibly lost: 0 bytes in 0 blocks ==5290== still reachable: 2,066 bytes in 48 blocks ==5290== suppressed: 0 bytes in 0 blocks ==5290== Reachable blocks (those to which a pointer was found) are not shown. ==5290== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==5290== ==5290== For counts of detected and suppressed errors, rerun with: -v ==5290== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1) ==5292== Warning: noted but unhandled ioctl 0x2401 with no size/direction hints. ==5292== This could cause spurious value errors to appear. ==5292== See README_MISSING_SYSCALL_OR_IOCTL for guidance on writing a proper wrapper. ==5292== ==5292== HEAP SUMMARY: ==5292== in use at exit: 164,018 bytes in 4,252 blocks ==5292== total heap usage: 17,910 allocs, 13,658 frees, 1,907,468 bytes allocated ==5292== ==5292== 49,720 (1,472 direct, 48,248 indirect) bytes in 8 blocks are definitely lost in loss record 580 of 580 ==5292== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==5292== by 0x5105E77: CRYPTO_malloc (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0) ==5292== by 0x51E1D23: ??? (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0) ==5292== by 0x51E4861: ??? (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0) ==5292== by 0x51E5414: ASN1_item_ex_d2i (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0) ==5292== by 0x51E546A: ASN1_item_d2i (in /lib/x86_64-linux-gnu/libcrypto.so.1.0.0) ==5292== by 0x4E53E00: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0) ==5292== by 0x4E55727: ??? (in /lib/x86_64-linux-gnu/libssl.so.1.0.0) ==5292== by 0x452C4B: ssl_connect (stream-ssl.c:530) ==5292== by 0x445B18: scs_connecting (stream.c:315) ==5292== by 0x445B18: stream_connect (stream.c:338) ==5292== by 0x445B91: stream_recv (stream.c:369) ==5292== by 0x432A9C: jsonrpc_recv.part.7 (jsonrpc.c:310) ==5292== by 0x433977: jsonrpc_recv (jsonrpc.c:1139) ==5292== by 0x433977: jsonrpc_session_recv (jsonrpc.c:1112) ==5292== by 0x40CCE3: ovsdb_jsonrpc_session_run (jsonrpc-server.c:553) ==5292== by 0x40CCE3: ovsdb_jsonrpc_session_run_all (jsonrpc-server.c:586) ==5292== by 0x40CCE3: ovsdb_jsonrpc_server_run (jsonrpc-server.c:401) ==5292== by 0x40682E: main_loop (ovsdb-server.c:209) ==5292== by 0x40682E: main (ovsdb-server.c:460) ==5292== ==5292== LEAK SUMMARY: ==5292== definitely lost: 1,472 bytes in 8 blocks ==5292== indirectly lost: 48,248 bytes in 936 blocks ==5292== possibly lost: 0 bytes in 0 blocks ==5292== still reachable: 114,298 bytes in 3,308 blocks ==5292== suppressed: 0 bytes in 0 blocks ==5292== Reachable blocks (those to which a pointer was found) are not shown. ==5292== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==5292== ==5292== For counts of detected and suppressed errors, rerun with: -v ==5292== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 1 from 1) .... In this case ovsdb-server is running as daemon process (--detach option) and leaking memory is accumulated whenever ovsdb-client is reconnected. Within observed test ovsdb-client CLI tool connects 8 times to ovsdb-server. Leaked memory in ovsdb-client (for each invocation) is approx. 6K bytes, while leaked memory in ovsdb-server is aprox. 48Kbytes what is actually 8*6K. Thus per each connection both ovsdb-client and ovsdb-server leak approx. 6K bytes. I have done a small manual test to check if ovsdb-server is indeed accumulating leaked memory by dumping ovsdb-server in a loop: console1: ovsdb-server \ --log-file \ --detach --no-chdir --pidfile \ --private-key=testpki-privkey2.pem \ --certificate=testpki-cert2.pem \ --ca-cert=testpki-cacert.pem \ --remote=pssl:0:127.0.0.1 \ db while (true); do \ ovsdb-client \ --private-key=testpki-privkey.pem \ --certificate=testpki-cert.pem \ --ca-cert=testpki-cacert.pem \ dump ssl:127.0.0.1:42067; \ done console2: watch -n 0.5 'cat /proc/$(pidof ovsdb-server)/status | grep VmSize' In console2 it was evidently seen ovsdb-server is constantly leaking memory. After a while (i.e. after a certain number of reconnections) the OOM killer jumps out and kills ovsdb-server. Very similar situation was already noticed and described in https://github.com/openvswitch/ovs-issues/issues/168. There, the problem pops up while connecting controller to ovs-vswitchd daemon. Valgrind reports point to a problem in openssl library, however after studying openssl code for a while I have found out the problem is actually in ovs. When connection through SSL channel is taken place openssl library allocates memory for keeping track of certificate. Reference to this memory works very similar as std::shared_ptr pointer in recent C++ dialects. i.e. when allocated memory is referenced its reference counter is incremented and decremented after the memory is derefered. When reference counter becomes zero allocated memory is automatically deallocated. In openssl library environment certificate is retrieved by calling SSL_get_peer_certificate() where its reference counter is incremented. After retrieved certificate is not used any more its reference counter must be decremented by calling X509_free(). If not, allocated memory is never freed despite the ssl connection is properly closed. The problem was caused in stream-ssl.c in function ssl_connect(), which retrieves common peer name by calling SSL_get_peer_certificate() function and without calling X509_free() function afterwards. Signed-off-by: Damijan Skvarc <damjan.skvarc@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* conntrack: Validate accessing of conntrack data in pkt_metadataYifeng Sun2019-09-191-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Valgrind reported: 1305: ofproto-dpif - conntrack - ipv6 ==26942== Conditional jump or move depends on uninitialised value(s) ==26942== at 0x587C00: check_orig_tuple (conntrack.c:1006) ==26942== by 0x587C00: process_one (conntrack.c:1141) ==26942== by 0x587C00: conntrack_execute (conntrack.c:1220) ==26942== by 0x47B00F: dp_execute_cb (dpif-netdev.c:7305) ==26942== by 0x4AF756: odp_execute_actions (odp-execute.c:794) ==26942== by 0x477532: dp_netdev_execute_actions (dpif-netdev.c:7349) ==26942== by 0x477532: handle_packet_upcall (dpif-netdev.c:6630) ==26942== by 0x477532: fast_path_processing (dpif-netdev.c:6726) ==26942== by 0x47933C: dp_netdev_input__ (dpif-netdev.c:6814) ==26942== by 0x479AB8: dp_netdev_input (dpif-netdev.c:6852) ==26942== by 0x479AB8: dp_netdev_process_rxq_port (dpif-netdev.c:4287) ==26942== by 0x47A6A9: dpif_netdev_run (dpif-netdev.c:5264) ==26942== by 0x4324E7: type_run (ofproto-dpif.c:342) ==26942== by 0x41C5FE: ofproto_type_run (ofproto.c:1734) ==26942== by 0x40BAAC: bridge_run__ (bridge.c:2965) ==26942== by 0x410CF3: bridge_run (bridge.c:3029) ==26942== by 0x407614: main (ovs-vswitchd.c:127) ==26942== Uninitialised value was created by a heap allocation ==26942== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==26942== by 0x532574: xmalloc (util.c:138) ==26942== by 0x46CD62: dp_packet_new (dp-packet.c:153) ==26942== by 0x4A0431: eth_from_flow_str (netdev-dummy.c:1644) ==26942== by 0x4A0431: netdev_dummy_receive (netdev-dummy.c:1783) ==26942== by 0x531990: process_command (unixctl.c:308) ==26942== by 0x531990: run_connection (unixctl.c:342) ==26942== by 0x531990: unixctl_server_run (unixctl.c:393) ==26942== by 0x40761E: main (ovs-vswitchd.c:128) 1316: ofproto-dpif - conntrack - tcp port reuse ==24039== Conditional jump or move depends on uninitialised value(s) ==24039== at 0x587BF5: check_orig_tuple (conntrack.c:1004) ==24039== by 0x587BF5: process_one (conntrack.c:1141) ==24039== by 0x587BF5: conntrack_execute (conntrack.c:1220) ==24039== by 0x47B02F: dp_execute_cb (dpif-netdev.c:7306) ==24039== by 0x4AF7A6: odp_execute_actions (odp-execute.c:794) ==24039== by 0x47755B: dp_netdev_execute_actions (dpif-netdev.c:7350) ==24039== by 0x47755B: handle_packet_upcall (dpif-netdev.c:6631) ==24039== by 0x47755B: fast_path_processing (dpif-netdev.c:6727) ==24039== by 0x47935C: dp_netdev_input__ (dpif-netdev.c:6815) ==24039== by 0x479AD8: dp_netdev_input (dpif-netdev.c:6853) ==24039== by 0x479AD8: dp_netdev_process_rxq_port (dpif-netdev.c:4287) ==24039== by 0x47A6C9: dpif_netdev_run (dpif-netdev.c:5264) ==24039== by 0x4324F7: type_run (ofproto-dpif.c:342) ==24039== by 0x41C5FE: ofproto_type_run (ofproto.c:1734) ==24039== by 0x40BAAC: bridge_run__ (bridge.c:2965) ==24039== by 0x410CF3: bridge_run (bridge.c:3029) ==24039== by 0x407614: main (ovs-vswitchd.c:127) ==24039== Uninitialised value was created by a heap allocation ==24039== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==24039== by 0x5325C4: xmalloc (util.c:138) ==24039== by 0x46D144: dp_packet_new (dp-packet.c:153) ==24039== by 0x46D144: dp_packet_new_with_headroom (dp-packet.c:163) ==24039== by 0x51191E: eth_from_hex (packets.c:498) ==24039== by 0x4A03B9: eth_from_packet (netdev-dummy.c:1609) ==24039== by 0x4A03B9: netdev_dummy_receive (netdev-dummy.c:1765) ==24039== by 0x5319E0: process_command (unixctl.c:308) ==24039== by 0x5319E0: run_connection (unixctl.c:342) ==24039== by 0x5319E0: unixctl_server_run (unixctl.c:393) ==24039== by 0x40761E: main (ovs-vswitchd.c:128) According to comments in pkt_metadata_init(), conntrack data is valid only if pkt_metadata.ct_state != 0. This patch prevents check_orig_tuple() get called when conntrack data is uninitialized. Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* db-ctl-base: Free leaked ovsdb_datumYifeng Sun2019-09-191-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Valgrind reported: 2491: database commands -- negative checks ==19245== 36 (32 direct, 4 indirect) bytes in 1 blocks are definitely lost in loss record 36 of 53 ==19245== at 0x4C2FD5F: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==19245== by 0x431AB4: xrealloc (util.c:149) ==19245== by 0x41656D: ovsdb_datum_reallocate (ovsdb-data.c:1883) ==19245== by 0x41656D: ovsdb_datum_union (ovsdb-data.c:1961) ==19245== by 0x4107B2: cmd_add (db-ctl-base.c:1494) ==19245== by 0x406E2E: do_vsctl (ovs-vsctl.c:2626) ==19245== by 0x406E2E: main (ovs-vsctl.c:183) ==19252== 16 bytes in 1 blocks are definitely lost in loss record 9 of 52 ==19252== at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==19252== by 0x430F74: xmalloc (util.c:138) ==19252== by 0x414D07: clone_atoms (ovsdb-data.c:990) ==19252== by 0x4153F6: ovsdb_datum_clone (ovsdb-data.c:1012) ==19252== by 0x4104D3: cmd_remove (db-ctl-base.c:1564) ==19252== by 0x406E2E: do_vsctl (ovs-vsctl.c:2626) ==19252== by 0x406E2E: main (ovs-vsctl.c:183) This patch fixes them. Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* dns-resolve: Free 'struct ub_result' when callback returns error resultsYifeng Sun2019-09-191-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Valgrind reported: 1074: ofproto - flush flows, groups, and meters for controller change ==5499== 695 (288 direct, 407 indirect) bytes in 3 blocks are definitely lost in loss record 344 of 355 ==5499== at 0x4C2FB55: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so) ==5499== by 0x5E7F145: ??? (in /usr/lib/x86_64-linux-gnu/libunbound.so.2.4.0) ==5499== by 0x5E6EBDE: ub_resolve_async (in /usr/lib/x86_64-linux-gnu/libunbound.so.2.4.0) ==5499== by 0x55C739: resolve_async__.part.5 (dns-resolve.c:233) ==5499== by 0x55C85C: resolve_async__ (dns-resolve.c:261) ==5499== by 0x55C85C: resolve_callback__ (dns-resolve.c:262) ==5499== by 0x5E6FEF1: ub_process (in /usr/lib/x86_64-linux-gnu/libunbound.so.2.4.0) ==5499== by 0x55CAF3: dns_resolve (dns-resolve.c:153) ==5499== by 0x523864: parse_sockaddr_components_dns (socket-util.c:438) ==5499== by 0x523864: parse_sockaddr_components (socket-util.c:504) ==5499== by 0x524468: inet_parse_active (socket-util.c:541) ==5499== by 0x524564: inet_open_active (socket-util.c:579) ==5499== by 0x5959F9: tcp_open (stream-tcp.c:56) ==5499== by 0x529192: stream_open (stream.c:228) ==5499== by 0x529910: stream_open_with_default_port (stream.c:724) ==5499== by 0x595FAE: vconn_stream_open (vconn-stream.c:81) ==5499== by 0x535C9B: vconn_open (vconn.c:250) ==5499== by 0x517C59: reconnect (rconn.c:467) ==5499== by 0x5184C7: run_BACKOFF (rconn.c:492) ==5499== by 0x5184C7: rconn_run (rconn.c:660) ==5499== by 0x457FE8: ofservice_run (connmgr.c:1992) ==5499== by 0x457FE8: connmgr_run (connmgr.c:367) ==5499== by 0x41E0F5: ofproto_run (ofproto.c:1845) ==5499== by 0x40BA63: bridge_run__ (bridge.c:2971) In ub_resolve_async's callback function, 'struct ub_result' should be finally freed even if there is a resolving error. This patch fixes it. Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* dpif-netdev: Handle uninitialized value error for 'match.wc'Yifeng Sun2019-09-191-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Valgrind reported that match.wc was not initialized, as below: 1176: ofproto-dpif - fragment handling - actions ==21214== Conditional jump or move depends on uninitialised value(s) ==21214== at 0x4B77C1: odp_flow_key_from_flow__ (odp-util.c:6143) ==21214== by 0x46DB58: dp_netdev_upcall (dpif-netdev.c:6239) ==21214== by 0x4774A7: handle_packet_upcall (dpif-netdev.c:6608) ==21214== by 0x4774A7: fast_path_processing (dpif-netdev.c:6726) ==21214== by 0x47933C: dp_netdev_input__ (dpif-netdev.c:6814) ==21214== by 0x479AB8: dp_netdev_input (dpif-netdev.c:6852) ==21214== by 0x479AB8: dp_netdev_process_rxq_port (dpif-netdev.c:4287) ==21214== by 0x47A6A9: dpif_netdev_run (dpif-netdev.c:5264) ==21214== by 0x4324E7: type_run (ofproto-dpif.c:342) ==21214== by 0x41C5FE: ofproto_type_run (ofproto.c:1734) ==21214== by 0x40BAAC: bridge_run__ (bridge.c:2965) ==21214== by 0x410CF3: bridge_run (bridge.c:3029) ==21214== by 0x407614: main (ovs-vswitchd.c:127) ==21214== Uninitialised value was created by a stack allocation ==21214== at 0x4769C3: fast_path_processing (dpif-netdev.c:6672) 'match' is allocated on stack but its 'wc' is accessed in odp_flow_key_from_flow__ without proper initialization. This patch fixes it. Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* db-ctl-base: Give better error messages for ambiguous abbreviations.Ben Pfaff2019-09-181-21/+37
| | | | | | | | | Tables and columns may be abbreviated to unique prefixes, but until now the error messages have just said there's more than one match. This commit makes the error messages list the possibilities. Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* sset: New function sset_join().Ben Pfaff2019-09-182-0/+34
| | | | | | | | | This will acquire its first user in an upcoming commit. This function follows the pattern set by svec_join(). Reviewed-by: Yifeng Sun <pkusunyifeng@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Remove OVN.Mark Michelson2019-09-061-6/+6
| | | | | | | | | | | | | | | | OVN is separated into its own repo. This commit removes the OVN source, OVN tests, and OVN documentation. It also removes mentions of OVN from most documentation. The only place where OVN has been left is in changelogs/NEWS, since we shouldn't mess with the history of the project. There is an exception here. The ovsdb-cluster tests rely on ovn-nbctl and ovn-sbctl to run. Therefore those ovn utilities, as well as their dependencies remain in the repo with this commit. Acked-by: Numan Siddique <nusiddiq@redhat.com> Signed-off-by: Mark Michelson <mmichels@redhat.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* dpif-netdev: Add core id in the PMD thread name.Ilya Maximets2019-09-061-1/+8
| | | | | | | | | | | | | | | | | | This is highly useful to see on which core PMD is running by only looking at the thread name. Thread Id still allows to distinguish different threads running on the same core over the time: |dpif_netdev(pmd-c10/id:53)|DBG|Creating 2. subtable <...> |dpif_netdev(pmd-c10/id:53)|DBG|flow_add: <...>, actions:2 |dpif_netdev(pmd-c09/id:70)|DBG|Core 9 processing port <..> In gdb, top or any other utility it's useful to quickly catch up needed thread without parsing logs, memory or matching threads by port names they're handling. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Eelco Chaudron <echaudro@redhat.com>
* dpdk: Use ovs-numa provided functions to manage thread affinity.Ilya Maximets2019-09-061-15/+12
| | | | | | | | | | This allows to decrease code duplication and avoid using Linux-specific functions (this might be useful in the future if we'll try to allow running OvS+DPDK on FreeBSD). Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: William Tu <u9012063@gmail.com>
* dpif-netdev-perf: Fix TSC frequency for non-DPDK case.Ilya Maximets2019-09-066-18/+81
| | | | | | | | | | | | | | | | | | | | | | Unlike 'rte_get_tsc_cycles()' which doesn't need any specific initialization, 'rte_get_tsc_hz()' could be used only after successfull call to 'rte_eal_init()'. 'rte_eal_init()' estimates the TSC frequency for later use by 'rte_get_tsc_hz()'. Fairly said, we're not allowed to use 'rte_get_tsc_cycles()' before initializing DPDK too, but it works this way for now and provides correct results. This patch provides TSC frequency estimation code that will be used in two cases: * DPDK is not compiled in, i.e. DPDK_NETDEV not defined. * DPDK compiled in but not initialized, i.e. other_config:dpdk-init=false This change is mostly useful for AF_XDP netdev support, i.e. allows to use dpif-netdev/pmd-perf-show command and various PMD perf metrics. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: William Tu <u9012063@gmail.com>
* ovs-numa: Add dump based thread affinity functions.Ilya Maximets2019-09-062-5/+69
| | | | | | | | | New functions to get and set CPU affinity using CPU dumps. This will abstract OS specific implementation details from the cross-platform code. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: David Marchand <david.marchand@redhat.com>
* packets: Fix using outdated RSS hash after MPLS decapsulation.Nitin Katiyar2019-08-291-0/+4
| | | | | | | | | | | | | | | | | | | When a packet is received, the RSS hash is calculated if it is not already available. The Exact Match Cache (EMC) entry is then looked up using this RSS hash. When a MPLS encapsulated packet is received, the MPLS header is popped and the packet is recirculated. Since the RSS hash has not been invalidated here, the EMC lookup for all decapsulated packets will hit the same entry even though these packets will have different tuple values. This degrades performance severely as different inner packets from the same MPLS tunnel would hit the same EMC entry. This patch invalidates RSS hash (by resetting offload flags) after MPLS header is popped. Signed-off-by: Nitin Katiyar <nitin.katiyar@ericsson.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
* dpif-netdev: Fail port addition if reconfiguration failed.Ilya Maximets2019-08-291-1/+2
| | | | | | | | | | | | | | | | If the port was destroyed during the initial reconfiguration, we should report an error to the upper layers. Otherwise, successful addition of the port will be logged and upper layers will continue to configure this port. For example, the 'dpif' layer will try to initilaize flow API for this device. Fix that by checking for port existence after reconfiguration. We can't get the real error code here, so let's assume EINVAL. 'ovs-vsctl' will tell the user to check the logs for a real reason anyway. Fixes: e32971b8ddb4 ("dpif-netdev: Centralized threads and queues handling code.") Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ian Stokes <ian.stokes@intel.com>
* conntrack: Fix ICMPv4 error data L4 length check.Darrell Ball2019-08-292-17/+27
| | | | | | | | | | | | | | | | | | | | The ICMPv4 error data L4 length check was found to be too strict for TCP, expecting a minimum of 20 rather than 8 bytes. This worked by hapenstance for other inner protocols. The approach is to explicitly handle the ICMPv4 error data L4 length check and to do this for all supported inner protocols in the same way. Making the code common between protocols also allows the existing ICMPv4 related UDP tests to cover TCP and ICMP inner protocol cases. Note that ICMPv6 does not have an 8 byte limit for error L4 data. Fixes: a489b16854b5 ("conntrack: New userspace connection tracker.") CC: Daniele Di Proietto <diproiettod@ovn.org> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/361949.html Reported-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Signed-off-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Co-authored-by: Vishal Deep Ajmera <vishal.deep.ajmera@ericsson.com> Signed-off-by: Darrell Ball <dlu998@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* flow: save "vlan_hdrs" memset for untagged trafficYanqin Wei2019-08-281-1/+1
| | | | | | | | | | | For untagged traffic, it is unnecessary to clear vlan_hdrs as it costs 32B memset. So the patch improves it by postponing to clear vlan_hdrs until ethtype check. It can benefit both untagged and single-tagged traffic. From testing, it does not impact performance of dual-tagged traffic. Reviewed-by: Gavin Hu <Gavin.Hu@arm.com> Signed-off-by: Yanqin Wei <Yanqin.Wei@arm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* flow: Reduce metadata connection state branches in miniflow_extractMalvika Gupta2019-08-281-7/+4
| | | | | | | | | | | This patch merges two separate if-else branches for metadata connection state into one if-else branch to improve performance. It gives an average performance improvement of ~3% on arm platforms and ~4.5% on x86 platforms. Signed-off-by: Malvika Gupta <malvika.gupta@arm.com> Reviewed-by: Yanqin Wei <yanqin.wei@arm.com> Reviewed-by: Gavin Hu <gavin.hu@arm.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev-dpdk: Refactor vhost custom stats for extensibility.Ilya Maximets2019-08-261-10/+18
| | | | | | | | | | | | vHost interfaces currently has only one custom statistic, but there might be others in the near future. This refactoring makes the code work in the same way as it done for dpdk and afxdp stats to keep the common style over the different code places and makes it easily extensible for the new stats addition. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com>
* netdev-dpdk: Fix not reporting rx_oversize_errors in stats.Ilya Maximets2019-08-261-70/+32
| | | | | | | | | | | | | | | There is a big code duplication issue with DPDK xstats that led to missed "rx_oversize_errors" statistics. It's defined but not used. Fix that by actually using this stat along with code refactoring that will allow us to not make same mistakes in the future. Macro definitions are perfectly suitable to automate code generation in such cases and already used in a couple of places in OVS for similar purposes. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Reviewed-by: David Marchand <david.marchand@redhat.com> Acked-by: Kevin Traynor <ktraynor@redhat.com> Acked-by: Ian Stokes <ian.stokes@intel.com>
* ovsdb-idl.c: Allows retry even when using a single remote.Han Zhou2019-08-211-8/+3
| | | | | | | | | | | | When clustered mode is used, the client needs to retry connecting to new servers when certain failures happen. Today it is allowed to retry new connection only if multiple remotes are used, which prevents using LB VIP with clustered nodes. This patch makes sure the retry logic works when using LB VIP: although same IP is used for retrying, the LB can actually redirect the connection to a new node. Signed-off-by: Han Zhou <hzhou8@ebay.com> Signed-off-by: Ben Pfaff <blp@ovn.org>
* sat-math: Do not use __builtin_s*_overflow() with sparse.Ben Pfaff2019-08-211-3/+3
| | | | | | | | | | | Some versions of sparse do not understand __builtin_saddll_overflow() and related GCC builtins for calculations with overflow detection. This patch avoids using them when sparse is in use. Reported-by: Justin Pettit <jpettit@ovn.org> Tested-by: Justin Pettit <jpettit@ovn.org> Acked-by: Justin Pettit <jpettit@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
* Switch from internal port to all ports definedAlin Gabriel Serdean2019-08-131-1/+1
| | | | | | | | | | | | This patch changes the way we try to figure out if a port is defined on a given switch. Instead of looking only in the internal ports defined switch to all ports defined. This caused issues when trying to add a Hyper-V container port to a given OVS bridge. Reported-by: Danting Liu <dantingl@vmware.com> Signed-off-by: Alin Gabriel Serdean <aserdean@ovn.org> Acked-by: Anand Kumar <kumaranand@vmware.com>
* netdev-afxdp: fix corner case where umem entries were not releasedEelco Chaudron2019-08-081-6/+6
| | | | | | | | | If for some reason the last element in the batch was already pushed on the stack, none of the elements where pushed. This was leading to buffer starvation. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
* ovs-tc: offload MPLS set actions to TC datapathJohn Hurley2019-08-013-0/+86
| | | | | | | | | | | | | | Recent modifications to TC allows the modifying of fields within the outermost MPLS header of a packet. OvS datapath rules impliment an MPLS set action by supplying a new MPLS header that should overwrite the current one. Convert the OvS datapath MPLS set action to a TC modify action and allow such rules to be offloaded to a TC datapath. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
* ovs-tc: offload MPLS push actions to TC datapathJohn Hurley2019-08-014-0/+115
| | | | | | | | | | | | | | | | TC can now be used to push an MPLS header onto a packet. The MPLS label is the only information that needs to be passed here with the rest reverting to default values if none are supplied. OvS, however, gives the entire MPLS header to be pushed along with the MPLS protocol to use. TC can optionally accept these values so can be made replicate the OvS datapath rule. Convert OvS MPLS push datapath rules to TC format and offload to a TC datapath. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
* ovs-tc: offload MPLS pop actions to TC datapathJohn Hurley2019-08-013-0/+83
| | | | | | | | | | | | TC now supports an action to pop the outer MPLS header from a packet. The next protocol after the header is required alongside this. Currently, OvS datapath rules also supply this information. Offload OvS MPLS pop actions to TC along with the next protocol. Signed-off-by: John Hurley <john.hurley@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com>
* dpif-netlink: Allow offloading of flows with dl_type 0x1234.Ilya Maximets2019-07-311-5/+0
| | | | | | | | | | | | | | | 'dpif_probe_feature()' always has DPIF_FP_PROBE flag set. Other probing code uses dpif_execute() with DPIF_OP_EXECUTE, hence never calls parse_flow_put(). Thus, this 'if' statement is wrong and should be removed as it only forbids offloading of the real legitimate flows with dl_type 0x1234. Dummy flows never reach this code. CC: Paul Blakey <paulb@mellanox.com> Fixes: 8b668ee3f0cc ("dpif-netlink: Use netdev flow put api to insert a flow") Reported-by: Eli Britstein <elibr@mellanox.com> Acked-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
* netdev-afxdp: Error when no XDP program loaded.William Tu2019-07-291-2/+6
| | | | | | | | netdev-afxdp requires XDP program to be loaded. When prog_id == 0, it indicates no XDP program, so return error and free resources. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
* ovsdb-data: Don't put strings with digits in quotes.Ilya Maximets2019-07-251-1/+6
| | | | | | | | | | No need to use quotes for strings like "br0". Keeping UUIDs always in quotes to avoid different treatment of those that starts with digits and those that starts with letters. Signed-off-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ben Pfaff <blp@ovn.org>
* netdev-afxdp: Convert AFXDP_DEBUG to custom stats.Ilya Maximets2019-07-243-19/+51
| | | | | | | | | | | These are valid statistics of a network interface and should be exposed via custom stats. The same MACRO trick as in vswitchd/bridge.c is used to reduce code duplication and easily add new stats if necessary in the future. Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
* netdev-afxdp: Fix use of unconfigured device.Ilya Maximets2019-07-234-3/+33
| | | | | | | | | | | | | | | | | | | | | | | In case of failure of 'xsk_configure_all()', 'n_rxq' and 'xdpmode' will remain in a new state. This will result in successful reconfiguration (immediate return, because configuration is already applied) if 'netdev_reconfigure()' will be called again. Same issue was fixed previously for netdev-dpdk using 'dev->started' flag in commit: 606f66507250 ("netdev-dpdk: Don't use PMD driver if not configured successfully") Let's use similar approach with checking the 'dev->xsks' which only exists if configuration was successful. Additionally implemented 'netdev_afxdp_construct()' function to explicitly initialize all the specific fields and request the reconfiguration. CC: William Tu <u9012063@gmail.com> Fixes: 0de1b425962d ("netdev-afxdp: add new netdev type for AF_XDP.") Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
* tnl-neigh-cache: Purge learnt neighbors when port/bridge is deletedVasu Dasari2019-07-222-0/+21
| | | | | | | | | | | | | | | | | Say an ARP entry is learnt on a OVS port and when such a port is deleted, learnt entry should be removed from the port. It would have be aged out after ARP ageout time. This code will clean up immediately. Added test case(tunnel - neighbor entry add and deletion) in tunnel.at, to verify neighbors are added and removed on deletion of a ports and bridges. Discussion for this addition is at: https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048754.html Signed-off-by: Vasu Dasari <vdasari@gmail.com> Reviewed-by: Flavio Fernandes <flavio@flaviof.com> Reviewed-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ben Pfaff <blp@ovn.org>
* netdev-afxdp: add new netdev type for AF_XDP.William Tu2019-07-1914-109/+1667
| | | | | | | | | | | | | | | | The patch introduces experimental AF_XDP support for OVS netdev. AF_XDP, the Address Family of the eXpress Data Path, is a new Linux socket type built upon the eBPF and XDP technology. It is aims to have comparable performance to DPDK but cooperate better with existing kernel's networking stack. An AF_XDP socket receives and sends packets from an eBPF/XDP program attached to the netdev, by-passing a couple of Linux kernel's subsystems As a result, AF_XDP socket shows much better performance than AF_PACKET For more details about AF_XDP, please see linux kernel's Documentation/networking/af_xdp.rst. Note that by default, this feature is not compiled in. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
* ovs-thread: Add pthread spin lock support.William Tu2019-07-191-0/+31
| | | | | | | | | | The patch adds the basic spin lock functions: ovs_spin_{lock, try_lock, unlock, init, destroy}. Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ben Pfaff <blp@ovn.org> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>