summaryrefslogtreecommitdiff
path: root/Documentation/topics
diff options
context:
space:
mode:
authorEelco Chaudron <echaudro@redhat.com>2021-12-22 10:17:44 +0100
committerIlya Maximets <i.maximets@ovn.org>2022-01-18 00:46:30 +0100
commitff4c712d45d7b696039061b1821a39dcd72d88fb (patch)
treeb369343af4eafda3238c7216cbda0bc1af151013 /Documentation/topics
parent512fab8f21ebd336a0be983b7ba5a62114592c03 (diff)
downloadopenvswitch-ff4c712d45d7b696039061b1821a39dcd72d88fb.tar.gz
Documentation: Add USDT documentation and bpftrace example.
Add the USDT documentation and a bpftrace example using the bridge run USDT probes. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Acked-by: Paolo Valerio <pvalerio@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Diffstat (limited to 'Documentation/topics')
-rw-r--r--Documentation/topics/index.rst1
-rw-r--r--Documentation/topics/usdt-probes.rst269
2 files changed, 270 insertions, 0 deletions
diff --git a/Documentation/topics/index.rst b/Documentation/topics/index.rst
index 3699fd5c4..90d4c66e6 100644
--- a/Documentation/topics/index.rst
+++ b/Documentation/topics/index.rst
@@ -56,3 +56,4 @@ OVS
idl-compound-indexes
ovs-extensions
userspace-tx-steering
+ usdt-probes
diff --git a/Documentation/topics/usdt-probes.rst b/Documentation/topics/usdt-probes.rst
new file mode 100644
index 000000000..8aa596b80
--- /dev/null
+++ b/Documentation/topics/usdt-probes.rst
@@ -0,0 +1,269 @@
+..
+ Licensed under the Apache License, Version 2.0 (the "License"); you may
+ not use this file except in compliance with the License. You may obtain
+ a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE-2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ License for the specific language governing permissions and limitations
+ under the License.
+
+ Convention for heading levels in Open vSwitch documentation:
+
+ ======= Heading 0 (reserved for the title in a document)
+ ------- Heading 1
+ ~~~~~~~ Heading 2
+ +++++++ Heading 3
+ ''''''' Heading 4
+
+ Avoid deeper levels because they do not render well.
+
+=============================================
+User Statically-Defined Tracing (USDT) probes
+=============================================
+
+Sometimes it's desired to troubleshoot one of OVS's components in the field.
+One of the techniques used is to add dynamic tracepoints, for example using
+perf_. However, the desired dynamic tracepoint and/or the desired variable,
+might not be available due to compiler optimizations.
+
+In this case, a well-thought-off, static tracepoint could be permanently added,
+so it's always available. For OVS we use the DTrace probe macro's, which have
+little to no overhead when disabled. Various tools exist to enable them. See
+some examples below.
+
+
+Compiling with USDT probes enabled
+----------------------------------
+
+Since USDT probes are compiled out by default, a compile-time option is
+available to include them. To add the probes to the generated code, use the
+following configure option ::
+
+ $ ./configure --enable-usdt-probes
+
+The following line should be seen in the configure output when the above option
+is used ::
+
+ checking whether USDT probes are enabled... yes
+
+
+Listing available probes
+------------------------
+
+There are various ways to display USDT probes available in a userspace
+application. Here we show three examples. All assuming ovs-vswitchd is in the
+search path with USDT probes enabled:
+
+You can use the **perf** tool as follows ::
+
+ $ perf buildid-cache --add $(which ovs-vswitchd)
+ $ perf list | grep sdt_
+ sdt_main:poll_block [SDT event]
+ sdt_main:run_start [SDT event]
+
+You can use the bpftrace_ tool ::
+
+ # bpftrace -l "usdt:$(which ovs-vswitchd):*"
+ usdt:/usr/sbin/ovs-vswitchd:main:poll_block
+ usdt:/usr/sbin/ovs-vswitchd:main:run_start
+
+.. note::
+
+ If you execute this on a running process,
+ ``bpftrace -lp $(pidof ovs-vswitchd) "usdt:*"`` , it will list all USDT
+ events, i.e., also the ones available in the used shared libraries.
+
+Finally, you can use the **tplist** tool which is part of the bcc_ framework ::
+
+ $ /usr/share/bcc/tools/tplist -vv -l $(which ovs-vswitchd)
+ b'main':b'poll_block' [sema 0x0]
+ location #1 b'/usr/sbin/ovs-vswitchd' 0x407fdc
+ b'main':b'run_start' [sema 0x0]
+ location #1 b'/usr/sbin/ovs-vswitchd' 0x407ff6
+
+
+Using probes
+------------
+
+We will use the OVS sandbox environment in combination with the probes shown
+above to try out some of the available trace tools. To start up the virtual
+environment use the ``make sandbox`` command. In addition we have to create
+a bridge to kick of the main bridge loop ::
+
+ $ ovs-vsctl add-br test_bridge
+ $ ovs-vsctl show
+ 055acdca-2f0c-4f6e-b542-f4b6d2c44e08
+ Bridge test_bridge
+ Port test_bridge
+ Interface test_bridge
+ type: internal
+
+perf
+~~~~
+
+Perf is using Linux uprobe based event tracing to for capturing the events.
+To enable the main:\* probes as displayed above and take an actual trace, you
+need to execute the following sequence of perf commands ::
+
+ # perf buildid-cache --add $(which ovs-vswitchd)
+
+ # perf list | grep sdt_
+ sdt_main:poll_block [SDT event]
+ sdt_main:run_start [SDT event]
+
+ # perf probe --add=sdt_main:poll_block --add=sdt_main:run_start
+ Added new events:
+ sdt_main:poll_block (on %poll_block in /usr/sbin/ovs-vswitchd)
+ sdt_main:run_start (on %run_start in /usr/sbin/ovs-vswitchd)
+
+ You can now use it in all perf tools, such as:
+
+ perf record -e sdt_main:run_start -aR sleep 1
+
+ # perf record -e sdt_main:run_start -e sdt_main:poll_block \
+ -p $(pidof ovs-vswitchd) sleep 30
+ [ perf record: Woken up 1 times to write data ]
+ [ perf record: Captured and wrote 0.039 MB perf.data (132 samples) ]
+
+ # perf script
+ ovs-vswitchd 8576 [011] 21031.340433: sdt_main:run_start: (407ff6)
+ ovs-vswitchd 8576 [011] 21031.340516: sdt_main:poll_block: (407fdc)
+ ovs-vswitchd 8576 [011] 21031.841726: sdt_main:run_start: (407ff6)
+ ovs-vswitchd 8576 [011] 21031.842088: sdt_main:poll_block: (407fdc)
+ ...
+
+Note that the above examples works with the sandbox environment, so make sure
+you execute the above command while in the sandbox shell!
+
+There are a lot more options available with perf, for example, the
+``--call-graph dwarf`` option, which would give you a call graph in the
+``perf script`` output. See the perf documentation for more information.
+
+One other interesting feature is that the perf data can be converted for use
+by the trace visualizer `Trace Compass`_. This can be done using the
+``--all --to-ctf`` option to the ``perf data convert`` tool.
+
+
+bpftrace
+~~~~~~~~
+
+bpftrace is a high-level tracing language based on eBPF, which can be used to
+script USDT probes. Here we will show a simple one-liner to display the
+USDT probes being hit. However, the script section below reference some more
+advanced bpftrace scripts.
+
+This is a simple bpftrace one-liner to show all ``main:*`` USDT probes ::
+
+ # bpftrace -p $(pidof ovs-vswitchd) -e \
+ 'usdt::main:* { printf("%s %u [%u] %u %s\n",
+ comm, pid, cpu, elapsed, probe); }'
+ Attaching 2 probes...
+ ovs-vswitchd 8576 [11] 203833199 usdt:main:run_start
+ ovs-vswitchd 8576 [11] 204086854 usdt:main:poll_block
+ ovs-vswitchd 8576 [11] 221611985 usdt:main:run_start
+ ovs-vswitchd 8576 [11] 221892019 usdt:main:poll_block
+
+
+bcc
+~~~
+
+The BPF Compiler Collection (BCC) is a set of tools and scripts that also use
+eBPF for tracing. The example below uses the ``trace`` tool to show the events
+while they are being generated ::
+
+ # /usr/share/bcc/tools/trace -T -p $(pidof ovs-vswitchd) \
+ 'u::main:run_start' 'u::main:poll_block'
+ TIME PID TID COMM FUNC
+ 15:49:06 8576 8576 ovs-vswitchd main:run_start
+ 15:49:06 8576 8576 ovs-vswitchd main:poll_block
+ 15:49:06 8576 8576 ovs-vswitchd main:run_start
+ 15:49:06 8576 8576 ovs-vswitchd main:poll_block
+ ^C
+
+
+Scripts
+-------
+To not have to re-invent the wheel when trying to debug complex OVS issues, a
+set of scripts are provided in the source repository. They are located in the
+``utilities/usdt-scripts/`` directory, and each script contains detailed
+information on how they should be used, and what information they provide.
+
+
+Available probes
+----------------
+The next sections describes all the available probes, their use case, and if
+used in any script, which one. Any new probes being added to OVS should get
+their own section. See the below "Adding your own probes" section for the
+used naming convention.
+
+Available probes in ``ovs_vswitchd``:
+
+- main:poll_block
+- main:run_start
+
+
+probe main:run_start
+~~~~~~~~~~~~~~~~~~~~
+
+**Description**:
+The ovs-vswitchd's main process contains a loop that runs every time some work
+needs to be done. This probe gets triggered every time the loop starts from the
+beginning. See also the ``main:poll_block`` probe below.
+
+**Arguments**:
+
+*None*
+
+**Script references**:
+
+- ``utilities/usdt-scripts/bridge_loop.bt``
+
+
+probe main:poll_block
+~~~~~~~~~~~~~~~~~~~~~
+
+**Description**:
+The ovs-vswitchd's main process contains a loop that runs every time some work
+needs to be done. This probe gets triggered every time the loop is done, and
+it's about to wait for being re-started by a poll_block() call returning.
+See also the ``main:run_start`` probe above.
+
+**Arguments**:
+
+*None*
+
+**Script references**:
+
+- ``utilities/usdt-scripts/bridge_loop.bt``
+
+
+Adding your own probes
+----------------------
+
+Adding your own probes is as simple as adding the ``OVS_USDT_PROBE()`` macro
+to the code. It's similar to the ``DTRACE_PROBExx`` macro's with the difference
+that it does automatically determine the number of optional arguments.
+
+The macro requires at least two arguments. The first one being the *provider*,
+and the second one being the *name*. To keep some consistency with the probe
+naming, please use the following convention. The *provider* should be the
+function name, and the *name* should be the name of the tracepoint. If you do
+function entry and exit like probes, please use ``entry`` and ``exit``.
+
+If, for some reason, you do not like to use the function name as a *provider*,
+please prefix it with ``__``, so we know it's not a function name.
+
+The remaining parameters, up to 10, can be variables, pointers, etc., that
+might be of interest to capture at this point in the code. Note that the
+provided variables can cause the compiler to be less effective in optimizing.
+
+
+
+.. _perf : https://developers.redhat.com/blog/2020/05/29/debugging-vhost-user-tx-contention-in-open-vswitch#
+.. _bpftrace : https://github.com/iovisor/bpftrace
+.. _bcc : https://github.com/iovisor/bcc
+.. _Trace Compass : https://www.eclipse.org/tracecompass/