| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ovs-vswitchd makes extensive use of RCU to defer freeing memory past the
latest time that it could be in use by a thread. Until now, ovs-vswitchd
has not waited for RCU callbacks to fire before exiting. This meant that
in many cases, when ovs-vswitchd exits, many blocks of memory are stuck in
RCU callback queues, which valgrind often reports as "possible" memory
leaks.
This commit adds a new function ovsrcu_exit() that waits and fires as many
RCU callbacks as it reasonably can. It can only do so for the thread that
calls it and the thread that calls the callbacks, but generally speaking
ovs-vswitchd shuts down other threads before it exits anyway, so this is
pretty good.
In my testing this eliminates most valgrind warnings for tests that run
ovs-vswitchd. This ought to make it easier to distinguish new leaks that
are real from existing non-leaks.
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: William Tu <u9012063@gmai.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With RCU in Open vSwitch it's very easy to protect objects accessed by
a pointer, but sometimes a pointer is not available.
One example is the vhost id for DPDK 16.07. Until DPDK 16.04 a pointer
was used to access a vhost device with RCU semantics. From DPDK 16.07
an integer id (which is an array index) is used to access a vhost
device. Ideally, we want the exact same RCU semantics that we had for
the pointer, on the integer (atomicity, memory barriers, behaviour
around quiescent states)
This commit implements a new type in ovs-rcu: ovsrcu_index. The newly
implemented ovsrcu_index_*() functions should be used to access the
type.
Even though we say "Do not, in general, declare a typedef for a struct,
union, or enum.", I think we're not in the "general" case.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ciara Loftus <ciara.loftus@intel.com>
Acked-by: Jarno Rajahalme <jarno@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The PMD thread needs to keep processing RX queues in order
to achieve maximum throughput. It also needs to sweep emc
cache and quiesce which use seq_mutex. That mutex can
eventually block the PMD thread causing latency spikes and
affecting the throughput.
Since there is no requirement for running those tasks at a
specific time, this patch extend seq API to allow tentative
locking instead.
Reported-by: Karl Rister <krister@redhat.com>
Co-authored-by: Karl Rister <krister@redhat.com>
Signed-off-by: Flavio Leitner <fbl@redhat.com>
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ovs_mutex_cond_wait() is used in many functions in dpif-netdev to
synchronize with pmd threads, but we can't guarantee that the callers do
not hold RCU references, so it's better to avoid quiescing.
In system_stats_thread_func() the code relied on ovs_mutex_cond_wait()
to introduce a quiescent state, so explicit calls to
ovsrcu_quiesce_start() and ovsrcu_quiesce_end() are added there.
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Tested-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A new thread must be started in a non quiescent state. There is a call
to ovsrcu_quiesce_end() in ovsthread_wrapper(), to enforce this.
ovs_thread_create(), instead, is executed in the parent thread. It must
call ovsrcu_quiesce_end() on its first invocation, to put the main
thread in a non quiescent state. On every other invocation, it doesn't
make sense to alter the calling thread state, so this commits wraps the
call to ovsrcu_quiesce_end() in an ovsthread_once construct.
This fixes a bug in ovs-rcu where the first call in the process to
ovsrcu_quiesce_start() will not be honored, because the calling thread
will need to create the 'urcu' thread (and creating a thread will
wrongly end its quiescent state).
ovsrcu_quiesce_start()
ovs_rcu_quiesced()
if (ovsthread_once_start(&once)) {
ovs_thread_create("urcu") /*This will end the quiescent state*/
}
This bug affects in particular ovs-vswitchd with DPDK.
In the DPDK case the first threads created are "vhost_thread" and
"dpdk_watchdog". If dpdk_watchdog is the first to call
ovsrcu_quiesce_start() (via xsleep()), the call is not honored and
the RCU grace period lasts at least for DPDK_PORT_WATCHDOG_INTERVAL
(5s on current master). If vhost_thread, on the other hand, is the
first to call ovsrcu_quiesce_start(), the call is not honored and the
RCU grace period lasts undefinitely, because no more calls to
ovsrcu_quiesce_start() are issued from vhost_thread.
For some reason (it's a race condition after all), on current master,
dpdk_watchdog will always be the first to call ovsrcu_quiesce_start(),
but with the upcoming DPDK database configuration changes, sometimes
vhost_thread will issue the first call to ovsrcu_quiesce_start().
Sample ovs-vswitchd.log:
2016-03-23T22:34:28.532Z|00004|ovs_rcu(urcu3)|WARN|blocked 8000 ms
waiting for vhost_thread2 to quiesce
2016-03-23T22:34:30.501Z|00118|ovs_rcu|WARN|blocked 8000 ms waiting for
vhost_thread2 to quiesce
2016-03-23T22:34:36.532Z|00005|ovs_rcu(urcu3)|WARN|blocked 16000 ms
waiting for vhost_thread2 to quiesce
2016-03-23T22:34:38.501Z|00119|ovs_rcu|WARN|blocked 16000 ms waiting for
vhost_thread2 to quiesce
The commit also adds a test for the ovs-rcu module to make sure that:
* A new thread is started in a non quiescent state.
* The first call to ovsrcu_quiesce_start() is honored.
* When a process becomes multithreaded the main thread is put in an
active state
Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
Acked-by: Ben Pfaff <blp@ovn.org>
|
|
|
|
|
|
| |
Suggested-by: William Tu <u9012063@gmail.com>
Signed-off-by: Ben Pfaff <blp@ovn.org>
Acked-by: Russell Bryant <russell@ovn.org>
|
|
|
|
|
|
|
|
| |
A comment referred to a "Usage" section but the section was named "Use".
This fixes the problem (also a grammar error).
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Postponed 'next' member poisoning was based on the faulty assumption
that postponed functions would be called in the order they were
postponed. This assumption holds only for the functions postponed by
any single thread. When functions are postponed by different
threads, there are no guarantees of the order in which the functions
may be called, or timing between those calls after the next grace
period has passed.
Given this, the postponed poisoning could have executed after
postponed destruction of the object containing the rculist element.
This bug was revealed after the memory leaks on rule deletion were
recently fixed.
This patch removes the postponed 'next' member poisoning and adds
documentation describing the ordering limitations in OVS RCU.
Alex Wang dug out the root cause of the resulting crashes, thanks!
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, OVSRCU_TYPE_INITIALIZER always initializes the RCU pointer
as NULL. There is no reason why the RCU pointer could not be
initialized with a non-NULL value, however, as statically allocated
memory is even more stable than required for RCU.
This patch changes the initializer to OVSRCU_INITIALIZER(VALUE), which
can take any pointer value as a parameter.
This allows rculist, which is introduced in a following patch, to
provide an initializer similar to the one in the normal list.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As ovsrcu_get() looks like a function call, it is reasonable for the
callers to expect that the arguments are evaluated only once.
CONST_CAST expands its 'POINTER' argument multiple times, and the
exact effect of this turned out to be compiler dependent. Fix this by
expanding the macro argument before CONST_CAST, and removing
unnecessary CONST_CASTs.
VMware-BZ: #1287651
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
|
|
|
|
|
|
| |
A following patch will add the first external user.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ovsrcu_init() was named after the atomic_init(), but the semantics are
different enough to warrant a different name. Basically C11
atomic_init is defined in a way that allows the implementation to
assign the value without any syncronization, so in theory stores via
atomic_init could be seen by others after stores via atomic_set, even
if the atomic_init appeared earlier in the code.
ovsrcu_set_hidden() can be used to set an RCU protected variable when
it is not yet accessible by any active reader, but will be made
visible later via an ovsrcu_set call on some other pointer.
This patch also adds a new ovsrcu_init() that can be used to initilize
RCU protected variables when the readers are not yet executing. The
new ovsrcu_init() is implemented with atomic_init(), so it does not
provide any kind of syncronization.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ovsrcu_set() and ovsrcu_init() look like functions, so users are
justified in expecting the arguments to be evaluated before any of
the body of ovsrcu_set or ovsrcu_init().
With ovs-atomic-pthreads, a fallback ovs-atomics implementation used
when no C11 atomics are available or with GCC older than 4.0, the
prior definitions led to nested mutex locking when the 'VALUE'
argument was an atomic_read(). This is resolved by ensuring function
argument semantics for the parameters. For non-GCC compilers this is
done with an inline function. GCC implementation of ovs-rcu does not
fix the RCU variable type, so a GCC macro needs to be used instead.
Reported-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
|
|
|
|
|
|
|
| |
'struct ovsrcu_pointer' is always used with the 'struct' keyword, so
remove the unneeded 'typedef'.
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
|
|
|
|
|
|
|
|
|
|
|
|
| |
lib/ovs-rcu.h had some of the comments duplicated.
Add ovsrcu_init() that can be used like ovsrcu_set() when the RCU
protected pointer is not yet visible any readers.
ovs-rcu internal initialization function is renamed as ovsrcu_init_module().
Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com>
Acked-by: Ben Pfaff <blp@nicira.com>
|
|
|
|
| |
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
|
|
|
|
|
|
| |
to NULL.
Signed-off-by: Ryan Wilson <wryan@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
|
|
|
|
|
|
|
|
|
|
| |
Otherwise ovsrcu_synchronize() busy-waits in its loop because its
poll_block() un-quiesces, causing the global_seqno to increase, which is
what it waits for.
Reported-by: Alex Wang <alexw@nicira.com>
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Alex Wang <alexw@nicira.com>
|
|
RCU allows multiple threads to read objects in parallel without any
performance penalty. The following commit will introduce the first use.
Signed-off-by: Ben Pfaff <blp@nicira.com>
Acked-by: Andy Zhou <azhou@nicira.com>
|