summaryrefslogtreecommitdiff
path: root/lib/ovs-rcu.h
Commit message (Collapse)AuthorAgeFilesLines
* ovs-vswitchd: Fire RCU callbacks before exit to reduce memory leak warnings.Ben Pfaff2018-02-011-0/+2
| | | | | | | | | | | | | | | | | | | | | | ovs-vswitchd makes extensive use of RCU to defer freeing memory past the latest time that it could be in use by a thread. Until now, ovs-vswitchd has not waited for RCU callbacks to fire before exiting. This meant that in many cases, when ovs-vswitchd exits, many blocks of memory are stuck in RCU callback queues, which valgrind often reports as "possible" memory leaks. This commit adds a new function ovsrcu_exit() that waits and fires as many RCU callbacks as it reasonably can. It can only do so for the thread that calls it and the thread that calls the callbacks, but generally speaking ovs-vswitchd shuts down other threads before it exits anyway, so this is pretty good. In my testing this eliminates most valgrind warnings for tests that run ovs-vswitchd. This ought to make it easier to distinguish new leaks that are real from existing non-leaks. Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: William Tu <u9012063@gmai.com>
* ovs-rcu: Add new ovsrcu_index type.Daniele Di Proietto2016-08-031-0/+84
| | | | | | | | | | | | | | | | | | | | | | | With RCU in Open vSwitch it's very easy to protect objects accessed by a pointer, but sometimes a pointer is not available. One example is the vhost id for DPDK 16.07. Until DPDK 16.04 a pointer was used to access a vhost device with RCU semantics. From DPDK 16.07 an integer id (which is an array index) is used to access a vhost device. Ideally, we want the exact same RCU semantics that we had for the pointer, on the integer (atomicity, memory barriers, behaviour around quiescent states) This commit implements a new type in ovs-rcu: ovsrcu_index. The newly implemented ovsrcu_index_*() functions should be used to access the type. Even though we say "Do not, in general, declare a typedef for a struct, union, or enum.", I think we're not in the "general" case. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Tested-by: Ciara Loftus <ciara.loftus@intel.com> Acked-by: Jarno Rajahalme <jarno@ovn.org>
* dpif-netdev: Remove PMD latency on seq_mutexFlavio Leitner2016-07-081-0/+1
| | | | | | | | | | | | | | | | | The PMD thread needs to keep processing RX queues in order to achieve maximum throughput. It also needs to sweep emc cache and quiesce which use seq_mutex. That mutex can eventually block the PMD thread causing latency spikes and affecting the throughput. Since there is no requirement for running those tasks at a specific time, this patch extend seq API to allow tentative locking instead. Reported-by: Karl Rister <krister@redhat.com> Co-authored-by: Karl Rister <krister@redhat.com> Signed-off-by: Flavio Leitner <fbl@redhat.com> Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com>
* ovs-thread: Do not quiesce in ovs_mutex_cond_wait().Daniele Di Proietto2016-05-231-2/+1
| | | | | | | | | | | | | | | ovs_mutex_cond_wait() is used in many functions in dpif-netdev to synchronize with pmd threads, but we can't guarantee that the callers do not hold RCU references, so it's better to avoid quiescing. In system_stats_thread_func() the code relied on ovs_mutex_cond_wait() to introduce a quiescent state, so explicit calls to ovsrcu_quiesce_start() and ovsrcu_quiesce_end() are added there. Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Tested-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ilya Maximets <i.maximets@samsung.com> Acked-by: Ben Pfaff <blp@ovn.org>
* ovs-thread: Do not always end quiescent state in ovs_thread_create().Daniele Di Proietto2016-03-251-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A new thread must be started in a non quiescent state. There is a call to ovsrcu_quiesce_end() in ovsthread_wrapper(), to enforce this. ovs_thread_create(), instead, is executed in the parent thread. It must call ovsrcu_quiesce_end() on its first invocation, to put the main thread in a non quiescent state. On every other invocation, it doesn't make sense to alter the calling thread state, so this commits wraps the call to ovsrcu_quiesce_end() in an ovsthread_once construct. This fixes a bug in ovs-rcu where the first call in the process to ovsrcu_quiesce_start() will not be honored, because the calling thread will need to create the 'urcu' thread (and creating a thread will wrongly end its quiescent state). ovsrcu_quiesce_start() ovs_rcu_quiesced() if (ovsthread_once_start(&once)) { ovs_thread_create("urcu") /*This will end the quiescent state*/ } This bug affects in particular ovs-vswitchd with DPDK. In the DPDK case the first threads created are "vhost_thread" and "dpdk_watchdog". If dpdk_watchdog is the first to call ovsrcu_quiesce_start() (via xsleep()), the call is not honored and the RCU grace period lasts at least for DPDK_PORT_WATCHDOG_INTERVAL (5s on current master). If vhost_thread, on the other hand, is the first to call ovsrcu_quiesce_start(), the call is not honored and the RCU grace period lasts undefinitely, because no more calls to ovsrcu_quiesce_start() are issued from vhost_thread. For some reason (it's a race condition after all), on current master, dpdk_watchdog will always be the first to call ovsrcu_quiesce_start(), but with the upcoming DPDK database configuration changes, sometimes vhost_thread will issue the first call to ovsrcu_quiesce_start(). Sample ovs-vswitchd.log: 2016-03-23T22:34:28.532Z|00004|ovs_rcu(urcu3)|WARN|blocked 8000 ms waiting for vhost_thread2 to quiesce 2016-03-23T22:34:30.501Z|00118|ovs_rcu|WARN|blocked 8000 ms waiting for vhost_thread2 to quiesce 2016-03-23T22:34:36.532Z|00005|ovs_rcu(urcu3)|WARN|blocked 16000 ms waiting for vhost_thread2 to quiesce 2016-03-23T22:34:38.501Z|00119|ovs_rcu|WARN|blocked 16000 ms waiting for vhost_thread2 to quiesce The commit also adds a test for the ovs-rcu module to make sure that: * A new thread is started in a non quiescent state. * The first call to ovsrcu_quiesce_start() is honored. * When a process becomes multithreaded the main thread is put in an active state Signed-off-by: Daniele Di Proietto <diproiettod@vmware.com> Acked-by: Ben Pfaff <blp@ovn.org>
* ovs-rcu: Improve comments on ovsrcu_postpone().Ben Pfaff2016-01-251-2/+4
| | | | | | Suggested-by: William Tu <u9012063@gmail.com> Signed-off-by: Ben Pfaff <blp@ovn.org> Acked-by: Russell Bryant <russell@ovn.org>
* ovs-rcu: Comment fixes.Ben Pfaff2015-06-121-4/+4
| | | | | | | | A comment referred to a "Usage" section but the section was named "Use". This fixes the problem (also a grammar error). Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Alex Wang <alexw@nicira.com>
* rculist: Remove postponed poisoning.Jarno Rajahalme2015-06-111-1/+14
| | | | | | | | | | | | | | | | | | | | | | | Postponed 'next' member poisoning was based on the faulty assumption that postponed functions would be called in the order they were postponed. This assumption holds only for the functions postponed by any single thread. When functions are postponed by different threads, there are no guarantees of the order in which the functions may be called, or timing between those calls after the next grace period has passed. Given this, the postponed poisoning could have executed after postponed destruction of the object containing the rculist element. This bug was revealed after the memory leaks on rule deletion were recently fixed. This patch removes the postponed 'next' member poisoning and adds documentation describing the ordering limitations in OVS RCU. Alex Wang dug out the root cause of the resulting crashes, thanks! Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Alex Wang <alexw@nicira.com>
* lib/ovs-rcu: Support static initialization.Jarno Rajahalme2014-10-271-2/+2
| | | | | | | | | | | | | | | Currently, OVSRCU_TYPE_INITIALIZER always initializes the RCU pointer as NULL. There is no reason why the RCU pointer could not be initialized with a non-NULL value, however, as statically allocated memory is even more stable than required for RCU. This patch changes the initializer to OVSRCU_INITIALIZER(VALUE), which can take any pointer value as a parameter. This allows rculist, which is introduced in a following patch, to provide an initializer similar to the one in the normal list. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* lib/ovs-rcu: evaluate argument of ovsrcu_get only once.Jarno Rajahalme2014-07-211-3/+4
| | | | | | | | | | | | | As ovsrcu_get() looks like a function call, it is reasonable for the callers to expect that the arguments are evaluated only once. CONST_CAST expands its 'POINTER' argument multiple times, and the exact effect of this turned out to be compiler dependent. Fix this by expanding the macro argument before CONST_CAST, and removing unnecessary CONST_CASTs. VMware-BZ: #1287651 Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* lib/ovs-rcu: Export ovsrcu_synchronize().Jarno Rajahalme2014-07-111-0/+4
| | | | | | A following patch will add the first external user. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
* lib/ovs-rcu: Rename ovsrcu_init() as ovsrcu_set_hidden().Jarno Rajahalme2014-06-031-7/+13
| | | | | | | | | | | | | | | | | | | | ovsrcu_init() was named after the atomic_init(), but the semantics are different enough to warrant a different name. Basically C11 atomic_init is defined in a way that allows the implementation to assign the value without any syncronization, so in theory stores via atomic_init could be seen by others after stores via atomic_set, even if the atomic_init appeared earlier in the code. ovsrcu_set_hidden() can be used to set an RCU protected variable when it is not yet accessible by any active reader, but will be made visible later via an ovsrcu_set call on some other pointer. This patch also adds a new ovsrcu_init() that can be used to initilize RCU protected variables when the readers are not yet executing. The new ovsrcu_init() is implemented with atomic_init(), so it does not provide any kind of syncronization. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
* lib/ovs-rcu: Evaluate parameters before ovsrcu_set and ovsrcu_init.Jarno Rajahalme2014-06-031-2/+23
| | | | | | | | | | | | | | | | | ovsrcu_set() and ovsrcu_init() look like functions, so users are justified in expecting the arguments to be evaluated before any of the body of ovsrcu_set or ovsrcu_init(). With ovs-atomic-pthreads, a fallback ovs-atomics implementation used when no C11 atomics are available or with GCC older than 4.0, the prior definitions led to nested mutex locking when the 'VALUE' argument was an atomic_read(). This is resolved by ensuring function argument semantics for the parameters. For non-GCC compilers this is done with an inline function. GCC implementation of ovs-rcu does not fix the RCU variable type, so a GCC macro needs to be used instead. Reported-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp> Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
* ovs-rcu: Remove the extra 'typedef' keyword.Jarno Rajahalme2014-06-031-1/+1
| | | | | | | 'struct ovsrcu_pointer' is always used with the 'struct' keyword, so remove the unneeded 'typedef'. Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: YAMAMOTO Takashi <yamamoto@valinux.co.jp>
* lib/ovs-rcu: Fix documentation, add ovsrcu_init().Jarno Rajahalme2014-05-281-29/+23
| | | | | | | | | | | | lib/ovs-rcu.h had some of the comments duplicated. Add ovsrcu_init() that can be used like ovsrcu_set() when the RCU protected pointer is not yet visible any readers. ovs-rcu internal initialization function is renamed as ovsrcu_init_module(). Signed-off-by: Jarno Rajahalme <jrajahalme@nicira.com> Acked-by: Ben Pfaff <blp@nicira.com>
* ovs-rcu: Fix typo in comment.Ben Pfaff2014-05-191-1/+1
| | | | Signed-off-by: Ben Pfaff <blp@nicira.com>
* ovs-rcu: Add OVSRCU_TYPE_INITIALIZER which initializes OVSRCU_TYPE variables ↵Ryan Wilson2014-05-191-0/+2
| | | | | | | | to NULL. Signed-off-by: Ryan Wilson <wryan@nicira.com> Acked-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com>
* timeval: Preserve quiescence across time_poll().Ben Pfaff2014-04-281-0/+1
| | | | | | | | | | Otherwise ovsrcu_synchronize() busy-waits in its loop because its poll_block() un-quiesces, causing the global_seqno to increase, which is what it waits for. Reported-by: Alex Wang <alexw@nicira.com> Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Alex Wang <alexw@nicira.com>
* ovs-rcu: New library.Ben Pfaff2014-03-181-0/+182
RCU allows multiple threads to read objects in parallel without any performance penalty. The following commit will introduce the first use. Signed-off-by: Ben Pfaff <blp@nicira.com> Acked-by: Andy Zhou <azhou@nicira.com>