|
Add a lockless multi-producer/single-consumer (MPSC), linked-list based,
intrusive, unbounded queue that does not require deferred memory
management.
The queue is designed to improve the specific MPSC setup. A benchmark
accompanies the unit tests to measure the difference in this configuration.
A single reader thread polls the queue while N writers enqueue elements
as fast as possible. The mpsc-queue is compared against the regular ovs-list
as well as the guarded list. The latter usually offers a slight improvement
by batching the element removal, however the mpsc-queue is faster.
The average is of each producer threads time:
$ ./tests/ovstest test-mpsc-queue benchmark 3000000 1
Benchmarking n=3000000 on 1 + 1 threads.
type\thread: Reader 1 Avg
mpsc-queue: 167 167 167 ms
list(spin): 89 80 80 ms
list(mutex): 745 745 745 ms
guarded list: 788 788 788 ms
$ ./tests/ovstest test-mpsc-queue benchmark 3000000 2
Benchmarking n=3000000 on 1 + 2 threads.
type\thread: Reader 1 2 Avg
mpsc-queue: 98 97 94 95 ms
list(spin): 185 171 173 172 ms
list(mutex): 203 199 203 201 ms
guarded list: 269 269 188 228 ms
$ ./tests/ovstest test-mpsc-queue benchmark 3000000 3
Benchmarking n=3000000 on 1 + 3 threads.
type\thread: Reader 1 2 3 Avg
mpsc-queue: 76 76 65 76 72 ms
list(spin): 246 110 240 238 196 ms
list(mutex): 542 541 541 539 540 ms
guarded list: 535 535 507 511 517 ms
$ ./tests/ovstest test-mpsc-queue benchmark 3000000 4
Benchmarking n=3000000 on 1 + 4 threads.
type\thread: Reader 1 2 3 4 Avg
mpsc-queue: 73 68 68 68 68 68 ms
list(spin): 294 275 279 277 282 278 ms
list(mutex): 346 309 287 345 302 310 ms
guarded list: 378 319 334 378 351 345 ms
Signed-off-by: Gaetan Rivet <grive@u256.net>
Reviewed-by: Eli Britstein <elibr@nvidia.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
|