[libc++][spaceship] Implement `lexicographical_compare_three_way`

The implementation makes use of the freedom added by LWG 3410. We have two variants of this algorithm: * a fast path for random access iterators: This fast path computes the maximum number of loop iterations up-front and does not compare the iterators against their limits on every loop iteration. * A basic implementation for all other iterators: This implementation compares the iterators against their limits in every loop iteration. However, it still takes advantage of the freedom added by LWG 3410 to avoid unnecessary additional iterator comparisons, as originally specified by P1614R2. https://godbolt.org/z/7xbMEen5e shows the benefit of the fast path: The hot loop generated of `lexicographical_compare_three_way3` is more tight than for `lexicographical_compare_three_way1`. The added benchmark illustrates how this leads to a 30% - 50% performance improvement on integer vectors. Implements part of P1614R2 "The Mothership has Landed" Fixes LWG 3410 and LWG 3350 Differential Revision: https://reviews.llvm.org/D131395
author: Adrian Vogelsgesang <avogelsgesang@tableau.com> 2022-08-04 15:21:27 -0700
committer: Adrian Vogelsgesang <avogelsgesang@salesforce.com> 2023-02-12 14:51:08 -0800
commit: 2a06757a200cc8dd4c3aeca98509d50d75bb4a27 (patch)
tree: 140e7a9d72c815e222a83ccb3662d7329afbf0b6 /libcxx/benchmarks
parent: 2e6430666caf303b84dd281442533d2f3b6ad1b2 (diff)
download: llvm-2a06757a200cc8dd4c3aeca98509d50d75bb4a27.tar.gz
2 files changed, 97 insertions, 0 deletions
diff --git a/libcxx/benchmarks/CMakeLists.txt b/libcxx/benchmarks/CMakeLists.txt
index 7eb76ac6370f..2c3c39ef5345 100644
--- a/libcxx/benchmarks/CMakeLists.txt
+++ b/libcxx/benchmarks/CMakeLists.txt
@@ -186,6 +186,7 @@ set(BENCHMARK_TESTS
     formatter_int.bench.cpp
     function.bench.cpp
     join_view.bench.cpp
+    lexicographical_compare_three_way.bench.cpp
     map.bench.cpp
     monotonic_buffer.bench.cpp
     ordered_set.bench.cpp
diff --git a/libcxx/benchmarks/lexicographical_compare_three_way.bench.cpp b/libcxx/benchmarks/lexicographical_compare_three_way.bench.cpp
new file mode 100644
index 000000000000..e58134adaf21
--- /dev/null
+++ b/libcxx/benchmarks/lexicographical_compare_three_way.bench.cpp
@@ -0,0 +1,96 @@
+//===----------------------------------------------------------------------===//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include <algorithm>
+
+#include "benchmark/benchmark.h"
+#include "test_iterators.h"
+
+static void BM_lexicographical_compare_three_way_slow_path(benchmark::State& state) {
+  auto size = state.range(0);
+  std::vector<int> v1;
+  v1.resize(size);
+  // v2 is identical except for the last value.
+  // This means, that `lexicographical_compare_three_way` actually has to
+  // compare the complete vector and cannot bail out early.
+  std::vector<int> v2 = v1;
+  v2.back() += 1;
+  int* b1 = v1.data();
+  int* e1 = b1 + v1.size();
+  int* b2 = v2.data();
+  int* e2 = b2 + v2.size();
+
+  for (auto _ : state) {
+    auto cmp = std::compare_three_way();
+    benchmark::DoNotOptimize(std::__lexicographical_compare_three_way_slow_path(b1, e1, b2, e2, cmp));
+  }
+}
+
+BENCHMARK(BM_lexicographical_compare_three_way_slow_path)->RangeMultiplier(4)->Range(1, 1 << 20);
+
+static void BM_lexicographical_compare_three_way_fast_path(benchmark::State& state) {
+  auto size = state.range(0);
+  std::vector<int> v1;
+  v1.resize(size);
+  // v2 is identical except for the last value.
+  // This means, that `lexicographical_compare_three_way` actually has to
+  // compare the complete vector and cannot bail out early.
+  std::vector<int> v2 = v1;
+  v2.back() += 1;
+  int* b1 = v1.data();
+  int* e1 = b1 + v1.size();
+  int* b2 = v2.data();
+  int* e2 = b2 + v2.size();
+
+  for (auto _ : state) {
+    auto cmp = std::compare_three_way();
+    benchmark::DoNotOptimize(std::__lexicographical_compare_three_way_fast_path(b1, e1, b2, e2, cmp));
+  }
+}
+
+BENCHMARK(BM_lexicographical_compare_three_way_fast_path)->RangeMultiplier(4)->Range(1, 1 << 20);
+
+template <class IteratorT>
+static void BM_lexicographical_compare_three_way(benchmark::State& state) {
+  auto size = state.range(0);
+  std::vector<int> v1;
+  v1.resize(size);
+  // v2 is identical except for the last value.
+  // This means, that `lexicographical_compare_three_way` actually has to
+  // compare the complete vector and cannot bail out early.
+  std::vector<int> v2 = v1;
+  v2.back() += 1;
+  auto b1 = IteratorT{v1.data()};
+  auto e1 = IteratorT{v1.data() + v1.size()};
+  auto b2 = IteratorT{v2.data()};
+  auto e2 = IteratorT{v2.data() + v2.size()};
+
+  for (auto _ : state) {
+    benchmark::DoNotOptimize(std::lexicographical_compare_three_way(b1, e1, b2, e2));
+  }
+}
+
+// Type alias to make sure the `*` does not appear in the benchmark name.
+// A `*` would confuse the Python test runner running this google benchmark.
+using IntPtr = int*;
+
+// `lexicographical_compare_three_way` has a fast path for random access iterators.
+BENCHMARK_TEMPLATE(BM_lexicographical_compare_three_way, IntPtr)->RangeMultiplier(4)->Range(1, 1 << 20);
+BENCHMARK_TEMPLATE(BM_lexicographical_compare_three_way, random_access_iterator<IntPtr>)
+    ->RangeMultiplier(4)
+    ->Range(1, 1 << 20);
+BENCHMARK_TEMPLATE(BM_lexicographical_compare_three_way, cpp17_input_iterator<IntPtr>)
+    ->RangeMultiplier(4)
+    ->Range(1, 1 << 20);
+
+int main(int argc, char** argv) {
+  benchmark::Initialize(&argc, argv);
+  if (benchmark::ReportUnrecognizedArguments(argc, argv))
+    return 1;
+
+  benchmark::RunSpecifiedBenchmarks();
+}
author	Adrian Vogelsgesang <avogelsgesang@tableau.com>	2022-08-04 15:21:27 -0700
committer	Adrian Vogelsgesang <avogelsgesang@salesforce.com>	2023-02-12 14:51:08 -0800
commit	2a06757a200cc8dd4c3aeca98509d50d75bb4a27 (patch)
tree	140e7a9d72c815e222a83ccb3662d7329afbf0b6 /libcxx/benchmarks
parent	2e6430666caf303b84dd281442533d2f3b6ad1b2 (diff)
download	llvm-2a06757a200cc8dd4c3aeca98509d50d75bb4a27.tar.gz