summaryrefslogtreecommitdiff
path: root/libcxx/benchmarks
Commit message (Collapse)AuthorAgeFilesLines
* [libc++abi] Improve performance of __dynamic_castSirui Mu2023-03-192-0/+173
| | | | | | | | | | The original `__dynamic_cast` implementation does not use the ABI-provided `src2dst_offset` parameter which helps improve performance on the hot paths. This patch improves the performance on the hot paths in `__dynamic_cast` by leveraging hints provided by the `src2dst_offset` parameter. This patch also includes a performance benchmark suite for the `__dynamic_cast` implementation. Reviewed By: philnik, ldionne, #libc, #libc_abi, avogelsgesang Spies: mikhail.ramalho, avogelsgesang, xingxue, libcxx-commits Differential Revision: https://reviews.llvm.org/D138005
* [libc++] Optimize std::ranges::{min, max} for types that are cheap to copyNikolas Klauser2023-03-112-0/+71
| | | | | | | | | | Don't forward to `min_element` for small types that are trivially copyable, and instead use a naive loop that keeps track of the smallest element (as opposed to an iterator to the smallest element). This allows the compiler to vectorize the loop in some cases. Reviewed By: #libc, ldionne Spies: ldionne, libcxx-commits Differential Revision: https://reviews.llvm.org/D143596
* [libc++] Forward to std::memcmp for trivially comparable types in equalNikolas Klauser2023-02-212-1/+48
| | | | | | | | Reviewed By: #libc, ldionne Spies: ldionne, Mordante, libcxx-commits Differential Revision: https://reviews.llvm.org/D139554
* [libc++][spaceship] Implement `lexicographical_compare_three_way`Adrian Vogelsgesang2023-02-122-0/+97
| | | | | | | | | | | | | | | | | | | | | | | | | The implementation makes use of the freedom added by LWG 3410. We have two variants of this algorithm: * a fast path for random access iterators: This fast path computes the maximum number of loop iterations up-front and does not compare the iterators against their limits on every loop iteration. * A basic implementation for all other iterators: This implementation compares the iterators against their limits in every loop iteration. However, it still takes advantage of the freedom added by LWG 3410 to avoid unnecessary additional iterator comparisons, as originally specified by P1614R2. https://godbolt.org/z/7xbMEen5e shows the benefit of the fast path: The hot loop generated of `lexicographical_compare_three_way3` is more tight than for `lexicographical_compare_three_way1`. The added benchmark illustrates how this leads to a 30% - 50% performance improvement on integer vectors. Implements part of P1614R2 "The Mothership has Landed" Fixes LWG 3410 and LWG 3350 Differential Revision: https://reviews.llvm.org/D131395
* [libc++] Enable segmented iterator optimizations for join_view::iteratorNikolas Klauser2023-01-202-0/+78
| | | | | | | | Reviewed By: ldionne, #libc Spies: libcxx-commits Differential Revision: https://reviews.llvm.org/D138413
* [libc++] Refactor deque::iterator algorithm optimizationsNikolas Klauser2023-01-192-0/+233
| | | | | | | | | | | | | | | This has multiple benefits: - The optimizations are also performed for the `ranges::` versions of the algorithms - Code duplication is reduced - it is simpler to add this optimization for other segmented iterators, like `ranges::join_view::iterator` - Algorithm code is removed from `<deque>` Reviewed By: ldionne, huixie90, #libc Spies: mstorsjo, sstefan1, EricWF, libcxx-commits, mgorny Differential Revision: https://reviews.llvm.org/D132505
* [libc++] Make pmr::monotonic_buffer_resource bump downNikolas Klauser2023-01-122-0/+29
| | | | | | | | | | | Bumping down is significantly faster than bumping up. This is ABI breaking, but the ABI of `pmr::monotonic_buffer_resource` was only stabilized in this release cycle, so we can still change it. For a more detailed explanation why bumping down is better, see https://fitzgeraldnick.com/2019/11/01/always-bump-downwards.html. Reviewed By: ldionne, #libc Spies: libcxx-commits Differential Revision: https://reviews.llvm.org/D141435
* [libc++] Extend check for non-ASCII characters to src/, test/ and benchmarks/Louis Dionne2022-08-232-3/+3
| | | | Differential Revision: https://reviews.llvm.org/D132180
* [libc++] Implements Unicode grapheme clusteringMark de Wever2022-07-201-140/+244
| | | | | | | | | | | This implements the Grapheme clustering as required by P1868R2 width: clarifying units of width and precision in std::format This was omitted in the initial patch, but the paper was marked as completed. This really completes the paper. Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D126971
* [libc++] Treat incomplete features just like other experimental featuresLouis Dionne2022-07-191-17/+9
| | | | | | | | | | | | In particular remove the ability to expel incomplete features from the library at configure-time, since this can now be done through the _LIBCPP_ENABLE_EXPERIMENTAL macro. Also, never provide symbols related to incomplete features inside the dylib, instead provide them in c++experimental.a (this changes the symbols list, but not for any configuration that should have shipped). Differential Revision: https://reviews.llvm.org/D128928
* [libc++][format] Adds integral formatter benchmarks.Mark de Wever2022-07-122-0/+209
| | | | | | | | This is a preparation to look at possible performance improvements. Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D129421
* [libc++] Don't call key_eq in unordered_map/set rehashing routineIvan Trofimov2022-07-102-0/+49
| | | | | | | | As of now containers key_eq might get called when rehashing happens, which is redundant for unique keys containers. Reviewed By: #libc, philnik, Mordante Differential Revision: https://reviews.llvm.org/D128021
* [libc++][ranges] Implement modifying heap algorithms:Konstantin Varlamov2022-07-086-0/+198
| | | | | | | | | - `ranges::make_heap`; - `ranges::push_heap`; - `ranges::pop_heap`; - `ranges::sort_heap`. Differential Revision: https://reviews.llvm.org/D128115
* [libc++][ranges] Implement `ranges::stable_sort`.Konstantin Varlamov2022-07-012-0/+40
| | | | Differential Revision: https://reviews.llvm.org/D127834
* [libc++] Don't force -O2 when building the benchmarksLouis Dionne2022-06-171-1/+1
| | | | | | | | | The optimization level used when building the benchmarks should match the optimization level of the current build. Otherwise, we can end up mixing an -O3 or -O0 optimized dylib with benchmarks built with -O2, which is really misleading. Differential Revision: https://reviews.llvm.org/D127987
* [libc++][ranges] Implement `ranges::sort`.Konstantin Varlamov2022-06-162-0/+40
| | | | Differential Revision: https://reviews.llvm.org/D127557
* [libc++][format] Implement format-string.Mark de Wever2022-06-111-3/+13
| | | | | | | | | | | Implements the compile-time checking of the formatting arguments. Completes: - P2216 std::format improvements Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D121530
* [libc++][ranges] Implement ranges::binary_search and ranges::{lower, ↵Nikolas Klauser2022-06-062-0/+43
| | | | | | | | | | upper}_bound Reviewed By: Mordante, var-const, ldionne, #libc Spies: sstefan1, libcxx-commits, mgorny Differential Revision: https://reviews.llvm.org/D121964
* [libcxx] Add sort.bench.cpp to libcxx/benchmarks/CMakeLists.txtHans Wennborg2022-05-241-1/+2
| | | | | | It was forgotten in D124740. Differential revision: https://reviews.llvm.org/D126297
* [libc++] Replace modulus operations in std::seed_seq::generate with ↵Laramie Leavitt2022-05-241-0/+33
| | | | | | | | | | | | | conditional checks. Abseil benchmarks suggest that the conditional checks result in faster code (4-5x) as they are compiled into conditional move instructions (cmov on x86). Reviewed By: #libc, philnik, Mordante Spies: pengfei, Mordante, philnik, libcxx-commits Differential Revision: https://reviews.llvm.org/D125329
* [libc++] Granularize algorithm benchmarksNikolas Klauser2022-05-1910-179/+365
| | | | | | | | Reviewed By: ldionne, #libc Spies: libcxx-commits, mgorny, mgrang Differential Revision: https://reviews.llvm.org/D124740
* [runtimes][NFC] Remove dead code for Standalone buildsLouis Dionne2022-05-131-6/+0
| | | | | | | | Standalone builds have been deprecated and then removed for a while now. Trying to use standalone builds leads to a fatal CMake error, so this code is all dead. Remove it to clean things up. Differential Revision: https://reviews.llvm.org/D125561
* [libc++][ranges] Implement ranges::minmax and ranges::minmax_elementNikolas Klauser2022-04-141-0/+16
| | | | | | | | Reviewed By: var-const, #libc, ldionne Spies: sstefan1, ldionne, BRevzin, libcxx-commits, mgorny Differential Revision: https://reviews.llvm.org/D120637
* Introduce branchless sorting functions for sort3, sort4 and sort5.Marco Gelmi2022-04-081-14/+8
| | | | | | | | | | | | | | | | | | | | | We are introducing branchless variants for sort3, sort4 and sort5. These sorting functions have been generated using Reinforcement Learning and aim to replace __sort3, __sort4 and __sort5 variants for integral types. The libc++ benchmarks were run on isolated machines for Skylake, ARM and AMD architectures and achieve statistically significant improvement in sorting random integers on test cases from sort1 to sort262144 for uint32 and uint64. A full performance overview for Intel Skylake, AMD and Arm can be found here: https://bit.ly/3AtesYf Reviewed By: ldionne, #libc, philnik Spies: daniel.mankowitz, mgrang, Quuxplusone, andreamichi, philnik, libcxx-commits, nilayvaish, kristof.beyls Differential Revision: https://reviews.llvm.org/D118029
* [libc++] Don't manually override NDEBUG in the dylib buildLouis Dionne2022-03-091-1/+0
| | | | | | | | | | | | | | | | LIBCXX_ENABLE_ASSERTIONS does not have any relationship to the `assert` macro -- it only controls assertions that are internal to the library. Playing around with `NDEBUG` only muddies the picture further than it already is. Also, remove a failing assertion in the benchmarks. That assertion had never been exercised because we defined `NDEBUG` manually, and it was failing since we introduced the ability to generate a benchmark vector with the Quicksort adversary ordering (which is obviously not sorted). This was split off of https://llvm.org/D121123. Differential Revision: https://reviews.llvm.org/D121244
* [libc++] Fix modules and benchmarks CI builds when incomplete features are ↵Louis Dionne2022-02-081-0/+5
| | | | | | disabled Differential Revision: https://reviews.llvm.org/D119036
* [libc++][format] Adds formatter floating-point.Mark de Wever2022-01-241-0/+241
| | | | | | | | | | | | | | | | This properly implements the formatter for floating-point types. Completes: - P1652R1 Printf corner cases in std::format - LWG 3250 std::format: # (alternate form) for NaN and inf - LWG 3243 std::format and negative zeroes Implements parts of: - P0645 Text Formatting Reviewed By: #libc, ldionne, vitaut Differential Revision: https://reviews.llvm.org/D114001
* [libc++] Fix benchmark failureLouis Dionne2022-01-241-2/+2
|
* [libc++] [bench] Stop using uniform_int_distribution<char> in benchmarks.Arthur O'Dwyer2022-01-173-7/+5
| | | | Reviewed as part of D114920.
* Re-Reland "[benchmarks] Move libcxx's fork of google/benchmark and llvm/utils'"Mircea Trofin2021-12-071-2/+2
| | | | | | | | | This reverts commit b2fbd45d2395f1f6ef39db72b7156724fc101e40. D114922 fixed the reason of the 2nd revert. This patch also re-applies 39e9f5d3685f3cfca0df072928ad96d973704dff. Differential Revision: https://reviews.llvm.org/D112012
* [libc++][format] Adds formatting benchmarks.Mark de Wever2021-11-284-0/+286
| | | | | | | | | These benchmarks will be used to test the performance inpact of the next set of optimization patches. Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D110501
* Revert "Reland "[benchmarks] Move libcxx's fork of google/benchmark and ↵Mircea Trofin2021-11-161-2/+2
| | | | | | | | | llvm/utils'""" This reverts commit 1ee32055ea1dd4db70d1939cbd4f5105c2dce160. We hit additional bot failures; in particular, Fuchsia's seems to be related to how CMakeLists are ingested, see https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8830380874445931681/overview
* Reland "[benchmarks] Move libcxx's fork of google/benchmark and llvm/utils'""Mircea Trofin2021-11-161-2/+2
| | | | | | | | | | | | This reverts commit e7568b68da8a216dc22cdc1c6d8903c94096c846 and relands c6f7b720ecfa6db40c648eb05e319f8a817110e9. The culprit was: missed that libc also had a dependency on one of the copies of `google-benchmark` Also opportunistically fixed indentation from prev. change. Differential Revision: https://reviews.llvm.org/D112012
* Revert "[benchmarks] Move libcxx's fork of google/benchmark and llvm/utils'"Mircea Trofin2021-11-161-2/+2
| | | | | | | | | This reverts commit c6f7b720ecfa6db40c648eb05e319f8a817110e9. Some buildbots are failing, will investigate and reland. Example: https://lab.llvm.org/buildbot#builders/138/builds/14067 https://lab.llvm.org/buildbot#builders/73/builds/20159
* [benchmarks] Move libcxx's fork of google/benchmark and llvm/utils'Mircea Trofin2021-11-161-2/+2
| | | | | | | | | | | | | under third-party This change: - moves the libcxx copy of `google/benchmark` to `third-party/benchmkark` - points the 2 uses of the library (libcxx and llvm/utils) to this copy We picked the licxx copy because it is the most up to date. Differential Revision: https://reviews.llvm.org/D112012
* [libc++] Add introsort to avoid O(n^2) behaviorNilay Vaish2021-11-161-3/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds a benchmark that tests std::sort on an adversarial inputs, and uses introsort in std::sort to avoid O(n^2) behavior on adversarial inputs. Inputs where partitions are unbalanced even after 2 log(n) pivots have been selected, the algorithm switches to heap sort to avoid the possibility of spending O(n^2) time on sorting the input. Benchmark results show that the intro sort implementation does significantly better. Benchmarking results before this change. Time represents the sorting time required per element: ---------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------------------------------- BM_Sort_uint32_QuickSortAdversary_1 3.75 ns 3.74 ns 187432960 BM_Sort_uint32_QuickSortAdversary_4 3.05 ns 3.05 ns 231211008 BM_Sort_uint32_QuickSortAdversary_16 2.45 ns 2.45 ns 288096256 BM_Sort_uint32_QuickSortAdversary_64 32.8 ns 32.8 ns 21495808 BM_Sort_uint32_QuickSortAdversary_256 132 ns 132 ns 5505024 BM_Sort_uint32_QuickSortAdversary_1024 498 ns 497 ns 1572864 BM_Sort_uint32_QuickSortAdversary_16384 3846 ns 3845 ns 262144 BM_Sort_uint32_QuickSortAdversary_262144 61431 ns 61400 ns 262144 BM_Sort_uint64_QuickSortAdversary_1 3.93 ns 3.92 ns 181141504 BM_Sort_uint64_QuickSortAdversary_4 3.10 ns 3.09 ns 222560256 BM_Sort_uint64_QuickSortAdversary_16 2.50 ns 2.50 ns 283639808 BM_Sort_uint64_QuickSortAdversary_64 33.2 ns 33.2 ns 21757952 BM_Sort_uint64_QuickSortAdversary_256 132 ns 132 ns 5505024 BM_Sort_uint64_QuickSortAdversary_1024 478 ns 477 ns 1572864 BM_Sort_uint64_QuickSortAdversary_16384 3932 ns 3930 ns 262144 BM_Sort_uint64_QuickSortAdversary_262144 61646 ns 61615 ns 262144 Benchmarking results after this change: ---------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------------------------------- BM_Sort_uint32_QuickSortAdversary_1 6.31 ns 6.30 ns 107741184 BM_Sort_uint32_QuickSortAdversary_4 4.51 ns 4.50 ns 158859264 BM_Sort_uint32_QuickSortAdversary_16 3.00 ns 3.00 ns 223608832 BM_Sort_uint32_QuickSortAdversary_64 44.8 ns 44.8 ns 15990784 BM_Sort_uint32_QuickSortAdversary_256 69.0 ns 68.9 ns 9961472 BM_Sort_uint32_QuickSortAdversary_1024 118 ns 118 ns 6029312 BM_Sort_uint32_QuickSortAdversary_16384 175 ns 175 ns 4194304 BM_Sort_uint32_QuickSortAdversary_262144 210 ns 210 ns 3407872 BM_Sort_uint64_QuickSortAdversary_1 6.75 ns 6.73 ns 103809024 BM_Sort_uint64_QuickSortAdversary_4 4.53 ns 4.53 ns 160432128 BM_Sort_uint64_QuickSortAdversary_16 2.98 ns 2.97 ns 234356736 BM_Sort_uint64_QuickSortAdversary_64 44.3 ns 44.3 ns 15990784 BM_Sort_uint64_QuickSortAdversary_256 69.2 ns 69.2 ns 10223616 BM_Sort_uint64_QuickSortAdversary_1024 119 ns 119 ns 6029312 BM_Sort_uint64_QuickSortAdversary_16384 173 ns 173 ns 4194304 BM_Sort_uint64_QuickSortAdversary_262144 212 ns 212 ns 3407872 Differential Revision: https://reviews.llvm.org/D113413
* [libc++] Disallow running the libc++ benchmarks in standalone buildsLouis Dionne2021-11-111-0/+6
| | | | | | | | | | | | We are trying to remove duplication of third-party code in https://reviews.llvm.org/D112012, which will move the Google Benchmark code outside of the `libcxx/` directory. That breaks running the benchmarks in the Standalone build. Since we have deprecated the Standalone build anyway, this patch just removes support for the benchmark in Standalone mode until we remove that mode entirely. Differential Revision: https://reviews.llvm.org/D113503
* [libc++][cmake] Improves benchmark build.Mark de Wever2021-11-091-2/+3
| | | | | | | | | | | | | | The CMake dependencies don't properly list the libc++ headers. When a libc++ header is modified the affected benchmarks aren't rebuild. This makes testing benchmarks tricky and may cause accidentally not using the latest modifications during testing. This change causes CMake to determine the proper dependencies. This shouldn't affect the CI build. Reviewed By: #libc, ldionne Differential Revision: https://reviews.llvm.org/D113419
* Ensure newlines at the end of files (NFC)Kazu Hirata2021-10-231-1/+1
|
* [libc++][format] Implement Unicode support.Mark de Wever2021-10-021-0/+196
| | | | | | | | | | | | This adds the width estimation functions to the std-format-spec. Implements parts of: - P0645 Text Formatting - P1868 width: clarifying units of width and precision in std::format Reviewed By: #libc, ldionne, vitaut Differential Revision: https://reviews.llvm.org/D103413
* [libc++] Remove "// -*- C++ -*-" comments from all .cpp files. NFCI.Arthur O'Dwyer2021-10-011-1/+0
| | | | | | | | Even if these comments have a benefit in .h files (for editors that care about language but can't be configured to treat .h as C++ code), they certainly have no benefit for files with the .cpp extension. Discussed in D110794.
* [libc++] Require a C++20 capable compiler.Mark de Wever2021-09-241-1/+1
| | | | | | | This enforces libcxx and its benchmarks are compiled by a C++20 capable compiler. Based on review comments in D103413. Differential Revision: https://reviews.llvm.org/D110338
* [libc++] Use CMake interface targets to setup benchmark flagsLouis Dionne2021-09-231-53/+26
| | | | | | | | | This is a re-application of da0592e4c8df which was reverted in 1454018dc1d9 because it was incompatible with older CMakes. Instead, disable the benchmarks when CMake is too old to support those idioms. Differential Revision: https://reviews.llvm.org/D110285
* [test] Migrate -gcc-toolchain with space separator to --gcc-toolchain=Fangrui Song2021-08-201-1/+1
| | | | | Space separated driver options are uncommon but Clang traditionally did not do a good job. --gcc-toolchain= is the preferred form.
* [libcxx][nfc] prefixes test type `input_iterator` with `cpp17_`Christopher Di Bella2021-05-021-1/+1
| | | | | | | | | | | | | C++20 revised the definition of what it means to be an iterator. While all _Cpp17InputIterators_ satisfy `std::input_iterator`, the reverse isn't true. D100271 introduces a new test adaptor to accommodate this new definition (`cpp20_input_iterator`). In order to help readers immediately distinguish which input iterator adaptor is _Cpp17InputIterator_, the current `input_iterator` adaptor has been prefixed with `cpp17_`. Differential Revision: https://reviews.llvm.org/D101242
* [libc++] Fixes std::to_chars for bases != 10.Mark de Wever2021-04-291-0/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | While working on D70631, Microsoft's unit tests discovered an issue. Our `std::to_chars` implementation for bases != 10 uses the range `[first,last)` as temporary buffer. This violates the contract for to_chars: [charconv.to.chars]/1 http://eel.is/c++draft/charconv#to.chars-1 `to_chars_result to_chars(char* first, char* last, see below value, int base = 10);` "If the member ec of the return value is such that the value is equal to the value of a value-initialized errc, the conversion was successful and the member ptr is the one-past-the-end pointer of the characters written." Our implementation modifies the range `[member ptr, last)`, which causes Microsoft's test to fail. Their test verifies the buffer `[member ptr, last)` is unchanged. (The test is only done when the conversion is successful.) While looking at the code I noticed the performance for bases != 10 also is suboptimal. This is tracked in D97705. This patch fixes the issue and adds a benchmark. This benchmark will be used as baseline for D97705. Reviewed By: #libc, Quuxplusone, zoecarver Differential Revision: https://reviews.llvm.org/D100722
* [libc++] Support per-target __config_site in per-target runtime buildPetr Hosek2021-04-281-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When using the per-target runtime build, it may be desirable to have different __config_site headers for each target where all targets cannot share a single configuration. The layout used for libc++ headers after this change is: ``` include/ c++/ v1/ <libc++ headers except for __config_site> <target1>/ c++/ v1/ __config_site <target2>/ c++/ v1/ __config_site <other targets> ``` This is the most optimal layout since it avoids duplication, the only headers that's per-target is __config_site, all other headers are shared across targets. This also means that we no need two -isystem flags: one for the target-agnostic headers and one for the target specific headers. Differential Revision: https://reviews.llvm.org/D89013
* [libc++] Include <__config_site> from <__config>Louis Dionne2021-03-301-2/+1
| | | | | | | | | | | | | | | | | | | | | | | Prior to this patch, we would generate a fancy <__config> header by concatenating <__config_site> and <__config>. This complexifies the build system and also increases the difference between what's tested and what's actually installed. This patch removes that complexity and instead simply installs <__config_site> alongside the libc++ headers. <__config_site> is then included by <__config>, which is much simpler. Doing this also opens the door to having different <__config_site> headers depending on the target, which was impossible before. It does change the workflow for testing header-only changes to libc++. Previously, we would run `lit` against the headers in libcxx/include. After this patch, we run it against a fake installation root of the headers (containing a proper <__config_site> header). This makes use closer to testing what we actually install, which is good, however it does mean that we have to update that root before testing header changes. Thus, we now need to run `ninja check-cxx-deps` before running `lit` by hand. Differential Revision: https://reviews.llvm.org/D97572
* Revert "[libc++] Require C++20 to build the benchmarks."Mark de Wever2021-02-091-2/+2
| | | | | | There are build bots without C++20 support building the benchmarks. This reverts commit 34acc91642440b8e4bad17acfdbb1314c8f2043e.
* [libc++] Require C++20 to build the benchmarks.Mark de Wever2021-02-091-2/+2
| | | | | | | | | | Some work-in-progress patches for the format header contain benchmarks. The format header requires C++20 to build. This is a preparation to make it easy to add these benchmarks. Reviewed By: ldionne, #libc Differential Revision: https://reviews.llvm.org/D96057