| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
The original `__dynamic_cast` implementation does not use the ABI-provided `src2dst_offset` parameter which helps improve performance on the hot paths. This patch improves the performance on the hot paths in `__dynamic_cast` by leveraging hints provided by the `src2dst_offset` parameter. This patch also includes a performance benchmark suite for the `__dynamic_cast` implementation.
Reviewed By: philnik, ldionne, #libc, #libc_abi, avogelsgesang
Spies: mikhail.ramalho, avogelsgesang, xingxue, libcxx-commits
Differential Revision: https://reviews.llvm.org/D138005
|
|
|
|
|
|
|
|
|
|
| |
Don't forward to `min_element` for small types that are trivially copyable, and instead use a naive loop that keeps track of the smallest element (as opposed to an iterator to the smallest element). This allows the compiler to vectorize the loop in some cases.
Reviewed By: #libc, ldionne
Spies: ldionne, libcxx-commits
Differential Revision: https://reviews.llvm.org/D143596
|
|
|
|
|
|
|
|
| |
Reviewed By: #libc, ldionne
Spies: ldionne, Mordante, libcxx-commits
Differential Revision: https://reviews.llvm.org/D139554
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The implementation makes use of the freedom added by LWG 3410. We have
two variants of this algorithm:
* a fast path for random access iterators: This fast path computes the
maximum number of loop iterations up-front and does not compare the
iterators against their limits on every loop iteration.
* A basic implementation for all other iterators: This implementation
compares the iterators against their limits in every loop iteration.
However, it still takes advantage of the freedom added by LWG 3410 to
avoid unnecessary additional iterator comparisons, as originally
specified by P1614R2.
https://godbolt.org/z/7xbMEen5e shows the benefit of the fast path:
The hot loop generated of `lexicographical_compare_three_way3` is
more tight than for `lexicographical_compare_three_way1`. The added
benchmark illustrates how this leads to a 30% - 50% performance
improvement on integer vectors.
Implements part of P1614R2 "The Mothership has Landed"
Fixes LWG 3410 and LWG 3350
Differential Revision: https://reviews.llvm.org/D131395
|
|
|
|
|
|
|
|
| |
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D138413
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This has multiple benefits:
- The optimizations are also performed for the `ranges::` versions of the algorithms
- Code duplication is reduced
- it is simpler to add this optimization for other segmented iterators,
like `ranges::join_view::iterator`
- Algorithm code is removed from `<deque>`
Reviewed By: ldionne, huixie90, #libc
Spies: mstorsjo, sstefan1, EricWF, libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D132505
|
|
|
|
|
|
|
|
|
|
|
| |
Bumping down is significantly faster than bumping up. This is ABI breaking, but the ABI of `pmr::monotonic_buffer_resource` was only stabilized in this release cycle, so we can still change it.
For a more detailed explanation why bumping down is better, see https://fitzgeraldnick.com/2019/11/01/always-bump-downwards.html.
Reviewed By: ldionne, #libc
Spies: libcxx-commits
Differential Revision: https://reviews.llvm.org/D141435
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D132180
|
|
|
|
|
|
|
|
|
|
|
| |
This implements the Grapheme clustering as required by
P1868R2 width: clarifying units of width and precision in std::format
This was omitted in the initial patch, but the paper was marked as completed. This really completes the paper.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D126971
|
|
|
|
|
|
|
|
|
|
|
|
| |
In particular remove the ability to expel incomplete features from the
library at configure-time, since this can now be done through the
_LIBCPP_ENABLE_EXPERIMENTAL macro.
Also, never provide symbols related to incomplete features inside the
dylib, instead provide them in c++experimental.a (this changes the
symbols list, but not for any configuration that should have shipped).
Differential Revision: https://reviews.llvm.org/D128928
|
|
|
|
|
|
|
|
| |
This is a preparation to look at possible performance improvements.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D129421
|
|
|
|
|
|
|
|
| |
As of now containers key_eq might get called when rehashing happens, which is redundant for unique keys containers.
Reviewed By: #libc, philnik, Mordante
Differential Revision: https://reviews.llvm.org/D128021
|
|
|
|
|
|
|
|
|
| |
- `ranges::make_heap`;
- `ranges::push_heap`;
- `ranges::pop_heap`;
- `ranges::sort_heap`.
Differential Revision: https://reviews.llvm.org/D128115
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D127834
|
|
|
|
|
|
|
|
|
| |
The optimization level used when building the benchmarks should
match the optimization level of the current build. Otherwise, we
can end up mixing an -O3 or -O0 optimized dylib with benchmarks
built with -O2, which is really misleading.
Differential Revision: https://reviews.llvm.org/D127987
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D127557
|
|
|
|
|
|
|
|
|
|
|
| |
Implements the compile-time checking of the formatting arguments.
Completes:
- P2216 std::format improvements
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D121530
|
|
|
|
|
|
|
|
|
|
| |
upper}_bound
Reviewed By: Mordante, var-const, ldionne, #libc
Spies: sstefan1, libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D121964
|
|
|
|
|
|
| |
It was forgotten in D124740.
Differential revision: https://reviews.llvm.org/D126297
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
conditional checks.
Abseil benchmarks suggest that the conditional checks result in faster code (4-5x)
as they are compiled into conditional move instructions (cmov on x86).
Reviewed By: #libc, philnik, Mordante
Spies: pengfei, Mordante, philnik, libcxx-commits
Differential Revision: https://reviews.llvm.org/D125329
|
|
|
|
|
|
|
|
| |
Reviewed By: ldionne, #libc
Spies: libcxx-commits, mgorny, mgrang
Differential Revision: https://reviews.llvm.org/D124740
|
|
|
|
|
|
|
|
| |
Standalone builds have been deprecated and then removed for a while now.
Trying to use standalone builds leads to a fatal CMake error, so this
code is all dead. Remove it to clean things up.
Differential Revision: https://reviews.llvm.org/D125561
|
|
|
|
|
|
|
|
| |
Reviewed By: var-const, #libc, ldionne
Spies: sstefan1, ldionne, BRevzin, libcxx-commits, mgorny
Differential Revision: https://reviews.llvm.org/D120637
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We are introducing branchless variants for sort3, sort4 and sort5.
These sorting functions have been generated using Reinforcement
Learning and aim to replace __sort3, __sort4 and __sort5 variants
for integral types.
The libc++ benchmarks were run on isolated machines for Skylake, ARM and
AMD architectures and achieve statistically significant improvement in
sorting random integers on test cases from sort1 to sort262144 for
uint32 and uint64.
A full performance overview for Intel Skylake, AMD and Arm can be
found here: https://bit.ly/3AtesYf
Reviewed By: ldionne, #libc, philnik
Spies: daniel.mankowitz, mgrang, Quuxplusone, andreamichi, philnik, libcxx-commits, nilayvaish, kristof.beyls
Differential Revision: https://reviews.llvm.org/D118029
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LIBCXX_ENABLE_ASSERTIONS does not have any relationship to the `assert`
macro -- it only controls assertions that are internal to the library.
Playing around with `NDEBUG` only muddies the picture further than it
already is.
Also, remove a failing assertion in the benchmarks. That assertion had
never been exercised because we defined `NDEBUG` manually, and it was
failing since we introduced the ability to generate a benchmark vector
with the Quicksort adversary ordering (which is obviously not sorted).
This was split off of https://llvm.org/D121123.
Differential Revision: https://reviews.llvm.org/D121244
|
|
|
|
|
|
| |
disabled
Differential Revision: https://reviews.llvm.org/D119036
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This properly implements the formatter for floating-point types.
Completes:
- P1652R1 Printf corner cases in std::format
- LWG 3250 std::format: # (alternate form) for NaN and inf
- LWG 3243 std::format and negative zeroes
Implements parts of:
- P0645 Text Formatting
Reviewed By: #libc, ldionne, vitaut
Differential Revision: https://reviews.llvm.org/D114001
|
| |
|
|
|
|
| |
Reviewed as part of D114920.
|
|
|
|
|
|
|
|
|
| |
This reverts commit b2fbd45d2395f1f6ef39db72b7156724fc101e40. D114922
fixed the reason of the 2nd revert.
This patch also re-applies 39e9f5d3685f3cfca0df072928ad96d973704dff.
Differential Revision: https://reviews.llvm.org/D112012
|
|
|
|
|
|
|
|
|
| |
These benchmarks will be used to test the performance inpact of the next
set of optimization patches.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D110501
|
|
|
|
|
|
|
|
|
| |
llvm/utils'"""
This reverts commit 1ee32055ea1dd4db70d1939cbd4f5105c2dce160.
We hit additional bot failures; in particular, Fuchsia's seems to be
related to how CMakeLists are ingested, see https://ci.chromium.org/ui/p/fuchsia/builders/toolchain.ci/clang-linux-x64/b8830380874445931681/overview
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit e7568b68da8a216dc22cdc1c6d8903c94096c846 and relands
c6f7b720ecfa6db40c648eb05e319f8a817110e9.
The culprit was: missed that libc also had a dependency on one of the
copies of `google-benchmark`
Also opportunistically fixed indentation from prev. change.
Differential Revision: https://reviews.llvm.org/D112012
|
|
|
|
|
|
|
|
|
| |
This reverts commit c6f7b720ecfa6db40c648eb05e319f8a817110e9.
Some buildbots are failing, will investigate and reland.
Example:
https://lab.llvm.org/buildbot#builders/138/builds/14067
https://lab.llvm.org/buildbot#builders/73/builds/20159
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
under third-party
This change:
- moves the libcxx copy of `google/benchmark` to
`third-party/benchmkark`
- points the 2 uses of the library (libcxx and llvm/utils) to this copy
We picked the licxx copy because it is the most up to date.
Differential Revision: https://reviews.llvm.org/D112012
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit adds a benchmark that tests std::sort on an adversarial inputs,
and uses introsort in std::sort to avoid O(n^2) behavior on adversarial
inputs.
Inputs where partitions are unbalanced even after 2 log(n) pivots have
been selected, the algorithm switches to heap sort to avoid the
possibility of spending O(n^2) time on sorting the input.
Benchmark results show that the intro sort implementation does
significantly better.
Benchmarking results before this change. Time represents the sorting
time required per element:
----------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------------------------------------
BM_Sort_uint32_QuickSortAdversary_1 3.75 ns 3.74 ns 187432960
BM_Sort_uint32_QuickSortAdversary_4 3.05 ns 3.05 ns 231211008
BM_Sort_uint32_QuickSortAdversary_16 2.45 ns 2.45 ns 288096256
BM_Sort_uint32_QuickSortAdversary_64 32.8 ns 32.8 ns 21495808
BM_Sort_uint32_QuickSortAdversary_256 132 ns 132 ns 5505024
BM_Sort_uint32_QuickSortAdversary_1024 498 ns 497 ns 1572864
BM_Sort_uint32_QuickSortAdversary_16384 3846 ns 3845 ns 262144
BM_Sort_uint32_QuickSortAdversary_262144 61431 ns 61400 ns 262144
BM_Sort_uint64_QuickSortAdversary_1 3.93 ns 3.92 ns 181141504
BM_Sort_uint64_QuickSortAdversary_4 3.10 ns 3.09 ns 222560256
BM_Sort_uint64_QuickSortAdversary_16 2.50 ns 2.50 ns 283639808
BM_Sort_uint64_QuickSortAdversary_64 33.2 ns 33.2 ns 21757952
BM_Sort_uint64_QuickSortAdversary_256 132 ns 132 ns 5505024
BM_Sort_uint64_QuickSortAdversary_1024 478 ns 477 ns 1572864
BM_Sort_uint64_QuickSortAdversary_16384 3932 ns 3930 ns 262144
BM_Sort_uint64_QuickSortAdversary_262144 61646 ns 61615 ns 262144
Benchmarking results after this change:
----------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------------------------------------
BM_Sort_uint32_QuickSortAdversary_1 6.31 ns 6.30 ns 107741184
BM_Sort_uint32_QuickSortAdversary_4 4.51 ns 4.50 ns 158859264
BM_Sort_uint32_QuickSortAdversary_16 3.00 ns 3.00 ns 223608832
BM_Sort_uint32_QuickSortAdversary_64 44.8 ns 44.8 ns 15990784
BM_Sort_uint32_QuickSortAdversary_256 69.0 ns 68.9 ns 9961472
BM_Sort_uint32_QuickSortAdversary_1024 118 ns 118 ns 6029312
BM_Sort_uint32_QuickSortAdversary_16384 175 ns 175 ns 4194304
BM_Sort_uint32_QuickSortAdversary_262144 210 ns 210 ns 3407872
BM_Sort_uint64_QuickSortAdversary_1 6.75 ns 6.73 ns 103809024
BM_Sort_uint64_QuickSortAdversary_4 4.53 ns 4.53 ns 160432128
BM_Sort_uint64_QuickSortAdversary_16 2.98 ns 2.97 ns 234356736
BM_Sort_uint64_QuickSortAdversary_64 44.3 ns 44.3 ns 15990784
BM_Sort_uint64_QuickSortAdversary_256 69.2 ns 69.2 ns 10223616
BM_Sort_uint64_QuickSortAdversary_1024 119 ns 119 ns 6029312
BM_Sort_uint64_QuickSortAdversary_16384 173 ns 173 ns 4194304
BM_Sort_uint64_QuickSortAdversary_262144 212 ns 212 ns 3407872
Differential Revision: https://reviews.llvm.org/D113413
|
|
|
|
|
|
|
|
|
|
|
|
| |
We are trying to remove duplication of third-party code in
https://reviews.llvm.org/D112012, which will move the Google
Benchmark code outside of the `libcxx/` directory. That breaks
running the benchmarks in the Standalone build. Since we have
deprecated the Standalone build anyway, this patch just removes
support for the benchmark in Standalone mode until we remove that
mode entirely.
Differential Revision: https://reviews.llvm.org/D113503
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The CMake dependencies don't properly list the libc++ headers. When a
libc++ header is modified the affected benchmarks aren't rebuild. This
makes testing benchmarks tricky and may cause accidentally not using the
latest modifications during testing. This change causes CMake to
determine the proper dependencies.
This shouldn't affect the CI build.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D113419
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the width estimation functions to the std-format-spec.
Implements parts of:
- P0645 Text Formatting
- P1868 width: clarifying units of width and precision in std::format
Reviewed By: #libc, ldionne, vitaut
Differential Revision: https://reviews.llvm.org/D103413
|
|
|
|
|
|
|
|
| |
Even if these comments have a benefit in .h files (for editors that
care about language but can't be configured to treat .h as C++ code),
they certainly have no benefit for files with the .cpp extension.
Discussed in D110794.
|
|
|
|
|
|
|
| |
This enforces libcxx and its benchmarks are compiled by a C++20 capable
compiler. Based on review comments in D103413.
Differential Revision: https://reviews.llvm.org/D110338
|
|
|
|
|
|
|
|
|
| |
This is a re-application of da0592e4c8df which was reverted in
1454018dc1d9 because it was incompatible with older CMakes.
Instead, disable the benchmarks when CMake is too old to
support those idioms.
Differential Revision: https://reviews.llvm.org/D110285
|
|
|
|
|
| |
Space separated driver options are uncommon but Clang traditionally
did not do a good job. --gcc-toolchain= is the preferred form.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
C++20 revised the definition of what it means to be an iterator. While
all _Cpp17InputIterators_ satisfy `std::input_iterator`, the reverse
isn't true. D100271 introduces a new test adaptor to accommodate this
new definition (`cpp20_input_iterator`).
In order to help readers immediately distinguish which input iterator
adaptor is _Cpp17InputIterator_, the current `input_iterator` adaptor
has been prefixed with `cpp17_`.
Differential Revision: https://reviews.llvm.org/D101242
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
While working on D70631, Microsoft's unit tests discovered an issue.
Our `std::to_chars` implementation for bases != 10 uses the range
`[first,last)` as temporary buffer. This violates the contract for
to_chars:
[charconv.to.chars]/1 http://eel.is/c++draft/charconv#to.chars-1
`to_chars_result to_chars(char* first, char* last, see below value, int base = 10);`
"If the member ec of the return value is such that the value is equal to
the value of a value-initialized errc, the conversion was successful and
the member ptr is the one-past-the-end pointer of the characters
written."
Our implementation modifies the range `[member ptr, last)`, which causes
Microsoft's test to fail. Their test verifies the buffer
`[member ptr, last)` is unchanged. (The test is only done when the
conversion is successful.)
While looking at the code I noticed the performance for bases != 10 also
is suboptimal. This is tracked in D97705.
This patch fixes the issue and adds a benchmark. This benchmark will be
used as baseline for D97705.
Reviewed By: #libc, Quuxplusone, zoecarver
Differential Revision: https://reviews.llvm.org/D100722
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using the per-target runtime build, it may be desirable to have
different __config_site headers for each target where all targets cannot
share a single configuration.
The layout used for libc++ headers after this change is:
```
include/
c++/
v1/
<libc++ headers except for __config_site>
<target1>/
c++/
v1/
__config_site
<target2>/
c++/
v1/
__config_site
<other targets>
```
This is the most optimal layout since it avoids duplication, the only
headers that's per-target is __config_site, all other headers are
shared across targets. This also means that we no need two
-isystem flags: one for the target-agnostic headers and one for
the target specific headers.
Differential Revision: https://reviews.llvm.org/D89013
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this patch, we would generate a fancy <__config> header by
concatenating <__config_site> and <__config>. This complexifies the
build system and also increases the difference between what's tested
and what's actually installed.
This patch removes that complexity and instead simply installs <__config_site>
alongside the libc++ headers. <__config_site> is then included by <__config>,
which is much simpler. Doing this also opens the door to having different
<__config_site> headers depending on the target, which was impossible before.
It does change the workflow for testing header-only changes to libc++.
Previously, we would run `lit` against the headers in libcxx/include.
After this patch, we run it against a fake installation root of the
headers (containing a proper <__config_site> header). This makes use
closer to testing what we actually install, which is good, however it
does mean that we have to update that root before testing header changes.
Thus, we now need to run `ninja check-cxx-deps` before running `lit` by
hand.
Differential Revision: https://reviews.llvm.org/D97572
|
|
|
|
|
|
| |
There are build bots without C++20 support building the benchmarks.
This reverts commit 34acc91642440b8e4bad17acfdbb1314c8f2043e.
|
|
|
|
|
|
|
|
|
|
| |
Some work-in-progress patches for the format header contain benchmarks.
The format header requires C++20 to build. This is a preparation to make
it easy to add these benchmarks.
Reviewed By: ldionne, #libc
Differential Revision: https://reviews.llvm.org/D96057
|