| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
VMASKMOVDQU supports 32bit/64bit version in 64bitmode, previously we prefer to use VMASKMOVDQU64 in 64bitmode because the 32bit one need 0x67 prefix.
After D150436, asm match table changed a little, which makes in 64bit mode "vmaskmovdqu %xmm0, %xmm1" will match VMASKMOVDQU other than VMASKMOVDQU64, this patch correct the asm match order for this instruction.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D150835
|
|
|
|
|
|
|
|
| |
The last use was removed by:
commit e98944ed47acd04279184343017aa2bf34999111
Author: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>
Date: Mon Mar 11 17:04:35 2019 +0000
|
|
|
|
| |
X86EncodingOptimization.cpp, NFCI
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When in ColumnLimit 0, the formatter looks for MustBreakBefore in the
line in order to check if a line is allowed to be merged onto one line.
However, since MustBreakBefore is really a property of the gap between
the token and the one previously, I belive the check is erroneous in
checking all the tokens in a line, since whether the previous line ended
with a forced line break should have no effect on whether the current
line is allowed to merge with the next one.
This patch changes the check to skip the first token in
`LineJoiner.containsMustBreak`.
This patch also changes a test, which is not ideal, but I believe the
test also suffered from this bug. The test case in question sets
AllowShortFunctionsOnASingleLine to "Empty", but the empty function in
said test case isn't merged to a single line, because of the very same
bug this patch fixes.
Fixes https://github.com/llvm/llvm-project/issues/62721
Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay
Differential Revision: https://reviews.llvm.org/D150614
|
|
|
|
|
|
|
|
|
|
|
| |
Apply the check for the constraint `C1172` to `unlock-stmt`,
`change-team-stmt`, `end-team-stmt`, and `critical-stmt`, which
all have `sync-stat-lists` and so `C1172` applies to them. Add
a test to check the `sync-stat-lists` for these 4 statements.
Reviewed By: PeteSteinfeld
Differential Revision: https://reviews.llvm.org/D150745
|
|
|
|
|
|
|
|
| |
This patch is similar to https://reviews.llvm.org/D150550, it adds accurate predicates for SDNode patterns depending on vector type.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D150754
|
|
|
|
| |
Preparatory change for D148094.
|
|
|
|
|
|
|
|
|
| |
We should specify that this is the pid_t as defined in the lldb
namespace, not some other pid_t. This doesn't really affect builds but
it makes writing tooling against the SBAPI easier.
I have verified that this does not change the emitted mangled name and
does not break ABI.
|
|
|
|
|
|
|
|
|
| |
For some reason, even though D150822 passed the buildbot, it failed to
catch this test
Reviewed By: anlunx
Differential Revision: https://reviews.llvm.org/D150830
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
After a07b135ce0c0111bd83450b5dc29ef0381cdbc39, we always pass
-coverage-notes-file/-coverage-data-file for driver options
-ftest-coverage/-fprofile-arcs/--coverage. As a bonus, we can make the following
simplification to cc1 options:
* `-ftest-coverage -coverage-notes-file a.gcno` => `-coverage-notes-file a.gcno`
* `-fprofile-arcs -coverage-data-file a.gcda` => `-coverage-data-file a.gcda`
and remove EmitCovNotes/EmitCovArcs.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch changes the `Process` struct to only provide the functions
expected to be visible by the interface. So, now we only export the
open, reset, and size query functions. This prevents users of the
interface from messing with the internals of the process, so now the
only existing failure mode is mismatched send and recieve calls.
Reviewed By: michaelrj
Differential Revision: https://reviews.llvm.org/D150703
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The tools `amdgpu-arch` and `nvptx-arch` are used to query the supported
GPUs on a system to implement features like `--offload-arch=native` as
well as generally being useful for setting up tests. However, we
currently directly link these if they are availible. This patch removes
this because it causes many problems on the user not having the libaries
present or misconfigured at build time. Since these are built
unconditionally we shoudl keep the dependencies away from clang.
Fixes https://github.com/llvm/llvm-project/issues/62784
Reviewed By: ye-luo
Differential Revision: https://reviews.llvm.org/D150807
|
|
|
|
| |
masked gather/scatter
|
| |
|
| |
|
|
|
|
|
|
| |
This reverts commit 3b6f7e45a20990fdbc2b43dc08457fc79d53bd39.
See comments on D150645.
|
|
|
|
|
|
|
|
| |
The GPU tests weren't updated when rebasing D150330, so this patch fixes that.
Reviewed By: anlunx
Differential Revision: https://reviews.llvm.org/D150822
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current assert wiht /EHa:
A single unwind edge may only enter one EH pad
invoke void @llvm.seh.try.begin()
to label %invoke.cont1 unwind label %catch.dispatch2
Current IR:
%1 = catchpad within %0 [ptr null, i32 0, ptr null]
invoke void @llvm.seh.try.begin()
to label %invoke.cont5 unwind label %catch.dispatch2
The problem is the invoke to llvm.seh.try.begin() missing "funclet"
Accodring: https://llvm.org/docs/LangRef.html#ob-funclet
If any "funclet" EH pads have been entered but not exited (per the
description in the EH doc), it is undefined behavior to execute a
call or invoke.
To fix the problem, when emit seh_try_begin, call EmitSehTryScopeBegin,
instead of calling EmitRuntimeCallOrInvoke for proper "funclet"
gerenration.
Differential Revision: https://reviews.llvm.org/D150340
|
|
|
|
|
| |
Test doesn't require asserts.
Remove a CHECK line in preparation for an upcoming change.
|
|
|
|
|
|
|
|
|
|
|
|
| |
lexical contexts
Substitute constraints only for declarations with different lexical contexts.
This results in avoiding the substitution of constraints during the redeclaration check
inside a class (and by product caching the wrong substitution result).
Test plan: ninja check-all
Differential revision: https://reviews.llvm.org/D150730
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit is part of the migration of towards the new STEA syntax/design. In particular, this commit includes the following changes:
* Renaming compiler-internal functions/methods:
* `SparseTensorEncodingAttr::{getDimLevelType => getLvlTypes}`
* `Merger::{getDimLevelType => getLvlType}` (for consistency)
* `sparse_tensor::{getDimLevelType => buildLevelType}` (to help reduce confusion vs actual getter methods)
* Renaming external facets to match:
* the STEA parser and printer
* the C and Python bindings
* PyTACO
However, the actual renaming of the `DimLevelType` itself (along with all the "dlt" names) will be handled in a separate commit.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D150330
|
|
|
|
|
|
|
|
|
|
| |
This function is used to add unit test and hermetic test framework libraries.
It avoids the duplicated code to add compile options to each every test
framework libraries.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D150727
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The `irdl.any_of` operation represent a constraint that is satisfied
if any of its subconstraint is satisfied.
For instance, in the following example:
```
%0 = irdl.is f32
%1 = irdl.is f64
%2 = irdl.any_of(f32, f64)
```
`%2` can only be satisfied by `f32` or `f64`.
Note that the verification algorithm required by `irdl.any_of` is
non-trivial, since we want that the order of arguments of
`irdl.any_of` to not matter. For this reason, our registration
algorithm fails if two constraints used by `any_of` might be
satisfied by the same `Attribute`. This is approximated by checking
the possible `Attribute` bases of each constraints.
Depends on D145734
Reviewed By: Mogball
Differential Revision: https://reviews.llvm.org/D145735
|
|
|
|
|
|
|
| |
- Change header guard after 147a61618989
- Fix file header
- Remove the `_MutexType` template parameter, I did not see this used
anywhere on llvm.org or on Apple's downstream forks.
|
|
|
|
|
|
|
|
|
| |
interceptors [NFC]'
It was reported in https://reviews.llvm.org/D150708 that my patch has broken
stage2/hwasan check: https://lab.llvm.org/buildbot/#/builders/236/builds/4069
Reverting that patch (and the followup fixes) until I can investigate this further
|
|
|
|
|
|
| |
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D150733
|
|
|
|
| |
This seems better suited for Utility than Core
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow parsing GPU-side variadic functions when we're compiling with CUDA-9 or
newer. We still do not allow accessing variadic arguments.
CUDA-9 was the version which introduced PTX-6.0 which allows implementing
variadic functions, so older versions can't have variadics in principle.
This is required for dealing with headers in recent CUDA versions that rely on
variadic function declarations in some of the templated code in libcu++.
E.g. https://github.com/llvm/llvm-project/issues/58410
Differential Revision: https://reviews.llvm.org/D150718
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Resolve a FIXME.
When -ftest-profile, -fprofile-arcs, or --coverage is specified and the
driver performs both compilation and linking phases, derive the .gcno &
.gcda file names from -o and -dumpdir.
`clang --coverage d/a.c d/b.c -o e/x && e/x` will now emit
`e/x-[ab].gc{no,da}` like GCC.
For -fprofile-dir=, we make the deliberate decision to not mangle the
input filename if relative.
The following script demonstrates the .gcno and .gcda filenames.
```
PATH=/tmp/Rel/bin:$PATH # adapt according to your build directory
mkdir -p d e f
echo 'int main() {}' > d/a.c
echo > d/b.c
a() { rm $@ || exit 1; }
clang --coverage d/a.c d/b.c && ./a.out
a a-[ab].gc{no,da}
clang --coverage d/a.c d/b.c -o e/x && e/x
a e/x-[ab].gc{no,da}
clang --coverage d/a.c d/b.c -o e/x -dumpdir f/ && e/x
a f/[ab].gc{no,da}
clang --coverage -fprofile-dir=f d/a.c d/b.c -o e/x && e/x
a e/x-[ab].gcno f/e/x-[ab].gcda
clang -c --coverage d/a.c d/b.c && clang --coverage a.o b.o && ./a.out
a [ab].gc{no,da}
clang -c --coverage -fprofile-dir=f d/a.c d/b.c && clang --coverage a.o b.o && ./a.out
a [ab].gcno f/[ab].gcda
clang -c --coverage d/a.c -o e/xa.o && clang --coverage e/xa.o && ./a.out
a e/xa.gc{no,da}
clang -c --coverage d/a.c -o e/xa.o -dumpdir f/g && clang --coverage e/xa.o && ./a.out
a f/ga.gc{no,da}
```
The gcov code accidentally claims -c and -S, so -fsyntax-only -c/-S and
-E -c/-S don't lead to a -Wunused-command-line-argument warning. Keep
the unintended code for now.
|
|
|
|
|
|
|
|
|
|
|
|
| |
MSVC has a `__cpuidex` function implemented to call the underlying cpuid
instruction which accepts a leaf, subleaf, and data array that the output
data is written into. This patch adds this functionality into clang
under the cpuid.h header. This also makes clang match GCC's behavior.
GCC has had `__cpuidex` in its cpuid.h since 2020.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D150646
|
| |
|
|
|
|
|
|
|
|
|
|
| |
These tests rely on an unintended behavior that when the driver performs both
compilation and linking phases, the .gcno & .gcda files are placed in PWD. The
behavior will be fixed to respect -o (match -ftime-trace, -gsplit-dwarf, and
GCC).
Add -dumpdir ./ so that the tests will work with or without the behavior change,
and make it easy to compare the coverage behavior with GCC.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, `CCState::AllocateStack` always allocated stack space by increasing
offsets. For targets with stack growing up (away from zero) it is more
convenient to allocate arguments by decreasing offsets, so that the first
argument is at the top of the stack. This is important when calling a function
with variable number of arguments: the callee does not know the size of the
stack, but must be able to access "fixed" arguments. For that to work, the
"fixed" arguments should have fixed offsets relative to the stack top, i.e. the
variadic arguments area should be at the stack bottom (at lowest addresses).
The in-tree target with stack growing up is AMDGPU, but it allocates
arguments by increasing addresses. It does not support variadic arguments.
A drive-by change is to promote stack size/offset to 64-bit integer.
This is what MachineFrameInfo expects.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D149575
|
|
|
|
|
|
|
|
|
|
|
|
| |
The term "next stack offset" is misleading because the next argument is
not necessarily allocated at this offset due to alignment constrains.
It also does not make much sense when allocating arguments at negative
offsets (introduced in a follow-up patch), because the returned offset
would be past the end of the next argument.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D149566
|
|
|
|
|
|
|
|
| |
constexpr (!SANITIZER_FUCHSIA)
This prevents spamming the build log with unused InitLoadedGlobals when building fuchsia runtimes.
Differential Revision: https://reviews.llvm.org/D150737
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously for overloaded functions we'd show:
Provides: foo, bar bar bar bar
The symbol name is duplicated
==> only show unique names, since we're not displaying the signature
Commas are missing
==> fix the logic which was checking for "last element" by value
(though after the above fix this bug is dead anyway)
While here, remove a redundant bounds check before take_front().
Differential Revision: https://reviews.llvm.org/D150683
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a representation for firstprivate clause modeled on the
private representation added in D150622.
The firstprivate recipe operation has an additional mandatory
region representing a sequences of operations needed to copy
the initial value to the created private copy.
Depends on D150622
Reviewed By: razvanlupusoru
Differential Revision: https://reviews.llvm.org/D150729
|
|
|
|
|
|
|
|
|
|
| |
at VL=2
In general, VL=2 vectors are very questionable profitability wise. For constants specifically, our inability to materialize many vector constants cheaply biases us strongly towards unprofitability at VL=2.
This hook is very close to the x86 implementation. The difference is that X86 whitelists stores of zeros, and we're better off letting that stay scalar at VL=2.
Differential Revision: https://reviews.llvm.org/D150798
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently privatization is not well represented in the OpenACC dialect. This
patch is a prototype to model privatization with a dedicated operation that
declares how the private value is created and destroyed.
The newly introduced operation is `acc.private.recipe` and it declares
an OpenACC privatization. The operation requires one mandatory and
one optional region.
1. The initializer region specifies how to create and initialize a new
private value. For example in Fortran, a derived-type might have a
default initialization. The region has an argument that contains the
value that need to be privatized. This is useful if the type is not
a known at compile time and the private value is needed to create its
copy.
2. The destroy region specifies how to destruct the value when it reaches
its end of life. It takes the privatized value as argument.
A same privatization can be used for multiple operand if they have the same
type and do not require a specific default initialization.
Example:
```mlir
acc.private.recipe @privatization_f32 : f32 init {
^bb0(%0: f32):
// init region contains a sequence of operations to create and initialize the
// copy if needed.
} destroy {
^bb0(%0: f32):
// destroy region contains a sequences of operations to destruct the created
// copy.
}
// The privatization symbol is then used in the corresponding operation.
acc.parallel private(@privatization_f32 -> %a : f32) {
}
```
Reviewed By: razvanlupusoru, jeanPerier
Differential Revision: https://reviews.llvm.org/D150622
|
|
|
|
| |
Differential revision: https://reviews.llvm.org/D150720
|
|
|
|
|
|
|
|
| |
Use `llvm::make_range` convenience wrapper from ADT.
Reviewed By: #bolt, rafauler
Differential Revision: https://reviews.llvm.org/D145887
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
identifier.
Previously we lexed into a SmallVector and unlexed the tokens if
the parsing failed.
This patch gets rid of the SmallVector and the unlexing.
If the first token fails to parse, return MatchFail. This allows us
to fallback to parsing as an immediate. If we successfully parsed the
first token, use ParseFail if any later tokens fail to parse. This
avoids needing to UnLex the tokens.
I've used a state machine to keep track of what component we've
parsed so far.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D150753
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D150799
|
|
|
|
| |
Differential Revision: https://reviews.llvm.org/D150237
|
|
|
|
|
|
|
|
|
|
|
|
| |
MacOS build of LLVM with OpenMP enabled fails with an error
that it doesn't know what std::abs is. Fix by including <cmath>
so that the relevant function declaration is included.
No functional change intended.
Reviewed By: tianshilei1992
Differential Revision: https://reviews.llvm.org/D150687
|
|
|
|
|
|
|
|
| |
Implement P2505R5(https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p2505r5.html)
Reviewed By: #libc, philnik, ldionne
Differential Revision: https://reviews.llvm.org/D140911
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit ebc111b08bddca55d5f7e560a20bdb2c913d80cb.
Sorry, I forgot to append Phabricator reversion when land D140911, I try to revert this change and reland.
Reviewed By: #libc, ldionne, EricWF
Differential Revision: https://reviews.llvm.org/D150793
|