| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 65429b9af6a2c99d340ab2dcddd41dab201f399c.
Broke several projects, see https://reviews.llvm.org/D144509#4347562 onwards.
Also reverts follow-up commit "[OpenMP] Compile assembly files as ASM, not C"
This reverts commit 4072c8aee4c89c4457f4f30d01dc9bb4dfa52559.
Also reverts fix attempt "[cmake] Set CMP0091 to fix Windows builds after the cmake_minimum_required bump"
This reverts commit 7d47dac5f828efd1d378ba44a97559114f00fb64.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a real-world case with functions that have many, many
R_RISCV_CALL_PLT relocations due to asan and ubsan
instrumentation, all these can be relaxed by an instruction and
the net result is more than 65536 bytes of reduction in the
output .text section that totals about 1.2MiB in final size.
This changes InputSection to use a 32-bit field for bytesDropped.
The RISCV relaxation keeps track in a 64-bit field and detects
32-bit overflow as it previously detected 16-bit overflow. It
doesn't seem likely that 32-bit overflow will arise, but it's not
inconceivable and it's cheap enough to detect it.
This unfortunately increases the size of InputSection on 64-bit
hosts by a word, but that seems hard to avoid.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D150722
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch migrates uses of StringRef::{starts,ends}with_insensitive
to StringRef::{starts,ends}_with_insensitive so that we can use names
similar to those used in std::string_view.
Note that the llvm/ directory has migrated in commit
6c3ea866e93003e16fc55d3b5cedd3bc371d1fde.
I'll post a separate patch to deprecate
StringRef::{starts,ends}with_insensitive.
Differential Revision: https://reviews.llvm.org/D150506
|
|
|
|
|
|
|
|
|
| |
MS link accepts *.obj with ehcont bit set only. LLD should match this
behavoir too.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D150508
|
|
|
|
|
|
|
| |
A code-review comment to change a couple of CHECK to CHECK-NEXT that I
forgot to apply prior to committing.
Differential Revision: https://reviews.llvm.org/D150445
|
|
|
|
|
|
|
|
|
|
| |
In D72756 the change to add INPUT_SECTION_FLAGS inadvertantly
removed the line to parse the program header assignment information for
OutputSections within an OVERLAY.
This change adds back the missing line and adds a test for it.
Differential Revision: https://reviews.llvm.org/D150445
|
|
|
|
|
|
| |
The owner of the last two failing buildbots updated CMake.
This reverts commit e8e8707b4aa6e4cc04c0cffb2de01d2de71165fc.
|
|
|
|
|
|
|
|
|
|
| |
Replace some RF_ flags with integer literals.
Rewrite the isWrite/isExec block to make the code block order reflect
the section order.
Rewrite some imprecise comments.
This is NFC, if we don't count invalid cases such as non-writable TLS
and non-writable RELRO.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
I noticed that we are converting llvm.public.type.test to regular
llvm.type.test too early, and thus not updating those in imported
functions. This would result in losing out on WPD opportunities. Move
the update to after function importing, and improve test to cover this
case.
Differential Revision: https://reviews.llvm.org/D150326
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The generic ABI says:
> Padding is present, if necessary, to ensure 8 or 4-byte alignment for the next note entry (depending on whether the file is a 64-bit or 32-bit object). Such padding is not included in descsz.
Our parsing code currently aligns n_namesz. Fix the bug by aligning the start
offset of the descriptor instead. This issue has been benign because the primary
uses of sh_addralign=8 notes are `.note.gnu.property`, where
`sizeof(Elf_Nhdr) + sizeof("GNU") = 16` (already aligned by 8).
In practice, many 64-bit systems incorrectly use sh_addralign=4 notes.
We can use sh_addralign (= p_align) to decide the descriptor padding.
Treat an alignment of 0 and 1 as 4. This approach matches modern GNU readelf
(since 2018).
We have a few tests incorrectly using sh_addralign=0. We may make our behavior
stricter after fixing these tests.
Linux kernel dumped core files use `p_align=0` notes, so we need to support the
case for compatibility.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D150022
|
|
|
|
|
|
| |
Unfortunatly not all buildbots are updated.
This reverts commit ffb807ab5375b3f78df198dc5d4302b3b552242f.
|
|
|
|
|
|
| |
All build bots should be updated now.
This reverts commit 44d38022ab29a3156349602733b3459df5beef93.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The libz compression library on SystemZ by default makes use of the
platform's hardware-accelerated compression facility. This is much
faster than the regular software implementation, but often results in
slightly different outputs. This causes failures with the
compressed-debug-level test case.
To fix this, set the DFLTCC environment variable to zero while running
tests on SystemZ, which prevents use of hardware compression and falls
back to the software implementation.
Reviewed by: MaskRay
Differential Revision: https://reviews.llvm.org/D149273
|
|
|
|
|
|
|
| |
Before this patch, export entries with empy RVA were displayed in the output. In some cases, when the module had exports with sparse ordinals, `llvm-objdump` used to print a lot of `0 0` lines.
We now skip over these empty entries in the output, just as `dumpbin` or binutils `objdump` does.
Differential Revision: https://reviews.llvm.org/D149610
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The `valueDelta` map was used to calculate the symbol value deltas from
the previous iteration. Since the symbol values themselves are also
updated every iteration, the following invariant holds:
```
sa[i].offset == sa[i].d->value + valueDelta[sa[i].d]
```
Note that `sa[i].offset` contains the original value of `sa[i].d` and is
never changed.
This means that the current way of updating symbol values can be
rewritten to not need the `valueDelta` map:
```
sa[i].d->value -= delta - valueDelta.find(sa[i].d)->second;
<=> (replace invariant)
sa[i].d->value -= delta - (sa[i].offset - sa[i].d->value);
<=>
sa[i].d->value = sa[i].d->value - (delta - (sa[i].offset - sa[i].d->value));
<=>
sa[i].d->value = sa[i].d->value - delta + sa[i].offset - sa[i].d->value;
<=>
sa[i].d->value = sa[i].offset - delta;
```
This patch implements this simplification. I believe this improves the
readability of the code as it took me quite some time to understand the
use of `valueDelta`. It might also have a slight performance benefit as
it removes one iteration over all relocations every relax iteration.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D149735
|
|
|
|
| |
The previous pattern was matching the RVA `0` to the first character of `0x1010`. Make sure now that the entire export entry is matched.
|
|
|
|
|
|
|
|
|
| |
OutputSection::checkDynRelAddends() incorrectly reports an internal
linker error for large addends on 32-bit targets. This is caused by the
lack of sign extension in DynamicReloc::computeAddend() for 32-bit
addends.
Differential Revision: https://reviews.llvm.org/D149347
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
const std::string&"
This reverts commit c117c2c8ba4afd45a006043ec6dd858652b2ffcc.
itaniumDemangle calls std::strlen with the results of
std::string_view::data() which may not be NUL-terminated. This causes
lld/test/wasm/why-extract.s to fail when "expensive checks" are enabled
via -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON. See D149675 for further
discussion. Back this out until the individual demanglers are converted
to use std::string_view.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
std::string&
As suggested by @erichkeane in
https://reviews.llvm.org/D141451#inline-1429549
There's potential for a lot more cleanups around these APIs. This is
just a start.
Callers need to be more careful about sub-expressions producing strings
that don't outlast the expression using ``llvm::demangle``. Add a
release note.
Reviewed By: MaskRay, #lld-macho
Differential Revision: https://reviews.llvm.org/D149104
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
That patch adds a check for threadIndex being used with only threads
created by ThreadPoolExecutor. This helps catch two types of errors:
1. If a thread is created not by ThreadPoolExecutor its index may clash
with the index of another thread. Using threadIndex, in that case, may
lead to a data race.
2. Index of the main thread(threadIndex == 0) currently clashes with
the index of thread0 in ThreadPoolExecutor threads. That may lead
to a data race if main thread and thread0 are executed concurrently.
This patch allows execution tasks on the main thread only in case
parallel::strategy.ThreadsRequested == 1. In all other cases,
assertions check that threadIndex != UINT_MAX(i.e. that task
is executed on a thread created by ThreadPoolExecutor).
Differential Revision: https://reviews.llvm.org/D148916
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
output section
In a link map, the input section name gives more information. See the updated
merge-entsize.s for an example. The output file is unchanged.
Compiler generated input sections with the SHF_MERGE flag have names such as
.rodata.str1.1 and .rodata.cstN, and are not affected by -fdata-sections.
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D149466
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, most Mach-O tests fail when executed on a big-endian host. This is
because the Mach-O back-end does not perform the necessary byte swaps when
accessing the (little-endian) binary file format.
For now, simply consider all Mach-O tests unsupported on big-endian hosts,
to enable running the test suite at all on such hosts.
Reviewed by: oontvoo
Differential Revision: https://reviews.llvm.org/D149270
|
|
|
|
|
|
| |
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D149318
|
|
|
|
| |
This reverts commit 176cc70abe8d85df9aae223e0b35ce65238c4333.
|
|
|
|
| |
Build is broken after ee9cbe35.
|
|
|
|
|
| |
This reverts commit f2404d589ece81b029c607af011c372d52bff8d2,
which causes failures on Windows.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Relocations R_AVR_LO8_LDI_GS/R_AVR_HI8_LDI_GS (indirect calls
via function pointers) only cover range 128KiB. They are
equivalent to R_AVR_LO8_LDI_PM/R_AVR_HI8_LDI_PM within this
range.
But for function addresses beyond this range, GNU-ld emits
trampolines. And this patch implements corresponding thunks
for them in lld.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D147364
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The libz compression library on SystemZ by default makes use of the
platform's hardware-accelerated compression facility. This is much
faster than the regular software implementation, but often results in
slightly different outputs. This causes failures with the
compressed-debug-level test case.
To fix this, run this test while setting the DFLTCC environment
variable to zero, which prevents use of hardware compression and falls
back to the software implementation. (This should not have any effect
on other platforms.)
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D149273
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running the LLD test suite on a big-endian host, the
COFF/pdb-framedata.yaml test case currently fails.
As it turns out, this is because code in DebugSHandler::finish
intended to relocate RvaStart entries of FDO records does not
work correctly when compiled for a big-endian host.
Fixed by always reading file data in little-endian mode.
Reviewed By: aganea
Differential Revision: https://reviews.llvm.org/D149268
|
| |
|
|
|
|
|
|
|
|
| |
This also fixes check prefix NO which is pointless in symtab.test
Reviewed By: mstorsjo
Differential Revision: https://reviews.llvm.org/D149235
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
--remap-inputs-file= can be specified multiple times, each naming a
remap file that contains `from-glob=to-file` lines or `#`-led comments.
('=' is used a separator a la -fdebug-prefix-map=)
--remap-inputs-file= can be used to:
* replace an input file. E.g. `"*/libz.so=exp/libz.so"` can replace a resolved
`-lz` without updating the input file list or (if used) a response file.
When debugging an application where a bug is isolated to one single
input file, this option gives a convenient way to test fixes.
* remove an input file with `/dev/null` (changed to `NUL` on Windows), e.g.
`"a.o=/dev/null"`. A build system may add unneeded dependencies.
This option gives a convenient way to test the result removing some inputs.
`--remap-inputs=a.o=aa.o` can be specified to provide one pattern without using
an extra file.
(bash/zsh process substitution is handy for specifying a pattern without using
a remap file, e.g. `--remap-inputs-file=<(printf 'a.o=aa.o')`, but it may be
unavailable in some systems. An extra file can be inconvenient for a build
system.)
Exact patterns are tested before wildcard patterns. In case of a tie, the first
patterns wins. This is an implementation detail that users should not rely on.
Co-authored-by: Marco Elver <elver@google.com>
Link: https://discourse.llvm.org/t/rfc-support-exclude-inputs/70070
Reviewed By: melver, peter.smith
Differential Revision: https://reviews.llvm.org/D148859
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch allows to specify that some part of tasks should be
done in sequential order. It makes it possible to not use
condition operator for separating sequential tasks:
TaskGroup tg;
for () {
if(condition) ==> tg.spawn([](){fn();}, condition)
fn();
else
tg.spawn([](){fn();});
}
It also prevents execution on main thread. Which allows adding
checks for getThreadIndex() function discussed in D142318.
The patch also replaces std::stack with std::deque in the
ThreadPoolExecutor to have natural execution order in case
(parallel::strategy.ThreadsRequested == 1).
Differential Revision: https://reviews.llvm.org/D148728
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
MSVC link.exe added this flag and MS STL started using this flag in
.drectve [1] when compiling with Clang with asan enabled, as reported
on https://github.com/llvm/llvm-project/issues/56300. This causes issues
with lld-link because it rejects any unknown flags in .drective sections.
As dc07867dc9991c982bd3441da19d6fcc16ea54d6 noted that, when using Clang
as the driver it explicitly passes the proper asan libraries. Therefore
it should be acceptable to ignore this flag in lld-link to at least
unbreak building with clang-cl and linking with lld-link.
[1]: https://github.com/microsoft/STL/blob/faaf094ee16bcbfb2c8d612fdb9334bcdef2fd0a/stl/inc/__msvc_sanitizer_annotate_container.hpp#L35
Differential Revision: https://reviews.llvm.org/D149023
|
|
|
|
|
|
|
| |
This is a small QoL improvement suggested by FrancescElies in
https://github.com/llvm/llvm-project/issues/56300#issuecomment-1172104966.
Differential Revision: https://reviews.llvm.org/D149022
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The AArch64 branch immediate instruction has a 128MiB range. This
makes it suitable for use a short range thunk in the same way as
short thunks are implemented in Arm and PPC. This patch adds
support for short range thunks to AArch64.
Adding short range thunk support should mean that OutputSections
can grow to nearly 256 MiB in size without needing long-range
indirect branches.
Differential Revision: https://reviews.llvm.org/D148701
|
|
|
|
|
|
|
|
|
|
|
| |
extension thunk.
Adding BTI to those PLT's which accessed with by a range extension thunk due to those preform an indirect call.
Fixes: #62140
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D148704
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
In particular, make it `foo.a(foo.o)$ARCHIVE_OFFSET`. The goal is to
make it more similar to both ld64 implementation, which uses the
`foo.a(foo.o)$MODULE_ID` format. We dump some of these names in LTO
code, so matching ld64's format is helpful. This format is also more
similar to LLD-ELF's, which is `foo.a(foo.o at $ARCHIVE_OFFSET)`.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D148828
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When --threads= is unspecified, we set it to
`parallel::strategy.compute_thread_count()`, which uses
sched_getaffinity (Linux)/cpuset_getaffinity (FreeBSD)/std::thread::hardware_concurrency (others).
With extensive testing on many machines (many configurations from
{aarch64,x86-64} x {Linux,FreeBSD,Windows} x allocators(native,mimalloc,rpmalloc) combinations)
with varying workloads, we discovered that when the concurrency is larger than
16, the linking process is slower than using --threads=16 due to parallelism
overhead outweighs optimizations. This is particularly harmful for machines with
many cores or when the link job competes with other jobs.
Cap parallel::strategy when --threads= is unspecified.
For some workloads changing the concurrency from 8 to 16 has nearly no improvement.
--thinlto-jobs= is unchanged since ThinLTO backend compiles are embarrassingly
parallel.
Link: https://discourse.llvm.org/t/avoidable-overhead-from-threading-by-default/69160
Reviewed By: peter.smith, andrewng
Differential Revision: https://reviews.llvm.org/D147493
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As pointed out in
https://discourse.llvm.org/t/undeterministic-thin-index-file/69985, the
block count added to distributed ThinLTO index files breaks incremental
builds on ThinLTO - if any linked file has a different number of BBs,
then the accumulated sum placed in the index files will change, causing
all ThinLTO backend compiles to be redone.
The block count is only used for scaling of partial sample profiles, and
was added in D80403 for D79831.
This patch simply removes this field from the index files of non partial
sample profile compiles, which is NFC on the output of the compiler.
We subsequently need to see if this can be removed for partial sample
profiles without signficant performance loss, or redesigned in a way
that does not destroy caching.
Differential Revision: https://reviews.llvm.org/D148746
|
|
|
|
|
|
|
|
|
| |
is unspecified"
This reverts commit da68d2164efcc1f5e57f090e2ae2219056b120a0.
This change is correct, but left a `config->threadCount` use that is error-prone
and may harm performance when parallel::strategy.compute_thread_count() > 16.
|
|
|
|
|
|
| |
Missed by the original commit of D147016 which updated the DataLayout for Power.
Differential Revision: https://reviews.llvm.org/D147016
|
|
|
|
|
|
| |
This reverts commit 1ef4c3c859728008cf707cad8d67f45ae5070ae1.
Two buildbots still haven't been updated.
|
|
|
|
|
|
| |
This reverts commit 92523a35a827539db8557bbc3ecab7f9ea3f6ade.
Reland to see whether CIs are updated.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Embedded systems that do not use an ELF loader locate the
.ARM.exidx exception table via linker defined __exidx_start and
__exidx_end rather than use the PT_ARM_EXIDX program header. This
means that some linker scripts such as the picolibc C library's
linker script, do not have the .ARM.exidx sections at offset 0 in
the OutputSection. For example:
.except_unordered : {
. = ALIGN(8);
PROVIDE(__exidx_start = .);
*(.ARM.exidx*)
PROVIDE(__exidx_end = .);
} >flash AT>flash :text
This is within the specification of Arm exception tables, and is
handled correctly by ld.bfd.
This patch has 2 parts. The first updates the writing of the data
of the .ARM.exidx SyntheticSection to account for a non-zero
OutputSection offset. The second part makes the PT_ARM_EXIDX program
header generation a special case so that it covers only the
SyntheticSection and not the parent OutputSection. While not strictly
necessary for programs locating the exception tables via the symbols
it may cause ELF utilities that locate the exception tables via
the PT_ARM_EXIDX program header to fail. This does not seem to be the
case for GNU and LLVM readelf which seems to look for the
SHT_ARM_EXIDX section.
Differential Revision: https://reviews.llvm.org/D148033
|
|
|
|
|
|
|
|
|
|
| |
This actually simplifies the code by performs a pre-pass of the stub
objects prior to LTO.
This should be the final change needed before we can make the switch
on the emscripten side: https://github.com/emscripten-core/emscripten/pull/18905
Differential Revision: https://reviews.llvm.org/D148287
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
R_RISCV_HI20/R_RISCV_LO12_I/R_RISCV_LO12_S.
This implements support for relaxing these relocations to use the GP
register to compute addresses of globals in the .sdata and .sbss
sections.
This feature is off by default and must be enabled by passing
--relax-gp to the linker.
The GP register might not always be the "global pointer". It can
be used for other purposes. See discussion here
https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/371
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D143673
|