summaryrefslogtreecommitdiff
path: root/lld/test
diff options
context:
space:
mode:
authorJez Ng <jezng@fb.com>2023-03-10 22:28:36 -0500
committerJez Ng <jezng@fb.com>2023-03-11 01:40:14 -0500
commitbb69a66ced27918f894fb0d5e58e22fda6958d99 (patch)
treed7fd4815650ae105b21bf42e6526dce5d7f39354 /lld/test
parent3b4cb1e96c645bb833fe710856479c31383859bb (diff)
downloadllvm-bb69a66ced27918f894fb0d5e58e22fda6958d99.tar.gz
[lld-macho] Coalesce local symbol aliases along with their aliased weak def
This supersedes {D139069}. In some ways we are now closer to ld64's behavior: we previously only did this coalescing for private-label symbols, but now we do it for all locals, just like ld64. However, we no longer generate weak binds when a local alias to a weak symbol is referenced. This is merely for implementation simplicity; it's not clear to me that any real-world programs depend on us emulating this behavior. The problem with the previous approach is that we ended up with duplicate references to the same symbol instance in our InputFiles, which translated into duplicate symbols in our output. While we could work around that problem by performing a dedup step before emitting the symbol table, it seems cleaner to not generate duplicate references in the first place. Numbers for chromium_framework on my 16 Core Intel Mac Pro: base diff difference (95% CI) sys_time 2.243 ± 0.093 2.231 ± 0.066 [ -2.5% .. +1.4%] user_time 6.529 ± 0.087 6.080 ± 0.050 [ -7.5% .. -6.3%] wall_time 6.928 ± 0.175 6.474 ± 0.112 [ -7.7% .. -5.4%] samples 26 31 Yep, that's a massive win... because it turns out that {D140606} and {D139069} caused a regression (of about the same size.) I just didn't think to measure them back then. I'm guessing all the extra symbols we have been emitting did not help perf at all... Reviewed By: lgrey Differential Revision: https://reviews.llvm.org/D145455
Diffstat (limited to 'lld/test')
-rw-r--r--lld/test/MachO/local-alias-to-weak.s149
-rw-r--r--lld/test/MachO/private-label-alias.s105
2 files changed, 149 insertions, 105 deletions
diff --git a/lld/test/MachO/local-alias-to-weak.s b/lld/test/MachO/local-alias-to-weak.s
new file mode 100644
index 000000000000..feb4c0a2e429
--- /dev/null
+++ b/lld/test/MachO/local-alias-to-weak.s
@@ -0,0 +1,149 @@
+# REQUIRES: x86
+## This test checks that when we coalesce weak definitions, their local symbol
+## aliases defs don't cause the coalesced data to be retained. This was
+## motivated by MC's aarch64 backend which automatically creates `ltmp<N>`
+## symbols at the start of each .text section. These symbols are frequently
+## aliases of other symbols created by clang or other inputs to MC. I've chosen
+## to explicitly create them here since we can then reference those symbols for
+## a more complete test.
+##
+## Not retaining the data matters for more than just size -- we have a use case
+## that depends on proper data coalescing to emit a valid file format. We also
+## need this behavior to properly deduplicate the __objc_protolist section;
+## failure to do this can result in dyld crashing on iOS 13.
+##
+## Finally, ld64 does all this regardless of whether .subsections_via_symbols is
+## specified. We don't. But again, given how rare the lack of that directive is
+## (I've only seen it from hand-written assembly inputs), I don't think we need
+## to worry about it.
+
+# RUN: rm -rf %t; split-file %s %t
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/weak-then-local.s -o %t/weak-then-local.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/local-then-weak.s -o %t/local-then-weak.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/no-subsections.s -o %t/no-subsections.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/no-dead-strip.s -o %t/no-dead-strip.o
+
+# RUN: %lld -lSystem -dylib %t/weak-then-local.o %t/local-then-weak.o -o %t/test1
+# RUN: llvm-objdump --macho --syms --section="__DATA,__data" --weak-bind %t/test1 | FileCheck %s
+# RUN: %lld -lSystem -dylib %t/local-then-weak.o %t/weak-then-local.o -o %t/test2
+# RUN: llvm-objdump --macho --syms --section="__DATA,__data" --weak-bind %t/test2 | FileCheck %s
+
+## Check that we only have one copy of 0x123 in the data, not two.
+# CHECK: Contents of (__DATA,__data) section
+# CHECK-NEXT: 0000000000001000 23 01 00 00 00 00 00 00 00 10 00 00 00 00 00 00 {{$}}
+# CHECK-NEXT: 0000000000001010 00 10 00 00 00 00 00 00 {{$}}
+# CHECK-EMPTY:
+# CHECK-NEXT: SYMBOL TABLE:
+# CHECK-NEXT: 0000000000001000 l O __DATA,__data _alias
+# CHECK-NEXT: 0000000000001008 l O __DATA,__data _ref
+# CHECK-NEXT: 0000000000001000 l O __DATA,__data _alias
+# CHECK-NEXT: 0000000000001010 l O __DATA,__data _ref
+# CHECK-NEXT: 0000000000001000 w O __DATA,__data _weak
+# CHECK-NEXT: 0000000000000000 *UND* dyld_stub_binder
+# CHECK-EMPTY:
+## Even though the references were to the non-weak `_alias` symbols, ld64 still
+## emits weak binds as if they were the `_weak` symbol itself. We do not. I
+## don't know of any programs that rely on this behavior, so I'm just
+## documenting it here.
+# CHECK-NEXT: Weak bind table:
+# CHECK-NEXT: segment section address type addend symbol
+# CHECK-EMPTY:
+
+# RUN: %lld -lSystem -dylib %t/local-then-weak.o %t/no-subsections.o -o %t/sub-nosub
+# RUN: llvm-objdump --macho --syms --section="__DATA,__data" %t/sub-nosub | FileCheck %s --check-prefix SUB-NOSUB
+
+## This test case demonstrates a shortcoming of LLD: If .subsections_via_symbols
+## isn't enabled, we don't elide the contents of coalesced weak symbols if they
+## are part of a section that has other non-coalesced symbols. In contrast, LD64
+## does elide the contents.
+# SUB-NOSUB: Contents of (__DATA,__data) section
+# SUB-NOSUB-NEXT: 0000000000001000 23 01 00 00 00 00 00 00 00 10 00 00 00 00 00 00
+# SUB-NOSUB-NEXT: 0000000000001010 00 00 00 00 00 00 00 00 23 01 00 00 00 00 00 00
+# SUB-NOSUB-EMPTY:
+# SUB-NOSUB-NEXT: SYMBOL TABLE:
+# SUB-NOSUB-NEXT: 0000000000001000 l O __DATA,__data _alias
+# SUB-NOSUB-NEXT: 0000000000001008 l O __DATA,__data _ref
+# SUB-NOSUB-NEXT: 0000000000001010 l O __DATA,__data _zeros
+# SUB-NOSUB-NEXT: 0000000000001000 l O __DATA,__data _alias
+# SUB-NOSUB-NEXT: 0000000000001000 w O __DATA,__data _weak
+# SUB-NOSUB-NEXT: 0000000000000000 *UND* dyld_stub_binder
+
+# RUN: %lld -lSystem -dylib %t/no-subsections.o %t/local-then-weak.o -o %t/nosub-sub
+# RUN: llvm-objdump --macho --syms --section="__DATA,__data" %t/nosub-sub | FileCheck %s --check-prefix NOSUB-SUB
+
+# NOSUB-SUB: Contents of (__DATA,__data) section
+# NOSUB-SUB-NEXT: 0000000000001000 00 00 00 00 00 00 00 00 23 01 00 00 00 00 00 00
+# NOSUB-SUB-NEXT: 0000000000001010 08 10 00 00 00 00 00 00 {{$}}
+# NOSUB-SUB-EMPTY:
+# NOSUB-SUB-NEXT: SYMBOL TABLE:
+# NOSUB-SUB-NEXT: 0000000000001000 l O __DATA,__data _zeros
+# NOSUB-SUB-NEXT: 0000000000001008 l O __DATA,__data _alias
+# NOSUB-SUB-NEXT: 0000000000001008 l O __DATA,__data _alias
+# NOSUB-SUB-NEXT: 0000000000001010 l O __DATA,__data _ref
+# NOSUB-SUB-NEXT: 0000000000001008 w O __DATA,__data _weak
+# NOSUB-SUB-NEXT: 0000000000000000 *UND* dyld_stub_binder
+
+## Verify that we don't drop any flags that the aliases have (such as
+## .no_dead_strip). This is a regression test. We previously had subsections
+## that were mistakenly stripped.
+
+# RUN: %lld -lSystem -dead_strip %t/no-dead-strip.o -o %t/no-dead-strip
+# RUN: llvm-objdump --macho --section-headers %t/no-dead-strip | FileCheck %s \
+# RUN: --check-prefix=NO-DEAD-STRIP
+# NO-DEAD-STRIP: __data 00000010
+
+#--- weak-then-local.s
+.globl _weak
+.weak_definition _weak
+.data
+_weak:
+_alias:
+ .quad 0x123
+
+_ref:
+ .quad _alias
+
+.subsections_via_symbols
+
+#--- local-then-weak.s
+.globl _weak
+.weak_definition _weak
+.data
+_alias:
+_weak:
+ .quad 0x123
+
+_ref:
+ .quad _alias
+
+.subsections_via_symbols
+
+#--- no-subsections.s
+.globl _weak
+.weak_definition _weak
+.data
+_zeros:
+.space 8
+
+_weak:
+_alias:
+ .quad 0x123
+
+#--- no-dead-strip.s
+.globl _main
+
+_main:
+ ret
+
+.data
+.no_dead_strip l_foo, l_bar
+
+_foo:
+l_foo:
+ .quad 0x123
+
+l_bar:
+_bar:
+ .quad 0x123
+
+.subsections_via_symbols
diff --git a/lld/test/MachO/private-label-alias.s b/lld/test/MachO/private-label-alias.s
deleted file mode 100644
index c5eb6277206c..000000000000
--- a/lld/test/MachO/private-label-alias.s
+++ /dev/null
@@ -1,105 +0,0 @@
-# REQUIRES: x86
-## This test checks that when we coalesce weak definitions, any flag-less
-## private-label aliases to those weak defs don't cause the coalesced data to be
-## retained. This test explicitly creates those private-label symbols, but it
-## was actually motivated by MC's aarch64 backend which automatically creates
-## them when emitting object files. I've chosen to explicitly create them here
-## since we can then reference those symbols for a more complete test.
-##
-## Not retaining the data matters for more than just size -- we have a use case
-## that depends on proper data coalescing to emit a valid file format.
-##
-## ld64 actually treats all local symbol aliases (not just the private ones) the
-## same way. But implementing this is harder -- we would have to create those
-## symbols first (so we can emit their names later), but we would have to
-## ensure the linker correctly shuffles them around when their aliasees get
-## coalesced. Emulating the behavior of weak binds for non-private symbols would
-## be even trickier. Let's just deal with private-label symbols for now until we
-## find a use case for more general local symbols.
-##
-## Finally, ld64 does all this regardless of whether .subsections_via_symbols is
-## specified. We don't. But again, given how rare the lack of that directive is
-## (I've only seen it from hand-written assembly inputs), I don't think we need
-## to worry about it.
-
-# RUN: rm -rf %t; split-file %s %t
-# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/weak-then-private.s -o %t/weak-then-private.o
-# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/private-then-weak.s -o %t/private-then-weak.o
-# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/no-dead-strip.s -o %t/no-dead-strip.o
-# RUN: %lld -dylib %t/weak-then-private.o %t/private-then-weak.o -o %t/test1
-# RUN: %lld -dylib %t/private-then-weak.o %t/weak-then-private.o -o %t/test2
-# RUN: %lld -dead_strip %t/no-dead-strip.o -o %t/no-dead-strip
-
-# RUN: llvm-objdump --macho --syms --section="__DATA,__data" --weak-bind %t/test1 | FileCheck %s
-# RUN: llvm-objdump --macho --syms --section="__DATA,__data" --weak-bind %t/test2 | FileCheck %s
-
-## Check that we only have one copy of 0x123 in the data, not two.
-# CHECK: Contents of (__DATA,__data) section
-# CHECK-NEXT: 0000000000001000 23 01 00 00 00 00 00 00 00 10 00 00 00 00 00 00 {{$}}
-# CHECK-NEXT: 0000000000001010 00 10 00 00 00 00 00 00 {{$}}
-# CHECK-EMPTY:
-# CHECK-NEXT: SYMBOL TABLE:
-# CHECK-NEXT: 0000000000001008 l O __DATA,__data _ref
-# CHECK-NEXT: 0000000000001010 l O __DATA,__data _ref
-# CHECK-NEXT: 0000000000001000 w O __DATA,__data _weak
-# CHECK-NEXT: 0000000000000000 *UND* dyld_stub_binder
-# CHECK-EMPTY:
-## Even though the references were to the non-weak `l_ignored` aliases, we
-## should still emit weak binds as if they were the `_weak` symbol itself.
-# CHECK-NEXT: Weak bind table:
-# CHECK-NEXT: segment section address type addend symbol
-# CHECK-NEXT: __DATA __data 0x00001008 pointer 0 _weak
-# CHECK-NEXT: __DATA __data 0x00001010 pointer 0 _weak
-
-## Verify that we don't drop any flags that private-label aliases have (such as
-## .no_dead_strip). This is a regression test. We previously had subsections
-## that were mistakenly stripped.
-
-# RUN: llvm-objdump --macho --section-headers %t/no-dead-strip | FileCheck %s \
-# RUN: --check-prefix=NO-DEAD-STRIP
-# NO-DEAD-STRIP: __data 00000010
-
-#--- weak-then-private.s
-.globl _weak
-.weak_definition _weak
-.data
-_weak:
-l_ignored:
- .quad 0x123
-
-_ref:
- .quad l_ignored
-
-.subsections_via_symbols
-
-#--- private-then-weak.s
-.globl _weak
-.weak_definition _weak
-.data
-l_ignored:
-_weak:
- .quad 0x123
-
-_ref:
- .quad l_ignored
-
-.subsections_via_symbols
-
-#--- no-dead-strip.s
-.globl _main
-
-_main:
- ret
-
-.data
-.no_dead_strip l_foo, l_bar
-
-_foo:
-l_foo:
- .quad 0x123
-
-l_bar:
-_bar:
- .quad 0x123
-
-.subsections_via_symbols