summaryrefslogtreecommitdiff
path: root/src/cmd/5l
Commit message (Collapse)AuthorAgeFilesLines
* [dev.power64] 6g,9g: formatters for Prog and Addr detailsAustin Clements2014-11-141-0/+2
| | | | | | | | | | | | The pretty printers for these make it hard to understand what's actually in the fields of these structures. These "ugly printers" show exactly what's in each field, which can be useful for understanding and debugging code. LGTM=rsc R=rsc CC=golang-codereviews https://codereview.appspot.com/175780043
* cmd/5l, cmd/6l, cmd/8l: fix nacl binary corruption bugRuss Cox2014-08-271-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | NaCl requires the addition of a 32-byte "halt sled" at the end of the text segment. This means that segtext.len is actually 32 bytes shorter than reality. The computation of the file offset of the end of the data segment did not take this 32 bytes into account, so if len and len+32 rounded up (by 64k) to different values, the symbol table overwrote the last page of the data segment. The last page of the data segment is usually the C .string symbols, which contain the strings used in error prints by the runtime. So when this happens, your program probably crashes, and then when it does, you get binary garbage instead of all the usual prints. The chance of hitting this with a randomly sized text segment is 32 in 65536, or 1 in 2048. If you add or remove ANY code while trying to debug this problem, you're overwhelmingly likely to bump the text segment one way or the other and make the bug disappear. Correct all the computations to use segdata.fileoff+segdata.filelen instead of trying to rederive segdata.fileoff. This fixes the failure during the nacl/amd64p32 build. TBR=iant CC=golang-codereviews https://codereview.appspot.com/135050043
* liblink, cmd/dist, cmd/5l: introduce %^ and move C_* constants.Shenghou Ma2014-08-062-46/+51
| | | | | | | | | The helps certain diagnostics and also removed duplicated enums as a side effect. LGTM=dave, rsc R=rsc, dave CC=golang-codereviews https://codereview.appspot.com/115060044
* cmd/5l, cmd/6l, cmd/8l, cmd/ld: remove unused code, consolidate enumsShenghou Ma2014-08-062-95/+8
| | | | | | | LGTM=rsc R=rsc, iant CC=golang-codereviews https://codereview.appspot.com/120220043
* cmd/5l: remove unused noop.cShenghou Ma2014-07-262-44/+0
| | | | | | | LGTM=dave R=rsc, dave CC=golang-codereviews https://codereview.appspot.com/116330043
* cmd/5l, cmd/6l, cmd/8l: remove mkenam.Shenghou Ma2014-07-261-45/+0
| | | | | | | | | Unused. cmd/dist will generate enams as liblink/anames[568].c. LGTM=rsc R=rsc CC=golang-codereviews https://codereview.appspot.com/119940043
* cmd/5c, cmd/5g, cmd/5l, liblink: nacl/arm supportShenghou Ma2014-07-103-1/+15
| | | | | | | LGTM=dave, rsc R=rsc, iant, dave CC=golang-codereviews https://codereview.appspot.com/108360043
* build: annotations and modifications for c2goRuss Cox2014-07-021-50/+63
| | | | | | | | | | | | | | | | | | | | | | | The main changes fall into a few patterns: 1. Replace #define with enum. 2. Add /*c2go */ comment giving effect of #define. This is necessary for function-like #defines and non-enum-able #defined constants. (Not all compilers handle negative or large enums.) 3. Add extra braces in struct initializer. (c2go does not implement the full rules.) This is enough to let c2go typecheck the source tree. There may be more changes once it is doing other semantic analyses. LGTM=minux, iant R=minux, dave, iant CC=golang-codereviews https://codereview.appspot.com/106860045
* runtime: use duff zero and copy to initialize memoryKeith Randall2014-05-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | benchmark old ns/op new ns/op delta BenchmarkCopyFat512 1307 329 -74.83% BenchmarkCopyFat256 666 169 -74.62% BenchmarkCopyFat1024 2617 671 -74.36% BenchmarkCopyFat128 343 89.0 -74.05% BenchmarkCopyFat64 182 48.9 -73.13% BenchmarkCopyFat32 103 28.8 -72.04% BenchmarkClearFat128 102 46.6 -54.31% BenchmarkClearFat512 344 167 -51.45% BenchmarkClearFat64 50.5 26.5 -47.52% BenchmarkClearFat256 147 87.2 -40.68% BenchmarkClearFat32 22.7 16.4 -27.75% BenchmarkClearFat1024 511 662 +29.55% Fixes issue 7624 LGTM=rsc R=golang-codereviews, khr, bradfitz, josharian, dave, rsc CC=golang-codereviews https://codereview.appspot.com/92760044
* runtime, cmd/ld, cmd/5l, run.bash: enable external linking on FreeBSD/ARM.Shenghou Ma2014-04-211-0/+1
| | | | | | | | | Update issue 7331 LGTM=dave, iant R=golang-codereviews, dave, gobot, iant CC=golang-codereviews https://codereview.appspot.com/89520043
* liblink, cmd/ld: reenable nosplit checking and testRuss Cox2014-04-161-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The new code is adapted from the Go 1.2 nosplit code, but it does not have the bug reported in issue 7623: g% go run nosplit.go g% go1.2 run nosplit.go BUG rejected incorrectly: main 0 call f; f 120 linker output: # _/tmp/go-test-nosplit021064539 main.main: nosplit stack overflow 120 guaranteed after split check in main.main 112 on entry to main.f -8 after main.f uses 120 g% Fixes issue 6931. Fixes issue 7623. LGTM=iant R=golang-codereviews, iant, ality CC=golang-codereviews, r https://codereview.appspot.com/88190043
* liblink: remove arch-specific constants from file formatRuss Cox2014-04-142-36/+26
| | | | | | | | | | | | | | | | The relocation and automatic variable types were using arch-specific numbers. Introduce portable enumerations instead. To the best of my knowledge, these are the only arch-specific bits left in the new object file format. Remove now, before Go 1.3, because file formats are forever. LGTM=iant R=iant CC=golang-codereviews https://codereview.appspot.com/87670044
* cmd/gc: shorten temporary lifetimes when possibleRuss Cox2014-04-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new channel and map runtime routines take pointers to values, typically temporaries. Without help, the compiler cannot tell when those temporaries stop being needed, because it isn't sure what happened to the pointer. Arrange to insert explicit VARKILL instructions for these temporaries so that the liveness analysis can avoid seeing them as "ambiguously live". The change is made in order.c, which was already in charge of introducing temporaries to preserve the order-of-evaluation guarantees. Now its job has expanded to include introducing temporaries as needed by runtime routines, and then also inserting the VARKILL annotations for all these temporaries, so that their lifetimes can be shortened. In order to do its job for the map runtime routines, order.c arranges that all map lookups or map assignments have the form: x = m[k] x, y = m[k] m[k] = x where x, y, and k are simple variables (often temporaries). Likewise, receiving from a channel is now always: x = <-c In order to provide the map guarantee, order.c is responsible for rewriting x op= y into x = x op y, so that m[k] += z becomes t = m[k] t2 = t + z m[k] = t2 While here, fix a few bugs in order.c's traversal: it was failing to walk into select and switch case bodies, so order of evaluation guarantees were not preserved in those situations. Added tests to test/reorder2.go. Fixes issue 7671. In gc/popt's temporary-merging optimization, allow merging of temporaries with their address taken as long as the liveness ranges do not intersect. (There is a good chance of that now that we have VARKILL annotations to limit the liveness range.) Explicitly killing temporaries cuts the number of ambiguously live temporaries that must be zeroed in the godoc binary from 860 to 711, or -17%. There is more work to be done, but this is a good checkpoint. Update issue 7345 LGTM=khr R=khr CC=golang-codereviews https://codereview.appspot.com/81940043
* all: final merge of NaCl treeRuss Cox2014-02-272-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This CL replays the following one CL from the rsc-go13nacl repo. This is the last replay CL: after this CL the main repo will have everything the rsc-go13nacl repo did. Changes made to the main repo after the rsc-go13nacl repo branched off probably mean that NaCl doesn't actually work after this CL, but all the code is now moved over and just needs to be redebugged. --- cmd/6l, cmd/8l, cmd/ld: support for Native Client See golang.org/s/go13nacl for design overview. This CL is publicly visible but not CC'ed to golang-dev, to avoid distracting from the preparation of the Go 1.2 release. This CL and the others will be checked into my rsc-go13nacl clone repo for now, and I will send CLs against the main repo early in the Go 1.3 development. R?khr https://codereview.appspot.com/15750044 --- LGTM=bradfitz, dave, iant R=dave, bradfitz, iant CC=golang-codereviews https://codereview.appspot.com/69040044
* cmd/gc: rename AFATVARDEF to AVARDEFRuss Cox2014-02-131-1/+1
| | | | | | | | | | | The "fat" referred to being used for multiword values only. We're going to use it for non-fat values sometimes too. No change other than the renaming. TBR=iant CC=golang-codereviews https://codereview.appspot.com/63650043
* cmd/cc, cmd/gc, cmd/ld: consolidate print format routinesAnthony Martin2014-02-122-440/+3
| | | | | | | | | | | | | | | We now use the %A, %D, %P, and %R routines from liblink across the board. Fixes issue 7178. Fixes issue 7055. LGTM=iant R=golang-codereviews, gobot, rsc, dave, iant, remyoudompheng CC=golang-codereviews https://codereview.appspot.com/49170043 Committer: Russ Cox <rsc@golang.org>
* include, linlink, cmd/6l, cmd/ld: part 1 of solaris/amd64 linker changes.Shenghou Ma2014-02-091-0/+1
| | | | | | | | | | rsc suggested that we split the whole linker changes into three parts. This is the first one, mostly dealing with adding Hsolaris. LGTM=iant R=golang-codereviews, iant, dave CC=golang-codereviews https://codereview.appspot.com/54210050
* liblink, cmd/5l: restore flag_sharedElias Naur2014-02-031-1/+1
| | | | | | | | | | | | | | | | | CL 56120043 fixed and cleaned up TLS on ARM after introducing liblink, but left flag_shared broken. This CL restores the (unsupported) flag_shared behaviour by simply rewriting access to $runtime.tlsgm(SB) with runtime.tlsgm(SB), to compensate for the extra indirection when going from the R_ARM_TLS_LE32 relocation to the R_ARM_TLS_IE32 relocation. Also, remove unnecessary symbol lookup left after 56120043. LGTM=iant R=iant, rsc CC=golang-codereviews https://codereview.appspot.com/57000043 Committer: Ian Lance Taylor <iant@golang.org>
* liblink, cmd/5a, cmd/5l: restore cgo on older ARM processorsElias Naur2014-02-031-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | CL 56120043 fixed TLS handling on ARM after the introduction of liblink but left older ARM processors broken. Before liblink, the MRC instruction was replaced with a fallback on older ARMs. CL 56120043 removed that, because the rewrite matched bit patterns on the AWORD pseudo-instruction and could therefore change unrelated AWORDs that happened to match. This CL adds an AMRC instruction to encode both MRC and MCR previously encoded as AWORDs. Then, in liblink, the AMRC instructions are either rewritten to AWORD, or, on goarm < 7, replaced with a branch to the fallback. ./all.bash completes successfully on an ARMv7 with either GOARM=7 or GOARM=5. I have verified that the fallback is indeed present in both runtime.save_gm and runtime.load_gm when GOARM=5 but not when GOARM=7. If all goes well, this should fix the armv5 builders. LGTM=iant R=iant, rsc CC=golang-codereviews https://codereview.appspot.com/55540044 Committer: Ian Lance Taylor <iant@golang.org>
* cmd/ld: move instruction selection + layout into compilers, assemblersRuss Cox2013-12-161-3/+1
| | | | | | | | | | | | | | | | | | | | | - new object file reader/writer (liblink/objfile.c) - remove old object file writing routines - add pcdata iterator - remove all trace of "line number stack" and "path fragments" from object files, linker (!!!) - dwarf now writes a single "compilation unit" instead of one per package This CL disables the check for chains of no-split functions that could overflow the stack red zone. A future CL will attack the problem of reenabling that check (issue 6931). This CL is just the liblink and cmd/ld changes. There are minor associated adjustments in CL 37030045. Each depends on the other. R=golang-dev, dave, iant CC=golang-dev https://codereview.appspot.com/39680043
* runtime: remove non-extern decls of runtime.goarmRuss Cox2013-12-091-1/+1
| | | | | | | | The linker is in charge of providing the one true declaration. R=golang-dev, dave, r CC=golang-dev https://codereview.appspot.com/39560043
* liblink: create new library based on linker codeRuss Cox2013-12-0811-5074/+216
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There is an enormous amount of code moving around in this CL, but the code is the same, and it is invoked in the same ways. This CL is preparation for the new linker structure, not the new structure itself. The new library's definition is in include/link.h. The main change is the use of a Link structure to hold all the linker-relevant state, replacing the smattering of global variables. The Link structure should both make it clearer which state must be carried around and make it possible to parallelize more easily later. The main body of the linker has moved into the architecture-independent cmd/ld directory. That includes the list of known header types, so the distinction between Hplan9x32 and Hplan9x64 is removed (no other header type distinguished 32- and 64-bit formats), and code for unused formats such as ipaq kernels has been deleted. The code being deleted from 5l, 6l, and 8l reappears in liblink or in ld. Because multiple files are being merged in the liblink directory, it is not possible to show the diffs nicely in hg. The Prog and Addr structures have been unified into an architecture-independent form and moved to link.h, where they will be shared by all tools: the assemblers, the compilers, and the linkers. The unification makes it possible to write architecture-independent traversal of Prog lists, among other benefits. The Sym structures cannot be unified: they are too fundamentally different between the linker and the compilers. Instead, liblink defines an LSym - a linker Sym - to be used in the Prog and Addr structures, and the linker now refers exclusively to LSyms. The compilers will keep using their own syms but will fill out the corresponding LSyms in the Prog and Addr structures. Although code from 5l, 6l, and 8l is now in a single library, the code has been arranged so that only one architecture needs to be linked into a particular program: 5l will not contain the code needed for x86 instruction layout, for example. The object file writing code in liblink/obj.c is from cmd/gc/obj.c. Preparation for golang.org/s/go13linker work. This CL does not build by itself. It depends on 35740044 and will be submitted at the same time. R=iant CC=golang-dev https://codereview.appspot.com/35790044
* cmd/5g, cmd/5l, cmd/6g, cmd/6l, cmd/8g, cmd/8l, cmd/gc, runtime: generate ↵Carl Shapiro2013-12-051-0/+1
| | | | | | | | | | | | | | | | | | | | | pointer maps by liveness analysis This change allows the garbage collector to examine stack slots that are determined as live and containing a pointer value by the garbage collector. This results in a mean reduction of 65% in the number of stack slots scanned during an invocation of "GOGC=1 all.bash". Unfortunately, this does not yet allow garbage collection to be precise for the stack slots computed as live. Pointers confound the determination of what definitions reach a given instruction. In general, this problem is not solvable without runtime cost but some advanced cooperation from the compiler might mitigate common cases. R=golang-dev, rsc, cshapiro CC=golang-dev https://codereview.appspot.com/14430048
* cmd/5l, runtime: fix divide for profiling tracebacks on ARMRuss Cox2013-10-311-5/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two bugs: 1. The first iteration of the traceback always uses LR when provided, which it is (only) during a profiling signal, but in fact LR is correct only if the stack frame has not been allocated yet. Otherwise an intervening call may have changed LR, and the saved copy in the stack frame should be used. Fix in traceback_arm.c. 2. The division runtime call adds 8 bytes to the stack. In order to keep the traceback routines happy, it must copy the saved LR into the new 0(SP). Change SUB $8, SP into MOVW 0(SP), R11 // r11 is temporary, for use by linker MOVW.W R11, -8(SP) to update SP and 0(SP) atomically, so that the traceback always sees a saved LR at 0(SP). Fixes issue 6681. R=golang-dev, r CC=golang-dev https://codereview.appspot.com/19910044
* undo CL 19810043 / 352f3b7c9664Russ Cox2013-10-311-15/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CL causes misc/cgo/test to fail randomly. I suspect that the problem is the use of a division instruction in usleep, which can be called while trying to acquire an m and therefore cannot store the denominator in m. The solution to that would be to rewrite the code to use a magic multiply instead of a divide, but now we're getting pretty far off the original code. Go back to the original in preparation for a different, less efficient but simpler fix. ??? original CL description cmd/5l, runtime: make ARM integer division profiler-friendly The implementation of division constructed non-standard stack frames that could not be handled by the traceback routines. CL 13239052 left the frames non-standard but fixed them for the specific case of a divide-by-zero panic. A profiling signal can arrive at any time, so that fix is not sufficient. Change the division to store the extra argument in the M struct instead of in a new stack slot. That keeps the frames bog standard at all times. Also fix a related bug in the traceback code: when starting a traceback, the LR register should be ignored if the current function has already allocated its stack frame and saved the original LR on the stack. The stack copy should be used, as the LR register may have been modified. Combined, these make the torture test from issue 6681 pass. Fixes issue 6681. R=golang-dev, r, josharian CC=golang-dev https://codereview.appspot.com/19810043 ??? TBR=r CC=golang-dev https://codereview.appspot.com/20350043
* cmd/5l, runtime: make ARM integer division profiler-friendlyRuss Cox2013-10-301-36/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The implementation of division constructed non-standard stack frames that could not be handled by the traceback routines. CL 13239052 left the frames non-standard but fixed them for the specific case of a divide-by-zero panic. A profiling signal can arrive at any time, so that fix is not sufficient. Change the division to store the extra argument in the M struct instead of in a new stack slot. That keeps the frames bog standard at all times. Also fix a related bug in the traceback code: when starting a traceback, the LR register should be ignored if the current function has already allocated its stack frame and saved the original LR on the stack. The stack copy should be used, as the LR register may have been modified. Combined, these make the torture test from issue 6681 pass. Fixes issue 6681. R=golang-dev, r, josharian CC=golang-dev https://codereview.appspot.com/19810043
* cmd/gc: support -installsuffix in the compiler and builderDave Day2013-10-031-0/+1
| | | | | | | | | | Add the -installsuffix flag to gc and {5,6,8}l, which overrides -race for the suffix if both are supplied. Pass this flag from the go tool for build and install. R=rsc CC=golang-dev https://codereview.appspot.com/14246044
* cmd/5l: fix handling of RET.EQ in wrapper functionRuss Cox2013-09-131-0/+9
| | | | | | | | Keith is too clever for me. R=ken2 CC=golang-dev, khr https://codereview.appspot.com/13272050
* runtime, cmd/gc, cmd/ld: ignore method wrappers in recoverRuss Cox2013-09-121-0/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Bug #1: Issue 5406 identified an interesting case: defer iface.M() may end up calling a wrapper that copies an indirect receiver from the iface value and then calls the real M method. That's two calls down, not just one, and so recover() == nil always in the real M method, even during a panic. [For the purposes of this entire discussion, a wrapper's implementation is a function containing an ordinary call, not the optimized tail call form that is somtimes possible. The tail call does not create a second frame, so it is already handled correctly.] Fix this bug by introducing g->panicwrap, which counts the number of bytes on current stack segment that are due to wrapper calls that should not count against the recover check. All wrapper functions must now adjust g->panicwrap up on entry and back down on exit. This adds slightly to their expense; on the x86 it is a single instruction at entry and exit; on the ARM it is three. However, the alternative is to make a call to recover depend on being able to walk the stack, which I very much want to avoid. We have enough problems walking the stack for garbage collection and profiling. Also, if performance is critical in a specific case, it is already faster to use a pointer receiver and avoid this kind of wrapper entirely. Bug #2: The old code, which did not consider the possibility of two calls, already contained a check to see if the call had split its stack and so the panic-created segment was one behind the current segment. In the wrapper case, both of the two calls might split their stacks, so the panic-created segment can be two behind the current segment. Fix this by propagating the Stktop.panic flag forward during stack splits instead of looking backward during recover. Fixes issue 5406. R=golang-dev, iant CC=golang-dev https://codereview.appspot.com/13367052
* cmd/5l, cmd/6l, cmd/8l: refactor stack split codeRuss Cox2013-09-111-133/+144
| | | | | | | | | | Pull the stack split generation into its own function. This will make an upcoming change to fix recover easier to digest. R=ken2 CC=golang-dev https://codereview.appspot.com/13611044
* cmd/5l,cmd/6l,cmd/8l: fix dragonflydynld pathJoel Sing2013-08-311-1/+1
| | | | | | R=golang-dev, bradfitz, dave CC=golang-dev https://codereview.appspot.com/13225043
* libbio, all cmd: consistently use BGETC/BPUTC instead of Bgetc/BputcDmitriy Vyukov2013-08-301-9/+9
| | | | | | | | | | | | | | | | | | | Also introduce BGET2/4, BPUT2/4 as they are widely used. Slightly improve BGETC/BPUTC implementation. This gives ~5% CPU time improvement on go install -a -p1 std. Before: real user sys 0m23.561s 0m16.625s 0m5.848s 0m23.766s 0m16.624s 0m5.846s 0m23.742s 0m16.621s 0m5.868s after: 0m22.999s 0m15.841s 0m5.889s 0m22.845s 0m15.808s 0m5.850s 0m22.889s 0m15.832s 0m5.848s R=golang-dev, r CC=golang-dev https://codereview.appspot.com/12745047
* cmd/5l,cmd/8l: unbreak arm and 386 linkersJoel Sing2013-08-241-0/+1
| | | | | | | | Add dragonflydynld to 5l and 8l so that they compile again. R=golang-dev, bradfitz CC=golang-dev https://codereview.appspot.com/12739048
* cmd/gc: &x panics if x doesRuss Cox2013-08-151-0/+1
| | | | | | | | | | | | See golang.org/s/go12nil. This CL is about getting all the right checks inserted. A followup CL will add an optimization pass to remove redundant checks. R=ken2 CC=golang-dev https://codereview.appspot.com/12970043
* runtime.cmd/ld: Add ARM external linking and implement -shared in terms of ↵Elias Naur2013-08-148-37/+201
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | external linking This CL is an aggregate of 10271047, 10499043, 9733044. Descriptions of each follow: 10499043 runtime,cmd/ld: Merge TLS symbols and teach 5l about ARM TLS This CL prepares for external linking support to ARM. The pseudo-symbols runtime.g and runtime.m are merged into a single runtime.tlsgm symbol. When external linking, the offset of a thread local variable is stored at a memory location instead of being embedded into a offset of a ldr instruction. With a single runtime.tlsgm symbol for both g and m, only one such offset is needed. The larger part of this CL moves TLS code from gcc compiled to internally compiled. The TLS code now uses the modern MRC instruction, and 5l is taught about TLS fallbacks in case the instruction is not available or appropriate. 10271047 This CL adds support for -linkmode external to 5l. For 5l itself, use addrel to allow for D_CALL relocations to be handled by the host linker. Of the cases listed in rsc's comment in issue 4069, only case 5 and 63 needed an update. One of the TODO: addrel cases was since replaced, and the rest of the cases are either covered by indirection through addpool (cases with LTO or LFROM flags) or stubs (case 74). The addpool cases are covered because addpool emits AWORD instructions, which in turn are handled by case 11. In the runtime, change the argv argument in the rt0* functions slightly to be a pointer to the argv list, instead of relying on a particular location of argv. 9733044 The -shared flag to 6l outputs a shared library, implemented in Go and callable from non-Go programs such as C. The main part of this CL change the thread local storage model. Go uses the fastest and least general mode, local exec. TLS data in shared libraries normally requires at least the local dynamic mode, however, this CL instead opts for using the initial exec mode. Initial exec mode is faster than local dynamic mode and can be used in linux since the linker has reserved a limited amount of TLS space for performance sensitive TLS code. Initial exec mode requires an extra load from the GOT table to determine the TLS offset. This penalty will not be paid if ld is not in -shared mode, since TLS accesses will be reduced to local exec. The elf sections .init_array and .rela.init_array are added to register the Go runtime entry with cgo at library load time. The "hidden" attribute is added to Cgo functions called from Go, since Go does not generate call through the GOT table, and adding non-GOT relocations for a global function is not supported by gcc. Cgo symbols don't need to be global and avoiding the GOT table is also faster. The changes to 8l are only removes code relevant to the old -shared mode where internal linking was used. This CL only address the low level linker work. It can be submitted by itself, but to be useful, the runtime changes in CL 9738047 is also needed. Design discussion at https://groups.google.com/forum/?fromgroups#!topic/golang-nuts/zmjXkGrEx6Q Fixes issue 5590. R=rsc CC=golang-dev https://codereview.appspot.com/12871044 Committer: Russ Cox <rsc@golang.org>
* cmd/5l: fix encoding of new MOVB, MOVH instructionsRuss Cox2013-08-121-1/+1
| | | | | | | | | They are just like MOVW and should be setting only two register fields, not three. R=ken2 CC=golang-dev, remyoudompheng https://codereview.appspot.com/12781043
* cmd/5c, cmd/5g, cmd/5l: turn MOVB, MOVH into plain moves, optimize short ↵R?my Oudompheng2013-08-092-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | arithmetic. Pseudo-instructions MOVBS and MOVHS are used to clarify the semantics of short integers vs. registers: * 8-bit and 16-bit values in registers are assumed to always be zero-extended or sign-extended depending on their type. * MOVB is truncation or move of an already extended value between registers. * MOVBU enforces zero-extension at the destination (register). * MOVBS enforces sign-extension at the destination (register). And similarly for MOVH/MOVS/MOVHU. The linker is adapted to assemble MOVB and MOVH to an ordinary mov. Also a peephole pass in 5g that aims at eliminating redundant zero/sign extensions is improved. encoding/binary: benchmark old ns/op new ns/op delta BenchmarkReadSlice1000Int32s 220387 217185 -1.45% BenchmarkReadStruct 12839 12910 +0.55% BenchmarkReadInts 5692 5534 -2.78% BenchmarkWriteInts 6137 6016 -1.97% BenchmarkPutUvarint32 257 241 -6.23% BenchmarkPutUvarint64 812 754 -7.14% benchmark old MB/s new MB/s speedup BenchmarkReadSlice1000Int32s 18.15 18.42 1.01x BenchmarkReadStruct 5.45 5.42 0.99x BenchmarkReadInts 5.27 5.42 1.03x BenchmarkWriteInts 4.89 4.99 1.02x BenchmarkPutUvarint32 15.56 16.57 1.06x BenchmarkPutUvarint64 9.85 10.60 1.08x crypto/des: benchmark old ns/op new ns/op delta BenchmarkEncrypt 7002 5169 -26.18% BenchmarkDecrypt 7015 5195 -25.94% benchmark old MB/s new MB/s speedup BenchmarkEncrypt 1.14 1.55 1.36x BenchmarkDecrypt 1.14 1.54 1.35x strconv: benchmark old ns/op new ns/op delta BenchmarkAtof64Decimal 457 385 -15.75% BenchmarkAtof64Float 574 479 -16.55% BenchmarkAtof64FloatExp 1035 906 -12.46% BenchmarkAtof64Big 1793 1457 -18.74% BenchmarkAtof64RandomBits 2267 2066 -8.87% BenchmarkAtof64RandomFloats 1416 1194 -15.68% BenchmarkAtof32Decimal 451 379 -15.96% BenchmarkAtof32Float 547 435 -20.48% BenchmarkAtof32FloatExp 1095 986 -9.95% BenchmarkAtof32Random 1154 1006 -12.82% BenchmarkAtoi 1415 1380 -2.47% BenchmarkAtoiNeg 1414 1401 -0.92% BenchmarkAtoi64 1744 1671 -4.19% BenchmarkAtoi64Neg 1737 1662 -4.32% Fixes issue 1837. R=rsc, dave, bradfitz CC=golang-dev https://codereview.appspot.com/12424043
* cmd/5c, cmd/5g, cmd/5l: introduce MOVBS and MOVHS instructions.R?my Oudompheng2013-08-084-10/+38
| | | | | | | | | | | | MOVBS and MOVHS are defined as duplicates of MOVB and MOVH, and perform sign-extension moving. No change is made to code generation. Update issue 1837 R=rsc, bradfitz CC=golang-dev https://codereview.appspot.com/12682043
* cmd/ld: Put the textflag constants in a separate file.Keith Randall2013-08-071-6/+1
| | | | | | | | | | | | | | | | | We can then include this file in assembly to replace cryptic constants like "7" with meaningful constants like "(NOPROF|DUPOK|NOSPLIT)". Converting just pkg/runtime/asm*.s for now. Dropping NOPROF and DUPOK from lots of places where they aren't needed. More .s files to come in a subsequent changelist. A nonzero number in the textflag field now means "has not been converted yet". R=golang-dev, daniel.morsing, rsc, khr CC=golang-dev https://codereview.appspot.com/12568043
* runtime: use funcdata to supply garbage collection informationRuss Cox2013-07-194-49/+0
| | | | | | | | | | | | | This CL introduces a FUNCDATA number for runtime-specific garbage collection metadata, changes the C and Go compilers to emit that metadata, and changes the runtime to expect it. The old pseudo-instructions that carried this information are gone, as is the linker code to process them. R=golang-dev, dvyukov, cshapiro CC=golang-dev https://codereview.appspot.com/11406044
* cmd/ld, runtime: use new contiguous pcln tableRuss Cox2013-07-181-1/+0
| | | | | | R=golang-dev, r, dave CC=golang-dev https://codereview.appspot.com/11494043
* cmd/5l, cmd/6l, cmd/8l: accept PCDATA instruction in inputRuss Cox2013-07-163-3/+13
| | | | | | | | | The portable code in cmd/ld already knows how to process it, we just have to ignore it during code generation. R=ken2 CC=golang-dev https://codereview.appspot.com/11363043
* cmd/ld, runtime: new in-memory symbol table formatRuss Cox2013-07-163-3/+10
| | | | | | | | | | | | | | | | | Design at http://golang.org/s/go12symtab. This enables some cleanup of the garbage collector metadata that will be done in future CLs. This CL does not move the old symtab and pclntab back into an unmapped section of the file. That's a bit tricky and will be done separately. Fixes issue 4020. R=golang-dev, dave, cshapiro, iant, r CC=golang-dev, nigeltao https://codereview.appspot.com/11085043
* cmd/ld: fix large stack split for preempt checkRuss Cox2013-07-121-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the stack frame size is larger than the known-unmapped region at the bottom of the address space, then the stack split prologue cannot use the usual condition: SP - size >= stackguard because SP - size may wrap around to a very large number. Instead, if the stack frame is large, the prologue tests: SP - stackguard >= size (This ends up being a few instructions more expensive, so we don't do it always.) Preemption requests register by setting stackguard to a very large value, so that the first test (SP - size >= stackguard) cannot possibly succeed. Unfortunately, that same very large value causes a wraparound in the second test (SP - stackguard >= size), making it succeed incorrectly. To avoid *that* wraparound, we have to amend the test: stackguard != StackPreempt && SP - stackguard >= size This test is only used for functions with large frames, which essentially always split the stack, so the cost of the few instructions is noise. This CL and CL 11085043 together fix the known issues with preemption, at the beginning of a function, so we will be able to try turning it on again. R=ken2 CC=golang-dev https://codereview.appspot.com/11205043
* cmd/ld: place read-only data in non-executable segmentRuss Cox2013-07-111-3/+10
| | | | | | R=golang-dev, dave, r CC=golang-dev, nigeltao https://codereview.appspot.com/10713043
* cmd/5l, cmd/6l, cmd/8l: increase error buffer sizeRuss Cox2013-07-111-1/+1
| | | | | | | | | | | STRINGSZ (200) is fine for lines generated by things like instruction dumps, but an error containing a couple file names can easily exceed that, especially on Macs with the ridiculous default $TMPDIR. R=ken2 CC=golang-dev https://codereview.appspot.com/11199043
* runtime: record proper goroutine state during stack splitRuss Cox2013-06-271-106/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Until now, the goroutine state has been scattered during the execution of newstack and oldstack. It's all there, and those routines know how to get back to a working goroutine, but other pieces of the system, like stack traces, do not. If something does interrupt the newstack or oldstack execution, the rest of the system can't understand the goroutine. For example, if newstack decides there is an overflow and calls throw, the stack tracer wouldn't dump the goroutine correctly. For newstack to save a useful state snapshot, it needs to be able to rewind the PC in the function that triggered the split back to the beginning of the function. (The PC is a few instructions in, just after the call to morestack.) To make that possible, we change the prologues to insert a jmp back to the beginning of the function after the call to morestack. That is, the prologue used to be roughly: TEXT myfunc check for split jmpcond nosplit call morestack nosplit: sub $xxx, sp Now an extra instruction is inserted after the call: TEXT myfunc start: check for split jmpcond nosplit call morestack jmp start nosplit: sub $xxx, sp The jmp is not executed directly. It is decoded and simulated by runtime.rewindmorestack to discover the beginning of the function, and then the call to morestack returns directly to the start label instead of to the jump instruction. So logically the jmp is still executed, just not by the cpu. The prologue thus repeats in the case of a function that needs a stack split, but against the cost of the split itself, the extra few instructions are noise. The repeated prologue has the nice effect of making a stack split double-check that the new stack is big enough: if morestack happens to return on a too-small stack, we'll now notice before corruption happens. The ability for newstack to rewind to the beginning of the function should help preemption too. If newstack decides that it was called for preemption instead of a stack split, it now has the goroutine state correctly paused if rescheduling is needed, and when the goroutine can run again, it can return to the start label on its original stack and re-execute the split check. Here is an example of a split stack overflow showing the full trace, without any special cases in the stack printer. (This one was triggered by making the split check incorrect.) runtime: newstack framesize=0x0 argsize=0x18 sp=0x6aebd0 stack=[0x6b0000, 0x6b0fa0] morebuf={pc:0x69f5b sp:0x6aebd8 lr:0x0} sched={pc:0x68880 sp:0x6aebd0 lr:0x0 ctxt:0x34e700} runtime: split stack overflow: 0x6aebd0 < 0x6b0000 fatal error: runtime: split stack overflow goroutine 1 [stack split]: runtime.mallocgc(0x290, 0x100000000, 0x1) /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:21 fp=0x6aebd8 runtime.new() /Users/rsc/g/go/src/pkg/runtime/zmalloc_darwin_amd64.c:682 +0x5b fp=0x6aec08 go/build.(*Context).Import(0x5ae340, 0xc210030c71, 0xa, 0xc2100b4380, 0x1b, ...) /Users/rsc/g/go/src/pkg/go/build/build.go:424 +0x3a fp=0x6b00a0 main.loadImport(0xc210030c71, 0xa, 0xc2100b4380, 0x1b, 0xc2100b42c0, ...) /Users/rsc/g/go/src/cmd/go/pkg.go:249 +0x371 fp=0x6b01a8 main.(*Package).load(0xc21017c800, 0xc2100b42c0, 0xc2101828c0, 0x0, 0x0, ...) /Users/rsc/g/go/src/cmd/go/pkg.go:431 +0x2801 fp=0x6b0c98 main.loadPackage(0x369040, 0x7, 0xc2100b42c0, 0x0) /Users/rsc/g/go/src/cmd/go/pkg.go:709 +0x857 fp=0x6b0f80 ----- stack segment boundary ----- main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc2100e6c00, 0xc2100e5750, ...) /Users/rsc/g/go/src/cmd/go/build.go:539 +0x437 fp=0x6b14a0 main.(*builder).action(0xc2100902a0, 0x0, 0x0, 0xc21015b400, 0x2, ...) /Users/rsc/g/go/src/cmd/go/build.go:528 +0x1d2 fp=0x6b1658 main.(*builder).test(0xc2100902a0, 0xc210092000, 0x0, 0x0, 0xc21008ff60, ...) /Users/rsc/g/go/src/cmd/go/test.go:622 +0x1b53 fp=0x6b1f68 ----- stack segment boundary ----- main.runTest(0x5a6b20, 0xc21000a020, 0x2, 0x2) /Users/rsc/g/go/src/cmd/go/test.go:366 +0xd09 fp=0x6a5cf0 main.main() /Users/rsc/g/go/src/cmd/go/main.go:161 +0x4f9 fp=0x6a5f78 runtime.main() /Users/rsc/g/go/src/pkg/runtime/proc.c:183 +0x92 fp=0x6a5fa0 runtime.goexit() /Users/rsc/g/go/src/pkg/runtime/proc.c:1266 fp=0x6a5fa8 And here is a seg fault during oldstack: SIGSEGV: segmentation violation PC=0x1b2a6 runtime.oldstack() /Users/rsc/g/go/src/pkg/runtime/stack.c:159 +0x76 runtime.lessstack() /Users/rsc/g/go/src/pkg/runtime/asm_amd64.s:270 +0x22 goroutine 1 [stack unsplit]: fmt.(*pp).printArg(0x2102e64e0, 0xe5c80, 0x2102c9220, 0x73, 0x0, ...) /Users/rsc/g/go/src/pkg/fmt/print.go:818 +0x3d3 fp=0x221031e6f8 fmt.(*pp).doPrintf(0x2102e64e0, 0x12fb20, 0x2, 0x221031eb98, 0x1, ...) /Users/rsc/g/go/src/pkg/fmt/print.go:1183 +0x15cb fp=0x221031eaf0 fmt.Sprintf(0x12fb20, 0x2, 0x221031eb98, 0x1, 0x1, ...) /Users/rsc/g/go/src/pkg/fmt/print.go:234 +0x67 fp=0x221031eb40 flag.(*stringValue).String(0x2102c9210, 0x1, 0x0) /Users/rsc/g/go/src/pkg/flag/flag.go:180 +0xb3 fp=0x221031ebb0 flag.(*FlagSet).Var(0x2102f6000, 0x293d38, 0x2102c9210, 0x143490, 0xa, ...) /Users/rsc/g/go/src/pkg/flag/flag.go:633 +0x40 fp=0x221031eca0 flag.(*FlagSet).StringVar(0x2102f6000, 0x2102c9210, 0x143490, 0xa, 0x12fa60, ...) /Users/rsc/g/go/src/pkg/flag/flag.go:550 +0x91 fp=0x221031ece8 flag.(*FlagSet).String(0x2102f6000, 0x143490, 0xa, 0x12fa60, 0x0, ...) /Users/rsc/g/go/src/pkg/flag/flag.go:563 +0x87 fp=0x221031ed38 flag.String(0x143490, 0xa, 0x12fa60, 0x0, 0x161950, ...) /Users/rsc/g/go/src/pkg/flag/flag.go:570 +0x6b fp=0x221031ed80 testing.init() /Users/rsc/g/go/src/pkg/testing/testing.go:-531 +0xbb fp=0x221031edc0 strings_test.init() /Users/rsc/g/go/src/pkg/strings/strings_test.go:1115 +0x62 fp=0x221031ef70 main.init() strings/_test/_testmain.go:90 +0x3d fp=0x221031ef78 runtime.main() /Users/rsc/g/go/src/pkg/runtime/proc.c:180 +0x8a fp=0x221031efa0 runtime.goexit() /Users/rsc/g/go/src/pkg/runtime/proc.c:1269 fp=0x221031efa8 goroutine 2 [runnable]: runtime.MHeap_Scavenger() /Users/rsc/g/go/src/pkg/runtime/mheap.c:438 runtime.goexit() /Users/rsc/g/go/src/pkg/runtime/proc.c:1269 created by runtime.main /Users/rsc/g/go/src/pkg/runtime/proc.c:166 rax 0x23ccc0 rbx 0x23ccc0 rcx 0x0 rdx 0x38 rdi 0x2102c0170 rsi 0x221032cfe0 rbp 0x221032cfa0 rsp 0x7fff5fbff5b0 r8 0x2102c0120 r9 0x221032cfa0 r10 0x221032c000 r11 0x104ce8 r12 0xe5c80 r13 0x1be82baac718 r14 0x13091135f7d69200 r15 0x0 rip 0x1b2a6 rflags 0x10246 cs 0x2b fs 0x0 gs 0x0 Fixes issue 5723. R=r, dvyukov, go.peter.90, dave, iant CC=golang-dev https://codereview.appspot.com/10360048
* cmd/gc: move genembedtramp into portable codeRuss Cox2013-06-111-4/+20
| | | | | | | | | | | | | | | | | | | | | | Requires adding new linker instruction RET f(SB) meaning return but then immediately call f. This is what you'd use to implement a tail call after fiddling with the arguments, but the compiler only uses it in genwrapper. This CL eliminates the copy-and-paste genembedtramp functions from 5g/8g/6g and makes the code run on ARM for the first time. It removes a small special case for function generation, which should help Carl a bit, but at the same time it does not bother to implement general tail call optimization, which we do not want anyway. Fixes issue 5627. R=ken2 CC=golang-dev https://codereview.appspot.com/10057044
* cmd/5l: use BLX for BL (Rx).Shenghou Ma2013-06-112-12/+10
| | | | | | | | | | | | Fixes issue 5111. Update issue 4718 This CL makes BL (Rx) to use BLX Rx instead of: MOV LR, PC MOV PC, Rx R=cshapiro, rsc CC=dave, gobot, golang-dev https://codereview.appspot.com/9669045
* cmd/5l: use guaranteed undefined instruction for UNDEF to match [68]l.Shenghou Ma2013-06-111-5/+3
| | | | | | R=golang-dev, dave, rsc CC=golang-dev https://codereview.appspot.com/10085050