summaryrefslogtreecommitdiff
path: root/src/cmd/5g
Commit message (Collapse)AuthorAgeFilesLines
* [dev.cc] cmd/5g,cmd/6g,cmd/9g: fix warnings on Plan 9David du Colombier2014-11-251-1/+1
| | | | | | | | | | | warning: src/cmd/5g/reg.c:461 format mismatch d VLONG, arg 5 warning: src/cmd/6g/reg.c:396 format mismatch d VLONG, arg 5 warning: src/cmd/9g/reg.c:440 format mismatch d VLONG, arg 5 LGTM=minux R=rsc, minux CC=golang-codereviews https://codereview.appspot.com/179300043
* [dev.cc] all: merge default (e4ab8f908aac) into dev.ccRuss Cox2014-11-204-55/+89
|\ | | | | | | | | | | TBR=austin CC=golang-codereviews https://codereview.appspot.com/179040044
| * [dev.cc] all: merge dev.power64 (f57928630b36) into dev.ccRuss Cox2014-11-201-1/+1
| |\ | | | | | | | | | | | | | | | | | | | | | This will be the last dev.power64 merge; we'll finish on dev.cc. TBR=austin CC=golang-codereviews https://codereview.appspot.com/175420043
| | * [dev.cc] all: merge dev.power64 (7667e41f3ced) into dev.ccRuss Cox2014-11-141-1/+1
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is to reduce the delta between dev.cc and dev.garbage to just garbage collector changes. These are the files that had merge conflicts and have been edited by hand: malloc.go mem_linux.go mgc.go os1_linux.go proc1.go panic1.go runtime1.go LGTM=austin R=austin CC=golang-codereviews https://codereview.appspot.com/174180043
| * | | [dev.power64] 5g,6g,8g,9g: debug prints for regopt pass 6 and paint2Austin Clements2014-11-142-15/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Theses were very helpful in understanding the regions and register selection when porting regopt to 9g. Add them to the other compilers (and improve 9g's successor debug print). LGTM=rsc R=rsc CC=golang-codereviews https://codereview.appspot.com/174130043
| * | | [dev.power64] 5g,6g,8g: synchronize documentation for regopt structuresAustin Clements2014-11-141-6/+22
| |/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I added several comments to the regopt-related structures when porting it to 9g. Synchronize those comments back in to the other compilers. LGTM=rsc R=rsc CC=golang-codereviews https://codereview.appspot.com/175720043
| * | [dev.power64] 5g: fix mistaken bit-wise AND in regoptAustin Clements2014-11-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Replace a bit-wise AND with a logical one. This happened to work before because bany returns 0 or 1, but the intent here is clearly logical (and this makes 5g match with 6g and 8g). LGTM=rsc R=rsc CC=golang-codereviews https://codereview.appspot.com/172850043
| * | [dev.power64] 5g: fix etype and width of itable AddrsAustin Clements2014-11-061-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For OITAB nodes, 5g's naddr was setting the wrong etype and failing to set the width of the resulting Addr. LGTM=rsc R=rsc CC=golang-codereviews https://codereview.appspot.com/171220043
| * | [dev.power64] gc: fix etype of stringsAustin Clements2014-11-061-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The etype of references to strings was being incorrectly set to TINT32 on all platforms. Change it to TSTRING. It seems this doesn't matter for compilation, since x86 uses LEA instructions to load string addresses and arm and power64 disassemble the string into its constituent pieces (with the correct types), but it helps when debugging. LGTM=rsc R=rsc CC=golang-codereviews https://codereview.appspot.com/170100043
| * | [dev.power64] gc: convert Bits to a uint64 arrayAustin Clements2014-11-042-31/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | So far all of our architectures have had at most 32 registers, so we've been able to use entry 0 in the Bits uint32 array directly as a register mask. Power64 has 64 registers, so this converts Bits to a uint64 array so we can continue to use entry 0 directly as a register mask on Power64. LGTM=rsc R=rsc CC=golang-codereviews https://codereview.appspot.com/169060043
* | | cmd/5g: fix bit mask for div/mod routines clobbering R12Ian Lance Taylor2014-11-091-1/+1
| |/ |/| | | | | | | | | | | | | | | | | | | | | This patch is based only on reading the code. I have not tried to construct a test case. Fixes issue 9077. LGTM=minux R=minux CC=golang-codereviews https://codereview.appspot.com/172110043
* | 5g: don't generate reg variables for direct-called functionsAustin Clements2014-11-051-1/+1
|/ | | | | | | | | | | | The test intended to skip direct calls when creating registerization variables was testing p->to.type instead of p->to.name, so it always failed, causing regopt to create unnecessary variables for these names. LGTM=rsc R=rsc CC=golang-codereviews https://codereview.appspot.com/169110043
* cmd/5g, cmd/8g: make 'out of registers' a fatal errorRuss Cox2014-09-161-2/+2
| | | | | | | | | | There's no point in continuing. We will only get confused. 6g already makes this fatal. LGTM=dave, minux, iant R=iant, dave, minux CC=golang-codereviews https://codereview.appspot.com/140660043
* cmd/cc, cmd/gc: stop generating 'argsize' PCDATARuss Cox2014-09-123-30/+0
| | | | | | | | | | | | | | The argsize PCDATA was specifying the number of bytes passed to a function call, so that if the function did not specify its argument count, the garbage collector could use the call site information to scan those bytes conservatively. We don't do that anymore, so stop generating the information. LGTM=khr R=khr CC=golang-codereviews https://codereview.appspot.com/139530043
* build: adjustments for move from src/pkg to srcRuss Cox2014-09-083-4/+4
| | | | | | | | | | | | | | | | | | | This CL adjusts code referring to src/pkg to refer to src. Immediately after submitting this CL, I will submit a change doing 'hg mv src/pkg/* src'. That change will be too large to review with Rietveld but will contain only the 'hg mv'. This CL will break the build. The followup 'hg mv' will fix it. For more about the move, see golang.org/s/go14nopkg. LGTM=r R=r CC=golang-codereviews https://codereview.appspot.com/134570043
* cmd/gc, runtime: treat slices and strings like pointers in garbage collectionRuss Cox2014-08-252-2/+0
| | | | | | | | | | | | | | | | | | | Before, a slice with cap=0 or a string with len=0 might have its base pointer pointing beyond the actual slice/string data into the next block. The collector had to ignore slices and strings with cap=0 in order to avoid misinterpreting the base pointer. Now, a slice with cap=0 or a string with len=0 still has a base pointer pointing into the actual slice/string data, no matter what. The collector can now always scan the pointer, which means strings and slices are no longer special. Fixes issue 8404. LGTM=khr, josharian R=josharian, khr, dvyukov CC=golang-codereviews https://codereview.appspot.com/112570044
* cmd/5g, cmd/6g, cmd/8g: clear Addr node when registerizingJosh Bleecher Snyder2014-08-241-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update issue 8525 Some temporary variables that were fully registerized nevertheless had stack space allocated for them because the Addrs were still marked as having associated nodes. Distribution of stack space reserved for temporary variables while running make.bash (6g): BEFORE 40.89% 7026 allocauto: 0 to 0 7.83% 1346 allocauto: 0 to 24 7.22% 1241 allocauto: 0 to 8 6.30% 1082 allocauto: 0 to 16 4.96% 853 allocauto: 0 to 56 4.59% 789 allocauto: 0 to 32 2.97% 510 allocauto: 0 to 40 2.32% 399 allocauto: 0 to 48 2.10% 360 allocauto: 0 to 64 1.91% 328 allocauto: 0 to 72 AFTER 48.49% 8332 allocauto: 0 to 0 9.52% 1635 allocauto: 0 to 16 5.28% 908 allocauto: 0 to 48 4.80% 824 allocauto: 0 to 32 4.73% 812 allocauto: 0 to 8 3.38% 581 allocauto: 0 to 24 2.35% 404 allocauto: 0 to 40 2.32% 399 allocauto: 0 to 64 1.65% 284 allocauto: 0 to 56 1.34% 230 allocauto: 0 to 72 LGTM=rsc R=rsc CC=dave, dvyukov, golang-codereviews, minux https://codereview.appspot.com/126160043
* cmd/5g, cmd/8g: registerize small structs and arraysRuss Cox2014-08-241-1/+72
| | | | | | | | | | | | | | | cmd/6g has been doing this for a long time. Arrays are still problematic on 5g because the addressing for t[0] where local var t has type [3]uintptr takes the address of t. That's issue 8125. Fixes issue 8123. LGTM=josharian R=josharian, dave CC=golang-codereviews https://codereview.appspot.com/102890046
* cmd/5g: remove dead code.Shenghou Ma2014-07-261-2/+1
| | | | | | | LGTM=bradfitz, dave R=rsc, bradfitz, dave CC=golang-codereviews https://codereview.appspot.com/115070043
* cmd/gc: mark auxiliary symbols as containing no pointersDmitriy Vyukov2014-07-231-5/+2
| | | | | | | | | | | | They do not, but pretend that they do. The immediate need is that it breaks the new GC because these are weird symbols as if with pointers but not necessary pointer aligned. LGTM=rsc R=golang-codereviews, dave, josharian, khr, rsc CC=golang-codereviews, iant, khr, rlh https://codereview.appspot.com/116060043
* cmd/5c, cmd/5g, cmd/5l, liblink: nacl/arm supportShenghou Ma2014-07-104-3/+11
| | | | | | | LGTM=dave, rsc R=rsc, iant, dave CC=golang-codereviews https://codereview.appspot.com/108360043
* build: annotations and modifications for c2goRuss Cox2014-07-026-8/+39
| | | | | | | | | | | | | | | | | | | | | | | The main changes fall into a few patterns: 1. Replace #define with enum. 2. Add /*c2go */ comment giving effect of #define. This is necessary for function-like #defines and non-enum-able #defined constants. (Not all compilers handle negative or large enums.) 3. Add extra braces in struct initializer. (c2go does not implement the full rules.) This is enough to let c2go typecheck the source tree. There may be more changes once it is doing other semantic analyses. LGTM=minux, iant R=minux, dave, iant CC=golang-codereviews https://codereview.appspot.com/106860045
* cmd/6g: treat vardef-initialized fat variables as live at callsRuss Cox2014-05-301-2/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This CL forces the optimizer to preserve some memory stores that would be redundant except that a stack scan due to garbage collection or stack copying might look at them during a function call. As such, it forces additional memory writes and therefore slows down the execution of some programs, especially garbage-heavy programs that are already limited by memory bandwidth. The slowdown can be as much as 7% for end-to-end benchmarks. These numbers are from running go1.test -test.benchtime=5s three times, taking the best (lowest) ns/op for each benchmark. I am excluding benchmarks with time/op < 10us to focus on macro effects. All benchmarks are on amd64. Comparing tip (a27f34c771cb) against this CL on an Intel Core i5 MacBook Pro: benchmark old ns/op new ns/op delta BenchmarkBinaryTree17 3876500413 3856337341 -0.52% BenchmarkFannkuch11 2965104777 2991182127 +0.88% BenchmarkGobDecode 8563026 8788340 +2.63% BenchmarkGobEncode 5050608 5267394 +4.29% BenchmarkGzip 431191816 434168065 +0.69% BenchmarkGunzip 107873523 110563792 +2.49% BenchmarkHTTPClientServer 85036 86131 +1.29% BenchmarkJSONEncode 22143764 22501647 +1.62% BenchmarkJSONDecode 79646916 85658808 +7.55% BenchmarkMandelbrot200 4720421 4700108 -0.43% BenchmarkGoParse 4651575 4712247 +1.30% BenchmarkRegexpMatchMedium_1K 71986 73490 +2.09% BenchmarkRegexpMatchHard_1K 111018 117495 +5.83% BenchmarkRevcomp 648798723 659352759 +1.63% BenchmarkTemplate 112673009 112819078 +0.13% Comparing tip (a27f34c771cb) against this CL on an Intel Xeon E5520: BenchmarkBinaryTree17 5461110720 5393104469 -1.25% BenchmarkFannkuch11 4314677151 4327177615 +0.29% BenchmarkGobDecode 11065853 11235272 +1.53% BenchmarkGobEncode 6500065 6959837 +7.07% BenchmarkGzip 647478596 671769097 +3.75% BenchmarkGunzip 139348579 141096376 +1.25% BenchmarkHTTPClientServer 69376 73610 +6.10% BenchmarkJSONEncode 30172320 31796106 +5.38% BenchmarkJSONDecode 113704905 114239137 +0.47% BenchmarkMandelbrot200 6032730 6003077 -0.49% BenchmarkGoParse 6775251 6405995 -5.45% BenchmarkRegexpMatchMedium_1K 111832 113895 +1.84% BenchmarkRegexpMatchHard_1K 161112 168420 +4.54% BenchmarkRevcomp 876363406 892319935 +1.82% BenchmarkTemplate 146273096 148998339 +1.86% Just to get a sense of where we are compared to the previous release, here are the same benchmarks comparing Go 1.2 to this CL. Comparing Go 1.2 against this CL on an Intel Core i5 MacBook Pro: BenchmarkBinaryTree17 4370077662 3856337341 -11.76% BenchmarkFannkuch11 3347052657 2991182127 -10.63% BenchmarkGobDecode 8791384 8788340 -0.03% BenchmarkGobEncode 4968759 5267394 +6.01% BenchmarkGzip 437815669 434168065 -0.83% BenchmarkGunzip 94604099 110563792 +16.87% BenchmarkHTTPClientServer 87798 86131 -1.90% BenchmarkJSONEncode 22818243 22501647 -1.39% BenchmarkJSONDecode 97182444 85658808 -11.86% BenchmarkMandelbrot200 4733516 4700108 -0.71% BenchmarkGoParse 5054384 4712247 -6.77% BenchmarkRegexpMatchMedium_1K 67612 73490 +8.69% BenchmarkRegexpMatchHard_1K 107321 117495 +9.48% BenchmarkRevcomp 733270055 659352759 -10.08% BenchmarkTemplate 109304977 112819078 +3.21% Comparing Go 1.2 against this CL on an Intel Xeon E5520: BenchmarkBinaryTree17 5986953594 5393104469 -9.92% BenchmarkFannkuch11 4861139174 4327177615 -10.98% BenchmarkGobDecode 11830997 11235272 -5.04% BenchmarkGobEncode 6608722 6959837 +5.31% BenchmarkGzip 661875826 671769097 +1.49% BenchmarkGunzip 138630019 141096376 +1.78% BenchmarkHTTPClientServer 71534 73610 +2.90% BenchmarkJSONEncode 30393609 31796106 +4.61% BenchmarkJSONDecode 139645860 114239137 -18.19% BenchmarkMandelbrot200 5988660 6003077 +0.24% BenchmarkGoParse 6974092 6405995 -8.15% BenchmarkRegexpMatchMedium_1K 111331 113895 +2.30% BenchmarkRegexpMatchHard_1K 165961 168420 +1.48% BenchmarkRevcomp 995049292 892319935 -10.32% BenchmarkTemplate 145623363 148998339 +2.32% Fixes issue 8036. LGTM=khr R=golang-codereviews, josharian, khr CC=golang-codereviews, iant, r https://codereview.appspot.com/99660044
* cmd/gc: fix x=x crashRuss Cox2014-05-292-0/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | [Same as CL 102820043 except applied changes to 6g/gsubr.c also to 5g/gsubr.c and 8g/gsubr.c. The problem I had last night trying to do that was that 8g's copy of nodarg has different (but equivalent) control flow and I was pasting the new code into the wrong place.] Description from CL 102820043: The 'nodarg' function is used to obtain a Node* representing a function argument or result. It returned a brand new Node*, but that violates the guarantee in most places in the compiler that two Node*s refer to the same variable if and only if they are the same Node* pointer. Reestablish that invariant by making nodarg return a preexisting named variable if present. Having fixed that, avoid any copy during x=x in componentgen, because the VARDEF we emit before the copy marks the lhs x as dead incorrectly. The change in walk.c avoids modifying the result of nodarg. This was the only place in the compiler that did so. Fixes issue 8097. LGTM=khr R=golang-codereviews, khr CC=golang-codereviews, iant, khr, r https://codereview.appspot.com/103750043
* undo CL 102820043 / b0ce6dbafc18Russ Cox2014-05-281-8/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Breaks 386 and arm builds. The obvious reason is that this CL only edited 6g/gsubr.c and failed to edit 5g/gsubr.c and 8g/gsubr.c. However, the obvious CL applying the same edit to those files (CL 101900043) causes mysterious build failures in various of the standard package tests, usually involving reflect. Something deep and subtle is broken but only on the 32-bit systems. Undo this CL for now. ??? original CL description cmd/gc: fix x=x crash The 'nodarg' function is used to obtain a Node* representing a function argument or result. It returned a brand new Node*, but that violates the guarantee in most places in the compiler that two Node*s refer to the same variable if and only if they are the same Node* pointer. Reestablish that invariant by making nodarg return a preexisting named variable if present. Having fixed that, avoid any copy during x=x in componentgen, because the VARDEF we emit before the copy marks the lhs x as dead incorrectly. The change in walk.c avoids modifying the result of nodarg. This was the only place in the compiler that did so. Fixes issue 8097. LGTM=r, khr R=golang-codereviews, r, khr CC=golang-codereviews, iant https://codereview.appspot.com/102820043 ??? TBR=r CC=golang-codereviews, khr https://codereview.appspot.com/95660043
* cmd/gc: fix x=x crashRuss Cox2014-05-281-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | The 'nodarg' function is used to obtain a Node* representing a function argument or result. It returned a brand new Node*, but that violates the guarantee in most places in the compiler that two Node*s refer to the same variable if and only if they are the same Node* pointer. Reestablish that invariant by making nodarg return a preexisting named variable if present. Having fixed that, avoid any copy during x=x in componentgen, because the VARDEF we emit before the copy marks the lhs x as dead incorrectly. The change in walk.c avoids modifying the result of nodarg. This was the only place in the compiler that did so. Fixes issue 8097. LGTM=r, khr R=golang-codereviews, r, khr CC=golang-codereviews, iant https://codereview.appspot.com/102820043
* cmd/gc: correct handling of globals, func args, resultsRuss Cox2014-05-151-42/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Globals, function arguments, and results are special cases in registerization. Globals must be flushed aggressively, because nearly any operation can cause a panic, and the recovery code must see the latest values. Globals also must be loaded aggressively, because nearly any store through a pointer might be updating a global: the compiler cannot see all the "address of" operations on globals, especially exported globals. To accomplish this, mark all globals as having their address taken, which effectively disables registerization. If a function contains a defer statement, the function results must be flushed aggressively, because nearly any operation can cause a panic, and the deferred code may call recover, causing the original function to return the current values of its function results. To accomplish this, mark all function results as having their address taken if the function contains any defer statements. This causes not just aggressive flushing but also aggressive loading. The aggressive loading is overkill but the best we can do in the current code. Function arguments must be considered live at all safe points in a function, because garbage collection always preserves them: they must be up-to-date in order to be preserved correctly. Accomplish this by marking them live at all call sites. An earlier attempt at this marked function arguments as having their address taken, which disabled registerization completely, making programs slower. This CL's solution allows registerization while preserving safety. The benchmark speedup is caused by being able to registerize again (the earlier CL lost the same amount). benchmark old ns/op new ns/op delta BenchmarkEqualPort32 61.4 56.0 -8.79% benchmark old MB/s new MB/s speedup BenchmarkEqualPort32 521.56 570.97 1.09x Fixes issue 1304. (again) Fixes issue 7944. (again) Fixes issue 7984. Fixes issue 7995. LGTM=khr R=golang-codereviews, khr CC=golang-codereviews, iant, r https://codereview.appspot.com/97500044
* cmd/gc: fix liveness vs regopt mismatch for input variablesRuss Cox2014-05-122-8/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The inputs to a function are marked live at all times in the liveness bitmaps, so that the garbage collector will not free the things they point at and reuse the pointers, so that the pointers shown in stack traces are guaranteed not to have been recycled. Unfortunately, no one told the register optimizer that the inputs need to be preserved at all call sites. If a function is done with a particular input value, the optimizer will stop preserving it across calls. For single-word values this just means that the value recorded might be stale. For multi-word values like slices, the value recorded could be only partially stale: it can happen that, say, the cap was updated but not the len, or that the len was updated but not the base pointer. Either of these possibilities (and others) would make the garbage collector misinterpret memory, leading to memory corruption. This came up in a real program, in which the garbage collector's 'slice len ? slice cap' check caught the inconsistency. Fixes issue 7944. LGTM=iant R=golang-codereviews, iant CC=golang-codereviews, khr https://codereview.appspot.com/100370045
* cmd/gc: alias more variables during register allocationJosh Bleecher Snyder2014-05-122-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is joint work with Daniel Morsing. In order for the register allocator to alias two variables, they must have the same width, stack offset, and etype. Code generation was altering a variable's etype in a few places. This prevented the variable from being moved to a register, which in turn prevented peephole optimization. This failure to alias was very common, with almost 23,000 instances just running make.bash. This phenomenon was not visible in the register allocation debug output because the variables that failed to alias had the same name. The debugging-only change to bits.c fixes this by printing the variable number with its name. This CL fixes the source of all etype mismatches for 6g, all but one case for 8g, and depressingly few cases for 5g. (I believe that extending CL 6819083 to 5g is a prerequisite.) Fixing the remaining cases in 8g and 5g is work for the future. The etype mismatch fixes are: * [gc] Slicing changed the type of the base pointer into a uintptr in order to perform arithmetic on it. Instead, support addition directly on pointers. * [*g] OSPTR was giving type uintptr to slice base pointers; undo that. This arose, for example, while compiling copy(dst, src). * [8g] 64 bit float conversion was assigning int64 type during codegen, overwriting the existing uint64 type. Note that some etype mismatches are appropriate, such as a struct with a single field or an array with a single element. With these fixes, the number of registerizations that occur while running make.bash for 6g increases ~10%. Hello world binary size shrinks ~1.5%. Running all benchmarks in the standard library show performance improvements ranging from nominal to substantive (>10%); a full comparison using 6g on my laptop is available at https://gist.github.com/josharian/8f9b5beb46667c272064. The microbenchmarks must be taken with a grain of salt; see issue 7920. The few benchmarks that show real regressions are likely due to issue 7920. I manually examined the generated code for the top few regressions and none had any assembly output changes. The few benchmarks that show extraordinary improvements are likely also due to issue 7920. Performance results from 8g appear similar to 6g. 5g shows no performance improvements. This is not surprising, given the discussion above. Update issue 7316 LGTM=rsc R=rsc, daniel.morsing, bradfitz CC=dave, golang-codereviews https://codereview.appspot.com/91850043 Committer: Russ Cox <rsc@golang.org>
* cmd/gc: don't give credit for NOPs during register allocationJosh Bleecher Snyder2014-05-091-11/+13
| | | | | | | | | | | | | | | The register allocator decides which variables should be placed into registers by charging for each load/store and crediting for each use, and then selecting an allocation with minimal cost. NOPs will be eliminated, however, so using a variable in a NOP should not generate credit. Issue 7867 arises from attempted registerization of multi-word variables because they are used in NOPs. By not crediting for that use, they will no longer be considered for registerization. This fix could theoretically lead to better register allocation, but NOPs are rare relative to other instructions. Fixes issue 7867. LGTM=rsc R=rsc CC=golang-codereviews https://codereview.appspot.com/94810044
* runtime: use duff zero and copy to initialize memoryKeith Randall2014-05-075-30/+141
| | | | | | | | | | | | | | | | | | | | | | | benchmark old ns/op new ns/op delta BenchmarkCopyFat512 1307 329 -74.83% BenchmarkCopyFat256 666 169 -74.62% BenchmarkCopyFat1024 2617 671 -74.36% BenchmarkCopyFat128 343 89.0 -74.05% BenchmarkCopyFat64 182 48.9 -73.13% BenchmarkCopyFat32 103 28.8 -72.04% BenchmarkClearFat128 102 46.6 -54.31% BenchmarkClearFat512 344 167 -51.45% BenchmarkClearFat64 50.5 26.5 -47.52% BenchmarkClearFat256 147 87.2 -40.68% BenchmarkClearFat32 22.7 16.4 -27.75% BenchmarkClearFat1024 511 662 +29.55% Fixes issue 7624 LGTM=rsc R=golang-codereviews, khr, bradfitz, josharian, dave, rsc CC=golang-codereviews https://codereview.appspot.com/92760044
* cmd/5g, cmd/6g, cmd/8g: preserve wide values in large functionsRuss Cox2014-04-161-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | In large functions with many variables, the register optimizer may give up and choose not to track certain variables at all. In this case, the "nextinnode" information linking together all the words from a given variable will be incomplete, and the result may be that only some of a multiword value is preserved across a call. That confuses the garbage collector, so don't do that. Instead, mark those variables as having their address taken, so that they will be preserved at all calls. It's overkill, but correct. Tested by hand using the 6g -S output to see that it does fix the buggy generated code leading to the issue 7726 failure. There is no automated test because I managed to break the compiler while writing a test (see issue 7727). I will check in a test along with the fix to issue 7727. Fixes issue 7726. LGTM=khr R=khr, bradfitz, dave CC=golang-codereviews https://codereview.appspot.com/85200043
* cmd/5g, cmd/8g: fix buildRuss Cox2014-04-011-6/+6
| | | | | | | | Botched during CL 83090046. TBR=khr CC=golang-codereviews https://codereview.appspot.com/83070046
* cmd/gc: shorten more temporary lifetimesRuss Cox2014-04-011-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. In functions with heap-allocated result variables or with defer statements, the return sequence requires more than just a single RET instruction. There is an optimization that arranges for all returns to jump to a single copy of the return epilogue in this case. Unfortunately, that optimization is fundamentally incompatible with PC-based liveness information: it takes PCs at many different points in the function and makes them all land at one PC, making the combined liveness information at that target PC a mess. Disable this optimization, so that each return site gets its own copy of the 'call deferreturn' and the copying of result variables back from the heap. This removes quite a few spurious 'ambiguously live' variables. 2. Let orderexpr allocate temporaries that are passed by address to a function call and then die on return, so that we can arrange an appropriate VARKILL. 2a. Do this for ... slices. 2b. Do this for closure structs. 2c. Do this for runtime.concatstring, which is the implementation of large string additions. Change representation of OADDSTR to an explicit list in typecheck to avoid reconstructing list in both walk and order. 3. Let orderexpr allocate the temporary variable copies used for range loops, so that they can be killed when the loop is over. Similarly, let it allocate the temporary holding the map iterator. CL 81940043 reduced the number of ambiguously live temps in the godoc binary from 860 to 711. This CL reduces the number to 121. Still more to do, but another good checkpoint. Update issue 7345 LGTM=khr R=khr CC=golang-codereviews https://codereview.appspot.com/83090046
* cmd/gc: shorten temporary lifetimes when possibleRuss Cox2014-04-014-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The new channel and map runtime routines take pointers to values, typically temporaries. Without help, the compiler cannot tell when those temporaries stop being needed, because it isn't sure what happened to the pointer. Arrange to insert explicit VARKILL instructions for these temporaries so that the liveness analysis can avoid seeing them as "ambiguously live". The change is made in order.c, which was already in charge of introducing temporaries to preserve the order-of-evaluation guarantees. Now its job has expanded to include introducing temporaries as needed by runtime routines, and then also inserting the VARKILL annotations for all these temporaries, so that their lifetimes can be shortened. In order to do its job for the map runtime routines, order.c arranges that all map lookups or map assignments have the form: x = m[k] x, y = m[k] m[k] = x where x, y, and k are simple variables (often temporaries). Likewise, receiving from a channel is now always: x = <-c In order to provide the map guarantee, order.c is responsible for rewriting x op= y into x = x op y, so that m[k] += z becomes t = m[k] t2 = t + z m[k] = t2 While here, fix a few bugs in order.c's traversal: it was failing to walk into select and switch case bodies, so order of evaluation guarantees were not preserved in those situations. Added tests to test/reorder2.go. Fixes issue 7671. In gc/popt's temporary-merging optimization, allow merging of temporaries with their address taken as long as the liveness ranges do not intersect. (There is a good chance of that now that we have VARKILL annotations to limit the liveness range.) Explicitly killing temporaries cuts the number of ambiguously live temporaries that must be zeroed in the godoc binary from 860 to 711, or -17%. There is more work to be done, but this is a good checkpoint. Update issue 7345 LGTM=khr R=khr CC=golang-codereviews https://codereview.appspot.com/81940043
* cmd/gc: liveness-related bug fixesRuss Cox2014-03-272-16/+117
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1. On entry to a function, only zero the ambiguously live stack variables. Before, we were zeroing all stack variables containing pointers. The zeroing is pretty inefficient right now (issue 7624), but there are also too many stack variables detected as ambiguously live (issue 7345), and that must be addressed before deciding how to improve the zeroing code. (Changes in 5g/ggen.c, 6g/ggen.c, 8g/ggen.c, gc/pgen.c) Fixes issue 7647. 2. Make the regopt word-based liveness analysis preserve the whole-variable liveness property expected by the garbage collection bitmap liveness analysis. That is, if the regopt liveness decides that one word in a struct needs to be preserved, make sure it preserves the entire struct. This is particularly important for multiword values such as strings, slices, and interfaces, in which all the words need to be present in order to understand the meaning. (Changes in 5g/reg.c, 6g/reg.c, 8g/reg.c.) Fixes issue 7591. 3. Make the regopt word-based liveness analysis treat a variable as having its address taken - which makes it preserved across all future calls - whenever n->addrtaken is set, for consistency with the gc bitmap liveness analysis, even if there is no machine instruction actually taking the address. In this case n->addrtaken is incorrect (a nicer way to put it is overconservative), and ideally there would be no such cases, but they can happen and the two analyses need to agree. (Changes in 5g/reg.c, 6g/reg.c, 8g/reg.c; test in bug484.go.) Fixes crashes found by turning off "zero everything" in step 1. 4. Remove spurious VARDEF annotations. As the comment in gc/pgen.c explains, the VARDEF must immediately precede the initialization. It cannot be too early, and it cannot be too late. In particular, if a function call sits between the VARDEF and the actual machine instructions doing the initialization, the variable will be treated as live during that function call even though it is uninitialized, leading to problems. (Changes in gc/gen.c; test in live.go.) Fixes crashes found by turning off "zero everything" in step 1. 5. Do not treat loading the address of a wide value as a signal that the value must be initialized. Instead depend on the existence of a VARDEF or the first actual read/write of a word in the value. If the load is in order to pass the address to a function that does the actual initialization, treating the load as an implicit VARDEF causes the same problems as described in step 4. The alternative is to arrange to zero every such value before passing it to the real initialization function, but this is a much easier and more efficient change. (Changes in gc/plive.c.) Fixes crashes found by turning off "zero everything" in step 1. 6. Treat wide input parameters with their address taken as initialized on entry to the function. Otherwise they look "ambiguously live" and we will try to emit code to zero them. (Changes in gc/plive.c.) Fixes crashes found by turning off "zero everything" in step 1. 7. An array of length 0 has no pointers, even if the element type does. Without this change, the zeroing code complains when asked to clear a 0-length array. (Changes in gc/reflect.c.) LGTM=khr R=khr CC=golang-codereviews https://codereview.appspot.com/80160044
* cmd/gc, cmd/5g, cmd/6g, cmd/8g: introduce linkarchinit and add amd64p32 supportDave Cheney2014-03-071-0/+5
| | | | | | | | | | | | | Replaces CL 70000043. Introduce linkarchinit() from cmd/ld. For cmd/6g, switch to the amd64p32 linker model if we are building under nacl/amd64p32. LGTM=rsc R=rsc CC=golang-codereviews https://codereview.appspot.com/71330045
* cmd/ld: clear unused ctxt before morestackRuss Cox2014-03-041-0/+2
| | | | | | | | | | | | | | | | | | For non-closure functions, the context register is uninitialized on entry and will not be used, but morestack saves it and then the garbage collector treats it as live. This can be a source of memory leaks if the context register points at otherwise dead memory. Avoid this by introducing a parallel set of morestack functions that clear the context register, and use those for the non-closure functions. I hope this will help with some of the finalizer flakiness, but it probably won't. Fixes issue 7244. LGTM=dvyukov R=khr, dvyukov CC=golang-codereviews https://codereview.appspot.com/71030044
* 5g, 8g: remove dead codeJosh Bleecher Snyder2014-02-252-5/+1
| | | | | | | | | | | maxstksize is superfluous and appears to be vestigial. 6g does not use it. c >= 4 cannot occur; c = w % 4. LGTM=rsc R=rsc CC=golang-codereviews https://codereview.appspot.com/68750043
* all: merge NaCl branch (part 1)Dave Cheney2014-02-251-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | See golang.org/s/go13nacl for design overview. This CL is the mostly mechanical changes from rsc's Go 1.2 based NaCl branch, specifically 39cb35750369 to 500771b477cf from https://code.google.com/r/rsc-go13nacl. This CL does not include working NaCl support, there are probably two or three more large merges to come. CL 15750044 is not included as it involves more invasive changes to the linker which will need to be merged separately. The exact change lists included are 15050047: syscall: support for Native Client 15360044: syscall: unzip implementation for Native Client 15370044: syscall: Native Client SRPC implementation 15400047: cmd/dist, cmd/go, go/build, test: support for Native Client 15410048: runtime: support for Native Client 15410049: syscall: file descriptor table for Native Client 15410050: syscall: in-memory file system for Native Client 15440048: all: update +build lines for Native Client port 15540045: cmd/6g, cmd/8g, cmd/gc: support for Native Client 15570045: os: support for Native Client 15680044: crypto/..., hash/crc32, reflect, sync/atomic: support for amd64p32 15690044: net: support for Native Client 15690048: runtime: support for fake time like on Go Playground 15690051: build: disable various tests on Native Client LGTM=rsc R=rsc CC=golang-codereviews https://codereview.appspot.com/68150047 Committer: Russ Cox <rsc@golang.org>
* cmd/5g, cmd/8g: zero ambiguously live values on entryRuss Cox2014-02-191-0/+38
| | | | | | | | | | | | | | The code here is being restored after its deletion in CL 14430048. I restored the copy in cmd/6g in CL 56430043 but neglected the other two. This is the reason that enabling precisestack only worked on amd64. LGTM=r R=r CC=golang-codereviews https://codereview.appspot.com/66170043
* cmd/gc: correct liveness for fat variablesRuss Cox2014-02-154-9/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The VARDEF placement must be before the initialization but after any final use. If you have something like s = ... using s ... the rhs must be evaluated, then the VARDEF, then the lhs assigned. There is a large comment in pgen.c on gvardef explaining this in more detail. This CL also includes Ian's suggestions from earlier CLs, namely commenting the use of mode in link.h and fixing the precedence of the ~r check in dcl.c. This CL enables the check that if liveness analysis decides a variable is live on entry to the function, that variable must be a function parameter (not a result, and not a local variable). If this check fails, it indicates a bug in the liveness analysis or in the generated code being analyzed. The race detector generates invalid code for append(x, y...). The code declares a temporary t and then uses cap(t) before initializing t. The new liveness check catches this bug and stops the compiler from writing out the buggy code. Consequently, this CL disables the race detector tests in run.bash until the race detector bug can be fixed (golang.org/issue/7334). Except for the race detector bug, the liveness analysis check does not detect any problems (this CL and the previous CLs fixed all the detected problems). The net test still fails with GOGC=0 but the rest of the tests now pass or time out (because GOGC=0 is so slow). TBR=iant CC=golang-codereviews https://codereview.appspot.com/64170043
* cmd/gc: handle variable initialization by block move in livenessRuss Cox2014-02-131-0/+12
| | | | | | | | | | | | Any initialization of a variable by a block copy or block zeroing or by multiple assignments (componentwise copying or zeroing of a multiword variable) needs to emit a VARDEF. These cases were not. Fixes issue 7205. TBR=iant CC=golang-codereviews https://codereview.appspot.com/63650044
* cmd/5g, cmd/8g: fix buildRuss Cox2014-02-131-2/+1
| | | | | | | | | | | | The test added in CL 63630043 fails on 5g and 8g because they were not emitting the VARDEF instruction when clearing a fat value by clearing the components. 6g had the call in the right place. Hooray tests. TBR=iant CC=golang-codereviews https://codereview.appspot.com/63660043
* cmd/gc: rename AFATVARDEF to AVARDEFRuss Cox2014-02-133-3/+3
| | | | | | | | | | | The "fat" referred to being used for multiword values only. We're going to use it for non-fat values sometimes too. No change other than the renaming. TBR=iant CC=golang-codereviews https://codereview.appspot.com/63650043
* cmd/5g: fix regopt bug in copypropRuss Cox2014-02-132-25/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | copyau1 was assuming that it could deduce the type of the middle register p->reg from the type of the left or right argument: in CMPF F1, F2, the p->reg==2 must be a D_FREG because p->from is F1, and in CMP R1, R2, the p->reg==2 must be a D_REG because p->from is R1. This heuristic fails for CMP $0, R2, which was causing copyau1 not to recognize p->reg==2 as a reference to R2, which was keeping it from properly renaming the register use when substituting registers. cmd/5c has the right approach: look at the opcode p->as to decide the kind of register. It is unclear where 5g's copyau1 came from; perhaps it was an attempt to avoid expanding 5c's a2type to include new instructions used only by 5g. Copy a2type from cmd/5c, expand to include additional instructions, and make it crash the compiler if asked about an instruction it does not understand (avoid silent bugs in the future if new instructions are added). Should fix current arm build breakage. While we're here, fix the print statements dumping the pred and succ info in the asm listing to pass an int arg to %.4ud (Prog.pc is a vlong now, due to the liblink merge). TBR=ken2 CC=golang-codereviews https://codereview.appspot.com/62730043
* cmd/5g: fix print format in peephole debuggingAnthony Martin2014-02-121-2/+2
| | | | | | | | | Fixes issue 7294. LGTM=minux.ma, dave, bradfitz R=golang-codereviews, minux.ma, dave, bradfitz CC=golang-codereviews https://codereview.appspot.com/61370043
* cmd/cc, cmd/gc, cmd/ld: consolidate print format routinesAnthony Martin2014-02-126-359/+7
| | | | | | | | | | | | | | | We now use the %A, %D, %P, and %R routines from liblink across the board. Fixes issue 7178. Fixes issue 7055. LGTM=iant R=golang-codereviews, gobot, rsc, dave, iant, remyoudompheng CC=golang-codereviews https://codereview.appspot.com/49170043 Committer: Russ Cox <rsc@golang.org>
* cmd/6g, cmd/8g, cmd/5g: make the undefined instruction have no successorsDaniel Morsing2014-02-111-1/+1
| | | | | | | | | | | The UNDEF instruction was listed in the instruction data as having the next instruction in the stream as its successor. This confused the optimizer into adding a load where it wasn't needed, in turn confusing the liveness analysis pass for GC bitmaps into thinking that the variable was live. Fixes issue 7229. LGTM=iant, rsc R=golang-codereviews, bradfitz, iant, dave, rsc CC=golang-codereviews https://codereview.appspot.com/56910045
* cmd/gc: bypass DATA instruction for data initialized to integer constantRuss Cox2013-12-201-21/+0
| | | | | | | | | | | | | | | | | Eventually we will want to bypass DATA for everything, but the relocations are not standardized well enough across architectures to make that possible. This did not help as much as I expected, but it is definitely better. It shaves maybe 1-2% off all.bash depending on how much you trust the timings of a single run: Before: 241.139r 362.702u 112.967s After: 234.339r 359.623u 111.045s R=golang-codereviews, gobot, r, iant CC=golang-codereviews https://codereview.appspot.com/44650043