summaryrefslogtreecommitdiff
path: root/compiler/nativeGen/X86
Commit message (Collapse)AuthorAgeFilesLines
* A bunch of typofixesGabor Greif2017-09-261-1/+1
|
* compiler: introduce custom "GhcPrelude" PreludeHerbert Valerio Riedel2017-09-196-0/+12
| | | | | | | | | | | | | | | | | | This switches the compiler/ component to get compiled with -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all modules. This is motivated by the upcoming "Prelude" re-export of `Semigroup((<>))` which would cause lots of name clashes in every modulewhich imports also `Outputable` Reviewers: austin, goldfire, bgamari, alanz, simonmar Reviewed By: bgamari Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari Differential Revision: https://phabricator.haskell.org/D3989
* nativeGen: Consistently use blockLbl to generate CLabels from BlockIdsBen Gamari2017-09-192-6/+6
| | | | | | | | | | | | | | | This fixes #14221, where the NCG and the DWARF code were apparently giving two different names to the same block. Test Plan: Validate with DWARF support enabled. Reviewers: simonmar, austin Subscribers: rwbarton, thomie GHC Trac Issues: #14221 Differential Revision: https://phabricator.haskell.org/D3977
* Fix typos in diagnostics, testsuite and commentsGabor Greif2017-09-071-1/+1
|
* Add debugPprTypeSimon Peyton Jones2017-08-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | We pretty-print a type by converting it to an IfaceType and pretty-printing that. But (a) that's a bit indirect, and (b) delibrately loses information about (e.g.) the kind on the /occurrences/ of a type variable So this patch implements debugPprType, which pretty prints the type directly, with no fancy formatting. It's just used for debugging. I took the opportunity to refactor the debug-pretty-printing machinery a little. In particular, define these functions and use them: ifPprDeubug :: SDoc -> SDOc -> SDoc -- Says what to do with and without -dppr-debug whenPprDebug :: SDoc -> SDoc -- Says what to do with -dppr-debug; without is empty getPprDebug :: (Bool -> SDoc) -> SDoc getPprDebug used to be called sdocPprDebugWith whenPprDebug used to be called ifPprDebug So a lot of files get touched in a very mechanical way
* nativeGen: Don't index into linked listsBen Gamari2017-08-291-4/+6
| | | | | | | | | | | There were a couple places where we indexed into linked lists of register names. Replace these with arrays. Reviewers: austin Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3893
* Add support for producing position-independent executablesBen Gamari2017-08-221-2/+2
| | | | | | | | | | | | | | | | Previously due to #12759 we disabled PIE support entirely. However, this breaks the user's ability to produce PIEs. Add an explicit flag, -fPIE, allowing the user to build PIEs. Test Plan: Validate Reviewers: rwbarton, austin, simonmar Subscribers: trommler, simonmar, trofi, jrtc27, thomie GHC Trac Issues: #12759, #13702 Differential Revision: https://phabricator.haskell.org/D3589
* Drop GHC 7.10 compatibilityRyan Scott2017-08-011-2/+0
| | | | | | | | | | | | | | | | GHC 8.2.1 is out, so now GHC's support window only extends back to GHC 8.0. This means we can delete gobs of code that was only used for GHC 7.10 support. Hooray! Test Plan: ./validate Reviewers: hvr, bgamari, austin, goldfire, simonmar Reviewed By: bgamari Subscribers: Phyx, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3781
* Hoopl: remove dependency on Hoopl packageMichal Terepeta2017-06-233-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This copies the subset of Hoopl's functionality needed by GHC to `cmm/Hoopl` and removes the dependency on the Hoopl package. The main motivation for this change is the confusing/noisy interface between GHC and Hoopl: - Hoopl has `Label` which is GHC's `BlockId` but different than GHC's `CLabel` - Hoopl has `Unique` which is different than GHC's `Unique` - Hoopl has `Unique{Map,Set}` which are different than GHC's `Uniq{FM,Set}` - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is needed just to filter the exposed functions (filter out some of the Hoopl's and add the GHC ones) With this change, we'll be able to simplify this significantly. It'll also be much easier to do invasive changes (Hoopl is a public package on Hackage with users that depend on the current behavior) This should introduce no changes in functionality - it merely copies the relevant code. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: austin, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: simonpj, kavon, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3616
* nativeGen: Use SSE2 SQRT instructionBen Gamari2017-04-283-7/+15
| | | | | | | | | | Reviewers: austin, dfeuer Subscribers: dfeuer, rwbarton, thomie GHC Trac Issues: #13629 Differential Revision: https://phabricator.haskell.org/D3508
* x86 nativeGen: Fix test with mask in range [128,255] (#13425)Reid Barton2017-03-241-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | My commit bdb0c43c7 optimized the encoding of instructions to test tag bits, but it did not always set exactly the same condition codes since the testb instruction does a single-byte comparison, rather than a full-word comparison. It would be correct to optimize the expression `x .&. 128 > 0` to the sequence testb $128, %al seta %al ; note: 'a' for unsigned comparison, ; not 'g' for signed comparison but the pretty-printer is not the right place to make this kind of context-sensitive optimization. Test Plan: harbormaster Reviewers: trofi, austin, bgamari, dfeuer Reviewed By: trofi, dfeuer Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D3359
* implement missing Fabs{32,64} on i386 NCG and UNREGSergei Trofimovich2017-03-101-2/+2
| | | | | | | | | | | Noticed breakage as build failure on i386 freebsd build bot: http://haskell.inf.elte.hu/builders/freebsd-i386-head/1267/10.html ghc-stage1: panic! (the 'impossible' happened) (GHC version 8.1.20170310 for i386-portbld-freebsd): outOfLineCmmOp: MO_F64_Fabs not supported here Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
* Generate better fp abs for X86 and llvm with default cmm otherwiseDominic Steinitz2017-03-072-4/+40
| | | | | | | | | | | | | | | | | | | | | | | Currently we have this in libraries/base/GHC/Float.hs: ``` abs x | x == 0 = 0 -- handles (-0.0) | x > 0 = x | otherwise = negateFloat x ``` But 3-4 years ago it was noted that this was inefficient: https://mail.haskell.org/pipermail/libraries/2013-April/019690.html We can generate better code for X86 and llvm and for others generate some custom cmm code which is similar to what the compiler generates now. Reviewers: austin, simonmar, hvr, bgamari Reviewed By: bgamari Subscribers: dfeuer, thomie Differential Revision: https://phabricator.haskell.org/D3265
* Spelling only [ci skip]Gabor Greif2017-02-231-1/+1
|
* Honour -dsuppress-uniques more thoroughlySimon Peyton Jones2017-02-171-6/+6
| | | | | | | | | | | I found that tests parser/should_compile/DumpRenamedAst and friends were printing uniques, which makes the test fragile. But -dsuppress-uniques made no difference! It turned out that pprName wasn't properly consulting Opt_SuppressUniques. This patch fixes the problem, and updates those three tests to use -dsuppress-uniques
* Debug: Use local symbols for unwind points (#13278)Ben Gamari2017-02-142-3/+4
| | | | | | | | | | | | | | | | | | | While this apparently didn't matter on Linux, the OS X toolchain seems to treat local and external symbols differently during linking. Namely, the linker assumes that an external symbol marks the beginning of a new, unused procedure, and consequently drops it. Fixes regression introduced in D2741. Test Plan: `debug` testcase on OS X Reviewers: austin, simonmar, rwbarton Reviewed By: rwbarton Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3135
* Cmm: Add support for undefined unwinding statementsBen Gamari2017-02-081-3/+3
| | | | | | | | | | | | | | And use to mark `stg_stack_underflow_frame`, which we are unable to determine a caller from. To simplify parsing at the moment we steal the `return` keyword to indicate an undefined unwind value. Perhaps this should be revisited. Reviewers: scpmw, simonmar, austin, erikd Subscribers: dfeuer, thomie Differential Revision: https://phabricator.haskell.org/D2738
* Generalize CmmUnwind and pass unwind information through NCGBen Gamari2017-02-083-13/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed in D1532, Trac Trac #11337, and Trac Trac #11338, the stack unwinding information produced by GHC is currently quite approximate. Essentially we assume that register values do not change at all within a basic block. While this is somewhat true in normal Haskell code, blocks containing foreign calls often break this assumption. This results in unreliable call stacks, especially in the code containing foreign calls. This is worse than it sounds as unreliable unwinding information can at times result in segmentation faults. This patch set attempts to improve this situation by tracking unwinding information with finer granularity. By dispensing with the assumption of one unwinding table per block, we allow the compiler to accurately represent the areas surrounding foreign calls. Towards this end we generalize the representation of unwind information in the backend in three ways, * Multiple CmmUnwind nodes can occur per block * CmmUnwind nodes can now carry unwind information for multiple registers (while not strictly necessary; this makes emitting unwinding information a bit more convenient in the compiler) * The NCG backend is given an opportunity to modify the unwinding records since it may need to make adjustments due to, for instance, native calling convention requirements for foreign calls (see #11353). This sets the stage for resolving #11337 and #11338. Test Plan: Validate Reviewers: scpmw, simonmar, austin, erikd Subscribers: qnikst, thomie Differential Revision: https://phabricator.haskell.org/D2741
* BlockId: remove BlockMap and BlockSet synonymsMichal Terepeta2016-12-082-4/+5
| | | | | | | | | | | | | | | | | | | | This continues removal of `BlockId` module in favor of Hoopl's `Label`. Most of the changes here are mechanical, apart from the orphan `Outputable` instances for `LabelMap` and `LabelSet`. For now I've moved them to `cmm/Hoopl`, since it's already trying to manage all imports from Hoopl (to avoid any collisions). Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: validate Reviewers: bgamari, austin, simonmar Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2800
* Reduce the size of string literals in binaries.Thijs Alkemade2016-12-061-5/+21
| | | | | | | | | | | | | | | Removed the alignment for strings and mark then as cstring sections in the generated asm so the linker can merge duplicate sections. Reviewers: rwbarton, trofi, austin, trommler, simonmar, hvr, bgamari Reviewed By: hvr, bgamari Subscribers: simonpj, hvr, thomie Differential Revision: https://phabricator.haskell.org/D1290 GHC Trac Issues: #9577
* Correct spelling of command-line option in commentGabor Greif2016-11-171-1/+1
|
* Inline compiler/NOTES into X86/Ppr.hsMatthew Pickering2016-11-161-0/+18
| | | | | | | | Reviewers: austin, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2721
* CodeGen X86: fix unsafe foreign calls wrt inliningSylvain HENRY2016-10-011-36/+75
| | | | | | | | | | | | | | | | | | | | | | | | | Foreign calls (unsafe and safe) interact badly with inlining and register passing ABIs (see #11792 and #12614): the inlined code to compute a parameter of the call may overwrite a register already set to pass a preceding parameter. With this patch, we compute all parameters which are not simple expressions before assigning them to fixed registers required by the ABI. Test Plan: - Add test (test both reg and stack parameters) - Validate Reviewers: osa1, bgamari, austin, simonmar Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2263 GHC Trac Issues: #11792, #12614
* Fix codegen bug in PIC version of genSwitch (#12433)Simon Marlow2016-09-151-3/+6
| | | | | | | | | | | | | | | | | | Summary: * getNonClobberedReg instead of getSomeReg, because the reg needs to survive across t_code * Use a new reg for the table offset calculation instead of clobbering the reg returned by expr (this was the bug affecting #12433) Test Plan: New unit test; validate Reviewers: rwbarton, bgamari, austin, erikd Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2529 GHC Trac Issues: #12433
* Revert "codeGen: Remove binutils<2.17 hack, fixes T11758"Simon Peyton Jones2016-08-192-3/+35
| | | | | | | This reverts commit e3e2e49a8f6952e1c8a19321c729c17b294d8c92. I'm reverting because it makes ghc-stage2 seg-fault on 64-bit Windows machines. Even ghc-stage2 --version seg-faults.
* codeGen: Remove binutils<2.17 hack, fixes T11758Alex Dzyoba2016-08-052-35/+3
| | | | | | | | | | | | | | | | | | | There was a complication on the x86_64 platform, where pointers were 64 bits, but the tools didn't support 64-bit relative relocations. This was true before binutils 2.17, which nowadays is quite standart (even CentOs 5 is shipped with 2.17). Hacks were removed from x86 genSwitch and asm pretty printer. Also [x86-64-relative] note was dropped from includes/rts/storage/InfoTables.h as it's not referenced anywhere now. Reviewers: austin, simonmar, rwbarton, erikd, bgamari Reviewed By: simonmar, erikd, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2426
* Reduce default for -fmax-pmcheck-iterations from 1e7 to 2e6Herbert Valerio Riedel2016-04-101-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | The commit 28f951edfe50ea5182065144340061ec326781f5 introduced the `-fmax-pmcheck-iterations` flag and set the default limit to 1e7 iterations. However, this value is still high enough that it can result GHC to exhibit memory spikes beyond 1 GiB of RAM usage (heap profile showed several `(:)`s, as well as `THUNK_2_0`, and `PmCon` during the memory spikes) A value of 2e6 seems to be a safer upper bound which still manages to let the checker not run into the limit in most cases. Test Plan: Validate, try building a few Hackage packages Reviewers: austin, gkaracha, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2095
* Restore original alignment for info tablesSimon Brenner2016-01-271-0/+9
| | | | | | | | | | | | | | | | | | | | This was broken in 4a32bf925b8aba7885d9c745769fe84a10979a53, meaning that info tables and subsequent code are no longer guaranteed to have the recommended alignment. Split up the section header and section alignment printers, and print an appropriate alignment directive before each info table. Fixes Trac #11486 Reviewers: austin, bgamari, rwbarton Reviewed By: bgamari, rwbarton Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1847 GHC Trac Issues: #11486
* Replace calls to `ptext . sLit` with `text`Jan Stolarek2016-01-181-49/+49
| | | | | | | | | | | | | | | | | | | | Summary: In the past the canonical way for constructing an SDoc string literal was the composition `ptext . sLit`. But for some time now we have function `text` that does the same. Plus it has some rules that optimize its runtime behaviour. This patch takes all uses of `ptext . sLit` in the compiler and replaces them with calls to `text`. The main benefits of this patch are clener (shorter) code and less dependencies between module, because many modules now do not need to import `FastString`. I don't expect any performance benefits - we mostly use SDocs to report errors and it seems there is little to be gained here. Test Plan: ./validate Reviewers: bgamari, austin, goldfire, hvr, alanz Subscribers: goldfire, thomie, mpickering Differential Revision: https://phabricator.haskell.org/D1784
* Support multiple debug output levelsBen Gamari2015-11-231-3/+3
| | | | | | | | | We now only strip block information from DebugBlocks when compiling with `-g1`, intended to be used when only minimal debug information is desired. `-g2` is assumed when `-g` is passed without any integer argument. Differential Revision: https://phabricator.haskell.org/D1281
* Implement function-sections for Haskell code, #8405Simon Brenner2015-11-122-100/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a flag -split-sections that does similar things to -split-objs, but using sections in single object files instead of relying on the Satanic Splitter and other abominations. This is very similar to the GCC flags -ffunction-sections and -fdata-sections. The --gc-sections linker flag, which allows unused sections to actually be removed, is added to all link commands (if the linker supports it) so that space savings from having base compiled with sections can be realized. Supported both in LLVM and the native code-gen, in theory for all architectures, but really tested on x86 only. In the GHC build, a new SplitSections variable enables -split-sections for relevant parts of the build. Test Plan: validate with both settings of SplitSections Reviewers: dterei, Phyx, austin, simonmar, thomie, bgamari Reviewed By: simonmar, thomie, bgamari Subscribers: hsyl20, erikd, kgardas, thomie Differential Revision: https://phabricator.haskell.org/D1242 GHC Trac Issues: #8405
* Add subWordC# on x86ishNikita Karetnikov2015-10-311-5/+10
| | | | | | | | | | | | | | | This adds a subWordC# primop which implements subtraction with overflow reporting. Reviewers: tibbe, goldfire, rwbarton, bgamari, austin, hvr Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1334 GHC Trac Issues: #10962
* x86 codegen: don't generate location commentsSergei Trofimovich2015-10-291-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | The following source snippet 'module A where x */* y = 42' when being compiled with '-g' option emits syntactically invalid comment for GNU as: .text .align 8 .loc 1 3 1 /* */* */ Fixed by not emitting comments at all. We already suppress all asm comments in 'X86/Ppr.hs'. Signed-off-by: Sergei Trofimovich <siarheit@google.com> Test Plan: added test and check it works Reviewers: scpmw, simonmar, austin, bgamari Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1386 GHC Trac Issues: #10667
* Rename package key to unit ID, and installed package ID to component ID.Edward Z. Yang2015-10-141-4/+4
| | | | | | Comes with Haddock submodule update. Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>
* Annotate CmmBranch with an optional likely targetSimon Marlow2015-09-231-3/+4
| | | | | | | | | | | | | | | | | Summary: This allows the code generator to give hints to later code generation steps about which branch is most likely to be taken. Right now it is only taken into account in one place: a special case in CmmContFlowOpt that swapped branches over to maximise the chance of fallthrough, which is now disabled when there is a likelihood setting. Test Plan: validate Reviewers: austin, simonpj, bgamari, ezyang, tibbe Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1273
* CodeGen: fix typo in error messageThomas Miedema2015-09-121-1/+1
|
* Refactor: delete most of the module FastTypesThomas Miedema2015-08-211-22/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverses some of the work done in #1405, and goes back to the assumption that the bootstrap compiler understands GHC-haskell. In particular: * use MagicHash instead of _ILIT and _CLIT * pattern matching on I# if possible, instead of using iUnbox unnecessarily * use Int#/Char#/Addr# instead of the following type synonyms: - type FastInt = Int# - type FastChar = Char# - type FastPtr a = Addr# * inline the following functions: - iBox = I# - cBox = C# - fastChr = chr# - fastOrd = ord# - eqFastChar = eqChar# - shiftLFastInt = uncheckedIShiftL# - shiftR_FastInt = uncheckedIShiftRL# - shiftRLFastInt = uncheckedIShiftRL# * delete the following unused functions: - minFastInt - maxFastInt - uncheckedIShiftRA# - castFastPtr - panicDocFastInt and pprPanicFastInt * rename panicFastInt back to panic# These functions remain, since they actually do something: * iUnbox * bitAndFastInt * bitOrFastInt Test Plan: validate Reviewers: austin, bgamari Subscribers: rwbarton Differential Revision: https://phabricator.haskell.org/D1141 GHC Trac Issues: #1405
* Delete FastBoolThomas Miedema2015-08-213-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reverses some of the work done in Trac #1405, and assumes GHC is smart enough to do its own unboxing of booleans now. I would like to do some more performance measurements, but the code changes can be reviewed already. Test Plan: With a perf build: ./inplace/bin/ghc-stage2 nofib/spectral/simple/Main.hs -fforce-recomp +RTS -t --machine-readable before: ``` [("bytes allocated", "1300744864") ,("num_GCs", "302") ,("average_bytes_used", "8811118") ,("max_bytes_used", "24477464") ,("num_byte_usage_samples", "9") ,("peak_megabytes_allocated", "64") ,("init_cpu_seconds", "0.001") ,("init_wall_seconds", "0.001") ,("mutator_cpu_seconds", "2.833") ,("mutator_wall_seconds", "4.283") ,("GC_cpu_seconds", "0.960") ,("GC_wall_seconds", "0.961") ] ``` after: ``` [("bytes allocated", "1301088064") ,("num_GCs", "310") ,("average_bytes_used", "8820253") ,("max_bytes_used", "24539904") ,("num_byte_usage_samples", "9") ,("peak_megabytes_allocated", "64") ,("init_cpu_seconds", "0.001") ,("init_wall_seconds", "0.001") ,("mutator_cpu_seconds", "2.876") ,("mutator_wall_seconds", "4.474") ,("GC_cpu_seconds", "0.965") ,("GC_wall_seconds", "0.979") ] ``` CPU time seems to be up a bit, but I'm not sure. Unfortunately CPU time measurements are rather noisy. Reviewers: austin, bgamari, rwbarton Subscribers: nomeata Differential Revision: https://phabricator.haskell.org/D1143 GHC Trac Issues: #1405
* Fix todo in compiler/nativeGen: Rename Size to Formatmarkus2015-07-074-576/+584
| | | | | | | | | | | | | | | | | | This commit renames the Size module in the native code generator to Format, as proposed by a todo, as well as adjusting parameter names in other modules that use it. Test Plan: validate Reviewers: austin, simonmar, bgamari Reviewed By: simonmar, bgamari Subscribers: bgamari, simonmar, thomie Projects: #ghc Differential Revision: https://phabricator.haskell.org/D865
* Encode alignment in MO_Memcpy and friendsBen Gamari2015-06-161-17/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: Alignment needs to be a compile-time constant. Previously the code generators had to jump through hoops to ensure this was the case as the alignment was passed as a CmmExpr in the arguments list. Now we take care of this up front. This fixes #8131. Authored-by: Reid Barton <rwbarton@gmail.com> Dusted-off-by: Ben Gamari <ben@smart-cactus.org> Tests for T8131 Test Plan: Validate Reviewers: rwbarton, austin Reviewed By: rwbarton, austin Subscribers: bgamari, carter, thomie Differential Revision: https://phabricator.haskell.org/D624 GHC Trac Issues: #8131
* Refactor the story around switches (#10137)Joachim Breitner2015-03-301-6/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This re-implements the code generation for case expressions at the Stg → Cmm level, both for data type cases as well as for integral literal cases. (Cases on float are still treated as before). The goal is to allow for fancier strategies in implementing them, for a cleaner separation of the strategy from the gritty details of Cmm, and to run this later than the Common Block Optimization, allowing for one way to attack #10124. The new module CmmSwitch contains a number of notes explaining this changes. For example, it creates larger consecutive jump tables than the previous code, if possible. nofib shows little significant overall improvement of runtime. The rather large wobbling comes from changes in the code block order (see #8082, not much we can do about it). But the decrease in code size alone makes this worthwhile. ``` Program Size Allocs Runtime Elapsed TotalMem Min -1.8% 0.0% -6.1% -6.1% -2.9% Max -0.7% +0.0% +5.6% +5.7% +7.8% Geometric Mean -1.4% -0.0% -0.3% -0.3% +0.0% ``` Compilation time increases slightly: ``` -1 s.d. ----- -2.0% +1 s.d. ----- +2.5% Average ----- +0.3% ``` The test case T783 regresses a lot, but it is the only one exhibiting any regression. The cause is the changed order of branches in an if-then-else tree, which makes the hoople data flow analysis traverse the blocks in a suboptimal order. Reverting that gets rid of this regression, but has a consistent, if only very small (+0.2%), negative effect on runtime. So I conclude that this test is an extreme outlier and no reason to change the code. Differential Revision: https://phabricator.haskell.org/D720
* Generate DWARF unwind informationPeter Wortmann2014-12-161-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | This tells debuggers such as GDB how to "unwind" a program state, which allows them to walk the stack up. Notes: * The code is quite general, perhaps unnecessarily so. Unless we get more unwind information, only the first case of pprSetUnwind will get used - and pprUnwindExpr and pprUndefUnwind will never be called. It just so happens that this is a point where we can get a lot of features cheaply, even if we don't use them. * When determining what location to show for a return address, most debuggers check the map for "rip-1", assuming that's where the "call" instruction is. For tables-next-to-code, that happens to always be the end of an info table. We therefore cheat a bit here by shifting .debug_frame information so it covers the end of the info table, as well as generating a .loc directive for the info table data. Debuggers will still show the wrong label for the return address, though. Haven't found a way around that one yet. (From Phabricator D396)
* Generate DWARF info sectionPeter Wortmann2014-12-161-3/+12
| | | | | | | | | | | | | | | | This is where we actually make GHC emit DWARF code. The info section contains all the general meta information bits as well as an entry for every block of native code. Notes: * We need quite a few new labels in order to properly address starts and ends of blocks. * Thanks to Nathan Howell for taking the iniative to get our own Haskell language ID for DWARF! (From Phabricator D396)
* Generate .loc/.file directives from source ticksPeter Wortmann2014-12-163-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | This generates DWARF, albeit indirectly using the assembler. This is the easiest (and, apparently, quite standard) method of generating the .debug_line DWARF section. Notes: * Note we have to make sure that .file directives appear correctly before the respective .loc. Right now we ppr them manually, which makes them absent from dumps. Fixing this would require .file to become a native instruction. * We have to pass a lot of things around the native code generator. I know Ian did quite a bit of refactoring already, but having one common monad could *really* simplify things here... * To support SplitObjcs, we need to emit/reset all DWARF data at every split. We use the occassion to move split marker generation to cmmNativeGenStream as well, so debug data extraction doesn't have to choke on it. (From Phabricator D396)
* Add unwind information to CmmPeter Wortmann2014-12-161-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Unwind information allows the debugger to discover more information about a program state, by allowing it to "reconstruct" other states of the program. In practice, this means that we explain to the debugger how to unravel stack frames, which comes down mostly to explaining how to find their Sp and Ip register values. * We declare yet another new constructor for CmmNode - and this time there's actually little choice, as unwind information can and will change mid-block. We don't actually make use of these capabilities, and back-end support would be tricky (generate new labels?), but it feels like the right way to do it. * Even though we only use it for Sp so far, we allow CmmUnwind to specify unwind information for any register. This is pretty cheap and could come in useful in future. * We allow full CmmExpr expressions for specifying unwind values. The advantage here is that we don't have to make up new syntax, and can e.g. use the WDS macro directly. On the other hand, the back-end will now have to simplify the expression until it can sensibly be converted into DWARF byte code - a process which might fail, yielding NCG panics. On the other hand, when you're writing Cmm by hand you really ought to know what you're doing. (From Phabricator D169)
* Tick scopesPeter Wortmann2014-12-161-1/+2
| | | | | | | | | | | | | | | | | | | | | | This patch solves the scoping problem of CmmTick nodes: If we just put CmmTicks into blocks we have no idea what exactly they are meant to cover. Here we introduce tick scopes, which allow us to create sub-scopes and merged scopes easily. Notes: * Given that the code often passes Cmm around "head-less", we have to make sure that its intended scope does not get lost. To keep the amount of passing-around to a minimum we define a CmmAGraphScoped type synonym here that just bundles the scope with a portion of Cmm to be assembled later. * We introduce new scopes at somewhat random places, aligning with getCode calls. This works surprisingly well, but we might have to add new scopes into the mix later on if we find things too be too coarse-grained. (From Phabricator D169)
* Source notes (Cmm support)Peter Wortmann2014-12-161-0/+1
| | | | | | | | | | | | | | | | | | | | This patch adds CmmTick nodes to Cmm code. This is relatively straight-forward, but also not very useful, as many blocks will simply end up with no annotations whatosever. Notes: * We use this design over, say, putting ticks into the entry node of all blocks, as it seems to work better alongside existing optimisations. Now granted, the reason for this is that currently GHC's main Cmm optimisations seem to mainly reorganize and merge code, so this might change in the future. * We have the Cmm parser generate a few source notes as well. This is relatively easy to do - worst part is that it complicates the CmmParse implementation a bit. (From Phabricator D169)
* Allow -dead_strip linking on platforms with .subsections_via_symbolsMoritz Angermann2014-11-191-6/+1
| | | | | | | | | | | | | | | | | Summary: This allows to link objects produced with the llvm code generator to be linked with -dead_strip. This applies to at least the iOS cross compiler and OS X compiler. Signed-off-by: Moritz Angermann <moritz@lichtzwerge.de> Test Plan: Create a ffi library and link it with -dead_strip. If the resulting binary does not crash, the patch works as advertised. Reviewers: rwbarton, simonmar, hvr, dterei, mzero, ezyang, austin Reviewed By: dterei, ezyang, austin Subscribers: thomie, mzero, simonmar, ezyang, carter Differential Revision: https://phabricator.haskell.org/D206
* Per-thread allocation counters and limitsSimon Marlow2014-11-123-21/+60
| | | | | | | | This reverts commit f0fcc41d755876a1b02d1c7c79f57515059f6417. New changes: now works on 32-bit platforms too. I added some basic support for 64-bit subtraction and comparison operations to the x86 NCG.
* Revert "Place static closures in their own section."Edward Z. Yang2014-10-201-4/+0
| | | | | | | | | | This reverts commit b23ba2a7d612c6b466521399b33fe9aacf5c4f75. Conflicts: compiler/cmm/PprCmmDecl.hs compiler/nativeGen/PPC/Ppr.hs compiler/nativeGen/SPARC/Ppr.hs compiler/nativeGen/X86/Ppr.hs