summaryrefslogtreecommitdiff
path: root/compiler/nativeGen/X86
Commit message (Collapse)AuthorAgeFilesLines
...
* stack: fix stack allocations on WindowsTamar Christina2018-07-181-11/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: On Windows one is not allowed to drop the stack by more than a page size. The reason for this is that the OS only allocates enough stack till what the TEB specifies. After that a guard page is placed and the rest of the virtual address space is unmapped. The intention is that doing stack allocations will cause you to hit the guard which will then map the next page in and move the guard. This is done to prevent what in the Linux world is known as stack clash vulnerabilities https://access.redhat.com/security/cve/cve-2017-1000364. There are modules in GHC for which the liveliness analysis thinks the reserved 8KB of spill slots isn't enough. One being DynFlags and the other being Cabal. Though I think the Cabal one is likely a bug: ``` 4d6544: 81 ec 00 46 00 00 sub $0x4600,%esp 4d654a: 8d 85 94 fe ff ff lea -0x16c(%ebp),%eax 4d6550: 3b 83 1c 03 00 00 cmp 0x31c(%ebx),%eax 4d6556: 0f 82 de 8d 02 00 jb 4ff33a <_cLpg_info+0x7a> 4d655c: c7 45 fc 14 3d 50 00 movl $0x503d14,-0x4(%ebp) 4d6563: 8b 75 0c mov 0xc(%ebp),%esi 4d6566: 83 c5 fc add $0xfffffffc,%ebp 4d6569: 66 f7 c6 03 00 test $0x3,%si 4d656e: 0f 85 a6 d7 02 00 jne 503d1a <_cLpb_info+0x6> 4d6574: 81 c4 00 46 00 00 add $0x4600,%esp ``` It allocates nearly 18KB of spill slots for a simple 4 line function and doesn't even use it. Note that this doesn't happen on x64 or when making a validate build. Only when making a build without a validate and build.mk. This and the allocation in DynFlags means the stack allocation will jump over the guard page into unmapped memory areas and GHC or an end program segfaults. The pagesize on x86 Windows is 4KB which means we hit it very easily for these two modules, which explains the total DOA of GHC 32bit for the past 3 releases and the "random" segfaults on Windows. ``` 0:000> bp 00503d29 0:000> gn Breakpoint 0 hit WARNING: Stack overflow detected. The unwound frames are extracted from outside normal stack bounds. eax=03b6b9c9 ebx=00dc90f0 ecx=03cac48c edx=03cac43d esi=03b6b9c9 edi=03abef40 eip=00503d29 esp=013e96fc ebp=03cf8f70 iopl=0 nv up ei pl nz na po nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202 setup+0x103d29: 00503d29 89442440 mov dword ptr [esp+40h],eax ss:002b:013e973c=???????? WARNING: Stack overflow detected. The unwound frames are extracted from outside normal stack bounds. WARNING: Stack overflow detected. The unwound frames are extracted from outside normal stack bounds. 0:000> !teb TEB at 00384000 ExceptionList: 013effcc StackBase: 013f0000 StackLimit: 013eb000 ``` This doesn't fix the liveliness analysis but does fix the allocations, by emitting a function call to `__chkstk_ms` when doing allocations of larger than a page, this will make sure the stack is probed every page so the kernel maps in the next page. `__chkstk_ms` is provided by `libGCC`, which is under the `GNU runtime exclusion license`, so it's safe to link against it, even for proprietary code. (Technically we already do since we link compiled C code in.) For allocations smaller than a page we drop the stack and probe the new address. This avoids the function call and still makes sure we hit the guard if needed. PS: In case anyone is Wondering why we didn't notice this before, it's because we only test x86_64 and on Windows 10. On x86_64 the page size is 8KB and also the kernel is a bit more lenient on Windows 10 in that it seems to catch the segfault and resize the stack if it was unmapped: ``` 0:000> t eax=03b6b9c9 ebx=00dc90f0 ecx=03cac48c edx=03cac43d esi=03b6b9c9 edi=03abef40 eip=00503d2d esp=013e96fc ebp=03cf8f70 iopl=0 nv up ei pl nz na po nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202 setup+0x103d2d: 00503d2d 8b461b mov eax,dword ptr [esi+1Bh] ds:002b:03b6b9e4=03cac431 0:000> !teb TEB at 00384000 ExceptionList: 013effcc StackBase: 013f0000 StackLimit: 013e9000 ``` Likely Windows 10 has a guard page larger than previous versions. This fixes the stack allocations, and as soon as I get the time I will look at the liveliness analysis. I find it highly unlikely that simple Cabal function requires ~2200 spill slots. Test Plan: ./validate Reviewers: simonmar, bgamari Reviewed By: bgamari Subscribers: AndreasK, rwbarton, thomie, carter GHC Trac Issues: #15154 Differential Revision: https://phabricator.haskell.org/D4917
* Allow aligning of cmm procs at specific boundryklebinger.andreas@gmx.at2018-06-031-0/+7
| | | | | | | | | | | | | | | | | | | | Allows to align CmmProcs at the given boundries. It makes performance usually worse but can be helpful to limit the effect of a unrelated function B becoming faster/slower after changing function A. Test Plan: ci, using it. Reviewers: bgamari Reviewed By: bgamari Subscribers: rwbarton, thomie, carter GHC Trac Issues: #15148 Differential Revision: https://phabricator.haskell.org/D4706
* Change jump targets in JMP_TBL from blocks to X86.JumpDest.Andreas Klebinger2018-05-302-12/+19
| | | | | | | | | | | | | | | | | | | | | Jump tables always point to blocks when we first generate them. However there are rare situations where we can shortcut one of these blocks to a static address during the asm shortcutting pass. While we already updated the data section accordingly this patch also extends this to the references stored in JMP_TBL. Test Plan: ci Reviewers: bgamari Reviewed By: bgamari Subscribers: thomie, carter GHC Trac Issues: #15104 Differential Revision: https://phabricator.haskell.org/D4595
* Allow CmmLabelDiffOff with different widthsSimon Marlow2018-05-164-7/+8
| | | | | | | | | | | | | | | | | | Summary: This change makes it possible to generate a static 32-bit relative label offset on x86_64. Currently we can only generate word-sized label offsets. This will be used in D4634 to shrink info tables. See D4632 for more details. Test Plan: See D4632 Reviewers: bgamari, niteria, michalt, erikd, jrtc27, osa1 Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4633
* Add 'addWordC#' PrimOpSebastian Graf2018-05-051-0/+3
| | | | | | | | | | | | | | | | | | | This is mostly for congruence with 'subWordC#' and '{add,sub}IntC#'. I found 'plusWord2#' while implementing this, which both lacks documentation and has a slightly different specification than 'addWordC#', which means the generic implementation is unnecessarily complex. While I was at it, I also added lacking meta-information on PrimOps and refactored 'subWordC#'s generic implementation to be branchless. Reviewers: bgamari, simonmar, jrtc27, dfeuer Reviewed By: bgamari, dfeuer Subscribers: dfeuer, thomie, carter Differential Revision: https://phabricator.haskell.org/D4592
* Update JMP_TBL targets during shortcutting in X86 NCG.Andreas Klebinger2018-04-131-8/+20
| | | | | | | | | | | | | | | | | | | | | | Without updating the JMP_TBL information the block list in JMP_TBL contained blocks which were eliminated in some circumstances. The actual assembly generation doesn't look at these fields so this didn't cause any bugs yet. However as long as we carry this information around we should make an effort to keep it correct. Especially since it's useful for debugging purposes and can be used for passes near the end of the codegen pipeline. In particular it's used by jumpDestsOfInstr which without these changes returns the wrong destinations. Test Plan: ci Reviewers: bgamari Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4566
* [RFC] nativeGen: Add support for MO_SS_Conv_W32_W64 on i386Ben Gamari2018-03-191-0/+14
| | | | | | | | | | | | | | | | This is required by D4288. However, this only handles i386; we will likely also need to do the same for PPC and SPARC, lest they break when D4288 is re-merged. Test Plan: Validate Reviewers: simonmar Reviewed By: simonmar Subscribers: rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4362
* Improve X86CodeGen's pprASCII.HE, Tao2018-02-061-13/+34
| | | | | | | | | | | | | | | | | | | | | The original implementation generates a list of SDoc then concatenates them using `hcat`. For memory optimization, we can transform the given literal string into escaped string the construct SDoc directly. This optimization will decreate the memory allocation when there's big literal strings in haskell code, see Trac #14741. Signed-off-by: HE, Tao <sighingnow@gmail.com> Reviewers: bgamari, mpickering, simonpj Reviewed By: simonpj Subscribers: simonpj, rwbarton, thomie, carter GHC Trac Issues: #14741 Differential Revision: https://phabricator.haskell.org/D4384
* Mark xmm6 as caller saved in the register allocator for windows.klebinger.andreas@gmx.at2018-01-311-2/+4
| | | | | | | | | | | | | | | | | | This prevents the register being picked up as a scratch register. Otherwise the allocator would be free to use it before a call. This fixes #14619. Test Plan: ci, repro case on #14619 Reviewers: bgamari, Phyx, erikd, simonmar, RyanGlScott, simonpj Reviewed By: Phyx, RyanGlScott, simonpj Subscribers: simonpj, RyanGlScott, Phyx, rwbarton, thomie, carter GHC Trac Issues: #14619 Differential Revision: https://phabricator.haskell.org/D4348
* Handle the likely:True case in CmmContFlowOptklebinger.andreas@gmx.at2018-01-261-0/+3
| | | | | | | | | | | | | | | | | | It's better to fall through to the likely case than to jump to it. We optimize for this in CmmContFlowOpt when likely:False. This commit extends the logic there to handle cases with likely:True as well. Test Plan: ci Reviewers: bgamari, simonmar Reviewed By: bgamari Subscribers: simonmar, alexbiehl, rwbarton, thomie, carter Differential Revision: https://phabricator.haskell.org/D4306
* Add new mbmi and mbmi2 compiler flagsJohn Ky2018-01-213-0/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | This adds support for the bit deposit and extraction operations provided by the BMI and BMI2 instruction set extensions on modern amd64 machines. Implement x86 code generator for pdep and pext. Properly initialise bmiVersion field. pdep and pext test cases Fix pattern match for pdep and pext instructions Fix build of pdep and pext code for 32-bit architectures Test Plan: Validate Reviewers: austin, simonmar, bgamari, angerman Reviewed By: bgamari Subscribers: trommler, carter, angerman, thomie, rwbarton, newhoggy GHC Trac Issues: #14206 Differential Revision: https://phabricator.haskell.org/D4236
* cmm: Use LocalBlockLabel instead of AsmTempLabel to represent blocksBen Gamari2017-11-281-4/+4
| | | | | | | | | | blockLbl was originally changed in 8b007abbeb3045900a11529d907a835080129176 to use mkTempAsmLabel to fix an inconsistency resulting in #14221. However, this breaks the C code generator, which doesn't support AsmTempLabels (#14454). Instead let's try going the other direction: use a new CLabel variety, LocalBlockLabel. Then we can teach the C code generator to deal with these as well.
* Revert "Add new mbmi and mbmi2 compiler flags"Ben Gamari2017-11-223-91/+0
| | | | | | This broke the 32-bit build. This reverts commit f5dc8ccc29429d0a1d011f62b6b430f6ae50290c.
* Add new mbmi and mbmi2 compiler flagsJohn Ky2017-11-153-0/+91
| | | | | | | | | | | | | | | | | This adds support for the bit deposit and extraction operations provided by the BMI and BMI2 instruction set extensions on modern amd64 machines. Test Plan: Validate Reviewers: austin, simonmar, bgamari, hvr, goldfire, erikd Reviewed By: bgamari Subscribers: goldfire, erikd, trommler, newhoggy, rwbarton, thomie GHC Trac Issues: #14206 Differential Revision: https://phabricator.haskell.org/D4063
* Turn `compareByteArrays#` out-of-line primop into inline primopalexbiehl2017-10-291-0/+1
| | | | | | | | | | | | Depends on D4090 Reviewers: austin, bgamari, erikd, simonmar, alexbiehl Reviewed By: bgamari Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D4091
* Add -falignment-sanitization flagBen Gamari2017-10-291-0/+18
| | | | | | | | | | | | | | | | | | | Here we add a flag to instruct the native code generator to add alignment checks in all info table dereferences. This is helpful in catching pointer tagging issues. Thanks to @jrtc27 for uncovering the tagging issues on Sparc which inspired this flag. Test Plan: Validate Reviewers: simonmar, austin, erikd Reviewed By: simonmar Subscribers: rwbarton, trofi, thomie, jrtc27 Differential Revision: https://phabricator.haskell.org/D4101
* A bunch of typofixesGabor Greif2017-09-261-1/+1
|
* compiler: introduce custom "GhcPrelude" PreludeHerbert Valerio Riedel2017-09-196-0/+12
| | | | | | | | | | | | | | | | | | This switches the compiler/ component to get compiled with -XNoImplicitPrelude and a `import GhcPrelude` is inserted in all modules. This is motivated by the upcoming "Prelude" re-export of `Semigroup((<>))` which would cause lots of name clashes in every modulewhich imports also `Outputable` Reviewers: austin, goldfire, bgamari, alanz, simonmar Reviewed By: bgamari Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari Differential Revision: https://phabricator.haskell.org/D3989
* nativeGen: Consistently use blockLbl to generate CLabels from BlockIdsBen Gamari2017-09-192-6/+6
| | | | | | | | | | | | | | | This fixes #14221, where the NCG and the DWARF code were apparently giving two different names to the same block. Test Plan: Validate with DWARF support enabled. Reviewers: simonmar, austin Subscribers: rwbarton, thomie GHC Trac Issues: #14221 Differential Revision: https://phabricator.haskell.org/D3977
* Fix typos in diagnostics, testsuite and commentsGabor Greif2017-09-071-1/+1
|
* Add debugPprTypeSimon Peyton Jones2017-08-311-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | We pretty-print a type by converting it to an IfaceType and pretty-printing that. But (a) that's a bit indirect, and (b) delibrately loses information about (e.g.) the kind on the /occurrences/ of a type variable So this patch implements debugPprType, which pretty prints the type directly, with no fancy formatting. It's just used for debugging. I took the opportunity to refactor the debug-pretty-printing machinery a little. In particular, define these functions and use them: ifPprDeubug :: SDoc -> SDOc -> SDoc -- Says what to do with and without -dppr-debug whenPprDebug :: SDoc -> SDoc -- Says what to do with -dppr-debug; without is empty getPprDebug :: (Bool -> SDoc) -> SDoc getPprDebug used to be called sdocPprDebugWith whenPprDebug used to be called ifPprDebug So a lot of files get touched in a very mechanical way
* nativeGen: Don't index into linked listsBen Gamari2017-08-291-4/+6
| | | | | | | | | | | There were a couple places where we indexed into linked lists of register names. Replace these with arrays. Reviewers: austin Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3893
* Add support for producing position-independent executablesBen Gamari2017-08-221-2/+2
| | | | | | | | | | | | | | | | Previously due to #12759 we disabled PIE support entirely. However, this breaks the user's ability to produce PIEs. Add an explicit flag, -fPIE, allowing the user to build PIEs. Test Plan: Validate Reviewers: rwbarton, austin, simonmar Subscribers: trommler, simonmar, trofi, jrtc27, thomie GHC Trac Issues: #12759, #13702 Differential Revision: https://phabricator.haskell.org/D3589
* Drop GHC 7.10 compatibilityRyan Scott2017-08-011-2/+0
| | | | | | | | | | | | | | | | GHC 8.2.1 is out, so now GHC's support window only extends back to GHC 8.0. This means we can delete gobs of code that was only used for GHC 7.10 support. Hooray! Test Plan: ./validate Reviewers: hvr, bgamari, austin, goldfire, simonmar Reviewed By: bgamari Subscribers: Phyx, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3781
* Hoopl: remove dependency on Hoopl packageMichal Terepeta2017-06-233-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This copies the subset of Hoopl's functionality needed by GHC to `cmm/Hoopl` and removes the dependency on the Hoopl package. The main motivation for this change is the confusing/noisy interface between GHC and Hoopl: - Hoopl has `Label` which is GHC's `BlockId` but different than GHC's `CLabel` - Hoopl has `Unique` which is different than GHC's `Unique` - Hoopl has `Unique{Map,Set}` which are different than GHC's `Uniq{FM,Set}` - GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is needed just to filter the exposed functions (filter out some of the Hoopl's and add the GHC ones) With this change, we'll be able to simplify this significantly. It'll also be much easier to do invasive changes (Hoopl is a public package on Hackage with users that depend on the current behavior) This should introduce no changes in functionality - it merely copies the relevant code. Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: ./validate Reviewers: austin, bgamari, simonmar Reviewed By: bgamari, simonmar Subscribers: simonpj, kavon, rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3616
* nativeGen: Use SSE2 SQRT instructionBen Gamari2017-04-283-7/+15
| | | | | | | | | | Reviewers: austin, dfeuer Subscribers: dfeuer, rwbarton, thomie GHC Trac Issues: #13629 Differential Revision: https://phabricator.haskell.org/D3508
* x86 nativeGen: Fix test with mask in range [128,255] (#13425)Reid Barton2017-03-241-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | My commit bdb0c43c7 optimized the encoding of instructions to test tag bits, but it did not always set exactly the same condition codes since the testb instruction does a single-byte comparison, rather than a full-word comparison. It would be correct to optimize the expression `x .&. 128 > 0` to the sequence testb $128, %al seta %al ; note: 'a' for unsigned comparison, ; not 'g' for signed comparison but the pretty-printer is not the right place to make this kind of context-sensitive optimization. Test Plan: harbormaster Reviewers: trofi, austin, bgamari, dfeuer Reviewed By: trofi, dfeuer Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D3359
* implement missing Fabs{32,64} on i386 NCG and UNREGSergei Trofimovich2017-03-101-2/+2
| | | | | | | | | | | Noticed breakage as build failure on i386 freebsd build bot: http://haskell.inf.elte.hu/builders/freebsd-i386-head/1267/10.html ghc-stage1: panic! (the 'impossible' happened) (GHC version 8.1.20170310 for i386-portbld-freebsd): outOfLineCmmOp: MO_F64_Fabs not supported here Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
* Generate better fp abs for X86 and llvm with default cmm otherwiseDominic Steinitz2017-03-072-4/+40
| | | | | | | | | | | | | | | | | | | | | | | Currently we have this in libraries/base/GHC/Float.hs: ``` abs x | x == 0 = 0 -- handles (-0.0) | x > 0 = x | otherwise = negateFloat x ``` But 3-4 years ago it was noted that this was inefficient: https://mail.haskell.org/pipermail/libraries/2013-April/019690.html We can generate better code for X86 and llvm and for others generate some custom cmm code which is similar to what the compiler generates now. Reviewers: austin, simonmar, hvr, bgamari Reviewed By: bgamari Subscribers: dfeuer, thomie Differential Revision: https://phabricator.haskell.org/D3265
* Spelling only [ci skip]Gabor Greif2017-02-231-1/+1
|
* Honour -dsuppress-uniques more thoroughlySimon Peyton Jones2017-02-171-6/+6
| | | | | | | | | | | I found that tests parser/should_compile/DumpRenamedAst and friends were printing uniques, which makes the test fragile. But -dsuppress-uniques made no difference! It turned out that pprName wasn't properly consulting Opt_SuppressUniques. This patch fixes the problem, and updates those three tests to use -dsuppress-uniques
* Debug: Use local symbols for unwind points (#13278)Ben Gamari2017-02-142-3/+4
| | | | | | | | | | | | | | | | | | | While this apparently didn't matter on Linux, the OS X toolchain seems to treat local and external symbols differently during linking. Namely, the linker assumes that an external symbol marks the beginning of a new, unused procedure, and consequently drops it. Fixes regression introduced in D2741. Test Plan: `debug` testcase on OS X Reviewers: austin, simonmar, rwbarton Reviewed By: rwbarton Subscribers: rwbarton, thomie Differential Revision: https://phabricator.haskell.org/D3135
* Cmm: Add support for undefined unwinding statementsBen Gamari2017-02-081-3/+3
| | | | | | | | | | | | | | And use to mark `stg_stack_underflow_frame`, which we are unable to determine a caller from. To simplify parsing at the moment we steal the `return` keyword to indicate an undefined unwind value. Perhaps this should be revisited. Reviewers: scpmw, simonmar, austin, erikd Subscribers: dfeuer, thomie Differential Revision: https://phabricator.haskell.org/D2738
* Generalize CmmUnwind and pass unwind information through NCGBen Gamari2017-02-083-13/+54
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed in D1532, Trac Trac #11337, and Trac Trac #11338, the stack unwinding information produced by GHC is currently quite approximate. Essentially we assume that register values do not change at all within a basic block. While this is somewhat true in normal Haskell code, blocks containing foreign calls often break this assumption. This results in unreliable call stacks, especially in the code containing foreign calls. This is worse than it sounds as unreliable unwinding information can at times result in segmentation faults. This patch set attempts to improve this situation by tracking unwinding information with finer granularity. By dispensing with the assumption of one unwinding table per block, we allow the compiler to accurately represent the areas surrounding foreign calls. Towards this end we generalize the representation of unwind information in the backend in three ways, * Multiple CmmUnwind nodes can occur per block * CmmUnwind nodes can now carry unwind information for multiple registers (while not strictly necessary; this makes emitting unwinding information a bit more convenient in the compiler) * The NCG backend is given an opportunity to modify the unwinding records since it may need to make adjustments due to, for instance, native calling convention requirements for foreign calls (see #11353). This sets the stage for resolving #11337 and #11338. Test Plan: Validate Reviewers: scpmw, simonmar, austin, erikd Subscribers: qnikst, thomie Differential Revision: https://phabricator.haskell.org/D2741
* BlockId: remove BlockMap and BlockSet synonymsMichal Terepeta2016-12-082-4/+5
| | | | | | | | | | | | | | | | | | | | This continues removal of `BlockId` module in favor of Hoopl's `Label`. Most of the changes here are mechanical, apart from the orphan `Outputable` instances for `LabelMap` and `LabelSet`. For now I've moved them to `cmm/Hoopl`, since it's already trying to manage all imports from Hoopl (to avoid any collisions). Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com> Test Plan: validate Reviewers: bgamari, austin, simonmar Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2800
* Reduce the size of string literals in binaries.Thijs Alkemade2016-12-061-5/+21
| | | | | | | | | | | | | | | Removed the alignment for strings and mark then as cstring sections in the generated asm so the linker can merge duplicate sections. Reviewers: rwbarton, trofi, austin, trommler, simonmar, hvr, bgamari Reviewed By: hvr, bgamari Subscribers: simonpj, hvr, thomie Differential Revision: https://phabricator.haskell.org/D1290 GHC Trac Issues: #9577
* Correct spelling of command-line option in commentGabor Greif2016-11-171-1/+1
|
* Inline compiler/NOTES into X86/Ppr.hsMatthew Pickering2016-11-161-0/+18
| | | | | | | | Reviewers: austin, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2721
* CodeGen X86: fix unsafe foreign calls wrt inliningSylvain HENRY2016-10-011-36/+75
| | | | | | | | | | | | | | | | | | | | | | | | | Foreign calls (unsafe and safe) interact badly with inlining and register passing ABIs (see #11792 and #12614): the inlined code to compute a parameter of the call may overwrite a register already set to pass a preceding parameter. With this patch, we compute all parameters which are not simple expressions before assigning them to fixed registers required by the ABI. Test Plan: - Add test (test both reg and stack parameters) - Validate Reviewers: osa1, bgamari, austin, simonmar Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2263 GHC Trac Issues: #11792, #12614
* Fix codegen bug in PIC version of genSwitch (#12433)Simon Marlow2016-09-151-3/+6
| | | | | | | | | | | | | | | | | | Summary: * getNonClobberedReg instead of getSomeReg, because the reg needs to survive across t_code * Use a new reg for the table offset calculation instead of clobbering the reg returned by expr (this was the bug affecting #12433) Test Plan: New unit test; validate Reviewers: rwbarton, bgamari, austin, erikd Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2529 GHC Trac Issues: #12433
* Revert "codeGen: Remove binutils<2.17 hack, fixes T11758"Simon Peyton Jones2016-08-192-3/+35
| | | | | | | This reverts commit e3e2e49a8f6952e1c8a19321c729c17b294d8c92. I'm reverting because it makes ghc-stage2 seg-fault on 64-bit Windows machines. Even ghc-stage2 --version seg-faults.
* codeGen: Remove binutils<2.17 hack, fixes T11758Alex Dzyoba2016-08-052-35/+3
| | | | | | | | | | | | | | | | | | | There was a complication on the x86_64 platform, where pointers were 64 bits, but the tools didn't support 64-bit relative relocations. This was true before binutils 2.17, which nowadays is quite standart (even CentOs 5 is shipped with 2.17). Hacks were removed from x86 genSwitch and asm pretty printer. Also [x86-64-relative] note was dropped from includes/rts/storage/InfoTables.h as it's not referenced anywhere now. Reviewers: austin, simonmar, rwbarton, erikd, bgamari Reviewed By: simonmar, erikd, bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2426
* Reduce default for -fmax-pmcheck-iterations from 1e7 to 2e6Herbert Valerio Riedel2016-04-101-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | The commit 28f951edfe50ea5182065144340061ec326781f5 introduced the `-fmax-pmcheck-iterations` flag and set the default limit to 1e7 iterations. However, this value is still high enough that it can result GHC to exhibit memory spikes beyond 1 GiB of RAM usage (heap profile showed several `(:)`s, as well as `THUNK_2_0`, and `PmCon` during the memory spikes) A value of 2e6 seems to be a safer upper bound which still manages to let the checker not run into the limit in most cases. Test Plan: Validate, try building a few Hackage packages Reviewers: austin, gkaracha, bgamari Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D2095
* Restore original alignment for info tablesSimon Brenner2016-01-271-0/+9
| | | | | | | | | | | | | | | | | | | | This was broken in 4a32bf925b8aba7885d9c745769fe84a10979a53, meaning that info tables and subsequent code are no longer guaranteed to have the recommended alignment. Split up the section header and section alignment printers, and print an appropriate alignment directive before each info table. Fixes Trac #11486 Reviewers: austin, bgamari, rwbarton Reviewed By: bgamari, rwbarton Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1847 GHC Trac Issues: #11486
* Replace calls to `ptext . sLit` with `text`Jan Stolarek2016-01-181-49/+49
| | | | | | | | | | | | | | | | | | | | Summary: In the past the canonical way for constructing an SDoc string literal was the composition `ptext . sLit`. But for some time now we have function `text` that does the same. Plus it has some rules that optimize its runtime behaviour. This patch takes all uses of `ptext . sLit` in the compiler and replaces them with calls to `text`. The main benefits of this patch are clener (shorter) code and less dependencies between module, because many modules now do not need to import `FastString`. I don't expect any performance benefits - we mostly use SDocs to report errors and it seems there is little to be gained here. Test Plan: ./validate Reviewers: bgamari, austin, goldfire, hvr, alanz Subscribers: goldfire, thomie, mpickering Differential Revision: https://phabricator.haskell.org/D1784
* Support multiple debug output levelsBen Gamari2015-11-231-3/+3
| | | | | | | | | We now only strip block information from DebugBlocks when compiling with `-g1`, intended to be used when only minimal debug information is desired. `-g2` is assumed when `-g` is passed without any integer argument. Differential Revision: https://phabricator.haskell.org/D1281
* Implement function-sections for Haskell code, #8405Simon Brenner2015-11-122-100/+72
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a flag -split-sections that does similar things to -split-objs, but using sections in single object files instead of relying on the Satanic Splitter and other abominations. This is very similar to the GCC flags -ffunction-sections and -fdata-sections. The --gc-sections linker flag, which allows unused sections to actually be removed, is added to all link commands (if the linker supports it) so that space savings from having base compiled with sections can be realized. Supported both in LLVM and the native code-gen, in theory for all architectures, but really tested on x86 only. In the GHC build, a new SplitSections variable enables -split-sections for relevant parts of the build. Test Plan: validate with both settings of SplitSections Reviewers: dterei, Phyx, austin, simonmar, thomie, bgamari Reviewed By: simonmar, thomie, bgamari Subscribers: hsyl20, erikd, kgardas, thomie Differential Revision: https://phabricator.haskell.org/D1242 GHC Trac Issues: #8405
* Add subWordC# on x86ishNikita Karetnikov2015-10-311-5/+10
| | | | | | | | | | | | | | | This adds a subWordC# primop which implements subtraction with overflow reporting. Reviewers: tibbe, goldfire, rwbarton, bgamari, austin, hvr Reviewed By: bgamari Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1334 GHC Trac Issues: #10962
* x86 codegen: don't generate location commentsSergei Trofimovich2015-10-291-3/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | The following source snippet 'module A where x */* y = 42' when being compiled with '-g' option emits syntactically invalid comment for GNU as: .text .align 8 .loc 1 3 1 /* */* */ Fixed by not emitting comments at all. We already suppress all asm comments in 'X86/Ppr.hs'. Signed-off-by: Sergei Trofimovich <siarheit@google.com> Test Plan: added test and check it works Reviewers: scpmw, simonmar, austin, bgamari Reviewed By: simonmar Subscribers: thomie Differential Revision: https://phabricator.haskell.org/D1386 GHC Trac Issues: #10667
* Rename package key to unit ID, and installed package ID to component ID.Edward Z. Yang2015-10-141-4/+4
| | | | | | Comes with Haddock submodule update. Signed-off-by: Edward Z. Yang <ezyang@cs.stanford.edu>