summaryrefslogtreecommitdiff
path: root/compiler/cmm
Commit message (Collapse)AuthorAgeFilesLines
...
* SIMD primops are now generated using schemas that are polymorphic inGeoffrey Mainland2013-09-222-0/+19
| | | | | | | | | | | | | width and element type. SIMD primops are now polymorphic in vector size and element type, but only internally to the compiler. More specifically, utils/genprimopcode has been extended so that it "knows" about SIMD vectors. This allows us to, for example, write a single definition for the "add two vectors" primop in primops.txt.pp and have it instantiated at many vector types. This generates a primop in GHC.Prim for each vector type at which "add two vectors" is instantiated, but only one data constructor for the PrimOp data type, so the code generator is much, much simpler.
* Do not assume that XMM registers are used to pass floating point arguments.Geoffrey Mainland2013-09-221-15/+24
| | | | | | | | | | | On x86-32, the C calling convention specifies that when SSE2 is enabled, vector arguments are passed in xmm* registers; however, float and double arguments are still passed on the stack. This patch allows us to make the same choice for GHC. Even when SSE2 is enabled, we don't want to pass Float and Double arguments in registers because this would change the ABI and break the ability to link with code that was compiled without -msse2. The next patch will enable passing vector arguments in xmm registers on x86-32.
* Comments onlyJan Stolarek2013-09-201-0/+1
|
* 80 columnsSimon Marlow2013-09-141-5/+8
|
* Rename -ddump-cmm-rewrite to -ddump-cmm-sinkJan Stolarek2013-09-131-1/+1
| | | | This makes it consistent with the corresponding -cmm-sink flag
* Improve sinking passJan Stolarek2013-09-123-42/+216
| | | | | | | | | | | | | | | | | | | | This commit does two things: * Allows duplicating of global registers and literals by inlining them. Previously we would only inline global register or literal if it was used only once. * Changes method of determining conflicts between a node and an assignment. New method has two advantages. It relies on DefinerOfRegs and UserOfRegs typeclasses, so if a set of registers defined or used by a node should ever change, `conflicts` function will use the changed definition. This definition also catches more cases than the previous one (namely CmmCall and CmmForeignCall) which is a step towards making it possible to run sinking pass before stack layout (currently this doesn't work). This patch also adds a lot of comments that are result of about two-week long investigation of how sinking pass works and why it does what it does.
* Fix AMP warnings.Austin Seipp2013-09-112-0/+18
| | | | | Authored-by: David Luposchainsky <dluposchainsky@gmail.com> Signed-off-by: Austin Seipp <austin@well-typed.com>
* Drop proc-points that don't exist in the graph (#8205)Jan Stolarek2013-09-112-20/+50
| | | | | | | | On some architectures it might happen that stack layout pass will invalidate the list of calculated procpoints by dropping some of them. We fix this by checking whether a proc-point is in a graph at the beginning of proc-point analysis. This is a speculative fix for #8205.
* Remove dead codeJan Stolarek2013-09-101-7/+1
|
* Add basic support for GHCJSAustin Seipp2013-09-061-0/+1
| | | | | | | | | | | | | | | | | | | This patch encompasses most of the basic infrastructure for GHCJS. It includes: * A new extension, -XJavaScriptFFI * A new architecture, ArchJavaScript * Parser and lexer support for 'foreign import javascript', only available under -XJavaScriptFFI, using ArchJavaScript. * As a knock-on, there is also a new 'WayCustom' constructor in DynFlags, so clients of the GHC API can add custom 'tags' to their built files. This should be useful for other users as well. The remaining changes are really just the resulting fallout, making sure all the cases are handled appropriately for DynFlags and Platform. Authored-by: Luite Stegeman <stegeman@gmail.com> Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Fix definition of DefinerOfRegs for CmmForeignCallJan Stolarek2013-09-042-8/+78
| | | | And update comments
* Comments and type synonym in CmmSinkJan Stolarek2013-09-032-22/+35
|
* Comments onlyJan Stolarek2013-09-021-20/+40
|
* Whitespaces and comment formattingJan Stolarek2013-08-291-3/+7
|
* Strings and comments only: 'to to ' fixesGabor Greif2013-08-221-1/+1
| | | | I'd still prefer if a native english speaker would check them.
* Only use real XMM registers when assigning arguments.Geoffrey Mainland2013-08-061-5/+4
| | | | | | | | My original change to the calling convention mistakenly used all 6 XMM registers---which live in the global register table---on x86 (32 bit). This royally screwed up the floating point code generated for that platform because floating point arguments were passed in global registers instead of on the stack!
* Rename SSE -> XMM for consistency.Geoffrey Mainland2013-08-061-13/+13
| | | | | We were using SSE is some places and XMM in others. Better to keep a consistent naming scheme.
* Implement "roles" into GHC.Richard Eisenberg2013-08-021-2/+40
| | | | | | | | | | | | | | | | Roles are a solution to the GeneralizedNewtypeDeriving type-safety problem. Roles were first described in the "Generative type abstraction" paper, by Stephanie Weirich, Dimitrios Vytiniotis, Simon PJ, and Steve Zdancewic. The implementation is a little different than that paper. For a quick primer, check out Note [Roles] in Coercion. Also see http://ghc.haskell.org/trac/ghc/wiki/Roles and http://ghc.haskell.org/trac/ghc/wiki/RolesImplementation For a more formal treatment, check out docs/core-spec/core-spec.pdf. This fixes Trac #1496, #4846, #7148.
* Fix a bug in stack layout with safe foreign calls (#8083)Simon Marlow2013-07-246-20/+21
| | | | | | | We weren't properly tracking the number of stack arguments in the continuation of a foreign call. It happened to work when the continuation was not a join point, but when it was a join point we were using the wrong amount of stack fixup.
* Temporarily disable common block elimination; fixes #8083 for nowIan Lynagh2013-07-231-3/+5
|
* Add support for byte endian swapping for Word 16/32/64.Austin Seipp2013-07-172-0/+2
| | | | | | | | | | | | | * Exposes bSwap{,16,32,64}# primops * Add a new machop: MO_BSwap * Use a Stg implementation (hs_bswap{16,32,64}) for other implementation in NCG. * Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr instead of using xchg. * Generate llvm.bswap intrinsics in llvm codegen. Authored-by: Vincent Hanquez <tab@snarc.org> Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Fix many ASSERT uses under Clang.Austin Seipp2013-06-181-1/+1
| | | | | | Clang doesn't like whitespace between macro and arguments. Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Revert "Add support for byte endian swapping for Word 16/32/64."Simon Peyton Jones2013-06-112-2/+0
| | | | This reverts commit 1c5b0511a89488f5280523569d45ee61c0d09ffa.
* Add support for byte endian swapping for Word 16/32/64.Ian Lynagh2013-06-092-0/+2
| | | | | | | | | | | | * Exposes bSwap{,16,32,64}# primops * Add a new machops MO_BSwap * Use a Stg implementation (hs_bswap{16,32,64}) for other implementation in NCG. * Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr instead of using xchg. * Generate llvm.bswap intrinsics in llvm codegen. Patch from Vincent Hanquez.
* Implement cardinality analysisSimon Peyton Jones2013-06-061-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This major patch implements the cardinality analysis described in our paper "Higher order cardinality analysis". It is joint work with Ilya Sergey and Dimitrios Vytiniotis. The basic is augment the absence-analysis part of the demand analyser so that it can tell when something is used never at most once some other way The "at most once" information is used a) to enable transformations, and in particular to identify one-shot lambdas b) to allow updates on thunks to be omitted. There are two new flags, mainly there so you can do performance comparisons: -fkill-absence stops GHC doing absence analysis at all -fkill-one-shot stops GHC spotting one-shot lambdas and single-entry thunks The big changes are: * The Demand type is substantially refactored. In particular the UseDmd is factored as follows data UseDmd = UCall Count UseDmd | UProd [MaybeUsed] | UHead | Used data MaybeUsed = Abs | Use Count UseDmd data Count = One | Many Notice that UCall recurses straight to UseDmd, whereas UProd goes via MaybeUsed. The "Count" embodies the "at most once" or "many" idea. * The demand analyser itself was refactored a lot * The previously ad-hoc stuff in the occurrence analyser for foldr and build goes away entirely. Before if we had build (\cn -> ...x... ) then the "\cn" was hackily made one-shot (by spotting 'build' as special. That's essential to allow x to be inlined. Now the occurrence analyser propagates info gotten from 'build's stricness signature (so build isn't special); and that strictness sig is in turn derived entirely automatically. Much nicer! * The ticky stuff is improved to count single-entry thunks separately. One shortcoming is that there is no DEBUG way to spot if an allegedly-single-entry thunk is acually entered more than once. It would not be hard to generate a bit of code to check for this, and it would be reassuring. But it's fiddly and I have not done it. Despite all this fuss, the performance numbers are rather under-whelming. See the paper for more discussion. nucleic2 -0.8% -10.9% 0.10 0.10 +0.0% sphere -0.7% -1.5% 0.08 0.08 +0.0% -------------------------------------------------------------------------------- Min -4.7% -10.9% -9.3% -9.3% -50.0% Max -0.4% +0.5% +2.2% +2.3% +7.4% Geometric Mean -0.8% -0.2% -1.3% -1.3% -1.8% I don't quite know how much credence to place in the runtime changes, but movement seems generally in the right direction.
* Comments and white space onlySimon Peyton Jones2013-06-061-3/+3
|
* Fix the GHC package DLL-splittingIan Lynagh2013-05-141-2/+2
| | | | | | | There's now an internal -dll-split flag, which we use to tell GHC how the GHC package is split into 2 separate DLLs. This is used by Packages.isDllName to determine whether a call is within the same DLL, or whether it is a call to another DLL.
* Make the current module available to labelDynamicIan Lynagh2013-05-131-2/+2
| | | | It doesn't actually use it yet
* Treat foreign imported things in CMM as being in this packageIan Lynagh2013-05-091-1/+1
| | | | | | They used to be treated as being in an exnternal package, which went wrong on Windows (it tried to call them via an imp wrapper, rather than calling them directly).
* In CMM, only allow foreign calls to labels, not arbitrary expressionsIan Lynagh2013-04-244-28/+19
| | | | | | | | | I'm not sure if we want to make this change permanently, but for now it fixes the unreg build. I've also removed some redundant special-case code that generated prototypes for foreign functions. The standard pprTempAndExternDecls now generates them.
* Merge branch 'master' of http://darcs.haskell.org/ghcSimon Peyton Jones2013-04-1913-114/+89
|\
| * Whitespace only in CmmNodeIan Lynagh2013-04-141-21/+14
| |
| * Merge branch 'master' of darcs.haskell.org:/srv/darcs//ghcIan Lynagh2013-04-062-19/+8
| |\
| | * Rewrite usingInconsistentPicReg as a table for clarityGabor Greif2013-04-061-5/+5
| | | | | | | | | | | | No change in functionality intended
| | * Derive instance Eq for CmmNodeGabor Greif2013-04-061-14/+3
| | |
| * | Detab modules with tabs on 5 lines or fewerIan Lynagh2013-04-063-32/+13
| |/
| * Fix typosGabor Greif2013-04-061-3/+3
| |
| * ticky enhancementsNicolas Frisby2013-03-292-14/+43
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * the new StgCmmArgRep module breaks a dependency cycle; I also untabified it, but made no real changes * updated the documentation in the wiki and change the user guide to point there * moved the allocation enters for ticky and CCS to after the heap check * I left LDV where it was, which was before the heap check at least once, since I have no idea what it is * standardized all (active?) ticky alloc totals to bytes * in order to avoid double counting StgCmmLayout.adjustHpBackwards no longer bumps ALLOC_HEAP_ctr * I resurrected the SLOW_CALL counters * the new module StgCmmArgRep breaks cyclic dependency between Layout and Ticky (which the SLOW_CALL counters cause) * renamed them SLOW_CALL_fast_<pattern> and VERY_SLOW_CALL * added ALLOC_RTS_ctr and _tot ticky counters * eg allocation by Storage.c:allocate or a BUILD_PAP in stg_ap_*_info * resurrected ticky counters for ALLOC_THK, ALLOC_PAP, and ALLOC_PRIM * added -ticky and -DTICKY_TICKY in ways.mk for debug ways * added a ticky counter for total LNE entries * new flags for ticky: -ticky-allocd -ticky-dyn-thunk -ticky-LNE * all off by default * -ticky-allocd: tracks allocation *of* closure in addition to allocation *by* that closure * -ticky-dyn-thunk tracks dynamic thunks as if they were functions * -ticky-LNE tracks LNEs as if they were functions * updated the ticky report format, including making the argument categories (more?) accurate again * the printed name for things in the report include the unique of their ticky parent as well as if they are not top-level
| * Remove unnecessary warnings suppressions, fixes ticket #7756; thanks ↵Edward Z. Yang2013-03-095-11/+1
| | | | | | | | | | | | monoidal for submitting. Signed-off-by: Edward Z. Yang <ezyang@mit.edu>
| * Remove warning-suppression (not needed)Simon Peyton Jones2013-03-091-5/+0
| |
| * Remove unused functions cmmConstrTag, cmmGetTagSimon Peyton Jones2013-03-091-7/+4
| | | | | | | | | | Patch offered by Boris Sukholitko <boriss@gmail.com> Trac #7757
| * commentsSimon Marlow2013-03-051-2/+3
| |
* | Comment onlySimon Peyton Jones2013-04-191-1/+1
|/
* Mimic OldCmm basic block ordering in the LLVM backend.Geoffrey Mainland2013-02-011-1/+30
| | | | | | | | | In OldCmm, the false case of a conditional was a fallthrough. In Cmm, conditionals have both true and false successors. When we convert Cmm to LLVM, we now first re-order Cmm blocks so that the false successor of a conditional occurs next in the list of basic blocks, i.e., it is a fallthrough, just like it (necessarily) did in OldCmm. Surprisingly, this can make a big performance difference.
* Add prefetch primops.Geoffrey Mainland2013-02-012-0/+5
|
* Add support for passing SSE vectors in registers.Geoffrey Mainland2013-02-015-19/+51
| | | | | | | This patch adds support for 6 XMM registers on x86-64 which overlap with the F and D registers and may hold 128-bit wide SIMD vectors. Because there is not a good way to attach type information to STG registers, we aggressively bitcast in the LLVM back-end.
* Add the Int32X4# primitive type and associated primops.Paul Monday2013-02-012-0/+52
|
* Add the Float32X4# primitive type and associated primops.Geoffrey Mainland2013-02-012-0/+59
| | | | | | | | | | | | | This patch lays the groundwork needed for primop support for SIMD vectors. In addition to the groundwork, we add support for the FloatX4# primitive type and associated primops. * Add the FloatX4# primitive type and associated primops. * Add CodeGen support for Float vectors. * Compile vector operations to LLVM vector operations in the LLVM code generator. * Make the x86 native backend fail gracefully when encountering vector primops. * Only generate primop wrappers for vector primops when using LLVM.
* Always pass vector values on the stack.Geoffrey Mainland2013-02-011-10/+24
| | | | | Vector values are now always passed on the stack. This isn't particularly efficient, but it will have to do for now.
* Add a bits128 type to C--.Geoffrey Mainland2013-02-012-0/+5
|