summaryrefslogtreecommitdiff
path: root/compiler/llvmGen/LlvmCodeGen/CodeGen.hs
Commit message (Collapse)AuthorAgeFilesLines
* Stop exporting, and stop using, functions marked as deprecatedThomas Miedema2014-09-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Don't export `getUs` and `getUniqueUs`. `UniqSM` has a `MonadUnique` instance: instance MonadUnique UniqSM where getUniqueSupplyM = getUs getUniqueM = getUniqueUs getUniquesM = getUniquesUs Commandline-fu used: git grep -l 'getUs\>' | grep -v compiler/basicTypes/UniqSupply.lhs | xargs sed -i 's/getUs/getUniqueSupplyM/g git grep -l 'getUniqueUs\>' | grep -v combiler/basicTypes/UniqSupply.lhs | xargs sed -i 's/getUniqueUs/getUniqueM/g' Follow up on b522d3a3f970a043397a0d6556ca555648e7a9c3 Reviewed By: austin, hvr Differential Revision: https://phabricator.haskell.org/D220
* `M-x delete-trailing-whitespace` & `M-x untabify`...Herbert Valerio Riedel2014-08-311-7/+7
| | | | | | ...some files more or less recently touched by me [ci skip]
* Add MO_AddIntC, MO_SubIntC MachOps and implement in X86 backendReid Barton2014-08-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | Summary: These MachOps are used by addIntC# and subIntC#, which in turn are used in integer-gmp when adding or subtracting small Integers. The following benchmark shows a ~6% speedup after this commit on x86_64 (building GHC with BuildFlavour=perf). {-# LANGUAGE MagicHash #-} import GHC.Exts import Criterion.Main count :: Int -> Integer count (I# n#) = go n# 0 where go :: Int# -> Integer -> Integer go 0# acc = acc go n# acc = go (n# -# 1#) $! acc + 1 main = defaultMain [bgroup "count" [bench "100" $ whnf count 100]] Differential Revision: https://phabricator.haskell.org/D140
* Implement new CLZ and CTZ primops (re #9340)Herbert Valerio Riedel2014-08-141-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | This implements the new primops clz#, clz32#, clz64#, ctz#, ctz32#, ctz64# which provide efficient implementations of the popular count-leading-zero and count-trailing-zero respectively (see testcase for a pure Haskell reference implementation). On x86, NCG as well as LLVM generates code based on the BSF/BSR instructions (which need extra logic to make the 0-case well-defined). Test Plan: validate and succesful tests on i686 and amd64 Reviewers: rwbarton, simonmar, ezyang, austin Subscribers: simonmar, relrod, ezyang, carter Differential Revision: https://phabricator.haskell.org/D144 GHC Trac Issues: #9340
* Re-add more primops for atomic ops on byte arraysJohan Tibell2014-06-301-24/+47
| | | | | | | | | | | | | | | | | | | | | | | This is the second attempt to add this functionality. The first attempt was reverted in 950fcae46a82569e7cd1fba1637a23b419e00ecd, due to register allocator failure on x86. Given how the register allocator currently works, we don't have enough registers on x86 to support cmpxchg using complicated addressing modes. Instead we fall back to a simpler addressing mode on x86. Adds the following primops: * atomicReadIntArray# * atomicWriteIntArray# * fetchSubIntArray# * fetchOrIntArray# * fetchXorIntArray# * fetchAndIntArray# Makes these pre-existing out-of-line primops inline: * fetchAddIntArray# * casIntArray#
* Revert "Add more primops for atomic ops on byte arrays"Johan Tibell2014-06-261-47/+24
| | | | | | | | This commit caused the register allocator to fail on i386. This reverts commit d8abf85f8ca176854e9d5d0b12371c4bc402aac3 and 04dd7cb3423f1940242fdfe2ea2e3b8abd68a177 (the second being a fix to the first).
* Add more primops for atomic ops on byte arraysJohan Tibell2014-06-241-24/+47
| | | | | | | | | | | | | | | | | | | Summary: Add more primops for atomic ops on byte arrays Adds the following primops: * atomicReadIntArray# * atomicWriteIntArray# * fetchSubIntArray# * fetchOrIntArray# * fetchXorIntArray# * fetchAndIntArray# Makes these pre-existing out-of-line primops inline: * fetchAddIntArray# * casIntArray#
* Some typos in commentsGabor Greif2014-06-111-1/+1
|
* Add LANGUAGE pragmas to compiler/ source filesHerbert Valerio Riedel2014-05-151-3/+2
| | | | | | | | | | | | | | | | | | In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been reorganized, while following the convention, to - place `{-# LANGUAGE #-}` pragmas at the top of the source file, before any `{-# OPTIONS_GHC #-}`-lines. - Moreover, if the list of language extensions fit into a single `{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each individual language extension. In both cases, try to keep the enumeration alphabetically ordered. (The latter layout is preferable as it's more diff-friendly) While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
* Avoid trivial cases of NondecreasingIndentationHerbert Valerio Riedel2014-05-151-2/+2
| | | | | | | This cleanup allows the following refactoring commit to avoid adding a few `{-# LANGUAGE NondecreasingIndentation #-}` pragmas. Signed-off-by: Herbert Valerio Riedel <hvr@gnu.org>
* Add support for prefetch with locality levels.Austin Seipp2013-10-011-4/+7
| | | | | | | | | | | | | | | | | This patch adds support for several new primitive operations which support using processor-specific instructions to help guide data and cache locality decisions. We have levels ranging from [0..3] For LLVM, we generate llvm.prefetch intrinsics at the proper locality level (similar to GCC.) For x86 we generate prefetch{NTA, t2, t1, t0} instructions. On SPARC and PowerPC, the locality levels are ignored. This closes #8256. Authored-by: Carter Tazio Schonwald <carter.schonwald@gmail.com> Signed-off-by: Austin Seipp <austin@well-typed.com>
* Pass 512-bit-wide vectors in registers.Geoffrey Mainland2013-09-221-0/+1
|
* Pass 256-bit-wide vectors in registers.Geoffrey Mainland2013-09-221-0/+1
|
* SIMD primops are now generated using schemas that are polymorphic inGeoffrey Mainland2013-09-221-0/+6
| | | | | | | | | | | | | width and element type. SIMD primops are now polymorphic in vector size and element type, but only internally to the compiler. More specifically, utils/genprimopcode has been extended so that it "knows" about SIMD vectors. This allows us to, for example, write a single definition for the "add two vectors" primop in primops.txt.pp and have it instantiated at many vector types. This generates a primop in GHC.Prim for each vector type at which "add two vectors" is instantiated, but only one data constructor for the PrimOp data type, so the code generator is much, much simpler.
* TyposKrzysztof Gogolewski2013-09-201-2/+2
|
* Add basic support for GHCJSAustin Seipp2013-09-061-0/+1
| | | | | | | | | | | | | | | | | | | This patch encompasses most of the basic infrastructure for GHCJS. It includes: * A new extension, -XJavaScriptFFI * A new architecture, ArchJavaScript * Parser and lexer support for 'foreign import javascript', only available under -XJavaScriptFFI, using ArchJavaScript. * As a knock-on, there is also a new 'WayCustom' constructor in DynFlags, so clients of the GHC API can add custom 'tags' to their built files. This should be useful for other users as well. The remaining changes are really just the resulting fallout, making sure all the cases are handled appropriately for DynFlags and Platform. Authored-by: Luite Stegeman <stegeman@gmail.com> Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Add support for byte endian swapping for Word 16/32/64.Austin Seipp2013-07-171-26/+35
| | | | | | | | | | | | | * Exposes bSwap{,16,32,64}# primops * Add a new machop: MO_BSwap * Use a Stg implementation (hs_bswap{16,32,64}) for other implementation in NCG. * Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr instead of using xchg. * Generate llvm.bswap intrinsics in llvm codegen. Authored-by: Vincent Hanquez <tab@snarc.org> Signed-off-by: Austin Seipp <aseipp@pobox.com>
* Fix llvm.prefetch instrinct for old LLVM versionsPeter Wortmann2013-07-051-3/+6
| | | | | | Seems the last parameter to llvm.prefectch was added in LLVM 3.0. Signed-off-by: David Terei <davidterei@gmail.com>
* Major Llvm refactoringPeter Wortmann2013-06-271-547/+558
| | | | | | | | | | | | | | | | | | | | | | This combined patch reworks the LLVM backend in a number of ways: 1. Most prominently, we introduce a LlvmM monad carrying the contents of the old LlvmEnv around. This patch completely removes LlvmEnv and refactors towards standard library monad combinators wherever possible. 2. Support for streaming - we can now generate chunks of Llvm for Cmm as it comes in. This might improve our speed. 3. To allow streaming, we need a more flexible way to handle forward references. The solution (getGlobalPtr) unifies LlvmCodeGen.Data and getHsFunc as well. 4. Skip alloca-allocation for registers that are actually never written. LLVM will automatically eliminate these, but output is smaller and friendlier to human eyes this way. 5. We use LlvmM to collect references for llvm.used. This allows places other than cmmProcLlvmGens to generate entries.
* Extend globals to aliasesPeter Wortmann2013-06-271-9/+9
| | | | | Also give them a proper constructor - getGlobalVar and getGlobalValue map directly to the accessors.
* Use SDoc for all LLVM pretty-printingPeter Wortmann2013-06-271-13/+14
| | | | | | | This patch reworks some parts of the LLVM pretty-printing code that were still using Show and String. Now we should be using SDoc and Outputable throughout. Note that many get*Name functions become pp*Name here as a side-effect.
* Iteration on dterei's metadata designPeter Wortmann2013-06-271-2/+2
| | | | | | | | | | | | | | | | | | | | | - MetaArgs is not needed, as variables are already meta data - Same goes for MetaVal - its only reason for existing seems to be to support LLVM's strange pretty-printing for meta-data annotations, and I feel that is better to keep the data structure clean and handle it in the pretty-printing instead. - Rename "MetaData" to "MetaAnnot". Meta-data is still meta-data when it is not associated with an expression or statement - for example compile unit data for debugging. I feel the old name was a bit misleading. - Make the renamed MetaAnnot a proper data type instead of a type alias for a pair. - Rename "MetaExpr" constructor to "MetaStruct". As the data is much more like a LLVM structure (not array, as it can contain values). - Fix a warning
* Rework LLVM metadata representation to be more accurate.David Terei2013-06-271-4/+4
|
* Comment out function; consequence of reverting a553f18Simon Peyton Jones2013-06-111-2/+2
|
* Revert "Add support for byte endian swapping for Word 16/32/64."Simon Peyton Jones2013-06-111-36/+24
| | | | This reverts commit 1c5b0511a89488f5280523569d45ee61c0d09ffa.
* Fix warningsIan Lynagh2013-06-091-0/+3
|
* Add support for byte endian swapping for Word 16/32/64.Ian Lynagh2013-06-091-24/+36
| | | | | | | | | | | | * Exposes bSwap{,16,32,64}# primops * Add a new machops MO_BSwap * Use a Stg implementation (hs_bswap{16,32,64}) for other implementation in NCG. * Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr instead of using xchg. * Generate llvm.bswap intrinsics in llvm codegen. Patch from Vincent Hanquez.
* Mimic OldCmm basic block ordering in the LLVM backend.Geoffrey Mainland2013-02-011-1/+1
| | | | | | | | | In OldCmm, the false case of a conditional was a fallthrough. In Cmm, conditionals have both true and false successors. When we convert Cmm to LLVM, we now first re-order Cmm blocks so that the false successor of a conditional occurs next in the list of basic blocks, i.e., it is a fallthrough, just like it (necessarily) did in OldCmm. Surprisingly, this can make a big performance difference.
* Add prefetch primops.Geoffrey Mainland2013-02-011-0/+21
|
* Add support for passing SSE vectors in registers.Geoffrey Mainland2013-02-011-32/+89
| | | | | | | This patch adds support for 6 XMM registers on x86-64 which overlap with the F and D registers and may hold 128-bit wide SIMD vectors. Because there is not a good way to attach type information to STG registers, we aggressively bitcast in the LLVM back-end.
* Add the Int32X4# primitive type and associated primops.Paul Monday2013-02-011-0/+23
|
* Add the Float32X4# primitive type and associated primops.Geoffrey Mainland2013-02-011-0/+43
| | | | | | | | | | | | | This patch lays the groundwork needed for primop support for SIMD vectors. In addition to the groundwork, we add support for the FloatX4# primitive type and associated primops. * Add the FloatX4# primitive type and associated primops. * Add CodeGen support for Float vectors. * Compile vector operations to LLVM vector operations in the LLVM code generator. * Make the x86 native backend fail gracefully when encountering vector primops. * Only generate primop wrappers for vector primops when using LLVM.
* Always pass vector values on the stack.Geoffrey Mainland2013-02-011-0/+11
| | | | | Vector values are now always passed on the stack. This isn't particularly efficient, but it will have to do for now.
* fix validate-breaking warningSimon Marlow2013-01-231-1/+0
|
* Fix our handling of literals and types in LLVM (#7575).David Terei2013-01-231-35/+29
| | | | | | | | | | | | This bug was introduced in the recent fix for #7571, that extended some existing infastructure in the LLVM backend that handled the conflict between LLVM's return type from comparison operations (i1) and what GHC expects (word). By extending it to handle literals though, we forced all literals to be i1 or word, breaking other code. This patch resolves this breakage and handles #7571 still, cleaning up the code for both a little. The overall approach is not ideal but changing that is left for the future.
* Ensure the LLVM codegen correctly handles literals in a branch. #7571Austin Seipp2013-01-221-16/+70
| | | | | | | | | | | | | | We need to be sure that when generating code for literals, we properly narrow the type of the literal to i1. See Note [Literals and branch conditions] in the LlvmCodeGen.CodeGen module. This occurs rarely as the optimizer will remove conditional branches with literals, however we can get this situation occurring with hand written Cmm code. This fixes Trac #7571. Signed-off-by: David Terei <davidterei@gmail.com>
* Fix LLVM code generated for word2Float# and word2Double#.Geoffrey Mainland2013-01-031-2/+6
|
* Implement word2Float# and word2Double#Johan Tibell2012-12-131-0/+12
|
* handle MO_Touch, and generate no code for it.Simon Marlow2012-11-121-0/+3
|
* Fix warningsSimon Marlow2012-11-121-2/+3
|
* Remove OldCmm, convert backends to consume new CmmSimon Marlow2012-11-121-95/+102
| | | | | | | | | | | | | | | | | | This removes the OldCmm data type and the CmmCvt pass that converts new Cmm to OldCmm. The backends (NCGs, LLVM and C) have all been converted to consume new Cmm. The main difference between the two data types is that conditional branches in new Cmm have both true/false successors, whereas in OldCmm the false case was a fallthrough. To generate slightly better code we occasionally need to invert a conditional to ensure that the branch-not-taken becomes a fallthrough; this was previously done in CmmCvt, and it is now done in CmmContFlowOpt. We could go further and use the Hoopl Block representation for native code, which would mean that we could use Hoopl's postorderDfs and analyses for native code, but for now I've left it as is, using the old ListGraph representation for native code.
* Generate correct LLVM for the new register allocation scheme.Geoffrey Mainland2012-10-301-32/+31
| | | | | | | | | | | | | We now have accurate global register liveness information attached to all Cmm procedures and jumps. With this patch, the LLVM back end uses this information to pass only the live floating point (F and D) registers on tail calls. This makes the LLVM back end compatible with the new register allocation strategy. Ideally the GHC LLVM calling convention would put all registers that are always live first in the parameter sequence. Unfortunately the specification is written so that on x86-64 SpLim (always live) is passed after the R registers. Therefore we must always pass *something* in the R registers, so we pass the LLVM value undef.
* Attach global register liveness info to Cmm procedures.Geoffrey Mainland2012-10-301-2/+2
| | | | | | | All Cmm procedures now include the set of global registers that are live on procedure entry, i.e., the global registers used to pass arguments to the procedure. Only global registers that are use to pass arguments are included in this list.
* Cmm jumps always have live register information.Geoffrey Mainland2012-10-301-3/+3
| | | | Jumps now always have live register information attached, so drop Maybes.
* Remove the old codegenSimon Marlow2012-10-191-1/+1
| | | | | Except for CgUtils.fixStgRegisters that is used in the NCG and LLVM backends, and should probably be moved somewhere else.
* Some alpha renamingIan Lynagh2012-10-161-1/+1
| | | | | Mostly d -> g (matching DynFlag -> GeneralFlag). Also renamed if* to when*, matching the Haskell if/when names
* Produce new-style Cmm from the Cmm parserSimon Marlow2012-10-081-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
* Move wORD_SIZE into platformConstantsIan Lynagh2012-09-161-2/+3
|
* Pass DynFlags down to llvmWordIan Lynagh2012-09-161-58/+75
|
* Pass DynFlags down to gcWordIan Lynagh2012-09-121-3/+4
|