summaryrefslogtreecommitdiff
path: root/compiler/llvmGen/LlvmCodeGen
Commit message (Collapse)AuthorAgeFilesLines
...
* Add support for byte endian swapping for Word 16/32/64.Ian Lynagh2013-06-091-24/+36
| | | | | | | | | | | | * Exposes bSwap{,16,32,64}# primops * Add a new machops MO_BSwap * Use a Stg implementation (hs_bswap{16,32,64}) for other implementation in NCG. * Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr instead of using xchg. * Generate llvm.bswap intrinsics in llvm codegen. Patch from Vincent Hanquez.
* Add iOS specific module layout entry to LLVM codegen; fixes #7721Ian Lynagh2013-03-021-0/+3
| | | | Patch from Stephen Blackheath
* Mimic OldCmm basic block ordering in the LLVM backend.Geoffrey Mainland2013-02-011-1/+1
| | | | | | | | | In OldCmm, the false case of a conditional was a fallthrough. In Cmm, conditionals have both true and false successors. When we convert Cmm to LLVM, we now first re-order Cmm blocks so that the false successor of a conditional occurs next in the list of basic blocks, i.e., it is a fallthrough, just like it (necessarily) did in OldCmm. Surprisingly, this can make a big performance difference.
* Add prefetch primops.Geoffrey Mainland2013-02-011-0/+21
|
* Add support for passing SSE vectors in registers.Geoffrey Mainland2013-02-013-37/+102
| | | | | | | This patch adds support for 6 XMM registers on x86-64 which overlap with the F and D registers and may hold 128-bit wide SIMD vectors. Because there is not a good way to attach type information to STG registers, we aggressively bitcast in the LLVM back-end.
* Add the Int32X4# primitive type and associated primops.Paul Monday2013-02-011-0/+23
|
* Add the Float32X4# primitive type and associated primops.Geoffrey Mainland2013-02-011-0/+43
| | | | | | | | | | | | | This patch lays the groundwork needed for primop support for SIMD vectors. In addition to the groundwork, we add support for the FloatX4# primitive type and associated primops. * Add the FloatX4# primitive type and associated primops. * Add CodeGen support for Float vectors. * Compile vector operations to LLVM vector operations in the LLVM code generator. * Make the x86 native backend fail gracefully when encountering vector primops. * Only generate primop wrappers for vector primops when using LLVM.
* Always pass vector values on the stack.Geoffrey Mainland2013-02-011-0/+11
| | | | | Vector values are now always passed on the stack. This isn't particularly efficient, but it will have to do for now.
* Add Cmm support for representing 128-bit-wide SIMD vectors.Geoffrey Mainland2013-02-012-1/+10
|
* Added support to cross-compile to androidNathan2013-01-241-0/+3
| | | | Signed-off-by: David Terei <davidterei@gmail.com>
* fix validate-breaking warningSimon Marlow2013-01-231-1/+0
|
* Fix our handling of literals and types in LLVM (#7575).David Terei2013-01-231-35/+29
| | | | | | | | | | | | This bug was introduced in the recent fix for #7571, that extended some existing infastructure in the LLVM backend that handled the conflict between LLVM's return type from comparison operations (i1) and what GHC expects (word). By extending it to handle literals though, we forced all literals to be i1 or word, breaking other code. This patch resolves this breakage and handles #7571 still, cleaning up the code for both a little. The overall approach is not ideal but changing that is left for the future.
* Ensure the LLVM codegen correctly handles literals in a branch. #7571Austin Seipp2013-01-221-16/+70
| | | | | | | | | | | | | | We need to be sure that when generating code for literals, we properly narrow the type of the literal to i1. See Note [Literals and branch conditions] in the LlvmCodeGen.CodeGen module. This occurs rarely as the optimizer will remove conditional branches with literals, however we can get this situation occurring with hand written Cmm code. This fixes Trac #7571. Signed-off-by: David Terei <davidterei@gmail.com>
* Up supported LLVM version to 3.3.David Terei2013-01-181-1/+1
| | | | | Actual support is in progress but we will accept bugs against these version. LLVM 3.2 seems in good shape at this point anyway.
* Fix LLVM code generated for word2Float# and word2Double#.Geoffrey Mainland2013-01-031-2/+6
|
* Implement word2Float# and word2Double#Johan Tibell2012-12-131-0/+12
|
* handle MO_Touch, and generate no code for it.Simon Marlow2012-11-121-0/+3
|
* Fix warningsSimon Marlow2012-11-121-2/+3
|
* Remove OldCmm, convert backends to consume new CmmSimon Marlow2012-11-124-98/+105
| | | | | | | | | | | | | | | | | | This removes the OldCmm data type and the CmmCvt pass that converts new Cmm to OldCmm. The backends (NCGs, LLVM and C) have all been converted to consume new Cmm. The main difference between the two data types is that conditional branches in new Cmm have both true/false successors, whereas in OldCmm the false case was a fallthrough. To generate slightly better code we occasionally need to invert a conditional to ensure that the branch-not-taken becomes a fallthrough; this was previously done in CmmCvt, and it is now done in CmmContFlowOpt. We could go further and use the Hoopl Block representation for native code, which would mean that we could use Hoopl's postorderDfs and analyses for native code, but for now I've left it as is, using the old ListGraph representation for native code.
* Generate correct LLVM for the new register allocation scheme.Geoffrey Mainland2012-10-303-48/+57
| | | | | | | | | | | | | We now have accurate global register liveness information attached to all Cmm procedures and jumps. With this patch, the LLVM back end uses this information to pass only the live floating point (F and D) registers on tail calls. This makes the LLVM back end compatible with the new register allocation strategy. Ideally the GHC LLVM calling convention would put all registers that are always live first in the parameter sequence. Unfortunately the specification is written so that on x86-64 SpLim (always live) is passed after the R registers. Therefore we must always pass *something* in the R registers, so we pass the LLVM value undef.
* Draw STG F and D registers from the same pool of available SSE registers on ↵Geoffrey Mainland2012-10-301-0/+6
| | | | | | | | | | | | x86-64. On x86-64 F and D registers are both drawn from SSE registers, so there is no reason not to draw them from the same pool of available SSE registers. This means that whereas previously a function could only receive two Double arguments in registers even if it did not have any Float arguments, now it can receive up to 6 arguments that are any mix of Float and Double in registers. This patch breaks the LLVM back end. The next patch will fix this breakage.
* Attach global register liveness info to Cmm procedures.Geoffrey Mainland2012-10-302-3/+3
| | | | | | | All Cmm procedures now include the set of global registers that are live on procedure entry, i.e., the global registers used to pass arguments to the procedure. Only global registers that are use to pass arguments are included in this list.
* Cmm jumps always have live register information.Geoffrey Mainland2012-10-301-3/+3
| | | | Jumps now always have live register information attached, so drop Maybes.
* Remove the old codegenSimon Marlow2012-10-192-2/+2
| | | | | Except for CgUtils.fixStgRegisters that is used in the NCG and LLVM backends, and should probably be moved somewhere else.
* Some alpha renamingIan Lynagh2012-10-161-1/+1
| | | | | Mostly d -> g (matching DynFlag -> GeneralFlag). Also renamed if* to when*, matching the Haskell if/when names
* Produce new-style Cmm from the Cmm parserSimon Marlow2012-10-081-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main change here is that the Cmm parser now allows high-level cmm code with argument-passing and function calls. For example: foo ( gcptr a, bits32 b ) { if (b > 0) { // we can make tail calls passing arguments: jump stg_ap_0_fast(a); } return (x,y); } More details on the new cmm syntax are in Note [Syntax of .cmm files] in CmmParse.y. The old syntax is still more-or-less supported for those occasional code fragments that really need to explicitly manipulate the stack. However there are a couple of differences: it is now obligatory to give a list of live GlobalRegs on every jump, e.g. jump %ENTRY_CODE(Sp(0)) [R1]; Again, more details in Note [Syntax of .cmm files]. I have rewritten most of the .cmm files in the RTS into the new syntax, except for AutoApply.cmm which is generated by the genapply program: this file could be generated in the new syntax instead and would probably be better off for it, but I ran out of enthusiasm. Some other changes in this batch: - The PrimOp calling convention is gone, primops now use the ordinary NativeNodeCall convention. This means that primops and "foreign import prim" code must be written in high-level cmm, but they can now take more than 10 arguments. - CmmSink now does constant-folding (should fix #7219) - .cmm files now go through the cmmPipeline, and as a result we generate better code in many cases. All the object files generated for the RTS .cmm files are now smaller. Performance should be better too, but I haven't measured it yet. - RET_DYN frames are removed from the RTS, lots of code goes away - we now have some more canned GC points to cover unboxed-tuples with 2-4 pointers, which will reduce code size a little.
* Move wORD_SIZE into platformConstantsIan Lynagh2012-09-163-10/+11
|
* Pass DynFlags down to llvmWordIan Lynagh2012-09-165-91/+111
|
* Pass DynFlags down to gcWordIan Lynagh2012-09-122-5/+6
|
* Pass DynFlags down to bWordIan Lynagh2012-09-122-11/+17
| | | | | | I've switched to passing DynFlags rather than Platform, as (a) it's simpler to not have to extract targetPlatform in so many places, and (b) it may be useful to have DynFlags around in future.
* Remove some CPP from llvmGen/LlvmCodeGen/Ppr.hsIan Lynagh2012-08-281-35/+24
| | | | | | I changed the behaviour slightly, e.g. i386/FreeBSD will no longer fall through and use the Linux "i386-pc-linux-gnu", but will get the final empty case instead. I assume that that's the right thing to do.
* Move activeStgRegs into CodeGen.PlatformIan Lynagh2012-08-212-16/+25
|
* Fix inverted test for platformUnregisterised (should fix the optllvm breakage)Simon Marlow2012-08-211-2/+2
|
* Define callerSaves for all platformsIan Lynagh2012-08-071-5/+5
| | | | | | | | This means that we now generate the same code whatever platform we are on, which should help avoid changes on one platform breaking the build on another. It's also another step towards full cross-compilation.
* Add "Unregisterised" as a field in the settings fileIan Lynagh2012-08-072-14/+16
| | | | | | To explicitly choose whether you want an unregisterised build you now need to use the "--enable-unregisterised"/"--disable-unregisterised" configure flags.
* New codegen: do not split proc-points when using the NCGSimon Marlow2012-07-301-2/+3
| | | | | | | | | Proc-point splitting is only required by backends that do not support having proc-points within a code block (that is, everything except the native backend, i.e. LLVM and C). Not doing proc-point splitting saves some compilation time, and might produce slightly better code in some cases.
* Warn if using unsupported version of LLVM.David Terei2012-06-251-2/+9
|
* Fix #6158. LLVM 3.1 doesn't like certain constructions that 3.0 andDavid Terei2012-06-251-3/+15
| | | | earlier did, so we avoid them.
* Remove some more redundant Platform argumentsIan Lynagh2012-06-201-3/+3
|
* Use SDoc rather than Doc in LLVMIan Lynagh2012-06-123-9/+14
| | | | | In particular, this makes life simpler when we want to use a general GHC SDoc in the middle of some LLVM.
* Add a quotRemWord2 primopIan Lynagh2012-04-211-6/+7
| | | | | | | | It allows you to do (high, low) `quotRem` d provided high < d. Currently only has an inefficient fallback implementation.
* Fix the unregisterised build; fixes #5901Ian Lynagh2012-02-271-5/+6
|
* Add a 2-word-multiply operatorIan Lynagh2012-02-241-0/+1
| | | | Currently no NCGs support it
* Add a Word add-with-carry primopIan Lynagh2012-02-231-11/+10
| | | | No special-casing in any NCGs yet
* Add a primop for unsigned quotRem; part of #5598Ian Lynagh2012-02-171-0/+1
| | | | Only amd64 has an efficient implementation currently.
* Define a quotRem CallishMachOp; fixes #5598Ian Lynagh2012-02-141-5/+10
| | | | | This means we no longer do a division twice when we are using quotRem (on platforms on which the op is supported; currently only amd64).
* Improve support for LLVM >= 3.0 write barrier. (#5814)David Terei2012-01-301-2/+5
|
* llvmGen: Use new fence instructionBen Gamari2012-01-301-9/+17
| | | | Signed-off-by: David Terei <davidterei@gmail.com>
* Fix validation errorDavid Terei2012-01-121-2/+2
|
* Add '-freg-liveness' flag to control if STG liveness informationDavid Terei2012-01-122-16/+27
| | | | is used for optimisation. (enabled by default)