| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
| |
|
|\
| |
| |
| |
| | |
Conflicts:
compiler/types/Coercion.lhs
|
| | |
|
|\ \
| |/ |
|
| |
| |
| |
| | |
Prep for #709
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main payload of this patch is to extend CPR so that it
detects when a function always returns a result constructed
with the *same* constructor, even if the constructor comes from
a sum type. This doesn't matter very often, but it does improve
some things (results below).
Binary sizes increase a little bit, I think because there are more
wrappers. This with -split-objs. Without split-ojbs binary sizes
increased by 6% even for HelloWorld.hs. It's hard to see exactly why,
but I think it was because System.Posix.Types.o got included in the
linked binary, whereas it didn't before.
Program Size Allocs Runtime Elapsed TotalMem
fluid +1.8% -0.3% 0.01 0.01 +0.0%
tak +2.2% -0.2% 0.02 0.02 +0.0%
ansi +1.7% -0.3% 0.00 0.00 +0.0%
cacheprof +1.6% -0.3% +0.6% +0.5% +1.4%
parstof +1.4% -4.4% 0.00 0.00 +0.0%
reptile +2.0% +0.3% 0.02 0.02 +0.0%
----------------------------------------------------------------------
Min +1.1% -4.4% -4.7% -4.7% -15.0%
Max +2.3% +0.3% +8.3% +9.4% +50.0%
Geometric Mean +1.9% -0.1% +0.6% +0.7% +0.3%
Other things in this commit
~~~~~~~~~~~~~~~~~~~~~~~~~~~
* Got rid of the Lattice class in Demand
* Refactored the way that products and newtypes are
decomposed (no change in functionality)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's only a single compiler backend now, so the 'z' suffix means
nothing. Also, the flags were confusingly named ('cmm-foo' vs
'foo-cmm',) and counter-intuitively, '-ddump-cmm' did not do at all what
you expected since the new backend went live.
Basically, all of the -ddump-cmmz-* flags are now -ddump-cmm-*. Some were
renamed to be more consistent.
This doesn't update the manual; it already mentions '-ddump-cmm' and
that flag implies all the others anyway, which is probably what you
want.
Signed-off-by: Austin Seipp <mad.one@gmail.com>
|
| |
|
| |
|
| |
|
|
|
|
| |
The workaround described in note [darwin-x86-pic] applies to Darwin/PPC too.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Top-level indirections are often generated when there is a cast, e.g.
foo :: T
foo = bar `cast` (some coercion)
For these we were generating a full-blown CAF, which is a fair chunk
of code.
This patch makes these indirections generate a single IND_STATIC
closure (4 words) instead. This is exactly what the CAF would
evaluate to eventually anyway, we're just shortcutting the whole
process.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts the compiler parts of
commit 7b594a5d7ac29972db39228e9c8b7f384313f39b
Author: David Terei <davidterei@gmail.com>
Date: Mon Nov 21 12:05:18 2011 -0800
Remove registerised code for dead architectures: mips, ia64, alpha,
hppa1, m68k
In particular, we want to know whether bewareLoadStoreAlignment should
return True or False for them.
It also reverts
commit 3fc68b5c356b39b2b52a86d953367d0021c13262
Author: Simon Marlow <marlowsd@gmail.com>
Date: Wed Jan 4 11:44:02 2012 +0000
Remove missing archs (mipseb, mipsel, alpha) (#5734)
It doesn't hurt to map these to ArchUnknown since we don't need to
know anything specific about them, and adding them would be a pain
(there are a bunch of places where we have to case-match on all the
arches to avoid warnings).
|
|
|
|
|
|
|
|
| |
The Slow calling convention passes the closure in R1, but we were
ignoring this and hoping it would work, which it often did. However,
this bug seems to have been the cause of #7192, because the
graph-colouring allocator is more sensitive to having correct liveness
information on jumps.
|
|
|
|
| |
This fixes a CmmLint complaint when doing proc-point splitting.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes the OldCmm data type and the CmmCvt pass that converts
new Cmm to OldCmm. The backends (NCGs, LLVM and C) have all been
converted to consume new Cmm.
The main difference between the two data types is that conditional
branches in new Cmm have both true/false successors, whereas in OldCmm
the false case was a fallthrough. To generate slightly better code we
occasionally need to invert a conditional to ensure that the
branch-not-taken becomes a fallthrough; this was previously done in
CmmCvt, and it is now done in CmmContFlowOpt.
We could go further and use the Hoopl Block representation for native
code, which would mean that we could use Hoopl's postorderDfs and
analyses for native code, but for now I've left it as is, using the
old ListGraph representation for native code.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
x86-64.
On x86-64 F and D registers are both drawn from SSE registers, so there is no
reason not to draw them from the same pool of available SSE registers. This
means that whereas previously a function could only receive two Double arguments
in registers even if it did not have any Float arguments, now it can receive up
to 6 arguments that are any mix of Float and Double in registers.
This patch breaks the LLVM back end. The next patch will fix this breakage.
|
|
|
|
|
|
|
| |
All Cmm procedures now include the set of global registers that are live on
procedure entry, i.e., the global registers used to pass arguments to the
procedure. Only global registers that are use to pass arguments are included in
this list.
|
|
|
|
|
|
| |
We would like to calculate register liveness for global registers as well as
local registers, so this patch generalizes the existing infrastructure to set
the stage.
|
|
|
|
| |
Jumps now always have live register information attached, so drop Maybes.
|
|
|
|
|
|
|
| |
Fixes this, when building unregisterised:
rts/dist/build/AutoApply.hc:87:1:
error: ‘stg_ap_v_entry’ undeclared (first use in this function)
|
| |
|
| |
|
| |
|
| |
|
|\ |
|
| | |
|
| |
| |
| |
| | |
(PicBaseReg + lit) + N ==> PicBaseReg + (lit+N)
|
|\ \
| |/ |
|
| | |
|
|/ |
|
|
|
|
|
|
|
| |
We were making an aggressive assumption that foreign calls cannot
clobber heap or stack memory, which for the majority of foreign calls
is true, but we violate the assumption in the implementation of
primops in the RTS. This was causing crashes in some STM tests.
|
|
|
|
|
| |
Except for CgUtils.fixStgRegisters that is used in the NCG and LLVM
backends, and should probably be moved somewhere else.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were being inconsistent about how we tested whether dump flags
were enabled; in particular, sometimes we also checked the verbosity,
and sometimes we didn't.
This lead to oddities such as "ghc -v4" printing an "Asm code" section
which didn't contain any code, and "-v4" enabled some parts of
"-ddump-deriv" but not others.
Now all the tests use dopt, which also takes the verbosity into account
as appropriate.
|
|
|
|
|
| |
Mostly d -> g (matching DynFlag -> GeneralFlag).
Also renamed if* to when*, matching the Haskell if/when names
|
|
|
|
|
| |
This avoids confusion due to [DynFlag] and DynFlags being completely
different types.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main change here is that the Cmm parser now allows high-level cmm
code with argument-passing and function calls. For example:
foo ( gcptr a, bits32 b )
{
if (b > 0) {
// we can make tail calls passing arguments:
jump stg_ap_0_fast(a);
}
return (x,y);
}
More details on the new cmm syntax are in Note [Syntax of .cmm files]
in CmmParse.y.
The old syntax is still more-or-less supported for those occasional
code fragments that really need to explicitly manipulate the stack.
However there are a couple of differences: it is now obligatory to
give a list of live GlobalRegs on every jump, e.g.
jump %ENTRY_CODE(Sp(0)) [R1];
Again, more details in Note [Syntax of .cmm files].
I have rewritten most of the .cmm files in the RTS into the new
syntax, except for AutoApply.cmm which is generated by the genapply
program: this file could be generated in the new syntax instead and
would probably be better off for it, but I ran out of enthusiasm.
Some other changes in this batch:
- The PrimOp calling convention is gone, primops now use the ordinary
NativeNodeCall convention. This means that primops and "foreign
import prim" code must be written in high-level cmm, but they can
now take more than 10 arguments.
- CmmSink now does constant-folding (should fix #7219)
- .cmm files now go through the cmmPipeline, and as a result we
generate better code in many cases. All the object files generated
for the RTS .cmm files are now smaller. Performance should be
better too, but I haven't measured it yet.
- RET_DYN frames are removed from the RTS, lots of code goes away
- we now have some more canned GC points to cover unboxed-tuples with
2-4 pointers, which will reduce code size a little.
|
| |
|
| |
|
|
|
|
|
| |
This is a hopefully temporary measure until the new SRT design is
implemeented.
|
| |
|
| |
|
| |
|