| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is mostly for congruence with 'subWordC#' and '{add,sub}IntC#'.
I found 'plusWord2#' while implementing this, which both lacks
documentation and has a slightly different specification than
'addWordC#', which means the generic implementation is unnecessarily
complex.
While I was at it, I also added lacking meta-information on PrimOps
and refactored 'subWordC#'s generic implementation to be branchless.
Reviewers: bgamari, simonmar, jrtc27, dfeuer
Reviewed By: bgamari, dfeuer
Subscribers: dfeuer, thomie, carter
Differential Revision: https://phabricator.haskell.org/D4592
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for the bit deposit and extraction operations provided
by the BMI and BMI2 instruction set extensions on modern amd64 machines.
Implement x86 code generator for pdep and pext. Properly initialise
bmiVersion field.
pdep and pext test cases
Fix pattern match for pdep and pext instructions
Fix build of pdep and pext code for 32-bit architectures
Test Plan: Validate
Reviewers: austin, simonmar, bgamari, angerman
Reviewed By: bgamari
Subscribers: trommler, carter, angerman, thomie, rwbarton, newhoggy
GHC Trac Issues: #14206
Differential Revision: https://phabricator.haskell.org/D4236
|
|
|
|
|
|
| |
This broke the 32-bit build.
This reverts commit f5dc8ccc29429d0a1d011f62b6b430f6ae50290c.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for the bit deposit and extraction operations provided
by the BMI and BMI2 instruction set extensions on modern amd64 machines.
Test Plan: Validate
Reviewers: austin, simonmar, bgamari, hvr, goldfire, erikd
Reviewed By: bgamari
Subscribers: goldfire, erikd, trommler, newhoggy, rwbarton, thomie
GHC Trac Issues: #14206
Differential Revision: https://phabricator.haskell.org/D4063
|
|
|
|
|
|
|
|
|
|
|
|
| |
Depends on D4090
Reviewers: austin, bgamari, erikd, simonmar, alexbiehl
Reviewed By: bgamari
Subscribers: rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D4091
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here we add a flag to instruct the native code generator to add
alignment checks in all info table dereferences. This is helpful in
catching pointer tagging issues.
Thanks to @jrtc27 for uncovering the tagging issues on Sparc which
inspired this flag.
Test Plan: Validate
Reviewers: simonmar, austin, erikd
Reviewed By: simonmar
Subscribers: rwbarton, trofi, thomie, jrtc27
Differential Revision: https://phabricator.haskell.org/D4101
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This switches the compiler/ component to get compiled with
-XNoImplicitPrelude and a `import GhcPrelude` is inserted in all
modules.
This is motivated by the upcoming "Prelude" re-export of
`Semigroup((<>))` which would cause lots of name clashes in every
modulewhich imports also `Outputable`
Reviewers: austin, goldfire, bgamari, alanz, simonmar
Reviewed By: bgamari
Subscribers: goldfire, rwbarton, thomie, mpickering, bgamari
Differential Revision: https://phabricator.haskell.org/D3989
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This copies the subset of Hoopl's functionality needed by GHC to
`cmm/Hoopl` and removes the dependency on the Hoopl package.
The main motivation for this change is the confusing/noisy interface
between GHC and Hoopl:
- Hoopl has `Label` which is GHC's `BlockId` but different than
GHC's `CLabel`
- Hoopl has `Unique` which is different than GHC's `Unique`
- Hoopl has `Unique{Map,Set}` which are different than GHC's
`Uniq{FM,Set}`
- GHC has its own specialized copy of `Dataflow`, so `cmm/Hoopl` is
needed just to filter the exposed functions (filter out some of the
Hoopl's and add the GHC ones)
With this change, we'll be able to simplify this significantly.
It'll also be much easier to do invasive changes (Hoopl is a public
package on Hackage with users that depend on the current behavior)
This should introduce no changes in functionality - it merely
copies the relevant code.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
Test Plan: ./validate
Reviewers: austin, bgamari, simonmar
Reviewed By: bgamari, simonmar
Subscribers: simonpj, kavon, rwbarton, thomie
Differential Revision: https://phabricator.haskell.org/D3616
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider one-line module
module B (v) where v = "hello"
in -fvia-C mode it generates code like
static char gibberish_str[] = "hello";
It resides in data section (precious resource on ia64!).
The patch switches genrator to emit:
static const char gibberish_str[] = "hello";
Other types if symbols that gained 'const' qualifier are:
- info tables (from haskell and CMM)
- static reference tables (from haskell and CMM)
Cleanups along the way:
- fixed info tables defined in .cmm to reside in .rodata
- split out closure declaration into 'IC_' / 'EC_'
- added label declaration (based on label type) right before
each label definition (based on section type) so that C
compiler could check if declaration and definition matches
at definition site.
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Test Plan: ran testsuite on unregisterised x86_64 compiler
Reviewers: simonmar, ezyang, austin, bgamari, erikd
Reviewed By: bgamari, erikd
Subscribers: rwbarton, thomie
GHC Trac Issues: #8996
Differential Revision: https://phabricator.haskell.org/D3481
|
|
|
|
| |
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Noticed breakage as build failure on i386 freebsd build bot:
http://haskell.inf.elte.hu/builders/freebsd-i386-head/1267/10.html
ghc-stage1: panic! (the 'impossible' happened)
(GHC version 8.1.20170310 for i386-portbld-freebsd):
outOfLineCmmOp: MO_F64_Fabs not supported here
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we have this in libraries/base/GHC/Float.hs:
```
abs x | x == 0 = 0 -- handles (-0.0)
| x > 0 = x
| otherwise = negateFloat x
```
But 3-4 years ago it was noted that this was inefficient:
https://mail.haskell.org/pipermail/libraries/2013-April/019690.html
We can generate better code for X86 and llvm and for others generate
some custom cmm code which is similar to what the compiler generates
now.
Reviewers: austin, simonmar, hvr, bgamari
Reviewed By: bgamari
Subscribers: dfeuer, thomie
Differential Revision: https://phabricator.haskell.org/D3265
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fundamental problem with `type UniqSet = UniqFM` is that `UniqSet`
has a key invariant `UniqFM` does not. For example, `fmap` over
`UniqSet` will generally produce nonsense.
* Upgrade `UniqSet` from a type synonym to a newtype.
* Remove unused and shady `extendVarSet_C` and `addOneToUniqSet_C`.
* Use cached unique in `tyConsOfType` by replacing
`unitNameEnv (tyConName tc) tc` with `unitUniqSet tc`.
Reviewers: austin, hvr, goldfire, simonmar, niteria, bgamari
Reviewed By: niteria
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D3146
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Typical UNREG build failure looks like that:
ghc-unreg/includes/Stg.h:226:46: error:
note: in definition of macro 'EI_'
#define EI_(X) extern StgWordArray (X) GNU_ATTRIBUTE(aligned (8))
^
|
226 | #define EI_(X) extern StgWordArray (X) GNU_ATTRIBUTE(aligned (8))
| ^
/tmp/ghc10489_0/ghc_3.hc:1754:6: error:
note: previous definition of 'ghczmprim_GHCziTypes_zdtcTyCon2_bytes' was here
char ghczmprim_GHCziTypes_zdtcTyCon2_bytes[] = "TyCon";
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
1754 | char ghczmprim_GHCziTypes_zdtcTyCon2_bytes[] = "TyCon";
| ^
As we see here "_bytes" string literals are defined as 'char []'
array, not 'StgWord []'.
The change special-cases "_bytes" string literals to have
correct declaration type.
Signed-off-by: Sergei Trofimovich <siarheit@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is specifically for the C backend on Sparc64 (which has
no native backend) but is also required for Sparc when building
un-registerised.
Bug reported via Debian (patch included):
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=842780
Test Plan: validate
Reviewers: hvr, Phyx, bgamari, austin, simonmar
Reviewed By: Phyx
Subscribers: jrtc27, thomie
Differential Revision: https://phabricator.haskell.org/D2661
GHC Trac Issues: #12793
|
|
|
|
|
|
|
| |
There should be no performance impact of switching to the
deterministic set here.
GHC Trac: #4012
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Looking at more failures on m68k (Trac #11395)
I've noticed the arith001 and arith012 test failures.
(--host=x86_64-linux --target=m68k-linux).
The following example was enough to reproduce a problem:
v :: Float
v = 43
main = print v
m68k binaries printed '0.0' instead of '43.0'.
The bug here is how we encode Floats and Double
as Words with the same binary representation.
Floats:
Before the patch we just coerced Float to Int.
That breaks when we cross-compile from
64-bit LE to 32-bit BE.
The patch fixes conversion by accounting for padding.
when we extend 32-bit value to 64-bit value (LE and BE
do it slightly differently).
Doubles:
Before the patch Doubles were coerced to a pair of Ints
(not correct as x86_64 can hold Double in one Int) and
then trucated this pair of Ints to pair of Word32.
The patch fixes conversion by always decomposing in
Word32 and accounting for host endianness (newly
introduced hostBE) and target endianness (wORDS_BIGENDIAN).
I've tested this patch on Double and Float conversion on
--host=x86_64-linux --target=m68k-linux
crosscompiler. It fixes 10 tests related to printing Floats
and Doubles.
Thanks to Bertram Felgenhauer who poined out this probem.
Signed-off-by: Sergei Trofimovich <siarheit@google.com>
Test Plan: checked some examples manually, fixed 10 tests in test suite
Reviewers: int-e, austin, bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1990
GHC Trac Issues: #11395
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before the patch both Cmm and C symbols were declared
with 'EF_' macro:
#define EF_(f) extern StgFunPtr f()
but for Cmm symbols we know exact prototypes.
The patch splits there prototypes in to:
#define EFF_(f) void f() /* See Note [External function prototypes] */
#define EF_(f) StgFunPtr f(void)
Cmm functions are 'EF_' (External Functions),
C functions are 'EFF_' (External Foreign Functions).
While at it changed external C function prototype
to return 'void' to workaround ghc bug on m68k.
Described in detail in Trac #11395.
This makes simple tests work on m68k-linux target!
Thanks to Michael Karcher for awesome analysis
happening in Trac #11395.
Signed-off-by: Sergei Trofimovich <siarheit@google.com>
Test Plan: ran "hello world" on m68k successfully
Reviewers: simonmar, austin, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1975
GHC Trac Issues: #11395
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bug is observed on m68k-linux target as crash
in RTS:
-- a.hs:
main = print 43
$ inplace/bin/ghc-stage1 --make -debug a.hs
$ ./a
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x80463b0a in LOOKS_LIKE_INFO_PTR_NOT_NULL (p=32858)
at includes/rts/storage/ClosureMacros.h:248
(gdb) bt
#0 0x80463b0a in LOOKS_LIKE_INFO_PTR_NOT_NULL (p=32858)
at includes/rts/storage/ClosureMacros.h:248
#1 0x80463b46 in LOOKS_LIKE_INFO_PTR (p=32858)
at includes/rts/storage/ClosureMacros.h:253
#2 0x80463b6c in LOOKS_LIKE_CLOSURE_PTR (
p=0x805aac6e <stg_dummy_ret_closure>)
at includes/rts/storage/ClosureMacros.h:258
#3 0x80463e4c in initStorage ()
at rts/sm/Storage.c:121
#4 0x8043ffb4 in hs_init_ghc (...)
at rts/RtsStartup.c:181
#5 0x80455982 in hs_main (...)
at rts/RtsMain.c:51
#6 0x80003c1c in main ()
GHC assumes last 2 pointer bits are tags on 32-bit targets.
But here 'stg_dummy_ret_closure' address violates the assumption:
LOOKS_LIKE_CLOSURE_PTR (p=0x805aac6e <stg_dummy_ret_closure>)
I've added compiler hint for static StgClosure objects to
align closures at least by their natural alignment (what GHC assumes).
See Note [StgWord alignment].
Signed-off-by: Sergei Trofimovich <siarheit@google.com>
Test Plan: ran basic test on m68k qemu, it got past ASSERTs
Reviewers: simonmar, austin, bgamari
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1974
GHC Trac Issues: #11395
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
In the past the canonical way for constructing an SDoc string literal was the
composition `ptext . sLit`. But for some time now we have function `text` that
does the same. Plus it has some rules that optimize its runtime behaviour.
This patch takes all uses of `ptext . sLit` in the compiler and replaces them
with calls to `text`. The main benefits of this patch are clener (shorter) code
and less dependencies between module, because many modules now do not need to
import `FastString`. I don't expect any performance benefits - we mostly use
SDocs to report errors and it seems there is little to be gained here.
Test Plan: ./validate
Reviewers: bgamari, austin, goldfire, hvr, alanz
Subscribers: goldfire, thomie, mpickering
Differential Revision: https://phabricator.haskell.org/D1784
|
|
|
|
|
|
| |
Starting with GHC 7.10 and base-4.8, `Monad` implies `Applicative`,
which allows to simplify some definitions to exploit the superclass
relationship. This a first refactoring to that end.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since GHC 8.1/8.2 only needs to be bootstrap-able by GHC 7.10 and
GHC 8.0 (and GHC 8.2), we can now finally drop all that pre-AMP
compatibility CPP-mess for good!
Reviewers: austin, goldfire, bgamari
Subscribers: goldfire, thomie, erikd
Differential Revision: https://phabricator.haskell.org/D1724
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds a subWordC# primop which implements subtraction with overflow
reporting.
Reviewers: tibbe, goldfire, rwbarton, bgamari, austin, hvr
Reviewed By: bgamari
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1334
GHC Trac Issues: #10962
|
|
|
|
|
|
|
|
| |
The patch makes
$ make test TEST="debug T10667"
not to fail on CmmStack code generation phase
Signed-off-by: Sergei Trofimovich <siarheit@google.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch refactors pure/(*>) and return/(>>) in MRP-friendly way, i.e.
such that the explicit definitions for `return` and `(>>)` match the
MRP-style default-implementation, i.e.
return = pure
and
(>>) = (*>)
This way, e.g. all `return = pure` definitions can easily be grepped and
removed in GHC 8.1;
Test Plan: Harbormaster
Reviewers: goldfire, alanz, bgamari, quchen, austin
Reviewed By: quchen, austin
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1312
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This allows the code generator to give hints to later code generation
steps about which branch is most likely to be taken. Right now it
is only taken into account in one place: a special case in
CmmContFlowOpt that swapped branches over to maximise the chance of
fallthrough, which is now disabled when there is a likelihood setting.
Test Plan: validate
Reviewers: austin, simonpj, bgamari, ezyang, tibbe
Subscribers: thomie
Differential Revision: https://phabricator.haskell.org/D1273
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
Alignment needs to be a compile-time constant. Previously the code
generators had to jump through hoops to ensure this was the case as the
alignment was passed as a CmmExpr in the arguments list. Now we take
care of this up front.
This fixes #8131.
Authored-by: Reid Barton <rwbarton@gmail.com>
Dusted-off-by: Ben Gamari <ben@smart-cactus.org>
Tests for T8131
Test Plan: Validate
Reviewers: rwbarton, austin
Reviewed By: rwbarton, austin
Subscribers: bgamari, carter, thomie
Differential Revision: https://phabricator.haskell.org/D624
GHC Trac Issues: #8131
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Summary:
Alignment needs to be a compile-time constant. Previously the code
generators had to jump through hoops to ensure this was the case as the
alignment was passed as a CmmExpr in the arguments list. Now we take
care of this up front.
This fixes #8131.
Authored-by: Reid Barton <rwbarton@gmail.com>
Dusted-off-by: Ben Gamari <ben@smart-cactus.org>
Tests for T8131
Test Plan: Validate
Reviewers: rwbarton, austin
Reviewed By: rwbarton, austin
Subscribers: bgamari, carter, thomie
Differential Revision: https://phabricator.haskell.org/D624
GHC Trac Issues: #8131
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
jakzale on #ghc reported a build failure
when ported GHC on a new target.
The code 'pprHexVal (2^32) W32' emits '0xU'
which is invalid C.
I've introduced bug in
43f1b2ecd1960fa7377cf55a2b97c66059a701ef
when added literal truncation. That truncation
is a new source of zeros.
Signed-off-by: Sergei Trofimovich <siarheit@google.com>
Test Plan: added test and tested on UNREG ghc
Reviewers: austin
Reviewed By: austin
Subscribers: thomie, bgamari
Differential Revision: https://phabricator.haskell.org/D987
GHC Trac Issues: #10518
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This re-implements the code generation for case expressions at the Stg →
Cmm level, both for data type cases as well as for integral literal
cases. (Cases on float are still treated as before).
The goal is to allow for fancier strategies in implementing them, for a
cleaner separation of the strategy from the gritty details of Cmm, and
to run this later than the Common Block Optimization, allowing for one
way to attack #10124. The new module CmmSwitch contains a number of
notes explaining this changes. For example, it creates larger
consecutive jump tables than the previous code, if possible.
nofib shows little significant overall improvement of runtime. The
rather large wobbling comes from changes in the code block order
(see #8082, not much we can do about it). But the decrease in code size
alone makes this worthwhile.
```
Program Size Allocs Runtime Elapsed TotalMem
Min -1.8% 0.0% -6.1% -6.1% -2.9%
Max -0.7% +0.0% +5.6% +5.7% +7.8%
Geometric Mean -1.4% -0.0% -0.3% -0.3% +0.0%
```
Compilation time increases slightly:
```
-1 s.d. ----- -2.0%
+1 s.d. ----- +2.5%
Average ----- +0.3%
```
The test case T783 regresses a lot, but it is the only one exhibiting
any regression. The cause is the changed order of branches in an
if-then-else tree, which makes the hoople data flow analysis traverse
the blocks in a suboptimal order. Reverting that gets rid of this
regression, but has a consistent, if only very small (+0.2%), negative
effect on runtime. So I conclude that this test is an extreme outlier
and no reason to change the code.
Differential Revision: https://phabricator.haskell.org/D720
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch solves the scoping problem of CmmTick nodes: If we just put
CmmTicks into blocks we have no idea what exactly they are meant to
cover. Here we introduce tick scopes, which allow us to create
sub-scopes and merged scopes easily.
Notes:
* Given that the code often passes Cmm around "head-less", we have to
make sure that its intended scope does not get lost. To keep the amount
of passing-around to a minimum we define a CmmAGraphScoped type synonym
here that just bundles the scope with a portion of Cmm to be assembled
later.
* We introduce new scopes at somewhat random places, aligning with
getCode calls. This works surprisingly well, but we might have to
add new scopes into the mix later on if we find things too be too
coarse-grained.
(From Phabricator D169)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds CmmTick nodes to Cmm code. This is relatively
straight-forward, but also not very useful, as many blocks will simply
end up with no annotations whatosever.
Notes:
* We use this design over, say, putting ticks into the entry node of all
blocks, as it seems to work better alongside existing optimisations.
Now granted, the reason for this is that currently GHC's main Cmm
optimisations seem to mainly reorganize and merge code, so this might
change in the future.
* We have the Cmm parser generate a few source notes as well. This is
relatively easy to do - worst part is that it complicates the CmmParse
implementation a bit.
(From Phabricator D169)
|
|
|
|
| |
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This includes pretty much all the changes needed to make `Applicative`
a superclass of `Monad` finally. There's mostly reshuffling in the
interests of avoid orphans and boot files, but luckily we can resolve
all of them, pretty much. The only catch was that
Alternative/MonadPlus also had to go into Prelude to avoid this.
As a result, we must update the hsc2hs and haddock submodules.
Signed-off-by: Austin Seipp <austin@well-typed.com>
Test Plan: Build things, they might not explode horribly.
Reviewers: hvr, simonmar
Subscribers: simonmar
Differential Revision: https://phabricator.haskell.org/D13
|
|
|
|
|
|
| |
...some files more or less recently touched by me
[ci skip]
|
|
|
|
|
|
|
| |
No need to emit (now empty) those special markers.
Markers were needed only in registerised -fvia-C mode.
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
On amd64/UNREG build there is many failing tests trying
to deal with 'Integer' types.
Looking at 'integerConversions' test I've observed
invalid C code generated by GHC.
Cmm code
CInt a = -1; (a == -1)
yields 'False' with optimisations enabled via the following C code:
StgWord64 a = (StgWord32)0xFFFFffffFFFFffffu; (a == 0xFFFFffffFFFFffffu)
The patch fixes it by shrinking emitted literals to required sizes:
StgWord64 a = (StgWord32)0xFFFFffffu; (a == 0xFFFFffffu)
Thanks to Reid Barton for tracking down and fixing the issue.
Signed-off-by: Sergei Trofimovich <slyfox@gentoo.org>
Test Plan: validate on UNREG build (amd64, x86)
Reviewers: simonmar, rwbarton, austin
Subscribers: hvr, simonmar, ezyang, carter
Differential Revision: https://phabricator.haskell.org/D173
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
These MachOps are used by addIntC# and subIntC#, which in turn are
used in integer-gmp when adding or subtracting small Integers. The
following benchmark shows a ~6% speedup after this commit on x86_64
(building GHC with BuildFlavour=perf).
{-# LANGUAGE MagicHash #-}
import GHC.Exts
import Criterion.Main
count :: Int -> Integer
count (I# n#) = go n# 0
where go :: Int# -> Integer -> Integer
go 0# acc = acc
go n# acc = go (n# -# 1#) $! acc + 1
main = defaultMain [bgroup "count"
[bench "100" $ whnf count 100]]
Differential Revision: https://phabricator.haskell.org/D140
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This implements the new primops
clz#, clz32#, clz64#,
ctz#, ctz32#, ctz64#
which provide efficient implementations of the popular
count-leading-zero and count-trailing-zero respectively
(see testcase for a pure Haskell reference implementation).
On x86, NCG as well as LLVM generates code based on the BSF/BSR
instructions (which need extra logic to make the 0-case well-defined).
Test Plan: validate and succesful tests on i686 and amd64
Reviewers: rwbarton, simonmar, ezyang, austin
Subscribers: simonmar, relrod, ezyang, carter
Differential Revision: https://phabricator.haskell.org/D144
GHC Trac Issues: #9340
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the second attempt to add this functionality. The first
attempt was reverted in 950fcae46a82569e7cd1fba1637a23b419e00ecd, due
to register allocator failure on x86. Given how the register
allocator currently works, we don't have enough registers on x86 to
support cmpxchg using complicated addressing modes. Instead we fall
back to a simpler addressing mode on x86.
Adds the following primops:
* atomicReadIntArray#
* atomicWriteIntArray#
* fetchSubIntArray#
* fetchOrIntArray#
* fetchXorIntArray#
* fetchAndIntArray#
Makes these pre-existing out-of-line primops inline:
* fetchAddIntArray#
* casIntArray#
|
|
|
|
|
|
|
|
| |
This commit caused the register allocator to fail on i386.
This reverts commit d8abf85f8ca176854e9d5d0b12371c4bc402aac3 and
04dd7cb3423f1940242fdfe2ea2e3b8abd68a177 (the second being a fix to
the first).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
Add more primops for atomic ops on byte arrays
Adds the following primops:
* atomicReadIntArray#
* atomicWriteIntArray#
* fetchSubIntArray#
* fetchOrIntArray#
* fetchXorIntArray#
* fetchAndIntArray#
Makes these pre-existing out-of-line primops inline:
* fetchAddIntArray#
* casIntArray#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases, the layout of the LANGUAGE/OPTIONS_GHC lines has been
reorganized, while following the convention, to
- place `{-# LANGUAGE #-}` pragmas at the top of the source file, before
any `{-# OPTIONS_GHC #-}`-lines.
- Moreover, if the list of language extensions fit into a single
`{-# LANGUAGE ... -#}`-line (shorter than 80 characters), keep it on one
line. Otherwise split into `{-# LANGUAGE ... -#}`-lines for each
individual language extension. In both cases, try to keep the
enumeration alphabetically ordered.
(The latter layout is preferable as it's more diff-friendly)
While at it, this also replaces obsolete `{-# OPTIONS ... #-}` pragma
occurences by `{-# OPTIONS_GHC ... #-}` pragmas.
|
|
|
|
| |
Signed-off-by: Erik de Castro Lopo <erikd@mega-nerd.com>
|
|
|
|
|
|
| |
This way CPP conditionals can be avoided for the transition period.
Signed-off-by: Herbert Valerio Riedel <hvr@gnu.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for several new primitive operations which
support using processor-specific instructions to help guide data and
cache locality decisions. We have levels ranging from [0..3]
For LLVM, we generate llvm.prefetch intrinsics at the proper locality
level (similar to GCC.)
For x86 we generate prefetch{NTA, t2, t1, t0} instructions. On SPARC and
PowerPC, the locality levels are ignored.
This closes #8256.
Authored-by: Carter Tazio Schonwald <carter.schonwald@gmail.com>
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
width and element type.
SIMD primops are now polymorphic in vector size and element type, but
only internally to the compiler. More specifically, utils/genprimopcode
has been extended so that it "knows" about SIMD vectors. This allows us
to, for example, write a single definition for the "add two vectors"
primop in primops.txt.pp and have it instantiated at many vector types.
This generates a primop in GHC.Prim for each vector type at which "add
two vectors" is instantiated, but only one data constructor for the
PrimOp data type, so the code generator is much, much simpler.
|
|
|
|
|
| |
Authored-by: David Luposchainsky <dluposchainsky@gmail.com>
Signed-off-by: Austin Seipp <austin@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch encompasses most of the basic infrastructure for GHCJS. It
includes:
* A new extension, -XJavaScriptFFI
* A new architecture, ArchJavaScript
* Parser and lexer support for 'foreign import javascript', only
available under -XJavaScriptFFI, using ArchJavaScript.
* As a knock-on, there is also a new 'WayCustom' constructor in
DynFlags, so clients of the GHC API can add custom 'tags' to their
built files. This should be useful for other users as well.
The remaining changes are really just the resulting fallout, making sure
all the cases are handled appropriately for DynFlags and Platform.
Authored-by: Luite Stegeman <stegeman@gmail.com>
Signed-off-by: Austin Seipp <aseipp@pobox.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Exposes bSwap{,16,32,64}# primops
* Add a new machop: MO_BSwap
* Use a Stg implementation (hs_bswap{16,32,64}) for other implementation
in NCG.
* Generate bswap in X86 NCG for 32 and 64 bits, and for 16 bits, bswap+shr
instead of using xchg.
* Generate llvm.bswap intrinsics in llvm codegen.
Authored-by: Vincent Hanquez <tab@snarc.org>
Signed-off-by: Austin Seipp <aseipp@pobox.com>
|