| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- No need to distinguish between gcc-llvm and clang. First of all,
gcc-llvm is quite old and surely unmaintained by now. Second of all,
none of the code actually care about that distinction!
Now, it does make sense to consider C multiple frontends for LLVMs in
the form of clang vs clang-cl (same clang, yes, but tweaked
interface). But this is better handled in terms of "gccish vs
mvscish" and "is LLVM", yielding 4 combinations. Therefore, I don't
think it is useful saving the existing code for that.
- Get the remaining CC_LLVM_BACKEND, and also TABLES_NEXT_TO_CODE in
mk/config.h the normal way, rather than hacking it post-hoc. No point
keeping these special cases around for now reason.
- Get rid of hand-rolled `die` function and just use `AC_MSG_ERROR`.
- Abstract check + flag override for unregisterised and tables next to
code.
Oh, and as part of the above I also renamed/combined some variables
where it felt appropriate.
- GccIsClang -> CcLlvmBackend. This is for `AC_SUBST`, like the other
Camal case ones. It was never about gcc-llvm, or Apple's renamed clang,
to be clear.
- llvm_CC_FLAVOR -> CC_LLVM_BACKEND. This is for `AC_DEFINE`, like the
other all-caps snake case ones. llvm_CC_FLAVOR was just silly
indirection *and* an odd name to boot.
|
|
|
|
|
|
|
|
|
| |
This allows the stage1 compiler (which needs to run on the build
platform and produce code for the host) to depend upon properties of the
target. This is wrong. However, it's no more wrong than it was
previously and @Erichson2314 is working on fixing this so I'm going to
remove the guard so we can finally bootstrap HEAD with ghc-8.8 (see
issue #17146).
|
| |
|
| |
|
|
|
|
| |
To avoid polluting the macro namespace
|
|
|
|
| |
They are only used in a file we construct directly, so just skip CPP.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The generated headers are now generated per stage, which means we can
skip hacks like `ghc_boot_platform.h` and just have that be the stage 0
header as proper. In general, stages are to be embraced: freely generate
everything in each stage but then just build what you depend on, and
everything is symmetrical and efficient. Trying to avoid stages because
bootstrapping is a mind bender just creates tons of bespoke
mini-mind-benders that add up to something far crazier.
Hadrian was pretty close to this "stage-major" approach already, and so
was fairly easy to fix. Make needed more work, however: it did know
about stages so at least there was a scaffold, but few packages except
for the compiler cared, and the compiler used its own counting system.
That said, make and Hadrian now work more similarly, which is good for
the transition to Hadrian. The merits of embracing stage aside, the
change may be worthy for easing that transition alone.
|
|
|
|
| |
Zeros heap memory after gc freed it.
|
|
|
|
|
| |
It doesn't need it, and it shouldn't need it or else multi-target will
break.
|
| |
|
|
|
|
|
|
|
| |
This commit starts renaming some flip bit related functions for the
generalised heap traversal code and adds provitions for sharing the
per-closure profiling header field currently used exclusively for retainer
profiling with other heap traversal profiling modes.
|
|
|
|
|
|
|
| |
The `defined(DEBUG_RETAINER) == true` branch doesn't even compile anymore
because 1) retainerSet was renamed to RetainerSet and 2) even if I fix that
the context in Rts.h seems to have changed such that it's not in scope. If
3) I fix that 'flip' is still not in scope :) At that point I just gave up.
|
|
|
|
|
|
|
| |
This updates the documentation of the MIN_PAYLOAD_SIZE constant and adds
a new Note [Mark bits in mark-compact collector] explaning why the
mark-compact collector uses two bits per objet and why we need
MIN_PAYLOAD_SIZE.
|
|
|
|
|
|
|
| |
Until 0472f0f6a92395d478e9644c0dbd12948518099f there was a meaningful
host vs target distinction (though it wasn't used right, in genapply).
After that, they did not differ in meaningful ways, so it's best to just
only keep one.
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a new eventlog event which indicates the start of
a biographical profiler sample. These are different to normal events as
they also include the timestamp of when the census took place. This is
because the LDV profiler only emits samples at the end of the run.
Now all the different profiling modes emit consumable events to the
eventlog.
|
|
|
|
|
|
| |
Add StgToCmm module hierarchy. Platform modules that are used in several
other places (NCG, LLVM codegen, Cmm transformations) are put into
GHC.Platform.
|
|
|
|
|
|
|
|
|
| |
Some where using `True` / `False`, a legacy of when they were in
`Config.hs`. See #16914 / d238d3062a9858 for a similar problem.
Also clean up the configure variables names for consistency and clarity
while we're at it. "Target" makes clear we are talking about outputted
code, not where GHC itself runs.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
`TablesNextToCode` is now a substituted by configure, where it has the
correct defaults and error handling. Nowhere else needs to duplicate
that, though we may want the compiler to to guard against bogus settings
files.
I renamed it from `GhcEnableTablesNextToCode` to `TablesNextToCode` to:
- Help me guard against any unfixed usages
- Remove any lingering connotation that this flag needs to be combined
with `GhcUnreigsterised`.
Original reviewers:
Original subscribers: TerrorJack, rwbarton, carter
Original Differential Revision: https://phabricator.haskell.org/D5082
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Effects as I measured them:
RTS Size: +0.1%
Compile times: -0.5%
Runtine nofib: -1.1%
Nofib runtime result seems to mostly come from the `CS` benchmark
which is very sensible to alignment changes so this is likely over
represented.
However the compile time changes are realistic.
This is related to #16961.
|
|
|
|
|
| |
Now that the target macros are not being used, we remove them. This
prevents target hardcoding regressions.
|
|
|
|
|
|
|
| |
Unfortunately this will require more work; register allocation is
quite broken.
This reverts commit acd795583625401c5554f8e04ec7efca18814011.
|
|
|
|
|
| |
Instead following @angerman's suggestion put them in the config file.
Maybe we could re-key llvm-targets someday, but this is good for now.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These prevent multi-target builds. They were gotten rid of in 3 ways:
1. In the compiler itself, replacing `#if` with runtime `if`. In these
cases, we care about the target platform still, but the target platform
is dynamic so we must delay the elimination to run time.
2. In the compiler itself, replacing `TARGET` with `HOST`. There was
just one bit of this, in some code splitting strings representing lists
of paths. These paths are used by GHC itself, and not by the compiled
binary. (They are compiler lookup paths, rather than RPATHS or something
that does matter to the compiled binary, and thus would legitamentally
be target-sensative.) As such, the path-splitting method only depends on
where GHC runs and not where code it produces runs. This should have
been `HOST` all along.
3. Changing the RTS. The RTS doesn't care about the target platform,
full stop.
4. `includes/stg/HaskellMachRegs.h` This file is also included in the
genapply executable. This is tricky because the RTS's host platform
really is that utility's target platform. so that utility really really
isn't multi-target either. But at least it isn't an installed part of
GHC, but just a one-off tool when building the RTS. Lying with the
`HOST` to a one-off program (genapply) that isn't installed doesn't seem so bad.
It's certainly better than the other way around of lying to the RTS
though not to genapply. The RTS is more important, and it is installed,
*and* this header is installed as part of the RTS.
|
|
|
|
|
|
|
| |
This adds support for constructing vector types from Float#, Double# etc
and performing arithmetic operations on them
Cleaned-Up-By: Ben Gamari <ben@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here the following changes are introduced:
- A read barrier machine op is added to Cmm.
- The order in which a closure's fields are read and written is changed.
- Memory barriers are added to RTS code to ensure correctness on
out-or-order machines with weak memory ordering.
Cmm has a new CallishMachOp called MO_ReadBarrier. On weak memory machines, this
is lowered to an instruction that ensures memory reads that occur after said
instruction in program order are not performed before reads coming before said
instruction in program order. On machines with strong memory ordering properties
(e.g. X86, SPARC in TSO mode) no such instruction is necessary, so
MO_ReadBarrier is simply erased. However, such an instruction is necessary on
weakly ordered machines, e.g. ARM and PowerPC.
Weam memory ordering has consequences for how closures are observed and mutated.
For example, consider a closure that needs to be updated to an indirection. In
order for the indirection to be safe for concurrent observers to enter, said
observers must read the indirection's info table before they read the
indirectee. Furthermore, the entering observer makes assumptions about the
closure based on its info table contents, e.g. an INFO_TYPE of IND imples the
closure has an indirectee pointer that is safe to follow.
When a closure is updated with an indirection, both its info table and its
indirectee must be written. With weak memory ordering, these two writes can be
arbitrarily reordered, and perhaps even interleaved with other threads' reads
and writes (in the absence of memory barrier instructions). Consider this
example of a bad reordering:
- An updater writes to a closure's info table (INFO_TYPE is now IND).
- A concurrent observer branches upon reading the closure's INFO_TYPE as IND.
- A concurrent observer reads the closure's indirectee and enters it. (!!!)
- An updater writes the closure's indirectee.
Here the update to the indirectee comes too late and the concurrent observer has
jumped off into the abyss. Speculative execution can also cause us issues,
consider:
- An observer is about to case on a value in closure's info table.
- The observer speculatively reads one or more of closure's fields.
- An updater writes to closure's info table.
- The observer takes a branch based on the new info table value, but with the
old closure fields!
- The updater writes to the closure's other fields, but its too late.
Because of these effects, reads and writes to a closure's info table must be
ordered carefully with respect to reads and writes to the closure's other
fields, and memory barriers must be placed to ensure that reads and writes occur
in program order. Specifically, updates to a closure must follow the following
pattern:
- Update the closure's (non-info table) fields.
- Write barrier.
- Update the closure's info table.
Observing a closure's fields must follow the following pattern:
- Read the closure's info pointer.
- Read barrier.
- Read the closure's (non-info table) fields.
This patch updates RTS code to obey this pattern. This should fix long-standing
SMP bugs on ARM (specifically newer aarch64 microarchitectures supporting
out-of-order execution) and PowerPC. This fixes issue #15449.
Co-Authored-By: Ben Gamari <ben@well-typed.com>
|
|
|
|
|
|
|
|
|
| |
This implements the correct fix for #11627 by skipping over the slop
(which is zeroed) rather than adding special case logic for LARGE
ARR_WORDS which runs the risk of not performing a correct census by
ignoring any subsequent blocks.
This approach implements similar logic to that in Sanity.c
|
|
|
|
|
|
|
| |
Previously we would pass flags intended for the C compiler to the C++
compiler (see #16738). This would cause, for instance, `-std=gnu99` to
be passed to the C++ compiler, causing spurious test failures. Fix this
by maintaining a separate set of flags for C++ compilation invocations.
|
|
|
|
|
| |
The linter now enforces our preference for `#if defined()` and
`#if !defined()`.
|
|
|
|
|
|
|
|
|
| |
As discussed in #16744, both the Make and Hadrian build systems
have special code to always pass -eventlog whenever -prof or -debug
are passed. However, there is some similar logic in the RTS itself only
for defining TRACING when the DEBUG macro is defined, but no such logic
is implemented to define TRACING when the PROFILING macro is defined.
This patch adds such a logic and therefore fixes #16744.
|
|
|
|
|
|
|
| |
This allows a user to observe how long a sampling period lasts so that
the time taken can be removed from the profiling output.
Fixes #16697
|
|
|
|
|
|
|
|
|
|
| |
After the previous commit, `Settings` is just a thin wrapper around
other groups of settings. While `Settings` is used by GHC-the-executable
to initalize `DynFlags`, in principle another consumer of
GHC-the-library could initialize `DynFlags` a different way. It
therefore doesn't make sense for `DynFlags` itself (library code) to
separate the settings that typically come from `Settings` from the
settings that typically don't.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit splits out a subset of GhcException which do not depend on
pretty printing (SDoc), as a new datatype called
PlainGhcException. These exceptions can be caught as GhcException,
because 'fromException' will convert them.
The motivation for this change is that that the Panic module
transitively depends on many modules, primarily due to pretty printing
code. It's on the order of about 130 modules. This large set of
dependencies has a few implications:
1. To avoid cycles / use of boot files, these dependencies cannot
throw GhcException.
2. There are some utility modules that use UnboxedTuples and also use
`panic`. This means that when loading GHC into GHCi, about 130
additional modules would need to be compiled instead of
interpreted. Splitting the non-pprint exception throwing into a new
module resolves this issue. See #13101
|
|
|
|
|
|
| |
This was a bit unclear as we use both one-based and zero-based
tags in GHC.
[skip ci]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. If GHC is to be multi-target, these cannot be baked in at compile
time.
2. Compile-time flags have a higher maintenance than run-time flags.
3. The old way makes build system implementation (various bootstrapping
details) with the thing being built. E.g. GHC doesn't need to care
about which integer library *will* be used---this is purely a crutch
so the build system doesn't need to pass flags later when using that
library.
4. Experience with cross compilation in Nixpkgs has shown things work
nicer when compiler's can *optionally* delegate the bootstrapping the
package manager. The package manager knows the entire end-goal build
plan, and thus can make top-down decisions on bootstrapping. GHC can
just worry about GHC, not even core library like base and ghc-prim!
|
| |
|
|
|
|
|
|
|
| |
The primop stgFloatToWord32 was sign-extending the 32-bit word, resulting
in weird negative Word32s. Zero-extend them instead.
Closes #16617.
|
|
|
|
| |
Get "Tables next to code" from the settings file instead.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bulk of the work was done in #712, making settings be make/Hadrian
controlled. This commit then just moves the unlit command rules in
make/Hadrian from the `Config.hs` generator to the `settings` generator
in each build system.
I think this is a good change because the crucial benefit is *settings*
don't affect the build: ghc gets one baby step closer to being a regular
cabal executable, and make/Hadrian just maintains settings as part of
bootstrapping.
|
|
|
|
| |
This allows it to eventually become stage-specific
|
|
|
|
|
|
| |
- Remove redundant casting in evacuate_static_object
- Remove redundant parens in STATIC_LINK
- Fix a typo in GC.c
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* simplifies registers to have GPR, Float and Double, by removing the SSE2 and X87 Constructors
* makes -msse2 assumed/default for x86 platforms, fixing a long standing nondeterminism in rounding
behavior in 32bit haskell code
* removes the 80bit floating point representation from the supported float sizes
* theres still 1 tiny bit of x87 support needed,
for handling float and double return values in FFI calls wrt the C ABI on x86_32,
but this one piece does not leak into the rest of NCG.
* Lots of code thats not been touched in a long time got deleted as a
consequence of all of this
all in all, this change paves the way towards a lot of future further
improvements in how GHC handles floating point computations, along with
making the native code gen more accessible to a larger pool of contributors.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This:
- Hoists part of the condition outside of the initialization loop in
`stg_newSmallArrayzh`.
- Annotates one of the unlikely branches as unlikely, also in
`stg_newSmallArrayzh`.
- Adds a couple of annotations to `allocateMightFail` indicating which
branches are likely to be taken.
Together this gives about 5% improvement.
Signed-off-by: Michal Terepeta <michal.terepeta@gmail.com>
|
|
|
|
|
|
| |
This commit includes the necessary changes in code and
documentation to support a primop that reverses a word's
bits. It also includes a test.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves all URL references to Trac Wiki to their corresponding
GitLab counterparts.
This substitution is classified as follows:
1. Automated substitution using sed with Ben's mapping rule [1]
Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...
New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...
2. Manual substitution for URLs containing `#` index
Old: ghc.haskell.org/trac/ghc/wiki/XxxYyy...#Zzz
New: gitlab.haskell.org/ghc/ghc/wikis/xxx-yyy...#zzz
3. Manual substitution for strings starting with `Commentary`
Old: Commentary/XxxYyy...
New: commentary/xxx-yyy...
See also !539
[1]: https://gitlab.haskell.org/bgamari/gitlab-migration/blob/master/wiki-mapping.json
|
| |
|
|
|
|
|
|
| |
This function allows the user to compute the (non-transitive) size of a
heap object in words. The "closure" in the name is admittedly confusing
but we are stuck with this nomenclature at this point.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We make liveness information for global registers
available on `JMP` and `BCTR`, which were the last instructions
missing. With complete liveness information we do not need to
reserve global registers in `freeReg` anymore. Moreover we
assign R9 and R10 to callee saves registers.
Cleanup by removing `Reg_Su`, which was unused, from `freeReg`
and removing unused register definitions.
The calculation of the number of floating point registers is too
conservative. Just follow X86 and specify the constants directly.
Overall on PowerPC this results in 0.3 % smaller code size in nofib
while runtime is slightly better in some tests.
|
|
|
|
|
| |
This moves all URL references to Trac tickets to their corresponding
GitLab counterparts.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Summary:
This re-applies {D5195} with fixes for i386:
* Fix unused label warnings, see {D5230} or {D5273}
* Fix a silly bug introduced by moving `#if`
{P190}
Add a RTS option -xp to load PIC object anywhere in address space. We do
this by relaxing the requirement of <0x80000000 result of
`mmapForLinker` and implying USE_CONTIGUOUS_MMAP.
We also need to change calls to `ocInit` and `ocGetNames` to avoid
dangling pointers when the address of `oc->image` is changed by
`ocAllocateSymbolExtra`.
Test Plan:
See {D5195}, also test under i386:
```
$ uname -a
Linux watashi-arch32 4.18.5-arch1-1.0-ARCH #1 SMP PREEMPT Tue Aug 28
20:45:30 CEST 2018 i686 GNU/Linux
$ cd testsuite/tests/th/ && make test
...
```
will run `./validate` on stacked diff.
Reviewers: simonmar, bgamari, alpmestan, trommler, hvr, erikd
Reviewed By: simonmar
Subscribers: rwbarton, carter
Differential Revision: https://phabricator.haskell.org/D5289
|