| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
This commit renames the `getErrorMessages` and
`getMessages` function in the parser code to `getPsErrorMessages` and
`getPsMessages`, to avoid import conflicts, as we have already
`getErrorMessages` and `getMessages` defined in `GHC.Types.Error`.
Fixes #19920.
Update haddock submodule
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce LogFlags as a independent subset of DynFlags used for logging.
As a consequence in many places we don't have to pass both Logger and
DynFlags anymore.
The main reason for this refactoring is that I want to refactor the
systools interfaces: for now many systools functions use DynFlags both
to use the Logger and to fetch their parameters (e.g. ldInputs for the
linker). I'm interested in refactoring the way they fetch their
parameters (i.e. use dedicated XxxOpts data types instead of DynFlags)
for #19877. But if I did this refactoring before refactoring the Logger,
we would have duplicate parameters (e.g. ldInputs from DynFlags and
linkerInputs from LinkerOpts). Hence this patch first.
Some flags don't really belong to LogFlags because they are subsystem
specific (e.g. most DumpFlags). For example -ddump-asm should better be
passed in NCGConfig somehow. This patch doesn't fix this tight coupling:
the dump flags are part of the UI but they are passed all the way down
for example to infer the file name for the dumps.
Because LogFlags are a subset of the DynFlags, we must update the former
when the latter changes (not so often). As a consequence we now use
accessors to read/write DynFlags in HscEnv instead of using `hsc_dflags`
directly.
In the process I've also made some subsystems less dependent on DynFlags:
- CmmToAsm: by passing some missing flags via NCGConfig (see new fields
in GHC.CmmToAsm.Config)
- Core.Opt.*:
- by passing -dinline-check value into UnfoldingOpts
- by fixing some Core passes interfaces (e.g. CallArity, FloatIn)
that took DynFlags argument for no good reason.
- as a side-effect GHC.Core.Opt.Pipeline.doCorePass is much less
convoluted.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In which we add a new code generator to the Glasgow Haskell
Compiler. This codegen supports ELF and Mach-O targets, thus covering
Linux, macOS, and BSDs in principle. It was tested only on macOS and
Linux. The NCG follows a similar structure as the other native code
generators we already have, and should therfore be realtively easy to
follow.
It supports most of the features required for a proper native code
generator, but does not claim to be perfect or fully optimised. There
are still opportunities for optimisations.
Metric Decrease:
ManyAlternatives
ManyConstructors
MultiLayerModules
PmSeriesG
PmSeriesS
PmSeriesT
PmSeriesV
T10421
T10421a
T10858
T11195
T11276
T11303b
T11374
T11822
T12227
T12545
T12707
T13035
T13253
T13253-spj
T13379
T13701
T13719
T14683
T14697
T15164
T15630
T16577
T17096
T17516
T17836
T17836b
T17977
T17977b
T18140
T18282
T18304
T18478
T18698a
T18698b
T18923
T1969
T3064
T5030
T5321FD
T5321Fun
T5631
T5642
T5837
T783
T9198
T9233
T9630
T9872d
T9961
WWRec
Metric Increase:
T4801
|
|
|
|
|
|
|
|
| |
The stg_ctoi_t and stg_ret_t procedures which convert unboxed
tuples between the bytecode an native calling convention were
causing a panic when using the LLVM backend.
Fixes #19591
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit converts the lexers and all the parser machinery to use the
new parser types and diagnostics infrastructure. Furthermore, it cleans
up the way the parser code was emitting hints.
As a result of this systematic approach, the test output of the
`InfixAppPatErr` and `T984` tests have been changed. Previously they
would emit a `SuggestMissingDo` hint, but this was not at all helpful in
resolving the error, and it was even confusing by just looking at the
original program that triggered the errors.
Update haddock submodule
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Suppose a safe call: myCall(x,y,z)
It is lowered into three unsafe calls in Cmm:
r = suspendThread(...);
myCall(x,y,z);
resumeThread(r);
Consider the following situation for myCall arguments:
x = Sp[..] -- stack
y = Hp[..] -- heap
z = R1 -- global register
r = suspendThread(...);
myCall(x,y,z);
resumeThread(r);
The sink pass assumes that unsafe calls clobber memory (heap and stack),
hence x and y assignments are not sunk after `suspendThread`. The sink
pass also correctly handles global register clobbering for all unsafe
calls, except `suspendThread`!
`suspendThread` is special because it releases the capability the thread
is running on. Hence the sink pass must also take into account global
registers that are mapped into memory (in the capability).
In the example above, we could get:
r = suspendThread(...);
z = R1
myCall(x,y,z);
resumeThread(r);
But this transformation isn't valid if R1 is (BaseReg->rR1) as BaseReg
is invalid between suspendThread and resumeThread. This caused argument
corruption at least with the C backend ("unregisterised") in #19237.
Fix #19237
|
|
|
|
|
|
|
|
|
|
| |
Replace uses of WARN macro with calls to:
warnPprTrace :: Bool -> SDoc -> a -> a
Remove the now unused HsVersions.h
Bump haddock submodule
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch cleans up the complexity around WW's `mk_absent_let` by
broadening the scope of `LitRubbish`. Rubbish literals now store the
`PrimRep` they represent and are ultimately lowered in Cmm.
This in turn allows absent literals of `VecRep` or `VoidRep`. The latter
allows absent literals for unlifted coercions, as requested in #18983.
I took the liberty to rewrite and clean up `Note [Absent fillers]` and
`Note [Rubbish values]` to account for the new implementation and to
make them more orthogonal in their description.
I didn't add a new regression test, as `T18982` already contains the
test in the ticket and its test output changes as expected.
Fixes #18983.
|
|
|
|
|
| |
Metric Increase:
MultiLayerModules
|
|
|
|
|
|
|
|
| |
GHCi needs to know the types of all breakpoints, but it's
not possible to get the exprType of any expression in STG.
This is preparation for the upcoming change to make GHCi
bytecode from STG instead of Core.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This new flag embeds a lookup table from the address of an info table
to information about that info table.
The main interface for consulting the map is the `lookupIPE` C function
> InfoProvEnt * lookupIPE(StgInfoTable *info)
The `InfoProvEnt` has the following structure:
> typedef struct InfoProv_{
> char * table_name;
> char * closure_desc;
> char * ty_desc;
> char * label;
> char * module;
> char * srcloc;
> } InfoProv;
>
> typedef struct InfoProvEnt_ {
> StgInfoTable * info;
> InfoProv prov;
> struct InfoProvEnt_ *link;
> } InfoProvEnt;
The source positions are approximated in a similar way to the source
positions for DWARF debugging information. They are only approximate but
in our experience provide a good enough hint about where the problem
might be. It is therefore recommended to use this flag in conjunction
with `-g<n>` for more accurate locations.
The lookup table is also emitted into the eventlog when it is available
as it is intended to be used with the `-hi` profiling mode.
Using this flag will significantly increase the size of the resulting
object file but only by a factor of 2-3x in our experience.
|
|
|
|
|
|
|
|
|
| |
This commit renames parser's Error and Warning types (and their
constructors) to have a 'Ps' prefix, so that this would play nicely
when more errors and warnings for other phases of the pipeline will
be added. This will make more explicit which is the particular type
of error and warning we are dealing with, and will be more informative
for users to see in the generated Haddock.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The unit database cache, the home unit and the unit state were stored in
DynFlags while they ought to be stored in the compiler session state
(HscEnv). This patch fixes this.
It introduces a new UnitEnv type that should be used in the future to
handle separate unit environments (especially host vs target units).
Related to #17957
Bump haddock submodule
|
|
|
|
|
|
|
| |
The HomeUnitId is only used by the Cmm parser and this one has access to
the DynFlags, so it can grab the UnitId of the HomeUnit from them.
Bump haddock submodule
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Haskell and Cmm parsers/lexers now report errors and warnings using ADTs
defined in GHC.Parser.Errors. They can be printed using functions in
GHC.Parser.Errors.Ppr.
Some of the errors provide hints with a separate ADT (e.g. to suggest to
turn on some extension). For now, however, hints are not consistent
across all messages. For example some errors contain the hints in the
main message. I didn't want to change any message with this patch. I
expect these changes to be discussed and implemented later.
Surprisingly, this patch enhances performance. On CI
(x86_64/deb9/hadrian, ghc/alloc):
parsing001 -11.5%
T13719 -2.7%
MultiLayerModules -3.5%
Naperian -3.1%
Bump haddock submodule
Metric Decrease:
MultiLayerModules
Naperian
T13719
parsing001
|
|
|
|
| |
Bump haddock submodule
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some types need a Platform value to be pretty-printed: CLabel, Cmm
types, instructions, etc.
Before this patch they had an Outputable instance and the Platform value
was obtained via sdocWithDynFlags. It meant that the *renderer* of the
SDoc was responsible of passing the appropriate Platform value (e.g. via
the DynFlags given to showSDoc). It put the burden of passing the
Platform value on the renderer while the generator of the SDoc knows the
Platform it is generating the SDoc for and there is no point passing a
different Platform at rendering time.
With this patch, we introduce a new OutputableP class:
class OutputableP a where
pdoc :: Platform -> a -> SDoc
With this class we still have some polymorphism as we have with `ppr`
(i.e. we can use `pdoc` on a variety of types instead of having a
dedicated `pprXXX` function for each XXX type).
One step closer removing `sdocWithDynFlags` (#10143) and supporting
several platforms (#14335).
|
|
|
|
|
|
|
|
| |
In preparation for the next version of 'happy', c95920 added a qualified
import to GHC/Parser.y but for some reason neglected GHC/Cmm/Parser.y
This patch adds the missing qualified import to GHC/Cmm/Parser.y and
also adds a clarifying comment to explain why this import is needed.
|
|
|
|
|
|
|
|
|
| |
- put panic related functions into GHC.Utils.Panic
- put trace related functions using DynFlags in GHC.Driver.Ppr
One step closer making Outputable fully independent of DynFlags.
Bump haddock submodule
|
|
|
|
|
|
|
|
|
|
| |
Platform constant wrappers took a DynFlags parameter, hence implicitly
used the target platform constants. We removed them to allow support
for several platforms at once (#14335) and to avoid having to pass
the full DynFlags to every function (#17957).
Metric Decrease:
T4801
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes #17667 and should help to avoid such issues going forward.
The changes are mostly mechanical in nature. With two notable
exceptions.
* The register allocator.
The register allocator references registers by distinct uniques.
However they come from the types of VirtualReg, Reg or Unique in
various places. As a result we sometimes cast the key type of the
map and use functions which operate on the now typed map but take
a raw Unique as actual key. The logic itself has not changed it
just becomes obvious where we do so now.
* <Type>Env Modules.
As an example a ClassEnv is currently queried using the types `Class`,
`Name`, and `TyCon`. This is safe since for a distinct class value all
these expressions give the same unique.
getUnique cls
getUnique (classTyCon cls)
getUnique (className cls)
getUnique (tcName $ classTyCon cls)
This is for the most part contained within the modules defining the
interface. However it requires us to play dirty when we are given a
`Name` to lookup in a `UniqFM Class a` map. But again the logic did
not change and it's for the most part hidden behind the Env Module.
Some of these cases could be avoided by refactoring but this is left
for future work.
We also bump the haddock submodule as it uses UniqFM.
|
|
|
|
|
|
|
|
|
|
|
| |
SCC profiling was enabled in a convoluted way: if WayProf was enabled,
Opt_SccProfilingOn general flag was set (in
`GHC.Driver.Ways.wayGeneralFlags`), and then this flag was queried in
various places.
There is no need to go via general flags, so this patch defines a
`sccProfilingEnabled :: DynFlags -> Bool` helper function that just
checks whether WayProf is enabled.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, if a .cmm file *not in the RTS* contained something like:
```cmm
section "rodata" { msg : bits8[] "Test\n"; }
```
It would get compiled by CmmToC into:
```c
ERW_(msg);
const char msg[] = "Test\012";
```
and fail with:
```
/tmp/ghc32129_0/ghc_4.hc:5:12: error:
error: conflicting types for \u2018msg\u2019
const char msg[] = "Test\012";
^~~
In file included from /tmp/ghc32129_0/ghc_4.hc:3:0: error:
/tmp/ghc32129_0/ghc_4.hc:4:6: error:
note: previous declaration of \u2018msg\u2019 was here
ERW_(msg);
^
/builds/hsyl20/ghc/_build/install/lib/ghc-8.11.0.20200605/lib/../lib/x86_64-linux-ghc-8.11.0.20200605/rts-1.0/include/Stg.h:253:46: error:
note: in definition of macro \u2018ERW_\u2019
#define ERW_(X) extern StgWordArray (X)
^
```
See the rationale for this on https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/compiler/backends/ppr-c#prototypes
Now we don't generate these extern declarations (ERW_, etc.) for
top-level data. It shouldn't change anything for the RTS (the only place
we use .cmm files) as it is already special cased in
`GHC.Cmm.CLabel.needsCDecl`. And hand-written Cmm can use explicit
extern declarations when needed.
Note that it allows `cgrun069` test to pass with CmmToC (cf #15467).
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't want to save both Fn and Dn register sets on x86-64 as they are
aliased to the same arch register (XMMn).
Moreover, when SAVE_STGREGS was used in conjunction with `jump foo [*]`
which makes a set of Cmm registers alive so that they cover all arch
registers used to pass parameter, we could have Fn, Dn and XMMn alive at
the same time. It made the LLVM code generator choke (see #17920).
Now `SAVE_REGS/RESTORE_REGS` and `jump foo [*]` use the same set of
registers.
|
|
|
|
|
|
|
| |
tablesNextToCode is a platform setting and doesn't belong into DynFlags
(#17957). Doing this is also a prerequisite to fix #14335 where we deal
with two platforms (target and host) that may have different platform
settings.
|
|
|
|
|
|
|
| |
The initial version was rewritten by Tamar Christina.
It was rewritten in large parts by Andreas Klebinger.
Co-authored-by: Andreas Klebinger <klebinger.andreas@gmx.at>
|
|
|
|
|
|
|
|
| |
* use UnitId instead of String to identify wired-in units
* use UnitId instead of Unit in the backend (Unit are only use by
Backpack to produce type-checked interfaces, not real code)
* rename lookup functions for consistency
* documentation
|
|
|
|
|
| |
* rename thisPackage into homeUnit
* document and refactor several Backpack things
|
|
|
|
|
|
|
| |
Introduce GHC.Unit.* hierarchy for everything concerning units, packages
and modules.
Update Haddock submodule
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Over the years the unit management code has been modified a lot to keep
up with changes in Cabal (e.g. support for several library components in
the same package), to integrate BackPack, etc. I found it very hard to
understand as the terminology wasn't consistent, was referring to past
concepts, etc.
The terminology is now explained as clearly as I could in the Note
"About Units" and the code is refactored to reflect it.
-------------------
Many names were misleading: UnitId is not an Id but could be a virtual
unit (an indefinite one instantiated on the fly), IndefUnitId
constructor may contain a definite instantiated unit, etc.
* Rename IndefUnitId into InstantiatedUnit
* Rename IndefModule into InstantiatedModule
* Rename UnitId type into Unit
* Rename IndefiniteUnitId constructor into VirtUnit
* Rename DefiniteUnitId constructor into RealUnit
* Rename packageConfigId into mkUnit
* Rename getPackageDetails into unsafeGetUnitInfo
* Rename InstalledUnitId into UnitId
Remove references to misleading ComponentId: a ComponentId is just an
indefinite unit-id to be instantiated.
* Rename ComponentId into IndefUnitId
* Rename ComponentDetails into UnitPprInfo
* Fix display of UnitPprInfo with empty version: this is now used for
units dynamically generated by BackPack
Generalize several types (Module, Unit, etc.) so that they can be used
with different unit identifier types: UnitKey, UnitId, Unit, etc.
* GenModule: Module, InstantiatedModule and InstalledModule are now
instances of this type
* Generalize DefUnitId, IndefUnitId, Unit, InstantiatedUnit,
PackageDatabase
Replace BackPack fake "hole" UnitId by a proper HoleUnit constructor.
Add basic support for UnitKey. They should be used more in the future to
avoid mixing them up with UnitId as we do now.
Add many comments.
Update Haddock submodule
|
|
|
|
|
|
|
| |
Update Haddock submodule
Metric Increase:
haddock.compiler
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* SysTools
* Parser
* GHC.Builtin
* GHC.Iface.Recomp
* Settings
Update Haddock submodule
Metric Decrease:
Naperian
parsing001
|
|
|
|
|
|
|
|
|
|
| |
In !2959 we noticed that there was some redundant code (in GHC.Cmm.Utils
and GHC.Cmm.StgToCmm.Utils) used to deal with `CmmStatics` datatype
(before SRT generation) and `RawCmmStatics` datatype (after SRT
generation).
This patch removes this redundant code by using a single GADT for
(Raw)CmmStatics.
|
|
|
|
|
|
|
| |
Update Haddock submodule
Metric Increase:
haddock.compiler
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use Platform instead of DynFlags when possible:
* `tARGET_MIN_INT` et al. replaced with `platformMinInt` et al.
* no more DynFlags in PreRules: added a new `RuleOpts` datatype
* don't use `wORD_SIZE` in the compiler
* make `wordAlignment` use `Platform`
* make `dOUBLE_SIZE` a constant
Metric Decrease:
T13035
T1969
|
|
|
|
|
|
|
|
| |
Metric Decrease:
ManyConstructors
T12707
T13035
T1969
|
|
|
|
| |
From the notes.ghc.drop list found using weeder in #17713
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When GHC is parsing a file generated by a tool, e.g. by the C preprocessor, the
tool may insert #line pragmas to adjust the locations reported to the user.
As the result, the locations recorded in RealSrcLoc are not monotonic. Elements
that appear later in the StringBuffer are not guaranteed to have a higher
line/column number.
In fact, there are no guarantees whatsoever, as #line pragmas can arbitrarily
modify locations. This lack of guarantees makes ideas such as #17544
infeasible.
This patch adds an additional bit of information to every SrcLoc:
newtype BufPos = BufPos { bufPos :: Int }
A BufPos represents the location in the StringBuffer, unaffected by any
pragmas.
Updates haddock submodule.
Metric Increase:
haddock.Cabal
haddock.base
haddock.compiler
MultiLayerModules
Naperian
parsing001
T12150
|
|
|
|
| |
Update haddock submodule
|
|
|
|
| |
submodule updates: nofib, haddock
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were two issues with this instance:
* its existence meant that a pattern match failure in the P monad would
produce a user-visible parse error, but the error message would not be
helpful to the user
* due to the MFP migration strategy, we had to use CPP in Lexer.x,
and that created issues for #17750
Updates haddock submodule.
|
|
|
|
| |
Update haddock submodule
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes all CafInfo predictions and various hacks to preserve
predicted CafInfos from the compiler and assigns final CafInfos to
interface Ids after code generation. SRT analysis is extended to support
static data, and Cmm generator is modified to allow generating
static_link fields after SRT analysis.
This also fixes `-fcatch-bottoms`, which introduces error calls in case
expressions in CorePrep, which runs *after* CoreTidy (which is where we
decide on CafInfos) and turns previously non-CAFFY things into CAFFY.
Fixes #17648
Fixes #9718
Evaluation
==========
NoFib
-----
Boot with: `make boot mode=fast`
Run: `make mode=fast EXTRA_RUNTEST_OPTS="-cachegrind" NoFibRuns=1`
--------------------------------------------------------------------------------
Program Size Allocs Instrs Reads Writes
--------------------------------------------------------------------------------
CS -0.0% 0.0% -0.0% -0.0% -0.0%
CSD -0.0% 0.0% -0.0% -0.0% -0.0%
FS -0.0% 0.0% -0.0% -0.0% -0.0%
S -0.0% 0.0% -0.0% -0.0% -0.0%
VS -0.0% 0.0% -0.0% -0.0% -0.0%
VSD -0.0% 0.0% -0.0% -0.0% -0.5%
VSM -0.0% 0.0% -0.0% -0.0% -0.0%
anna -0.1% 0.0% -0.0% -0.0% -0.0%
ansi -0.0% 0.0% -0.0% -0.0% -0.0%
atom -0.0% 0.0% -0.0% -0.0% -0.0%
awards -0.0% 0.0% -0.0% -0.0% -0.0%
banner -0.0% 0.0% -0.0% -0.0% -0.0%
bernouilli -0.0% 0.0% -0.0% -0.0% -0.0%
binary-trees -0.0% 0.0% -0.0% -0.0% -0.0%
boyer -0.0% 0.0% -0.0% -0.0% -0.0%
boyer2 -0.0% 0.0% -0.0% -0.0% -0.0%
bspt -0.0% 0.0% -0.0% -0.0% -0.0%
cacheprof -0.0% 0.0% -0.0% -0.0% -0.0%
calendar -0.0% 0.0% -0.0% -0.0% -0.0%
cichelli -0.0% 0.0% -0.0% -0.0% -0.0%
circsim -0.0% 0.0% -0.0% -0.0% -0.0%
clausify -0.0% 0.0% -0.0% -0.0% -0.0%
comp_lab_zift -0.0% 0.0% -0.0% -0.0% -0.0%
compress -0.0% 0.0% -0.0% -0.0% -0.0%
compress2 -0.0% 0.0% -0.0% -0.0% -0.0%
constraints -0.0% 0.0% -0.0% -0.0% -0.0%
cryptarithm1 -0.0% 0.0% -0.0% -0.0% -0.0%
cryptarithm2 -0.0% 0.0% -0.0% -0.0% -0.0%
cse -0.0% 0.0% -0.0% -0.0% -0.0%
digits-of-e1 -0.0% 0.0% -0.0% -0.0% -0.0%
digits-of-e2 -0.0% 0.0% -0.0% -0.0% -0.0%
dom-lt -0.0% 0.0% -0.0% -0.0% -0.0%
eliza -0.0% 0.0% -0.0% -0.0% -0.0%
event -0.0% 0.0% -0.0% -0.0% -0.0%
exact-reals -0.0% 0.0% -0.0% -0.0% -0.0%
exp3_8 -0.0% 0.0% -0.0% -0.0% -0.0%
expert -0.0% 0.0% -0.0% -0.0% -0.0%
fannkuch-redux -0.0% 0.0% -0.0% -0.0% -0.0%
fasta -0.0% 0.0% -0.0% -0.0% -0.0%
fem -0.0% 0.0% -0.0% -0.0% -0.0%
fft -0.0% 0.0% -0.0% -0.0% -0.0%
fft2 -0.0% 0.0% -0.0% -0.0% -0.0%
fibheaps -0.0% 0.0% -0.0% -0.0% -0.0%
fish -0.0% 0.0% -0.0% -0.0% -0.0%
fluid -0.1% 0.0% -0.0% -0.0% -0.0%
fulsom -0.0% 0.0% -0.0% -0.0% -0.0%
gamteb -0.0% 0.0% -0.0% -0.0% -0.0%
gcd -0.0% 0.0% -0.0% -0.0% -0.0%
gen_regexps -0.0% 0.0% -0.0% -0.0% -0.0%
genfft -0.0% 0.0% -0.0% -0.0% -0.0%
gg -0.0% 0.0% -0.0% -0.0% -0.0%
grep -0.0% 0.0% -0.0% -0.0% -0.0%
hidden -0.0% 0.0% -0.0% -0.0% -0.0%
hpg -0.1% 0.0% -0.0% -0.0% -0.0%
ida -0.0% 0.0% -0.0% -0.0% -0.0%
infer -0.0% 0.0% -0.0% -0.0% -0.0%
integer -0.0% 0.0% -0.0% -0.0% -0.0%
integrate -0.0% 0.0% -0.0% -0.0% -0.0%
k-nucleotide -0.0% 0.0% -0.0% -0.0% -0.0%
kahan -0.0% 0.0% -0.0% -0.0% -0.0%
knights -0.0% 0.0% -0.0% -0.0% -0.0%
lambda -0.0% 0.0% -0.0% -0.0% -0.0%
last-piece -0.0% 0.0% -0.0% -0.0% -0.0%
lcss -0.0% 0.0% -0.0% -0.0% -0.0%
life -0.0% 0.0% -0.0% -0.0% -0.0%
lift -0.0% 0.0% -0.0% -0.0% -0.0%
linear -0.1% 0.0% -0.0% -0.0% -0.0%
listcompr -0.0% 0.0% -0.0% -0.0% -0.0%
listcopy -0.0% 0.0% -0.0% -0.0% -0.0%
maillist -0.0% 0.0% -0.0% -0.0% -0.0%
mandel -0.0% 0.0% -0.0% -0.0% -0.0%
mandel2 -0.0% 0.0% -0.0% -0.0% -0.0%
mate -0.0% 0.0% -0.0% -0.0% -0.0%
minimax -0.0% 0.0% -0.0% -0.0% -0.0%
mkhprog -0.0% 0.0% -0.0% -0.0% -0.0%
multiplier -0.0% 0.0% -0.0% -0.0% -0.0%
n-body -0.0% 0.0% -0.0% -0.0% -0.0%
nucleic2 -0.0% 0.0% -0.0% -0.0% -0.0%
para -0.0% 0.0% -0.0% -0.0% -0.0%
paraffins -0.0% 0.0% -0.0% -0.0% -0.0%
parser -0.1% 0.0% -0.0% -0.0% -0.0%
parstof -0.1% 0.0% -0.0% -0.0% -0.0%
pic -0.0% 0.0% -0.0% -0.0% -0.0%
pidigits -0.0% 0.0% -0.0% -0.0% -0.0%
power -0.0% 0.0% -0.0% -0.0% -0.0%
pretty -0.0% 0.0% -0.3% -0.4% -0.4%
primes -0.0% 0.0% -0.0% -0.0% -0.0%
primetest -0.0% 0.0% -0.0% -0.0% -0.0%
prolog -0.0% 0.0% -0.0% -0.0% -0.0%
puzzle -0.0% 0.0% -0.0% -0.0% -0.0%
queens -0.0% 0.0% -0.0% -0.0% -0.0%
reptile -0.0% 0.0% -0.0% -0.0% -0.0%
reverse-complem -0.0% 0.0% -0.0% -0.0% -0.0%
rewrite -0.0% 0.0% -0.0% -0.0% -0.0%
rfib -0.0% 0.0% -0.0% -0.0% -0.0%
rsa -0.0% 0.0% -0.0% -0.0% -0.0%
scc -0.0% 0.0% -0.3% -0.5% -0.4%
sched -0.0% 0.0% -0.0% -0.0% -0.0%
scs -0.0% 0.0% -0.0% -0.0% -0.0%
simple -0.1% 0.0% -0.0% -0.0% -0.0%
solid -0.0% 0.0% -0.0% -0.0% -0.0%
sorting -0.0% 0.0% -0.0% -0.0% -0.0%
spectral-norm -0.0% 0.0% -0.0% -0.0% -0.0%
sphere -0.0% 0.0% -0.0% -0.0% -0.0%
symalg -0.0% 0.0% -0.0% -0.0% -0.0%
tak -0.0% 0.0% -0.0% -0.0% -0.0%
transform -0.0% 0.0% -0.0% -0.0% -0.0%
treejoin -0.0% 0.0% -0.0% -0.0% -0.0%
typecheck -0.0% 0.0% -0.0% -0.0% -0.0%
veritas -0.0% 0.0% -0.0% -0.0% -0.0%
wang -0.0% 0.0% -0.0% -0.0% -0.0%
wave4main -0.0% 0.0% -0.0% -0.0% -0.0%
wheel-sieve1 -0.0% 0.0% -0.0% -0.0% -0.0%
wheel-sieve2 -0.0% 0.0% -0.0% -0.0% -0.0%
x2n1 -0.0% 0.0% -0.0% -0.0% -0.0%
--------------------------------------------------------------------------------
Min -0.1% 0.0% -0.3% -0.5% -0.5%
Max -0.0% 0.0% -0.0% -0.0% -0.0%
Geometric Mean -0.0% -0.0% -0.0% -0.0% -0.0%
--------------------------------------------------------------------------------
Program Size Allocs Instrs Reads Writes
--------------------------------------------------------------------------------
circsim -0.1% 0.0% -0.0% -0.0% -0.0%
constraints -0.0% 0.0% -0.0% -0.0% -0.0%
fibheaps -0.0% 0.0% -0.0% -0.0% -0.0%
gc_bench -0.0% 0.0% -0.0% -0.0% -0.0%
hash -0.0% 0.0% -0.0% -0.0% -0.0%
lcss -0.0% 0.0% -0.0% -0.0% -0.0%
power -0.0% 0.0% -0.0% -0.0% -0.0%
spellcheck -0.0% 0.0% -0.0% -0.0% -0.0%
--------------------------------------------------------------------------------
Min -0.1% 0.0% -0.0% -0.0% -0.0%
Max -0.0% 0.0% -0.0% -0.0% -0.0%
Geometric Mean -0.0% +0.0% -0.0% -0.0% -0.0%
Manual inspection of programs in testsuite/tests/programs
---------------------------------------------------------
I built these programs with a bunch of dump flags and `-O` and compared
STG, Cmm, and Asm dumps and file sizes.
(Below the numbers in parenthesis show number of modules in the program)
These programs have identical compiler (same .hi and .o sizes, STG, and
Cmm and Asm dumps):
- Queens (1), andre_monad (1), cholewo-eval (2), cvh_unboxing (3),
andy_cherry (7), fun_insts (1), hs-boot (4), fast2haskell (2),
jl_defaults (1), jq_readsPrec (1), jules_xref (1), jtod_circint (4),
jules_xref2 (1), lennart_range (1), lex (1), life_space_leak (1),
bargon-mangler-bug (7), record_upd (1), rittri (1), sanders_array (1),
strict_anns (1), thurston-module-arith (2), okeefe_neural (1),
joao-circular (6), 10queens (1)
Programs with different compiler outputs:
- jl_defaults (1): For some reason GHC HEAD marks a lot of top-level
`[Int]` closures as CAFFY for no reason. With this patch we no longer
make them CAFFY and generate less SRT entries. For some reason Main.o
is slightly larger with this patch (1.3%) and the executable sizes are
the same. (I'd expect both to be smaller)
- launchbury (1): Same as jl_defaults: top-level `[Int]` closures marked
as CAFFY for no reason. Similarly `Main.o` is 1.4% larger but the
executable sizes are the same.
- galois_raytrace (13): Differences are in the Parse module. There are a
lot, but some of the changes are caused by the fact that for some
reason (I think a bug) GHC HEAD marks the dictionary for `Functor
Identity` as CAFFY. Parse.o is 0.4% larger, the executable size is the
same.
- north_array: We now generate less SRT entries because some of array
primops used in this program like `NewArrayOp` get eliminated during
Stg-to-Cmm and turn some CAFFY things into non-CAFFY. Main.o gets 24%
larger (9224 bytes from 9000 bytes), executable sizes are the same.
- seward-space-leak: Difference in this program is better shown by this
smaller example:
module Lib where
data CDS
= Case [CDS] [(Int, CDS)]
| Call CDS CDS
instance Eq CDS where
Case sels1 rets1 == Case sels2 rets2 =
sels1 == sels2 && rets1 == rets2
Call a1 b1 == Call a2 b2 =
a1 == a2 && b1 == b2
_ == _ =
False
In this program GHC HEAD builds a new SRT for the recursive group of
`(==)`, `(/=)` and the dictionary closure. Then `/=` points to `==`
in its SRT field, and `==` uses the SRT object as its SRT. With this
patch we use the closure for `/=` as the SRT and add `==` there. Then
`/=` gets an empty SRT field and `==` points to `/=` in its SRT
field.
This change looks fine to me.
Main.o gets 0.07% larger, executable sizes are identical.
head.hackage
------------
head.hackage's CI script builds 428 packages from Hackage using this
patch with no failures.
Compiler performance
--------------------
The compiler perf tests report that the compiler allocates slightly more
(worst case observed so far is 4%). However most programs in the test
suite are small, single file programs. To benchmark compiler performance
on something more realistic I build Cabal (the library, 236 modules)
with different optimisation levels. For the "max residency" row I run
GHC with `+RTS -s -A100k -i0 -h` for more accurate numbers. Other rows
are generated with just `-s`. (This is because `-i0` causes running GC
much more frequently and as a result "bytes copied" gets inflated by
more than 25x in some cases)
* -O0
| | GHC HEAD | This MR | Diff |
| --------------- | -------------- | -------------- | ------ |
| Bytes allocated | 54,413,350,872 | 54,701,099,464 | +0.52% |
| Bytes copied | 4,926,037,184 | 4,990,638,760 | +1.31% |
| Max residency | 421,225,624 | 424,324,264 | +0.73% |
* -O1
| | GHC HEAD | This MR | Diff |
| --------------- | --------------- | --------------- | ------ |
| Bytes allocated | 245,849,209,992 | 246,562,088,672 | +0.28% |
| Bytes copied | 26,943,452,560 | 27,089,972,296 | +0.54% |
| Max residency | 982,643,440 | 991,663,432 | +0.91% |
* -O2
| | GHC HEAD | This MR | Diff |
| --------------- | --------------- | --------------- | ------ |
| Bytes allocated | 291,044,511,408 | 291,863,910,912 | +0.28% |
| Bytes copied | 37,044,237,616 | 36,121,690,472 | -2.49% |
| Max residency | 1,071,600,328 | 1,086,396,256 | +1.38% |
Extra compiler allocations
--------------------------
Runtime allocations of programs are as reported above (NoFib section).
The compiler now allocates more than before. Main source of allocation
in this patch compared to base commit is the new SRT algorithm
(GHC.Cmm.Info.Build). Below is some of the extra work we do with this
patch, numbers generated by profiled stage 2 compiler when building a
pathological case (the test 'ManyConstructors') with '-O2':
- We now sort the final STG for a module, which means traversing the
entire program, generating free variable set for each top-level
binding, doing SCC analysis, and re-ordering the program. In
ManyConstructors this step allocates 97,889,952 bytes.
- We now do SRT analysis on static data, which in a program like
ManyConstructors causes analysing 10,000 bindings that we would
previously just skip. This step allocates 70,898,352 bytes.
- We now maintain an SRT map for the entire module as we compile Cmm
groups:
data ModuleSRTInfo = ModuleSRTInfo
{ ...
, moduleSRTMap :: SRTMap
}
(SRTMap is just a strict Map from the 'containers' library)
This map gets an entry for most bindings in a module (exceptions are
THUNKs and CAFFY static functions). For ManyConstructors this map
gets 50015 entries.
- Once we're done with code generation we generate a NameSet from SRTMap
for the non-CAFFY names in the current module. This set gets the same
number of entries as the SRTMap.
- Finally we update CafInfos in ModDetails for the non-CAFFY Ids, using
the NameSet generated in the previous step. This usually does the
least amount of allocation among the work listed here.
Only place with this patch where we do less work in the CAF analysis in
the tidying pass (CoreTidy). However that doesn't save us much, as the
pass still needs to traverse the whole program and update IdInfos for
other reasons. Only thing we don't here do is the `hasCafRefs` pass over
the RHS of bindings, which is a stateless pass that returns a boolean
value, so it doesn't allocate much.
(Metric changes blow are all increased allocations)
Metric changes
--------------
Metric Increase:
ManyAlternatives
ManyConstructors
T13035
T14683
T1969
T9961
|
|
|