| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Update Haddock submodule
Metric Increase:
haddock.compiler
|
|
|
|
|
|
|
|
|
|
| |
* replace `DynFlags` parameters with `SDocContext` parameters for a few
Ppr related functions: `bufLeftRenderSDoc`, `printSDoc`,
`printSDocLn`, `showSDocOneLine`.
* remove the use of `pprCols :: DynFlags -> Int` in Outputable. We
already have the information via `sdocLineLength :: SDocContext ->
Int`
|
| |
|
|
|
|
| |
Update Haddock submodule
|
|
|
|
|
|
|
|
| |
This allows us to remove several bits of CPP that are either always
true or no longer reachable. As an added bonus, we no longer need to
worry about importing `Control.Monad.Fail.fail` qualified to avoid
clashing with `Control.Monad.fail`, since the latter is now the same
as the former.
|
|
|
|
|
|
|
| |
Update Haddock submodule
Metric Increase:
haddock.compiler
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This read causes NULL dereferencing when len is 0.
Fixes #17909
In the reproducer in #17909 this bug is triggered as follows:
- SimplOpt.dealWithStringLiteral is called with a single-char string
("=" in #17909)
- tailFS gets called on the FastString of the single-char string.
- tailFS checks the length of the string, which is 1, and calls
mkFastStringByteString on the tail of the ByteString, which is an
empty ByteString as the original ByteString has only one char.
- ByteString's unsafeUseAsCStringLen returns (NULL, 0) for the empty
ByteString, which is passed to mkFastStringWith.
- mkFastStringWith gets hash of the NULL pointer via hashStr, which
fails on empty strings because of this bug.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch disentangles a bit more DynFlags from the native code
generator (CmmToAsm).
In more details:
- add a new NCGConfig datatype in GHC.CmmToAsm.Config which contains the
configuration of a native code generation session
- explicitly pass NCGConfig/Platform arguments when necessary
- as a consequence `sdocWithPlatform` is gone and there are only a few
`sdocWithDynFlags` left
- remove the use of `unsafeGlobalDynFlags` from GHC.CmmToAsm.CFG
- remove `sdocDebugLevel` (now we pass the debug level via NCGConfig)
There are still some places where DynFlags is used, especially because
of pretty-printing (CLabel), because of Cmm helpers (such as
`cmmExprType`) and because of `Outputable` instance for the
instructions. These are left for future refactoring as this patch is
already big.
|
|
|
|
| |
From the notes.ghc.drop list found using weeder in #17713
|
|
|
|
|
|
|
|
|
|
| |
I see quite a few warnings like:
WARNING: file compiler/utils/Util.hs, line 593
Over-long elem in unionLists
But the call stack is uninformative. Better to add HasDebugCallStack
to isIn. Ditto isn'tIn.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When GHC is parsing a file generated by a tool, e.g. by the C preprocessor, the
tool may insert #line pragmas to adjust the locations reported to the user.
As the result, the locations recorded in RealSrcLoc are not monotonic. Elements
that appear later in the StringBuffer are not guaranteed to have a higher
line/column number.
In fact, there are no guarantees whatsoever, as #line pragmas can arbitrarily
modify locations. This lack of guarantees makes ideas such as #17544
infeasible.
This patch adds an additional bit of information to every SrcLoc:
newtype BufPos = BufPos { bufPos :: Int }
A BufPos represents the location in the StringBuffer, unaffected by any
pragmas.
Updates haddock submodule.
Metric Increase:
haddock.Cabal
haddock.base
haddock.compiler
MultiLayerModules
Naperian
parsing001
T12150
|
|
|
|
| |
Update haddock submodule
|
| |
|
|
|
|
| |
submodule updates: nofib, haddock
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Remove several uses of `sdocWithDynFlags`. The remaining ones are mostly
CodeGen related (e.g. depend on target platform constants) and will be
fixed separately.
Metric Decrease:
T12425
T9961
WWRec
T1969
T14683
|
|
|
|
|
|
|
| |
The original reason this was disabled should be fixed by the previous
commit.
This reverts commit 1c1b63d63efe8b0f789aa7d5b87cfac3edd213eb.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes all CafInfo predictions and various hacks to preserve
predicted CafInfos from the compiler and assigns final CafInfos to
interface Ids after code generation. SRT analysis is extended to support
static data, and Cmm generator is modified to allow generating
static_link fields after SRT analysis.
This also fixes `-fcatch-bottoms`, which introduces error calls in case
expressions in CorePrep, which runs *after* CoreTidy (which is where we
decide on CafInfos) and turns previously non-CAFFY things into CAFFY.
Fixes #17648
Fixes #9718
Evaluation
==========
NoFib
-----
Boot with: `make boot mode=fast`
Run: `make mode=fast EXTRA_RUNTEST_OPTS="-cachegrind" NoFibRuns=1`
--------------------------------------------------------------------------------
Program Size Allocs Instrs Reads Writes
--------------------------------------------------------------------------------
CS -0.0% 0.0% -0.0% -0.0% -0.0%
CSD -0.0% 0.0% -0.0% -0.0% -0.0%
FS -0.0% 0.0% -0.0% -0.0% -0.0%
S -0.0% 0.0% -0.0% -0.0% -0.0%
VS -0.0% 0.0% -0.0% -0.0% -0.0%
VSD -0.0% 0.0% -0.0% -0.0% -0.5%
VSM -0.0% 0.0% -0.0% -0.0% -0.0%
anna -0.1% 0.0% -0.0% -0.0% -0.0%
ansi -0.0% 0.0% -0.0% -0.0% -0.0%
atom -0.0% 0.0% -0.0% -0.0% -0.0%
awards -0.0% 0.0% -0.0% -0.0% -0.0%
banner -0.0% 0.0% -0.0% -0.0% -0.0%
bernouilli -0.0% 0.0% -0.0% -0.0% -0.0%
binary-trees -0.0% 0.0% -0.0% -0.0% -0.0%
boyer -0.0% 0.0% -0.0% -0.0% -0.0%
boyer2 -0.0% 0.0% -0.0% -0.0% -0.0%
bspt -0.0% 0.0% -0.0% -0.0% -0.0%
cacheprof -0.0% 0.0% -0.0% -0.0% -0.0%
calendar -0.0% 0.0% -0.0% -0.0% -0.0%
cichelli -0.0% 0.0% -0.0% -0.0% -0.0%
circsim -0.0% 0.0% -0.0% -0.0% -0.0%
clausify -0.0% 0.0% -0.0% -0.0% -0.0%
comp_lab_zift -0.0% 0.0% -0.0% -0.0% -0.0%
compress -0.0% 0.0% -0.0% -0.0% -0.0%
compress2 -0.0% 0.0% -0.0% -0.0% -0.0%
constraints -0.0% 0.0% -0.0% -0.0% -0.0%
cryptarithm1 -0.0% 0.0% -0.0% -0.0% -0.0%
cryptarithm2 -0.0% 0.0% -0.0% -0.0% -0.0%
cse -0.0% 0.0% -0.0% -0.0% -0.0%
digits-of-e1 -0.0% 0.0% -0.0% -0.0% -0.0%
digits-of-e2 -0.0% 0.0% -0.0% -0.0% -0.0%
dom-lt -0.0% 0.0% -0.0% -0.0% -0.0%
eliza -0.0% 0.0% -0.0% -0.0% -0.0%
event -0.0% 0.0% -0.0% -0.0% -0.0%
exact-reals -0.0% 0.0% -0.0% -0.0% -0.0%
exp3_8 -0.0% 0.0% -0.0% -0.0% -0.0%
expert -0.0% 0.0% -0.0% -0.0% -0.0%
fannkuch-redux -0.0% 0.0% -0.0% -0.0% -0.0%
fasta -0.0% 0.0% -0.0% -0.0% -0.0%
fem -0.0% 0.0% -0.0% -0.0% -0.0%
fft -0.0% 0.0% -0.0% -0.0% -0.0%
fft2 -0.0% 0.0% -0.0% -0.0% -0.0%
fibheaps -0.0% 0.0% -0.0% -0.0% -0.0%
fish -0.0% 0.0% -0.0% -0.0% -0.0%
fluid -0.1% 0.0% -0.0% -0.0% -0.0%
fulsom -0.0% 0.0% -0.0% -0.0% -0.0%
gamteb -0.0% 0.0% -0.0% -0.0% -0.0%
gcd -0.0% 0.0% -0.0% -0.0% -0.0%
gen_regexps -0.0% 0.0% -0.0% -0.0% -0.0%
genfft -0.0% 0.0% -0.0% -0.0% -0.0%
gg -0.0% 0.0% -0.0% -0.0% -0.0%
grep -0.0% 0.0% -0.0% -0.0% -0.0%
hidden -0.0% 0.0% -0.0% -0.0% -0.0%
hpg -0.1% 0.0% -0.0% -0.0% -0.0%
ida -0.0% 0.0% -0.0% -0.0% -0.0%
infer -0.0% 0.0% -0.0% -0.0% -0.0%
integer -0.0% 0.0% -0.0% -0.0% -0.0%
integrate -0.0% 0.0% -0.0% -0.0% -0.0%
k-nucleotide -0.0% 0.0% -0.0% -0.0% -0.0%
kahan -0.0% 0.0% -0.0% -0.0% -0.0%
knights -0.0% 0.0% -0.0% -0.0% -0.0%
lambda -0.0% 0.0% -0.0% -0.0% -0.0%
last-piece -0.0% 0.0% -0.0% -0.0% -0.0%
lcss -0.0% 0.0% -0.0% -0.0% -0.0%
life -0.0% 0.0% -0.0% -0.0% -0.0%
lift -0.0% 0.0% -0.0% -0.0% -0.0%
linear -0.1% 0.0% -0.0% -0.0% -0.0%
listcompr -0.0% 0.0% -0.0% -0.0% -0.0%
listcopy -0.0% 0.0% -0.0% -0.0% -0.0%
maillist -0.0% 0.0% -0.0% -0.0% -0.0%
mandel -0.0% 0.0% -0.0% -0.0% -0.0%
mandel2 -0.0% 0.0% -0.0% -0.0% -0.0%
mate -0.0% 0.0% -0.0% -0.0% -0.0%
minimax -0.0% 0.0% -0.0% -0.0% -0.0%
mkhprog -0.0% 0.0% -0.0% -0.0% -0.0%
multiplier -0.0% 0.0% -0.0% -0.0% -0.0%
n-body -0.0% 0.0% -0.0% -0.0% -0.0%
nucleic2 -0.0% 0.0% -0.0% -0.0% -0.0%
para -0.0% 0.0% -0.0% -0.0% -0.0%
paraffins -0.0% 0.0% -0.0% -0.0% -0.0%
parser -0.1% 0.0% -0.0% -0.0% -0.0%
parstof -0.1% 0.0% -0.0% -0.0% -0.0%
pic -0.0% 0.0% -0.0% -0.0% -0.0%
pidigits -0.0% 0.0% -0.0% -0.0% -0.0%
power -0.0% 0.0% -0.0% -0.0% -0.0%
pretty -0.0% 0.0% -0.3% -0.4% -0.4%
primes -0.0% 0.0% -0.0% -0.0% -0.0%
primetest -0.0% 0.0% -0.0% -0.0% -0.0%
prolog -0.0% 0.0% -0.0% -0.0% -0.0%
puzzle -0.0% 0.0% -0.0% -0.0% -0.0%
queens -0.0% 0.0% -0.0% -0.0% -0.0%
reptile -0.0% 0.0% -0.0% -0.0% -0.0%
reverse-complem -0.0% 0.0% -0.0% -0.0% -0.0%
rewrite -0.0% 0.0% -0.0% -0.0% -0.0%
rfib -0.0% 0.0% -0.0% -0.0% -0.0%
rsa -0.0% 0.0% -0.0% -0.0% -0.0%
scc -0.0% 0.0% -0.3% -0.5% -0.4%
sched -0.0% 0.0% -0.0% -0.0% -0.0%
scs -0.0% 0.0% -0.0% -0.0% -0.0%
simple -0.1% 0.0% -0.0% -0.0% -0.0%
solid -0.0% 0.0% -0.0% -0.0% -0.0%
sorting -0.0% 0.0% -0.0% -0.0% -0.0%
spectral-norm -0.0% 0.0% -0.0% -0.0% -0.0%
sphere -0.0% 0.0% -0.0% -0.0% -0.0%
symalg -0.0% 0.0% -0.0% -0.0% -0.0%
tak -0.0% 0.0% -0.0% -0.0% -0.0%
transform -0.0% 0.0% -0.0% -0.0% -0.0%
treejoin -0.0% 0.0% -0.0% -0.0% -0.0%
typecheck -0.0% 0.0% -0.0% -0.0% -0.0%
veritas -0.0% 0.0% -0.0% -0.0% -0.0%
wang -0.0% 0.0% -0.0% -0.0% -0.0%
wave4main -0.0% 0.0% -0.0% -0.0% -0.0%
wheel-sieve1 -0.0% 0.0% -0.0% -0.0% -0.0%
wheel-sieve2 -0.0% 0.0% -0.0% -0.0% -0.0%
x2n1 -0.0% 0.0% -0.0% -0.0% -0.0%
--------------------------------------------------------------------------------
Min -0.1% 0.0% -0.3% -0.5% -0.5%
Max -0.0% 0.0% -0.0% -0.0% -0.0%
Geometric Mean -0.0% -0.0% -0.0% -0.0% -0.0%
--------------------------------------------------------------------------------
Program Size Allocs Instrs Reads Writes
--------------------------------------------------------------------------------
circsim -0.1% 0.0% -0.0% -0.0% -0.0%
constraints -0.0% 0.0% -0.0% -0.0% -0.0%
fibheaps -0.0% 0.0% -0.0% -0.0% -0.0%
gc_bench -0.0% 0.0% -0.0% -0.0% -0.0%
hash -0.0% 0.0% -0.0% -0.0% -0.0%
lcss -0.0% 0.0% -0.0% -0.0% -0.0%
power -0.0% 0.0% -0.0% -0.0% -0.0%
spellcheck -0.0% 0.0% -0.0% -0.0% -0.0%
--------------------------------------------------------------------------------
Min -0.1% 0.0% -0.0% -0.0% -0.0%
Max -0.0% 0.0% -0.0% -0.0% -0.0%
Geometric Mean -0.0% +0.0% -0.0% -0.0% -0.0%
Manual inspection of programs in testsuite/tests/programs
---------------------------------------------------------
I built these programs with a bunch of dump flags and `-O` and compared
STG, Cmm, and Asm dumps and file sizes.
(Below the numbers in parenthesis show number of modules in the program)
These programs have identical compiler (same .hi and .o sizes, STG, and
Cmm and Asm dumps):
- Queens (1), andre_monad (1), cholewo-eval (2), cvh_unboxing (3),
andy_cherry (7), fun_insts (1), hs-boot (4), fast2haskell (2),
jl_defaults (1), jq_readsPrec (1), jules_xref (1), jtod_circint (4),
jules_xref2 (1), lennart_range (1), lex (1), life_space_leak (1),
bargon-mangler-bug (7), record_upd (1), rittri (1), sanders_array (1),
strict_anns (1), thurston-module-arith (2), okeefe_neural (1),
joao-circular (6), 10queens (1)
Programs with different compiler outputs:
- jl_defaults (1): For some reason GHC HEAD marks a lot of top-level
`[Int]` closures as CAFFY for no reason. With this patch we no longer
make them CAFFY and generate less SRT entries. For some reason Main.o
is slightly larger with this patch (1.3%) and the executable sizes are
the same. (I'd expect both to be smaller)
- launchbury (1): Same as jl_defaults: top-level `[Int]` closures marked
as CAFFY for no reason. Similarly `Main.o` is 1.4% larger but the
executable sizes are the same.
- galois_raytrace (13): Differences are in the Parse module. There are a
lot, but some of the changes are caused by the fact that for some
reason (I think a bug) GHC HEAD marks the dictionary for `Functor
Identity` as CAFFY. Parse.o is 0.4% larger, the executable size is the
same.
- north_array: We now generate less SRT entries because some of array
primops used in this program like `NewArrayOp` get eliminated during
Stg-to-Cmm and turn some CAFFY things into non-CAFFY. Main.o gets 24%
larger (9224 bytes from 9000 bytes), executable sizes are the same.
- seward-space-leak: Difference in this program is better shown by this
smaller example:
module Lib where
data CDS
= Case [CDS] [(Int, CDS)]
| Call CDS CDS
instance Eq CDS where
Case sels1 rets1 == Case sels2 rets2 =
sels1 == sels2 && rets1 == rets2
Call a1 b1 == Call a2 b2 =
a1 == a2 && b1 == b2
_ == _ =
False
In this program GHC HEAD builds a new SRT for the recursive group of
`(==)`, `(/=)` and the dictionary closure. Then `/=` points to `==`
in its SRT field, and `==` uses the SRT object as its SRT. With this
patch we use the closure for `/=` as the SRT and add `==` there. Then
`/=` gets an empty SRT field and `==` points to `/=` in its SRT
field.
This change looks fine to me.
Main.o gets 0.07% larger, executable sizes are identical.
head.hackage
------------
head.hackage's CI script builds 428 packages from Hackage using this
patch with no failures.
Compiler performance
--------------------
The compiler perf tests report that the compiler allocates slightly more
(worst case observed so far is 4%). However most programs in the test
suite are small, single file programs. To benchmark compiler performance
on something more realistic I build Cabal (the library, 236 modules)
with different optimisation levels. For the "max residency" row I run
GHC with `+RTS -s -A100k -i0 -h` for more accurate numbers. Other rows
are generated with just `-s`. (This is because `-i0` causes running GC
much more frequently and as a result "bytes copied" gets inflated by
more than 25x in some cases)
* -O0
| | GHC HEAD | This MR | Diff |
| --------------- | -------------- | -------------- | ------ |
| Bytes allocated | 54,413,350,872 | 54,701,099,464 | +0.52% |
| Bytes copied | 4,926,037,184 | 4,990,638,760 | +1.31% |
| Max residency | 421,225,624 | 424,324,264 | +0.73% |
* -O1
| | GHC HEAD | This MR | Diff |
| --------------- | --------------- | --------------- | ------ |
| Bytes allocated | 245,849,209,992 | 246,562,088,672 | +0.28% |
| Bytes copied | 26,943,452,560 | 27,089,972,296 | +0.54% |
| Max residency | 982,643,440 | 991,663,432 | +0.91% |
* -O2
| | GHC HEAD | This MR | Diff |
| --------------- | --------------- | --------------- | ------ |
| Bytes allocated | 291,044,511,408 | 291,863,910,912 | +0.28% |
| Bytes copied | 37,044,237,616 | 36,121,690,472 | -2.49% |
| Max residency | 1,071,600,328 | 1,086,396,256 | +1.38% |
Extra compiler allocations
--------------------------
Runtime allocations of programs are as reported above (NoFib section).
The compiler now allocates more than before. Main source of allocation
in this patch compared to base commit is the new SRT algorithm
(GHC.Cmm.Info.Build). Below is some of the extra work we do with this
patch, numbers generated by profiled stage 2 compiler when building a
pathological case (the test 'ManyConstructors') with '-O2':
- We now sort the final STG for a module, which means traversing the
entire program, generating free variable set for each top-level
binding, doing SCC analysis, and re-ordering the program. In
ManyConstructors this step allocates 97,889,952 bytes.
- We now do SRT analysis on static data, which in a program like
ManyConstructors causes analysing 10,000 bindings that we would
previously just skip. This step allocates 70,898,352 bytes.
- We now maintain an SRT map for the entire module as we compile Cmm
groups:
data ModuleSRTInfo = ModuleSRTInfo
{ ...
, moduleSRTMap :: SRTMap
}
(SRTMap is just a strict Map from the 'containers' library)
This map gets an entry for most bindings in a module (exceptions are
THUNKs and CAFFY static functions). For ManyConstructors this map
gets 50015 entries.
- Once we're done with code generation we generate a NameSet from SRTMap
for the non-CAFFY names in the current module. This set gets the same
number of entries as the SRTMap.
- Finally we update CafInfos in ModDetails for the non-CAFFY Ids, using
the NameSet generated in the previous step. This usually does the
least amount of allocation among the work listed here.
Only place with this patch where we do less work in the CAF analysis in
the tidying pass (CoreTidy). However that doesn't save us much, as the
pass still needs to traverse the whole program and update IdInfos for
other reasons. Only thing we don't here do is the `hasCafRefs` pass over
the RHS of bindings, which is a stateless pass that returns a boolean
value, so it doesn't allocate much.
(Metric changes blow are all increased allocations)
Metric changes
--------------
Metric Increase:
ManyAlternatives
ManyConstructors
T13035
T14683
T1969
T9961
|
|
|
|
|
|
| |
incomplete-uni-patterns and incomplete-record-updates will be in -Wall at a
future date, so prepare for that by disabling those warnings on files that
trigger them.
|
|
|
|
|
|
|
|
|
| |
In https://gitlab.haskell.org/ghc/ghc/merge_requests/2192#note_246551
Simon convinced me that ignoring type variables existentially bound by
data constructors have to be the same way as value binders.
Sadly I couldn't think of a regression test, but I'm confident that this
change strictly improves on the status quo.
|
| |
|
|
|
|
|
|
|
|
|
| |
This changes GHC's treatment of so-called Naughty Quantification
Candidates to issue errors, instead of zapping to Any.
Close #16775.
No new test cases, because existing ones cover this well.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add 'dumpAction' hook to DynFlags.
It allows GHC API users to catch dumped intermediate codes and
information. The format of the dump (Core, Stg, raw text, etc.) is now
reported allowing easier automatic handling.
* Add 'traceAction' hook to DynFlags.
Some dumps go through the trace mechanism (for instance unfoldings that
have been considered for inlining). This is problematic because:
1) dumps aren't written into files even with -ddump-to-file on
2) dumps are written on stdout even with GHC API
3) in this specific case, dumping depends on unsafe globally stored
DynFlags which is bad for GHC API users
We introduce 'traceAction' hook which allows GHC API to catch those
traces and to avoid using globally stored DynFlags.
* Avoid dumping empty logs via dumpAction/traceAction (but still write
empty files to keep the existing behavior)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before this patch, GHC always printed the * kind unparenthesized.
This led to two issues:
1. Sometimes GHC printed invalid or incorrect code.
For example, GHC would print: type F @* x = x
when it meant to print: type F @(*) x = x
In the former case, instead of a kind application we were getting a
type operator (@*).
2. Sometimes GHC printed kinds that were correct but hard to read.
Should Either * Int be read as Either (*) Int
or as (*) Either Int ?
This depends on whether -XStarIsType is enabled, but it would be
easier if we didn't have to check for the flag when reading the code.
We can solve both problems by assigning (*) a different precedence. Note
that Haskell98 kinds are not affected:
((* -> *) -> *) -> * does NOT become (((*) -> (*)) -> (*)) -> (*)
The parentheses are added when (*) is used in a function argument
position:
F * * * becomes F (*) (*) (*)
F A * B becomes F A (*) B
Proxy * becomes Proxy (*)
a * -> * becomes a (*) -> *
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we checked all imported type family equations
for injectivity. This is very silly. Now, we check only
for conflicts.
Before I could even imagine doing the fix, I needed to untangle
several functions that were (in my opinion) overly complicated.
It's still not quite as perfect as I'd like, but it's good enough
for now.
Test case: typecheck/should_compile/T17405
|
|
|
|
|
|
|
|
| |
GHC Proposal #229 changes the lexical rules of Haskell, which may
require slight whitespace adjustments in certain cases.
This patch changes formatting in a few places in GHC and its testsuite
in a way that enables it to compile under the proposed rules.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is part two of fixing #17334.
There are two parts to this commit:
- A bugfix for computing loop levels
- A bugfix of basic block invariants in the NCG.
-----------------------------------------------------------
In the first bug we ended up with a CFG of the sort: [A -> B -> C]
This was represented via maps as fromList [(A,B),(B,C)] and later
transformed into a adjacency array. However the transformation did
not include block C in the array (since we only looked at the keys of
the map).
This was still fine until we tried to look up successors for C and tried
to read outside of the array bounds when accessing C.
In order to prevent this in the future I refactored to code to include
all nodes as keys in the map representation. And make this a invariant
which is checked in a few places.
Overall I expect this to make the code more robust as now any failed
lookup will represent an error, versus failed lookups sometimes being
expected and sometimes not.
In terms of performance this makes some things cheaper (getting a list
of all nodes) and others more expensive (adding a new edge). Overall
this adds up to no noteable performance difference.
-----------------------------------------------------------
Part 2: When the NCG generated a new basic block, it did
not always insert a NEWBLOCK meta instruction in the stream which
caused a quite subtle bug.
During instruction selection a statement `s`
in a block B with control of the sort: B -> C
will sometimes result in control
flow of the sort:
┌ < ┐
v ^
B -> B1 ┴ -> C
as is the case for some atomic operations.
Now to keep the CFG in sync when introducing B1 we clearly
want to insert it between B and C. However there is
a catch when we have to deal with self loops.
We might start with code and a CFG of these forms:
loop:
stmt1 ┌ < ┐
.... v ^
stmtX loop ┘
stmtY
....
goto loop:
Now we introduce B1:
┌ ─ ─ ─ ─ ─┐
loop: │ ┌ < ┐ │
instrs v │ │ ^
.... loop ┴ B1 ┴ ┘
instrsFromX
stmtY
goto loop:
This is simple, all outgoing edges from loop now simply
start from B1 instead and the code generator knows which
new edges it introduced for the self loop of B1.
Disaster strikes if the statement Y follows the same pattern.
If we apply the same rule that all outgoing edges change then
we end up with:
loop ─> B1 ─> B2 ┬─┐
│ │ └─<┤ │
│ └───<───┘ │
└───────<────────┘
This is problematic. The edge B1->B1 is modified as expected.
However the modification is wrong!
The assembly in this case looked like this:
_loop:
<instrs>
_B1:
...
cmpxchgq ...
jne _B1
<instrs>
<end _B1>
_B2:
...
cmpxchgq ...
jne _B2
<instrs>
jmp loop
There is no edge _B2 -> _B1 here. It's still a self loop onto _B1.
The problem here is that really B1 should be two basic blocks.
Otherwise we have control flow in the *middle* of a basic block.
A contradiction!
So to account for this we add yet another basic block marker:
_B:
<instrs>
_B1:
...
cmpxchgq ...
jne _B1
jmp _B1'
_B1':
<instrs>
<end _B1>
_B2:
...
Now when inserting B2 we will only look at the outgoing edges of B1' and
everything will work out nicely.
You might also wonder why we don't insert jumps at the end of _B1'. There is
no way another block ends up jumping to the labels _B1 or _B2 since they are
essentially invisible to other blocks. View them as control flow labels local
to the basic block if you'd like.
Not doing this ultimately caused (part 2 of) #17334.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes #16512.
There are lots of parts of this patch:
* The main payload is in FamInst. See
Note [Coverage condition for injective type families] there
for the overview. But it doesn't fix the bug.
* We now bump the reduction depth every time we discharge
a CFunEqCan. See Note [Flatten when discharging CFunEqCan]
in TcInteract.
* Exploration of this revealed a new, easy to maintain invariant
for CTyEqCans. See Note [Almost function-free] in TcRnTypes.
* We also realized that type inference for injectivity was a
bit incomplete. This means we exchanged lookupFlattenTyVar for
rewriteTyVar. See Note [rewriteTyVar] in TcFlatten. The new
function is monadic while the previous one was pure, necessitating
some faff in TcInteract. Nothing too bad.
* zonkCt did not maintain invariants on CTyEqCan. It's not worth
the bother doing so, so we just transmute CTyEqCans to
CNonCanonicals.
* The pure unifier was finding the fixpoint of the returned
substitution, even when doing one-way matching (in tcUnifyTysWithTFs).
Fixed now.
Test cases: typecheck/should_fail/T16512{a,b}
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For backends maintaining the CFG during codegen
we can now find loops and their nesting level.
This is based on the Cmm CFG and dominator analysis.
As a result we can estimate edge frequencies a lot better
for methods, resulting in far better code layout.
Speedup on nofib: ~1.5%
Increase in compile times: ~1.9%
To make this feasible this commit adds:
* Dominator analysis based on the Lengauer-Tarjan Algorithm.
* An algorithm estimating global edge frequences from branch
probabilities - In CFG.hs
A few static branch prediction heuristics:
* Expect to take the backedge in loops.
* Expect to take the branch NOT exiting a loop.
* Expect integer vs constant comparisons to be false.
We also treat heap/stack checks special for branch prediction
to avoid them being treated as loops.
|
|
|
|
| |
To avoid polluting the macro namespace
|
|
|
|
|
|
|
|
| |
- Remove unneeded ones
- Use <..> for inter-package.
Besides general clean up, helps distinguish between the RTS we link
against vs the RTS we compile for.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit refactors interface file generation to allow information
from the later passed (NCG, STG) to be stored in interface files.
We achieve this by splitting interface file generation into two parts:
* Partial interfaces, built based on the result of the core pipeline
* A fully instantiated interface, which also contains the final
fingerprints and can optionally contain information produced by the backend.
This change is required by !1304 and !1530.
-dynamic-too handling is refactored too: previously when generating code
we'd branch on -dynamic-too *before* code generation, but now we do it
after.
(Original code written by @AndreasK in !1530)
Performance
~~~~~~~~~~~
Before this patch interface files where created and immediately flushed
to disk which made space leaks impossible.
With this change we instead use NFData to force all iface related data
structures to avoid space leaks.
In the process of refactoring it was discovered that the code in the
ToIface Module allocated a lot of thunks which were immediately forced
when writing/forcing the interface file. So we made this module more
strict to avoid creating many of those thunks.
Bottom line is that allocations go down by about ~0.1% compared to
master.
Residency is not meaningfully different after this patch.
Runtime was not benchmarked.
Co-Authored-By: Andreas Klebinger <klebinger.andreas@gmx.at>
Co-Authored-By: Ömer Sinan Ağacan <omer@well-typed.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using EvVars for capturing type constraints implied side-effects in DsM
when we just wanted to *construct* type constraints.
But giving names to type constraints is only necessary when passing
Givens to the type checker, of which the majority of the pattern match
checker should be unaware.
Thus, we simply generate `newtype TyCt = TyCt PredType`, which are
nicely stateless. But at the same time this means we have to allocate
EvVars when we want to query the type oracle! So we keep the type oracle
state as `newtype TyState = TySt (Bag EvVar)`, which nicely makes a
distinction between new, unchecked `TyCt`s and the inert set in
`TyState`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we had an elaborate mechanism for selecting the warnings to
generate in the presence of different `COMPLETE` matching groups that,
albeit finely-tuned, produced wrong results from an end user's
perspective in some cases (#13363).
The underlying issue is that at the point where the `ConVar` case has to
commit to a particular `COMPLETE` group, there's not enough information
to do so and the status quo was to just enumerate all possible complete
sets nondeterministically. The `getResult` function would then pick the
outcome according to metrics defined in accordance to the user's guide.
But crucially, it lacked knowledge about the order in which affected
clauses appear, leading to the surprising behavior in #13363.
In !1010 we taught the term oracle to reason about literal values a
variable can certainly not take on. This MR extends that idea to
`ConLike`s and thereby fixes #13363: Instead of committing to a
particular `COMPLETE` group in the `ConVar` case, we now split off the
matching constructor incrementally and record the newly covered case as
a refutable shape in the oracle. Whenever the set of refutable shapes
covers any `COMPLETE` set, the oracle recognises vacuosity of the
uncovered set.
This patch goes a step further: Since at this point the information
in value abstractions is merely a cut down representation of what the
oracle knows, value abstractions degenerate to a single `Id`, the
semantics of which is determined by the oracle state `Delta`.
Value vectors become lists of `[Id]` given meaning to by a single
`Delta`, value set abstractions (of which the uncovered set is an
instance) correspond to a union of `Delta`s which instantiate the
same `[Id]` (akin to models of formula).
Fixes #11528 #13021, #13363, #13965, #14059, #14253, #14851, #15753, #17096, #17149
-------------------------
Metric Decrease:
ManyAlternatives
T11195
-------------------------
|
|
|
|
|
| |
07ee15915d5a0d6d1aeee137541eec6e9c153e65 started the transition, but the
job was never finished.
|
|
|
|
| |
Haven't been used since 16206a6603e87e15d61c57456267c5f7ba68050e.
|
|
|
|
|
|
| |
1) FastStrings are always UTF-8 encoded now.
2) Clarify what is meant by "hashed"
3) Add mention of lazy z-enc
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having an IORef in FastString to memoize the z-encoded version is
unecessary because there is this amazing thing Haskell can do natively,
it's called "lazyness" :)
We simply remove the UNPACK and strictness annotations from the constructor
field corresponding to the z-encoding, making it lazy, and store the
(pure) z-encoded string there.
The only complication here is 'hasZEncoding' which allows cheking if a
z-encoding was computed for a given string. Since this is only used for
compiler performance statistics though it's not actually necessary to have
the current per-string granularity.
Instead I add a global IORef counter to the FastStringTable and use
unsafePerformIO to increment the counter whenever a lazy z-encoding is
forced.
|
|
|
|
|
| |
Those constructors have been added after GHC 8.8. The version guards
in `binary` are correct, see https://github.com/kolmodin/binary/pull/167/files.
|