| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In #23208 we observed that the demand signature of a binder occuring in a RULE
wasn't unleashed, leading to a transitively used binder being discarded as
absent. The solution was to use the same code path that we already use for
handling exported bindings.
See the changes to `Note [Absence analysis for stable unfoldings and RULES]`
for more details.
I took the chance to factor out the old notion of a `PlusDmdArg` (a pair of a
`VarEnv Demand` and a `Divergence`) into `DmdEnv`, which fits nicely into our
existing framework. As a result, I had to touch quite a few places in the code.
This refactoring exposed a few small bugs around correct handling of bottoming
demand environments. As a result, some strictness signatures now mention uniques
that weren't there before which caused test output changes to T13143, T19969 and
T22112. But these tests compared whole -ddump-simpl listings which is a very
fragile thing to begin with. I changed what exactly they test for based on the
symptoms in the corresponding issues.
There is a single regression in T18894 because we are more conservative around
stable unfoldings now. Unfortunately it is not easily fixed; let's wait until
there is a concrete motivation before invest more time.
Fixes #23208.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit caced75765472a1a94453f2e5a439dba0d04a265.
It seems the patch "Don't keep exit join points so much" is causing
wide-spread regressions in the bytestring library benchmarks. If I
revert it then the 9.6 numbers are better on average than 9.4.
See https://gitlab.haskell.org/ghc/ghc/-/issues/22893#note_479525
-------------------------
Metric Decrease:
MultiComponentModules
MultiComponentModulesRecomp
MultiLayerModules
MultiLayerModulesRecomp
MultiLayerModulesTH_Make
T12150
T13386
T13719
T21839c
T3294
parsing001
-------------------------
|
|
|
|
|
| |
This seems like a good idea either way, but is mostly motivated by a
patch where this avoids a module loop.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As #22725 shows, in worker/wrapper we must add the void argument
/last/, not first. See GHC.Core.Opt.WorkWrap.Utils
Note [Worker/wrapper needs to add void arg last].
That led me to to study GHC.Core.Opt.SpecConstr
Note [SpecConstr needs to add void args first] which suggests the
opposite! And indeed I think it's the other way round for SpecConstr
-- or more precisely the void arg must precede the "extra_bndrs".
That led me to some refactoring of GHC.Core.Opt.SpecConstr.calcSpecInfo.
|
|
|
|
|
|
|
|
|
| |
This patch fixes #22634. Because we don't have TYPE/CONSTRAINT
polymorphism, we need two error functions rather than one.
I took the opportunity to rname runtimeError to impossibleError,
to line up with mkImpossibleExpr, and avoid confusion with the
genuine runtime-error-constructing functions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This big patch addresses the rats-nest of issues that have plagued
us for years, about the relationship between Type and Constraint.
See #11715/#21623.
The main payload of the patch is:
* To introduce CONSTRAINT :: RuntimeRep -> Type
* To make TYPE and CONSTRAINT distinct throughout the compiler
Two overview Notes in GHC.Builtin.Types.Prim
* Note [TYPE and CONSTRAINT]
* Note [Type and Constraint are not apart]
This is the main complication.
The specifics
* New primitive types (GHC.Builtin.Types.Prim)
- CONSTRAINT
- ctArrowTyCon (=>)
- tcArrowTyCon (-=>)
- ccArrowTyCon (==>)
- funTyCon FUN -- Not new
See Note [Function type constructors and FunTy]
and Note [TYPE and CONSTRAINT]
* GHC.Builtin.Types:
- New type Constraint = CONSTRAINT LiftedRep
- I also stopped nonEmptyTyCon being built-in; it only needs to be wired-in
* Exploit the fact that Type and Constraint are distinct throughout GHC
- Get rid of tcView in favour of coreView.
- Many tcXX functions become XX functions.
e.g. tcGetCastedTyVar --> getCastedTyVar
* Kill off Note [ForAllTy and typechecker equality], in (old)
GHC.Tc.Solver.Canonical. It said that typechecker-equality should ignore
the specified/inferred distinction when comparein two ForAllTys. But
that wsa only weakly supported and (worse) implies that we need a separate
typechecker equality, different from core equality. No no no.
* GHC.Core.TyCon: kill off FunTyCon in data TyCon. There was no need for it,
and anyway now we have four of them!
* GHC.Core.TyCo.Rep: add two FunTyFlags to FunCo
See Note [FunCo] in that module.
* GHC.Core.Type. Lots and lots of changes driven by adding CONSTRAINT.
The key new function is sORTKind_maybe; most other changes are built
on top of that.
See also `funTyConAppTy_maybe` and `tyConAppFun_maybe`.
* Fix a longstanding bug in GHC.Core.Type.typeKind, and Core Lint, in
kinding ForAllTys. See new tules (FORALL1) and (FORALL2) in GHC.Core.Type.
(The bug was that before (forall (cv::t1 ~# t2). blah), where
blah::TYPE IntRep, would get kind (TYPE IntRep), but it should be
(TYPE LiftedRep). See Note [Kinding rules for types] in GHC.Core.Type.
* GHC.Core.TyCo.Compare is a new module in which we do eqType and cmpType.
Of course, no tcEqType any more.
* GHC.Core.TyCo.FVs. I moved some free-var-like function into this module:
tyConsOfType, visVarsOfType, and occCheckExpand. Refactoring only.
* GHC.Builtin.Types. Compiletely re-engineer boxingDataCon_maybe to
have one for each /RuntimeRep/, rather than one for each /Type/.
This dramatically widens the range of types we can auto-box.
See Note [Boxing constructors] in GHC.Builtin.Types
The boxing types themselves are declared in library ghc-prim:GHC.Types.
GHC.Core.Make. Re-engineer the treatment of "big" tuples (mkBigCoreVarTup
etc) GHC.Core.Make, so that it auto-boxes unboxed values and (crucially)
types of kind Constraint. That allows the desugaring for arrows to work;
it gathers up free variables (including dictionaries) into tuples.
See Note [Big tuples] in GHC.Core.Make.
There is still work to do here: #22336. But things are better than
before.
* GHC.Core.Make. We need two absent-error Ids, aBSENT_ERROR_ID for types of
kind Type, and aBSENT_CONSTRAINT_ERROR_ID for vaues of kind Constraint.
Ditto noInlineId vs noInlieConstraintId in GHC.Types.Id.Make;
see Note [inlineId magic].
* GHC.Core.TyCo.Rep. Completely refactor the NthCo coercion. It is now called
SelCo, and its fields are much more descriptive than the single Int we used to
have. A great improvement. See Note [SelCo] in GHC.Core.TyCo.Rep.
* GHC.Core.RoughMap.roughMatchTyConName. Collapse TYPE and CONSTRAINT to
a single TyCon, so that the rough-map does not distinguish them.
* GHC.Core.DataCon
- Mainly just improve documentation
* Some significant renamings:
GHC.Core.Multiplicity: Many --> ManyTy (easier to grep for)
One --> OneTy
GHC.Core.TyCo.Rep TyCoBinder --> GHC.Core.Var.PiTyBinder
GHC.Core.Var TyCoVarBinder --> ForAllTyBinder
AnonArgFlag --> FunTyFlag
ArgFlag --> ForAllTyFlag
GHC.Core.TyCon TyConTyCoBinder --> TyConPiTyBinder
Many functions are renamed in consequence
e.g. isinvisibleArgFlag becomes isInvisibleForAllTyFlag, etc
* I refactored FunTyFlag (was AnonArgFlag) into a simple, flat data type
data FunTyFlag
= FTF_T_T -- (->) Type -> Type
| FTF_T_C -- (-=>) Type -> Constraint
| FTF_C_T -- (=>) Constraint -> Type
| FTF_C_C -- (==>) Constraint -> Constraint
* GHC.Tc.Errors.Ppr. Some significant refactoring in the TypeEqMisMatch case
of pprMismatchMsg.
* I made the tyConUnique field of TyCon strict, because I
saw code with lots of silly eval's. That revealed that
GHC.Settings.Constants.mAX_SUM_SIZE can only be 63, because
we pack the sum tag into a 6-bit field. (Lurking bug squashed.)
Fixes
* #21530
Updates haddock submodule slightly.
Performance changes
~~~~~~~~~~~~~~~~~~~
I was worried that compile times would get worse, but after
some careful profiling we are down to a geometric mean 0.1%
increase in allocation (in perf/compiler). That seems fine.
There is a big runtime improvement in T10359
Metric Decrease:
LargeRecord
MultiLayerModulesTH_OneShot
T13386
T13719
Metric Increase:
T8095
|
|
|
|
|
| |
Introduces GHC.Prelude.Basic which can be used in modules which are a
dependency of the ppr code.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes #21229 properly, by avoiding doing a
binder-swap on dictionary Ids. This is pretty subtle, and explained
in Note [Care with binder-swap on dictionaries].
Test is already in simplCore/should_run/T21229
This allows us to restore a feature to the specialiser that we had
to revert: see Note [Specialising polymorphic dictionaries].
(This is done in a separate patch.)
I also modularised things, using a new function scrutBinderSwap_maybe
in all the places where we are (effectively) doing a binder-swap,
notably
* Simplify.Iteration.addAltUnfoldings
* SpecConstr.extendCaseBndrs
In Simplify.Iteration.addAltUnfoldings I also eliminated a guard
Many <- idMult case_bndr
because we concluded, in #22123, that it was doing no good.
|
|
|
|
|
|
|
| |
We do so by having an explicit folding function that doesn't need to
allocate intermediate lists first.
Fixes #22196
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When doing performance debugging on #22084 / !8901, I found that the
algorithm in SpecConstr.decreaseSpecCount was so aggressive that if
there were /more/ specialisations available for an outer function,
that could more or less kill off specialisation for an /inner/
function. (An example was in nofib/spectral/fibheaps.)
This patch makes it a bit more aggressive, by dividing by 2, rather
than by the number of outer specialisations.
This makes the program bigger, temporarily:
T19695(normal) ghc/alloc +11.3% BAD
because we get more specialisation. But lots of other programs
compile a bit faster and the geometric mean in perf/compiler
is 0.0%.
Metric Increase:
T19695
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were religiously keeping exit join points throughout, which
had some bad effects (#21148, #22084).
This MR does two things:
* Arranges that exit join points are inhibited from inlining
only in /one/ Simplifier pass (right after Exitification).
See Note [Be selective about not-inlining exit join points]
in GHC.Core.Opt.Exitify
It's not a big deal, but it shaves 0.1% off compile times.
* Inline used-once non-recursive join points very aggressively
Given join j x = rhs in
joinrec k y = ....j x....
where this is the only occurrence of `j`, we want to inline `j`.
(Unless sm_keep_exits is on.)
See Note [Inline used-once non-recursive join points] in
GHC.Core.Opt.Simplify.Utils
This is just a tidy-up really. It doesn't change allocation, but
getting rid of a binding is always good.
Very effect on nofib -- some up and down.
|
|
|
|
|
|
|
|
|
|
|
| |
Part of proposal 475 (https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0475-tuple-syntax.rst)
Moves all tuples to GHC.Tuple.Prim
Updates ghc-prim version (and bumps bounds in dependents)
updates haddock submodule
updates deepseq submodule
updates text submodule
|
| |
|
|
|
|
|
|
|
| |
This fixes various typos and spelling mistakes
in the compiler.
Fixes #21891
|
|
|
|
|
|
| |
It turns out Solo is a very recent addition to base, so for older GHC
versions we just defined it inline here the one place we use it in the
compiler.
|
|
|
|
|
|
|
| |
The use of Solo here allows us to force the selection into the SCE to obtain
the Subst but without forcing the substitution to be applied. The resulting thunk
is placed into a lazy field which is rarely forced, so forcing it regresses
peformance.
|
|
|
|
|
|
|
|
| |
This reverts commit 851d8dd89a7955864b66a3da8b25f1dd88a503f8.
This commit was originally reverted due to an increase in space usage.
This was diagnosed as because the SCE increased in size and that was
being retained by another leak. See #22102
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As #21763 showed, we were over-specialising in some cases, when
the function involved was doing a simple 'eval', but not taking
the value apart, or branching on it.
This MR fixes the problem. See Note [Do not specialise evals].
Nofib barely budges, except that spectral/cichelli allocates about
3% less.
Compiler bytes-allocated improves a bit
geo. mean -0.1%
minimum -0.5%
maximum +0.0%
The -0.5% is on T11303b, for what it's worth.
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 415468fef8a3e9181b7eca86de0e05c0cce31729.
This refactoring introduced quite a severe residency regression (900MB
live from 650MB live when compiling mmark), see #21993 for a reproducer
and more discussion.
Ticket #21993
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes the TCvSubst data type and instead uses Subst as
the environment for both term and type level substitution. This
change is partially motivated by the existential type proposal,
which will introduce types that contain expressions and therefore
forces us to carry around an "IdSubstEnv" even when substituting for
types. It also reduces the amount of code because "Subst" and
"TCvSubst" share a lot of common operations. There isn't any
noticeable impact on performance (geo. mean for ghc/alloc is around
0.0% but we have -94 loc and one less data type to worry abount).
Currently, the "TCvSubst" data type for substitution on types is
identical to the "Subst" data type except the former doesn't store
"IdSubstEnv". Using "Subst" for type-level substitution means there
will be a redundant field stored in the data type. However, in cases
where the substitution starts from the expression, using "Subst" for
type-level substitution saves us from having to project "Subst" into a
"TCvSubst". This probably explains why the allocation is mostly even
despite the redundant field.
The patch deletes "TCvSubst" and moves "Subst" and its relevant
functions from "GHC.Core.Subst" into "GHC.Core.TyCo.Subst".
Substitution on expressions is still defined in "GHC.Core.Subst" so we
don't have to expose the definition of "Expr" in the hs-boot file that
"GHC.Core.TyCo.Subst" must import to refer to "IdSubstEnv" (whose
codomain is "CoreExpr"). Most functions named fooTCvSubst are renamed
into fooSubst with a few exceptions (e.g. "isEmptyTCvSubst" is a
distinct function from "isEmptySubst"; the former ignores the
emptiness of "IdSubstEnv"). These exceptions mainly exist for
performance reasons and will go away when "Expr" and "Type" are
mutually recursively defined (we won't be able to take those
shortcuts if we can't make the assumption that expressions don't
appear in types).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch addresses #21831, point 2. See
Note [generaliseDictPats] in SpecConstr
I took the opportunity to refactor the construction of specialisation
rules a bit, so that the rule name says what type we are specialising
at.
Surprisingly, there's a 20% decrease in compile time for test
perf/compiler/T18223. I took a look at it, and the code size seems the
same throughout. I did a quick ticky profile which seemed to show a
bit less substitution going on. Hmm. Maybe it's the "don't do
eta-expansion in stable unfoldings" patch, which is part of the
same MR as this patch.
Anyway, since it's a move in the right direction, I didn't think it
was worth looking into further.
Metric Decrease:
T18223
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch, provoked by #21457, simplifies SpecConstr by treating
top-level and nested bindings uniformly (see the new scBind).
* Eliminates the mysterious scTopBindEnv
* Refactors scBind to handle top-level and nested definitions
uniformly.
* But, for now at least, continues the status quo of not doing
SpecConstr for top-level non-recursive bindings. (In contrast
we do specialise nested non-recursive bindings, although the
original paper did not; see Note [Local let bindings].)
I tried the effect of specialising top-level non-recursive
bindings (which is now dead easy to switch on, unlike before)
but found some regressions, so I backed off. See !8135.
It's a pure refactoring. I think it'll do a better job in a few
cases, but there is no regression test.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We used to put OtherCon unfoldings on lambda binders of workers
and sometimes also join points/specializations with with the
assumption that since the wrapper would force these arguments
once we execute the RHS they would indeed be in WHNF.
This was wrong for reasons detailed in #21472. So now we purge
evaluated unfoldings from *all* lambda binders.
This fixes #21472, but at the cost of sometimes not using as efficient a
calling convention. It can also change inlining behaviour as some
occurances will no longer look like value arguments when they did
before.
As consequence we also change how we compute CBV information for
arguments slightly. We now *always* determine the CBV convention
for arguments during tidy. Earlier in the pipeline we merely mark
functions as candidates for having their arguments treated as CBV.
As before the process is described in the relevant notes:
Note [CBV Function Ids]
Note [Attaching CBV Marks to ids]
Note [Never put `OtherCon` unfoldigns on lambda binders]
-------------------------
Metric Decrease:
T12425
T13035
T18223
T18223
T18923
MultiLayerModulesTH_OneShot
Metric Increase:
WWRec
-------------------------
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
too hard
Progress towards #17957
Because of `CoreM`, I did not move the `DynFlags` and `HscEnv` to other
modules as thoroughly as I usually do. This does mean that risk of
`DynFlags` "creeping back in" is higher than it usually is.
After we do the same process to the other Core passes, and then figure
out what we want to do about `CoreM`, we can finish the job started
here.
That is a good deal more work, however, so it certainly makes sense to
land this now.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch addresses a relatively obscure situation that arose
when chasing perf regressions in !7847, which itself is fixing
It does two things:
* SpecConstr can specialise on ($df d1 d2) dictionary arguments
* FloatOut no longer checks argument strictness
See Note [Specialising on dictionaries] in GHC.Core.Opt.SpecConstr.
A test case is difficult to construct, but it makes a big difference
in nofib/real/eff/VSM, at least when we have the patch for #21286
installed. (The latter stops worker/wrapper for dictionary arguments).
There is a spectacular, but slightly illusory, improvement in
runtime perf on T15426. I have documented the specifics in
T15426 itself.
Metric Decrease:
T15426
|
|
|
|
|
|
|
|
| |
- Remove groupWithName (unused)
- Use the RuntimeRepType synonym where possible
- Replace getUniqueM + mkSysLocalOrCoVar with mkSysLocalOrCoVarM
No functional changes.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
applications
The main fix is that in addVoidWorkerArg we now add the argument to the front.
This fixes #21448.
-------------------------
Metric Decrease:
T16875
-------------------------
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In GHC.Core.Opt.Specialise.bindAuxiliaryDict we were unnecessarily
calling `extendInScope` to bring into scope variables that were
/already/ in scope. Worse, GHC.Core.Subst.extendInScope strangely
deleted the newly-in-scope variables from the substitution -- and that
was fatal in #21391.
I removed the redundant calls to extendInScope.
More ambitiously, I changed GHC.Core.Subst.extendInScope (and cousins)
to stop deleting variables from the substitution. I even changed the
names of the function to extendSubstInScope (and cousins) and audited
all the calls to check that deleting from the substitution was wrong.
In fact there are very few such calls, and they are all about
introducing a fresh non-in-scope variable. These are "OutIds"; it is
utterly wrong to mess with the "InId" substitution.
I have not added a Note, because I'm deleting wrong code, and it'd be
distracting to document a bug.
|
|
|
|
| |
Fixes #20935 and #20924
|
|
|
|
|
|
|
|
|
|
|
|
| |
A new pragma, `OPAQUE`, that ensures that every call of a named
function annotated with an `OPAQUE` pragma remains a call of that
named function, not some name-mangled variant.
Implements GHC proposal 0415:
https://github.com/ghc-proposals/ghc-proposals/blob/master/proposals/0415-opaque-pragma.rst
This commit also updates the haddock submodule to handle the newly
introduced lexer tokens corresponding to the OPAQUE pragma.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As #20837 pointed out, `isLiftedType_maybe` returned `Just False` in
many situations where it should return `Nothing`, because it didn't
take into account type families or type variables.
In this patch, we fix this issue. We rename `isLiftedType_maybe` to
`typeLevity_maybe`, which now returns a `Levity` instead of a boolean.
We now return `Nothing` for types with kinds of the form
`TYPE (F a1 ... an)` for a type family `F`, as well as
`TYPE (BoxedRep l)` where `l` is a type variable.
This fix caused several other problems, as other parts of the compiler
were relying on `isLiftedType_maybe` returning a `Just` value, and were
now panicking after the above fix. There were two main situations in
which panics occurred:
1. Issues involving the let/app invariant. To uphold that invariant,
we need to know whether something is lifted or not. If we get an
answer of `Nothing` from `isLiftedType_maybe`, then we don't know
what to do. As this invariant isn't particularly invariant, we
can change the affected functions to not panic, e.g. by behaving
the same in the `Just False` case and in the `Nothing` case
(meaning: no observable change in behaviour compared to before).
2. Typechecking of data (/newtype) constructor patterns. Some programs
involving patterns with unknown representations were accepted, such
as T20363. Now that we are stricter, this caused further issues,
culminating in Core Lint errors. However, the behaviour was
incorrect the whole time; the incorrectness only being revealed by
this change, not triggered by it.
This patch fixes this by overhauling where the representation
polymorphism involving pattern matching are done. Instead of doing
it in `tcMatches`, we instead ensure that the `matchExpected`
functions such as `matchExpectedFunTys`, `matchActualFunTySigma`,
`matchActualFunTysRho` allow return argument pattern types which
have a fixed RuntimeRep (as defined in Note [Fixed RuntimeRep]).
This ensures that the pattern matching code only ever handles types
with a known runtime representation. One exception was that
patterns with an unknown representation type could sneak in via
`tcConPat`, which points to a missing representation-polymorphism
check, which this patch now adds.
This means that we now reject the program in #20363, at least until
we implement PHASE 2 of FixedRuntimeRep (allowing type families in
RuntimeRep positions). The aforementioned refactoring, in which
checks have been moved to `matchExpected` functions, is a first
step in implementing PHASE 2 for patterns.
Fixes #20837
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Issue #21150 shows that worker/wrapper allocated a worker function for a
function with multiple calls that said "called at most once" when the first
argument was absent. That's bad!
This patch makes it so that WW preserves at least one non-one-shot value lambda
(see `Note [Preserving float barriers]`) by passing around `void#` in place of
absent arguments.
Fixes #21150.
Since the fix is pretty similar to `Note [Protecting the last value argument]`,
I put the logic in `mkWorkerArgs`. There I realised (#21204) that
`-ffun-to-thunk` is basically useless with `-ffull-laziness`, so I deprecated
the flag, simplified and split into `needsVoidWorkerArg`/`addVoidWorkerArg`.
SpecConstr is another client of that API.
Fixes #21204.
Metric Decrease:
T14683
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This does three major things:
* Enforce the invariant that all strict fields must contain tagged
pointers.
* Try to predict the tag on bindings in order to omit tag checks.
* Allows functions to pass arguments unlifted (call-by-value).
The former is "simply" achieved by wrapping any constructor allocations with
a case which will evaluate the respective strict bindings.
The prediction is done by a new data flow analysis based on the STG
representation of a program. This also helps us to avoid generating
redudant cases for the above invariant.
StrictWorkers are created by W/W directly and SpecConstr indirectly.
See the Note [Strict Worker Ids]
Other minor changes:
* Add StgUtil module containing a few functions needed by, but
not specific to the tag analysis.
-------------------------
Metric Decrease:
T12545
T18698b
T18140
T18923
LargeRecord
Metric Increase:
LargeRecord
ManyAlternatives
ManyConstructors
T10421
T12425
T12707
T13035
T13056
T13253
T13253-spj
T13379
T15164
T18282
T18304
T18698a
T1969
T20049
T3294
T4801
T5321FD
T5321Fun
T783
T9233
T9675
T9961
T19695
WWRec
-------------------------
|
| |
|
|
|
|
| |
This makes it more similar to pprTrace, pprPanic etc.
|
| |
|
|
|
|
|
| |
`DataConEnv` will prove to be useful in another place besides
`GHC.Core.Opt.SpecConstr` in a follow-up commit.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes some abundant reboxing of `DynFlags` in
`GHC.HsToCore.Match.Literal.warnAboutOverflowedLit` (which was the topic
of #19407) by introducing a Boxity analysis to GHC, done as part of demand
analysis. This allows to accurately capture ad-hoc unboxing decisions previously
made in worker/wrapper in demand analysis now, where the boxity info can
propagate through demand signatures.
See the new `Note [Boxity analysis]`. The actual fix for #19407 is described in
`Note [No lazy, Unboxed demand in demand signature]`, but
`Note [Finalising boxity for demand signature]` is probably a better entry-point.
To support the fix for #19407, I had to change (what was)
`Note [Add demands for strict constructors]` a bit
(now `Note [Unboxing evaluated arguments]`). In particular, we now take care of
it in `finaliseBoxity` (which is only called from demand analaysis) instead of
`wantToUnboxArg`.
I also had to resurrect `Note [Product demands for function body]` and rename
it to `Note [Unboxed demand on function bodies returning small products]` to
avoid huge regressions in `join004` and `join007`, thereby fixing #4267 again.
See the updated Note for details.
A nice side-effect is that the worker/wrapper transformation no longer needs to
look at strictness info and other bits such as `InsideInlineableFun` flags
(needed for `Note [Do not unbox class dictionaries]`) at all. It simply collects
boxity info from argument demands and interprets them with a severely simplified
`wantToUnboxArg`. All the smartness is in `finaliseBoxity`, which could be moved
to DmdAnal completely, if it wasn't for the call to `dubiousDataConInstArgTys`
which would be awkward to export.
I spent some time figuring out the reason for why `T16197` failed prior to my
amendments to `Note [Unboxing evaluated arguments]`. After having it figured
out, I minimised it a bit and added `T16197b`, which simply compares computed
strictness signatures and thus should be far simpler to eyeball.
The 12% ghc/alloc regression in T11545 is because of the additional `Boxity`
field in `Poly` and `Prod` that results in more allocation during `lubSubDmd`
and `plusSubDmd`. I made sure in the ticky profiles that the number of calls
to those functions stayed the same. We can bear such an increase here, as we
recently improved it by -68% (in b760c1f).
T18698* regress slightly because there is more unboxing of dictionaries
happening and that causes Lint (mostly) to allocate more.
Fixes #19871, #19407, #4267, #16859, #18907 and #13331.
Metric Increase:
T11545
T18698a
T18698b
Metric Decrease:
T12425
T16577
T18223
T18282
T4267
T9961
|
|
|
|
|
|
|
|
| |
Now that Outputable is independent of DynFlags, we can put tracing
functions using SDocs into their own module that doesn't transitively
depend on any GHC.Driver.* module.
A few modules needed to be moved to avoid loops in DEBUG mode.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Replace uses of WARN macro with calls to:
warnPprTrace :: Bool -> SDoc -> a -> a
Remove the now unused HsVersions.h
Bump haddock submodule
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no reason to use CPP. __LINE__ and __FILE__ macros are now
better replaced with GHC's CallStack. As a bonus, assert error messages
now contain more information (function name, column).
Here is the mapping table (HasCallStack omitted):
* ASSERT: assert :: Bool -> a -> a
* MASSERT: massert :: Bool -> m ()
* ASSERTM: assertM :: m Bool -> m ()
* ASSERT2: assertPpr :: Bool -> SDoc -> a -> a
* MASSERT2: massertPpr :: Bool -> SDoc -> m ()
* ASSERTM2: assertPprM :: m Bool -> SDoc -> m ()
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In GHC.Core.Opt.SpecConstr.spec_one we were giving join-points an
incorrect join-arity -- this was fallout from
commit c71b220491a6ae46924cc5011b80182bcc773a58
Author: Simon Peyton Jones <simonpj@microsoft.com>
Date: Thu Apr 8 23:36:24 2021 +0100
Improvements in SpecConstr
* Allow under-saturated calls to specialise
See Note [SpecConstr call patterns]
This just allows a bit more specialisation to take place.
and showed up in #19780. I refactored the code to make the new
function calcSpecInfo which treats join points separately.
In doing this I discovered two other small bugs:
* In the Var case of argToPat we were treating UnkOcc as
uninteresting, but (by omission) NoOcc as interesting. As a
result we were generating SpecConstr specialisations for functions
with unused arguments. But the absence anlyser does that much
better; doing it here just generates more code. Easily fixed.
* The lifted/unlifted test in GHC.Core.Opt.WorkWrap.Utils.mkWorkerArgs
was back to front (#19794). Easily fixed.
* In the same function, mkWorkerArgs, we were adding an extra argument
nullary join points, which isn't necessary. I added a test for
this. That in turn meant I had to remove an ASSERT in
CoreToStg.mkStgRhs for nullary join points, which was always bogus
but now trips; I added a comment to explain.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1. `text` is as efficient as `ptext . sLit` thanks to the rewrite rules
2. `text` is visually nicer than `ptext . sLit`
3. `ptext . sLit` encourages using one `ptext` for several `sLit` as in:
ptext $ case xy of
... -> sLit ...
... -> sLit ...
which may allocate SDoc's TextBeside constructors at runtime instead
of sharing them into CAFs.
|
|
|
|
|
|
|
|
|
| |
Plus a few minor refactorings:
* Introduce `normSplitTyConApp_maybe` to Core.Utils
* Reduce boolean blindness in the Bool argument to `wantToUnbox`
* Let `wantToUnbox` also decide when to drop an argument, cleaning up
`mkWWstr_one`
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Allow under-saturated calls to specialise
See Note [SpecConstr call patterns]
This just allows a bit more specialisation to take place.
* Don't discard calls from un-specialised RHSs. This was
a plain bug in `specialise`, again leading to loss of
specialisation. Refactoring yields an `otherwise`
case that is easier to grok.
* I refactored CallPat to become a proper data type, not a tuple.
All this came up when I was working on eta-reduction. The ticket
is #19672.
The nofib results are mostly zero, with a couple of big wins:
Program Size Allocs Runtime Elapsed TotalMem
--------------------------------------------------------------------------------
awards +0.2% -0.1% -18.7% -18.8% 0.0%
comp_lab_zift +0.2% -0.2% -23.9% -23.9% 0.0%
fft2 +0.2% -1.0% -34.9% -36.6% 0.0%
hpg +0.2% -0.3% -18.4% -18.4% 0.0%
mate +0.2% -15.7% -19.3% -19.3% +11.1%
parser +0.2% +0.6% -16.3% -16.3% 0.0%
puzzle +0.4% -19.7% -33.7% -34.0% 0.0%
rewrite +0.2% -0.5% -20.7% -20.7% 0.0%
--------------------------------------------------------------------------------
Min +0.2% -19.7% -48.1% -48.9% 0.0%
Max +0.4% +0.6% -1.2% -1.1% +11.1%
Geometric Mean +0.2% -0.4% -21.0% -21.1% +0.1%
I investigated the 0.6% increase on 'parser'. It comes because SpecConstr
has a limit of 3 specialisations. With HEAD, hsDoExpr has 2
specialisations, and then a further several from the specialised
bodies, of which 1 is picked. With this patch we get 3
specialisations right off the bat, so we discard all from the
recursive calls. Turns out that that's not the best choice, but there
is no way to tell that. I'm accepting it.
NB: these figures actually come from this patch plus the preceding one for
StgCSE, but I think the gains come from SpecConstr.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In #19597, we also settled on the following renamings:
* `idStrictness` -> `idDmdSig`,
`strictnessInfo` -> `dmdSigInfo`,
`HsStrictness` -> `HsDmdSig`
* `idCprInfo` -> `idCprSig`,
`cprInfo` -> `cprSigInfo`,
`HsCpr` -> `HsCprSig`
Fixes #19597.
|
|
|
|
|
| |
This commit adds the `lint:compiler` Hadrian target to the CI runner.
It does also fixes hints in the compiler/ and libraries/base/ codebases.
|
|
|
|
|
| |
Metric Increase:
MultiLayerModules
|