summaryrefslogtreecommitdiff
path: root/compiler/GHC/Stg/Lint.hs
Commit message (Collapse)AuthorAgeFilesLines
* Refactored Simplify passDominik Peteler2022-07-221-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * Removed references to driver from GHC.Core.LateCC, GHC.Core.Simplify namespace and GHC.Core.Opt.Stats. Also removed services from configuration records. * Renamed GHC.Core.Opt.Simplify to GHC.Core.Opt.Simplify.Iteration. * Inlined `simplifyPgm` and renamed `simplifyPgmIO` to `simplifyPgm` and moved the Simplify driver to GHC.Core.Opt.Simplify. * Moved `SimplMode` and `FloatEnable` to GHC.Core.Opt.Simplify.Env. * Added a configuration record `TopEnvConfig` for the `SimplTopEnv` environment in GHC.Core.Opt.Simplify.Monad. * Added `SimplifyOpts` and `SimplifyExprOpts`. Provide initialization functions for those in a new module GHC.Driver.Config.Core.Opt.Simplify. Also added initialization functions for `SimplMode` to that module. * Moved `CoreToDo` and friends to a new module GHC.Core.Pipeline.Types and the counting types and functions (`SimplCount` and `Tick`) to new module GHC.Core.Opt.Stats. * Added getter functions for the fields of `SimplMode`. The pedantic bottoms option and the platform are retrieved from the ArityOpts and RuleOpts and the getter functions allow us to retrieve values from `SpecEnv` without the knowledge where the data is stored exactly. * Moved the coercion optimization options from the top environment to `SimplMode`. This way the values left in the top environment are those dealing with monadic functionality, namely logging, IO related stuff and counting. Added a note "The environments of the Simplify pass". * Removed `CoreToDo` from GHC.Core.Lint and GHC.CoreToStg.Prep and got rid of `CoreDoSimplify`. Pass `SimplifyOpts` in the `CoreToDo` type instead. * Prep work before removing `InteractiveContext` from `HscEnv`.
* Don't mark lambda binders as OtherConAndreas Klebinger2022-06-271-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to put OtherCon unfoldings on lambda binders of workers and sometimes also join points/specializations with with the assumption that since the wrapper would force these arguments once we execute the RHS they would indeed be in WHNF. This was wrong for reasons detailed in #21472. So now we purge evaluated unfoldings from *all* lambda binders. This fixes #21472, but at the cost of sometimes not using as efficient a calling convention. It can also change inlining behaviour as some occurances will no longer look like value arguments when they did before. As consequence we also change how we compute CBV information for arguments slightly. We now *always* determine the CBV convention for arguments during tidy. Earlier in the pipeline we merely mark functions as candidates for having their arguments treated as CBV. As before the process is described in the relevant notes: Note [CBV Function Ids] Note [Attaching CBV Marks to ids] Note [Never put `OtherCon` unfoldigns on lambda binders] ------------------------- Metric Decrease: T12425 T13035 T18223 T18223 T18923 MultiLayerModulesTH_OneShot Metric Increase: WWRec -------------------------
* Misc cleanupKrzysztof Gogolewski2022-05-161-10/+7
| | | | | | | | - Remove groupWithName (unused) - Use the RuntimeRepType synonym where possible - Replace getUniqueM + mkSysLocalOrCoVar with mkSysLocalOrCoVarM No functional changes.
* StgLint: Check that functions are applied to compatible runtime repsAndreas Klebinger2022-05-011-21/+132
| | | | | We use compatibleRep to compare reps, and avoid checking functions with levity polymorphic types because of #21399.
* StgLint: Lint constructor applications and strict workers for arity.Andreas Klebinger2022-04-141-4/+25
| | | | | | | This will mean T9208 when run with lint will return a lint error instead of resulting in a panic. Fixes #21117
* Drop the app invariantghc-9.5-startJoachim Breitner2022-04-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | previously, GHC had the "let/app-invariant" which said that the RHS of a let or the argument of an application must be of lifted type or ok for speculation. We want this on let to freely float them around, and we wanted that on app to freely convert between the two (e.g. in beta-reduction or inlining). However, the app invariant meant that simple code didn't stay simple and this got in the way of rules matching. By removing the app invariant, this thus fixes #20554. The new invariant is now called "let-can-float invariant", which is hopefully easier to guess its meaning correctly. Dropping the app invariant means that everywhere where we effectively do beta-reduction (in the two simplifiers, but also in `exprIsConApp_maybe` and other innocent looking places) we now have to check if the argument must be evaluated (unlifted and side-effecting), and analyses have to be adjusted to the new semantics of `App`. Also, `LetFloats` in the simplifier can now also carry such non-floating bindings. The fix for DmdAnal, refine by Sebastian, makes functions with unlifted arguments strict in these arguments, which changes some signatures. This causes some extra calls to `exprType` and `exprOkForSpeculation`, so some perf benchmarks regress a bit (while others improve). Metric Decrease: T9020 Metric Increase: LargeRecord T12545 T15164 T16577 T18223 T5642 T9961 Co-authored-by: Sebastian Graf <sebastian.graf@kit.edu>
* GenStgAlt 3-tuple synonym --> Record typedoyougnu2022-03-021-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit alters GenStgAlt from a type synonym to a Record with field accessors. In pursuit of #21078, this is not a required change but cleans up several areas for nicer code in the upcoming js-backend, and in GHC itself. GenStgAlt: 3-tuple -> record Stg.Utils: GenStgAlt 3-tuple -> record Stg.Stats: StgAlt 3-tuple --> record Stg.InferTags.Rewrite: StgAlt 3-tuple -> record Stg.FVs: GenStgAlt 3-tuple -> record Stg.CSE: GenStgAlt 3-tuple -> record Stg.InferTags: GenStgAlt 3-tuple --> record Stg.Debug: GenStgAlt 3-tuple --> record Stg.Lift.Analysis: GenStgAlt 3-tuple --> record Stg.Lift: GenStgAlt 3-tuple --> record ByteCode.Instr: GenStgAlt 3-tuple --> record Stg.Syntax: add GenStgAlt helper functions Stg.Unarise: GenStgAlt 3-tuple --> record Stg.BcPrep: GenStgAlt 3-tuple --> record CoreToStg: GenStgAlt 3-tuple --> record StgToCmm.Expr: GenStgAlt 3-tuple --> record StgToCmm.Bind: GenStgAlt 3-tuple --> record StgToByteCode: GenStgAlt 3-tuple --> record Stg.Lint: GenStgAlt 3-tuple --> record Stg.Syntax: strictify GenStgAlt GenStgAlt: add haddock, some cleanup fixup: remove calls to pure, single ViewPattern StgToByteCode: use case over viewpatterns
* Tag inference work.Andreas Klebinger2022-02-121-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This does three major things: * Enforce the invariant that all strict fields must contain tagged pointers. * Try to predict the tag on bindings in order to omit tag checks. * Allows functions to pass arguments unlifted (call-by-value). The former is "simply" achieved by wrapping any constructor allocations with a case which will evaluate the respective strict bindings. The prediction is done by a new data flow analysis based on the STG representation of a program. This also helps us to avoid generating redudant cases for the above invariant. StrictWorkers are created by W/W directly and SpecConstr indirectly. See the Note [Strict Worker Ids] Other minor changes: * Add StgUtil module containing a few functions needed by, but not specific to the tag analysis. ------------------------- Metric Decrease: T12545 T18698b T18140 T18923 LargeRecord Metric Increase: LargeRecord ManyAlternatives ManyConstructors T10421 T12425 T12707 T13035 T13056 T13253 T13253-spj T13379 T15164 T18282 T18304 T18698a T1969 T20049 T3294 T4801 T5321FD T5321Fun T783 T9233 T9675 T9961 T19695 WWRec -------------------------
* Purge DynFlags from GHC.StgJohn Ericson2022-02-061-7/+4
| | | | | Also derive some more instances. GHC doesn't need them, but downstream consumers may need to e.g. put stuff in maps.
* Increase type sharingBen Gamari2021-11-171-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes #20541 by making mkTyConApp do more sharing of types. In particular, replace * BoxedRep Lifted ==> LiftedRep * BoxedRep Unlifted ==> UnliftedRep * TupleRep '[] ==> ZeroBitRep * TYPE ZeroBitRep ==> ZeroBitType In each case, the thing on the right is a type synonym for the thing on the left, declared in ghc-prim:GHC.Types. See Note [Using synonyms to compress types] in GHC.Core.Type. The synonyms for ZeroBitRep and ZeroBitType are new, but absolutely in the same spirit as the other ones. (These synonyms are mainly for internal use, though the programmer can use them too.) I also renamed GHC.Core.Ty.Rep.isVoidTy to isZeroBitTy, to be compatible with the "zero-bit" nomenclature above. See discussion on !6806. There is a tricky wrinkle: see GHC.Core.Types Note [Care using synonyms to compress types] Compiler allocation decreases by up to 0.8%.
* Dynflags: introduce DiagOptsSylvain Henry2021-07-011-27/+35
| | | | | | | | | | | | | | | | | | | | | | Use DiagOpts for diagnostic options instead of directly querying DynFlags (#17957). Surprising performance improvements on CI: T4801(normal) ghc/alloc 313236344.0 306515216.0 -2.1% GOOD T9961(normal) ghc/alloc 384502736.0 380584384.0 -1.0% GOOD ManyAlternatives(normal) ghc/alloc 797356128.0 786644928.0 -1.3% ManyConstructors(normal) ghc/alloc 4389732432.0 4317740880.0 -1.6% T783(normal) ghc/alloc 408142680.0 402812176.0 -1.3% Metric Decrease: T4801 T9961 T783 ManyAlternatives ManyConstructors Bump haddock submodule
* Make Logger independent of DynFlagsSylvain Henry2021-06-071-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce LogFlags as a independent subset of DynFlags used for logging. As a consequence in many places we don't have to pass both Logger and DynFlags anymore. The main reason for this refactoring is that I want to refactor the systools interfaces: for now many systools functions use DynFlags both to use the Logger and to fetch their parameters (e.g. ldInputs for the linker). I'm interested in refactoring the way they fetch their parameters (i.e. use dedicated XxxOpts data types instead of DynFlags) for #19877. But if I did this refactoring before refactoring the Logger, we would have duplicate parameters (e.g. ldInputs from DynFlags and linkerInputs from LinkerOpts). Hence this patch first. Some flags don't really belong to LogFlags because they are subsystem specific (e.g. most DumpFlags). For example -ddump-asm should better be passed in NCGConfig somehow. This patch doesn't fix this tight coupling: the dump flags are part of the UI but they are passed all the way down for example to infer the file name for the dumps. Because LogFlags are a subset of the DynFlags, we must update the former when the latter changes (not so often). As a consequence we now use accessors to read/write DynFlags in HscEnv instead of using `hsc_dflags` directly. In the process I've also made some subsystems less dependent on DynFlags: - CmmToAsm: by passing some missing flags via NCGConfig (see new fields in GHC.CmmToAsm.Config) - Core.Opt.*: - by passing -dinline-check value into UnfoldingOpts - by fixing some Core passes interfaces (e.g. CallArity, FloatIn) that took DynFlags argument for no good reason. - as a side-effect GHC.Core.Opt.Pipeline.doCorePass is much less convoluted.
* Compute Severity of diagnostics at birthAlfredo Di Napoli2021-04-011-23/+24
| | | | | | | | | | | | | | | | | | | | | This commit further expand on the design for #18516 by getting rid of the `defaultReasonSeverity` in favour of a function called `diagReasonSeverity` which correctly takes the `DynFlags` as input. The idea is to compute the `Severity` and the `DiagnosticReason` of each message "at birth", without doing any later re-classifications, which are potentially error prone, as the `DynFlags` might evolve during the course of the program. In preparation for a proper refactoring, now `pprWarning` from the Parser.Ppr module has been renamed to `mkParserWarn`, which now takes a `DynFlags` as input. We also get rid of the reclassification we were performing inside `printOrThrowWarnings`. Last but not least, this commit removes the need for reclassify inside GHC.Tc.Errors, and also simplifies the implementation of `maybeReportError`. Update Haddock submodule
* Add `MessageClass`, rework `Severity` and add `DiagnosticReason`.wip/adinapoli-message-class-new-designAlfredo Di Napoli2021-03-291-3/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Other than that: * Fix T16167,json,json2,T7478,T10637 tests to reflect the introduction of the `MessageClass` type * Remove `makeIntoWarning` * Remove `warningsToMessages` * Refactor GHC.Tc.Errors 1. Refactors GHC.Tc.Errors so that we use `DiagnosticReason` for "choices" (defer types errors, holes, etc); 2. We get rid of `reportWarning` and `reportError` in favour of a general `reportDiagnostic`. * Introduce `DiagnosticReason`, `Severity` is an enum: This big commit makes `Severity` a simple enumeration, and introduces the concept of `DiagnosticReason`, which classifies the /reason/ why we are emitting a particular diagnostic. It also adds a monomorphic `DiagnosticMessage` type which is used for generic messages. * The `Severity` is computed (for now) from the reason, statically. Later improvement will add a `diagReasonSeverity` function to compute the `Severity` taking `DynFlags` into account. * Rename `logWarnings` into `logDiagnostics` * Add note and expand description of the `mkHoleError` function
* Generate GHCi bytecode from STG instead of Core and support unboxedLuite Stegeman2021-03-201-2/+6
| | | | | | tuples and sums. fixes #1257
* Implement the UnliftedDatatypes extensionSebastian Graf2021-03-141-3/+5
| | | | | | | | | | | | | | | | | | GHC Proposal: 0265-unlifted-datatypes.rst Discussion: https://github.com/ghc-proposals/ghc-proposals/pull/265 Issues: https://gitlab.haskell.org/ghc/ghc/-/issues/19523 Implementation Details: Note [Implementation of UnliftedDatatypes] This patch introduces the `UnliftedDatatypes` extension. When this extension is enabled, GHC relaxes the restrictions around what result kinds are allowed in data declarations. This allows data types for which an unlifted or levity-polymorphic result kind is inferred. The most significant changes are in `GHC.Tc.TyCl`, where `Note [Implementation of UnliftedDatatypes]` describes the details of the implementation. Fixes #19523.
* Add option to give each usage of a data constructor its own info tableMatthew Pickering2021-03-031-3/+3
| | | | | | | | | | | | | The `-fdistinct-constructor-tables` flag will generate a fresh info table for the usage of any data constructor. This is useful for debugging as now by inspecting the info table, you can determine which usage of a constructor caused that allocation rather than the old situation where the info table always mapped to the definition site of the data constructor which is useless. In conjunction with `-hi` and `-finfo-table-map` this gives a more fine grained understanding of where constructor allocations arise from in a program.
* Refactor LoggerSylvain Henry2021-02-131-4/+6
| | | | | | | | | | | | | | | | | | | | | Before this patch, the only way to override GHC's default logging behavior was to set `log_action`, `dump_action` and `trace_action` fields in DynFlags. This patch introduces a new Logger abstraction and stores it in HscEnv instead. This is part of #17957 (avoid storing state in DynFlags). DynFlags are duplicated and updated per-module (because of OPTIONS_GHC pragma), so we shouldn't store global state in them. This patch also fixes a race in parallel "--make" mode which updated the `generatedDumps` IORef concurrently. Bump haddock submodule The increase in MultilayerModules is tracked in #19293. Metric Increase: MultiLayerModules
* Remove ErrDoc and MsgDocAlfredo Di Napoli2021-02-011-7/+7
| | | | | | | | | | | | | This commit boldly removes the ErrDoc and the MsgDoc from the codebase. The former was introduced with the only purpose of classifying errors according to their importance, but a similar result can be obtained just by having a simple [SDoc], and placing bullets after each of them. On top of that I have taken the perhaps controversial decision to also banish MsgDoc, as it was merely a type alias over an SDoc and as such it wasn't offering any extra type safety. Granted, it was perhaps making type signatures slightly more "focused", but at the expense of cognitive burden: if it's really just an SDoc, let's call it with its proper name.
* Remove StgLamLeif Metcalf2021-01-291-4/+0
| | | | | | | | | | | | | | | | StgLam is used exclusively in the work of CoreToStg, but there's nothing in the type of StgExpr that indicates this, so we're forced throughout the Stg.* codebase to handle cases like: case expr of ... StgLam lam -> panic "Unexpected StgLam" ... This patch removes the StgLam constructor from the base StgExpr so these cases no longer need to be handled. Instead, we use a new intermediate type in CoreToStg, PreStgRhs, to represent the RHS expression of a binding.
* Lint the compiler for extraneous LANGUAGE pragmasHécate2020-10-101-2/+2
|
* Introduce isBoxedTupleDataCon and use it to fix #18644Ryan Scott2020-09-041-2/+2
| | | | | | | | | | | | | | | | | | | | The code that converts promoted tuple data constructors to `IfaceType`s in `GHC.CoreToIface` was using `isTupleDataCon`, which conflates boxed and unboxed tuple data constructors. To avoid this, this patch introduces `isBoxedTupleDataCon`, which is like `isTupleDataCon` but only works for _boxed_ tuple data constructors. While I was in town, I was horribly confused by the fact that there were separate functions named `isUnboxedTupleCon` and `isUnboxedTupleTyCon` (similarly, `isUnboxedSumCon` and `isUnboxedSumTyCon`). It turns out that the former only works for data constructors, despite its very general name! I opted to rename `isUnboxedTupleCon` to `isUnboxedTupleDataCon` (similarly, I renamed `isUnboxedSumCon` to `isUnboxedSumDataCon`) to avoid this potential confusion, as well as to be more consistent with the naming convention I used for `isBoxedTupleDataCon`. Fixes #18644.
* DynFlags: remove use of sdocWithDynFlags from GHC.Stg.* (#17957)Sylvain Henry2020-07-211-37/+49
| | | | | | | | | * add StgPprOpts datatype * remove Outputable instances for types that need `StgPprOpts` to be pretty-printed and explicitly call type specific ppr functions * add default `panicStgPprOpts` for panic messages (when it's not convenient to thread StgPprOpts or DynFlags down to the ppr function call)
* Remove PprStyle param of logging actionsSylvain Henry2020-05-011-1/+1
| | | | Use `withPprStyle` instead to apply a specific style to a SDoc.
* Refactor PprDebug handlingSylvain Henry2020-05-011-1/+1
| | | | | | | | | | | If `-dppr-debug` is set, then PprUser and PprDump styles are silently replaced with PprDebug style. This was done in `mkUserStyle` and `mkDumpStyle` smart constructors. As a consequence they needed a DynFlags parameter. Now we keep the original PprUser and PprDump styles until they are used to create an `SDocContext`. I.e. the substitution is only performed in `initSDocContext`.
* Unit: split and rename modulesSylvain Henry2020-04-301-1/+1
| | | | | | | Introduce GHC.Unit.* hierarchy for everything concerning units, packages and modules. Update Haddock submodule
* Modules: Utils and Data (#13009)Sylvain Henry2020-04-261-5/+5
| | | | | | | Update Haddock submodule Metric Increase: haddock.compiler
* Modules: Types (#13009)Sylvain Henry2020-03-291-10/+10
| | | | | | | Update Haddock submodule Metric Increase: haddock.compiler
* Modules: Core (#13009)Sylvain Henry2020-03-161-2/+2
| | | | Update submodule: haddock
* Modules: Core (#13009)Sylvain Henry2020-02-261-2/+2
| | | | Update haddock submodule
* Modules: Driver (#13009)Sylvain Henry2020-02-211-1/+1
| | | | submodule updates: nofib, haddock
* Do CafInfo/SRT analysis in CmmÖmer Sinan Ağacan2020-01-311-19/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes all CafInfo predictions and various hacks to preserve predicted CafInfos from the compiler and assigns final CafInfos to interface Ids after code generation. SRT analysis is extended to support static data, and Cmm generator is modified to allow generating static_link fields after SRT analysis. This also fixes `-fcatch-bottoms`, which introduces error calls in case expressions in CorePrep, which runs *after* CoreTidy (which is where we decide on CafInfos) and turns previously non-CAFFY things into CAFFY. Fixes #17648 Fixes #9718 Evaluation ========== NoFib ----- Boot with: `make boot mode=fast` Run: `make mode=fast EXTRA_RUNTEST_OPTS="-cachegrind" NoFibRuns=1` -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- CS -0.0% 0.0% -0.0% -0.0% -0.0% CSD -0.0% 0.0% -0.0% -0.0% -0.0% FS -0.0% 0.0% -0.0% -0.0% -0.0% S -0.0% 0.0% -0.0% -0.0% -0.0% VS -0.0% 0.0% -0.0% -0.0% -0.0% VSD -0.0% 0.0% -0.0% -0.0% -0.5% VSM -0.0% 0.0% -0.0% -0.0% -0.0% anna -0.1% 0.0% -0.0% -0.0% -0.0% ansi -0.0% 0.0% -0.0% -0.0% -0.0% atom -0.0% 0.0% -0.0% -0.0% -0.0% awards -0.0% 0.0% -0.0% -0.0% -0.0% banner -0.0% 0.0% -0.0% -0.0% -0.0% bernouilli -0.0% 0.0% -0.0% -0.0% -0.0% binary-trees -0.0% 0.0% -0.0% -0.0% -0.0% boyer -0.0% 0.0% -0.0% -0.0% -0.0% boyer2 -0.0% 0.0% -0.0% -0.0% -0.0% bspt -0.0% 0.0% -0.0% -0.0% -0.0% cacheprof -0.0% 0.0% -0.0% -0.0% -0.0% calendar -0.0% 0.0% -0.0% -0.0% -0.0% cichelli -0.0% 0.0% -0.0% -0.0% -0.0% circsim -0.0% 0.0% -0.0% -0.0% -0.0% clausify -0.0% 0.0% -0.0% -0.0% -0.0% comp_lab_zift -0.0% 0.0% -0.0% -0.0% -0.0% compress -0.0% 0.0% -0.0% -0.0% -0.0% compress2 -0.0% 0.0% -0.0% -0.0% -0.0% constraints -0.0% 0.0% -0.0% -0.0% -0.0% cryptarithm1 -0.0% 0.0% -0.0% -0.0% -0.0% cryptarithm2 -0.0% 0.0% -0.0% -0.0% -0.0% cse -0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e1 -0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e2 -0.0% 0.0% -0.0% -0.0% -0.0% dom-lt -0.0% 0.0% -0.0% -0.0% -0.0% eliza -0.0% 0.0% -0.0% -0.0% -0.0% event -0.0% 0.0% -0.0% -0.0% -0.0% exact-reals -0.0% 0.0% -0.0% -0.0% -0.0% exp3_8 -0.0% 0.0% -0.0% -0.0% -0.0% expert -0.0% 0.0% -0.0% -0.0% -0.0% fannkuch-redux -0.0% 0.0% -0.0% -0.0% -0.0% fasta -0.0% 0.0% -0.0% -0.0% -0.0% fem -0.0% 0.0% -0.0% -0.0% -0.0% fft -0.0% 0.0% -0.0% -0.0% -0.0% fft2 -0.0% 0.0% -0.0% -0.0% -0.0% fibheaps -0.0% 0.0% -0.0% -0.0% -0.0% fish -0.0% 0.0% -0.0% -0.0% -0.0% fluid -0.1% 0.0% -0.0% -0.0% -0.0% fulsom -0.0% 0.0% -0.0% -0.0% -0.0% gamteb -0.0% 0.0% -0.0% -0.0% -0.0% gcd -0.0% 0.0% -0.0% -0.0% -0.0% gen_regexps -0.0% 0.0% -0.0% -0.0% -0.0% genfft -0.0% 0.0% -0.0% -0.0% -0.0% gg -0.0% 0.0% -0.0% -0.0% -0.0% grep -0.0% 0.0% -0.0% -0.0% -0.0% hidden -0.0% 0.0% -0.0% -0.0% -0.0% hpg -0.1% 0.0% -0.0% -0.0% -0.0% ida -0.0% 0.0% -0.0% -0.0% -0.0% infer -0.0% 0.0% -0.0% -0.0% -0.0% integer -0.0% 0.0% -0.0% -0.0% -0.0% integrate -0.0% 0.0% -0.0% -0.0% -0.0% k-nucleotide -0.0% 0.0% -0.0% -0.0% -0.0% kahan -0.0% 0.0% -0.0% -0.0% -0.0% knights -0.0% 0.0% -0.0% -0.0% -0.0% lambda -0.0% 0.0% -0.0% -0.0% -0.0% last-piece -0.0% 0.0% -0.0% -0.0% -0.0% lcss -0.0% 0.0% -0.0% -0.0% -0.0% life -0.0% 0.0% -0.0% -0.0% -0.0% lift -0.0% 0.0% -0.0% -0.0% -0.0% linear -0.1% 0.0% -0.0% -0.0% -0.0% listcompr -0.0% 0.0% -0.0% -0.0% -0.0% listcopy -0.0% 0.0% -0.0% -0.0% -0.0% maillist -0.0% 0.0% -0.0% -0.0% -0.0% mandel -0.0% 0.0% -0.0% -0.0% -0.0% mandel2 -0.0% 0.0% -0.0% -0.0% -0.0% mate -0.0% 0.0% -0.0% -0.0% -0.0% minimax -0.0% 0.0% -0.0% -0.0% -0.0% mkhprog -0.0% 0.0% -0.0% -0.0% -0.0% multiplier -0.0% 0.0% -0.0% -0.0% -0.0% n-body -0.0% 0.0% -0.0% -0.0% -0.0% nucleic2 -0.0% 0.0% -0.0% -0.0% -0.0% para -0.0% 0.0% -0.0% -0.0% -0.0% paraffins -0.0% 0.0% -0.0% -0.0% -0.0% parser -0.1% 0.0% -0.0% -0.0% -0.0% parstof -0.1% 0.0% -0.0% -0.0% -0.0% pic -0.0% 0.0% -0.0% -0.0% -0.0% pidigits -0.0% 0.0% -0.0% -0.0% -0.0% power -0.0% 0.0% -0.0% -0.0% -0.0% pretty -0.0% 0.0% -0.3% -0.4% -0.4% primes -0.0% 0.0% -0.0% -0.0% -0.0% primetest -0.0% 0.0% -0.0% -0.0% -0.0% prolog -0.0% 0.0% -0.0% -0.0% -0.0% puzzle -0.0% 0.0% -0.0% -0.0% -0.0% queens -0.0% 0.0% -0.0% -0.0% -0.0% reptile -0.0% 0.0% -0.0% -0.0% -0.0% reverse-complem -0.0% 0.0% -0.0% -0.0% -0.0% rewrite -0.0% 0.0% -0.0% -0.0% -0.0% rfib -0.0% 0.0% -0.0% -0.0% -0.0% rsa -0.0% 0.0% -0.0% -0.0% -0.0% scc -0.0% 0.0% -0.3% -0.5% -0.4% sched -0.0% 0.0% -0.0% -0.0% -0.0% scs -0.0% 0.0% -0.0% -0.0% -0.0% simple -0.1% 0.0% -0.0% -0.0% -0.0% solid -0.0% 0.0% -0.0% -0.0% -0.0% sorting -0.0% 0.0% -0.0% -0.0% -0.0% spectral-norm -0.0% 0.0% -0.0% -0.0% -0.0% sphere -0.0% 0.0% -0.0% -0.0% -0.0% symalg -0.0% 0.0% -0.0% -0.0% -0.0% tak -0.0% 0.0% -0.0% -0.0% -0.0% transform -0.0% 0.0% -0.0% -0.0% -0.0% treejoin -0.0% 0.0% -0.0% -0.0% -0.0% typecheck -0.0% 0.0% -0.0% -0.0% -0.0% veritas -0.0% 0.0% -0.0% -0.0% -0.0% wang -0.0% 0.0% -0.0% -0.0% -0.0% wave4main -0.0% 0.0% -0.0% -0.0% -0.0% wheel-sieve1 -0.0% 0.0% -0.0% -0.0% -0.0% wheel-sieve2 -0.0% 0.0% -0.0% -0.0% -0.0% x2n1 -0.0% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min -0.1% 0.0% -0.3% -0.5% -0.5% Max -0.0% 0.0% -0.0% -0.0% -0.0% Geometric Mean -0.0% -0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- circsim -0.1% 0.0% -0.0% -0.0% -0.0% constraints -0.0% 0.0% -0.0% -0.0% -0.0% fibheaps -0.0% 0.0% -0.0% -0.0% -0.0% gc_bench -0.0% 0.0% -0.0% -0.0% -0.0% hash -0.0% 0.0% -0.0% -0.0% -0.0% lcss -0.0% 0.0% -0.0% -0.0% -0.0% power -0.0% 0.0% -0.0% -0.0% -0.0% spellcheck -0.0% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min -0.1% 0.0% -0.0% -0.0% -0.0% Max -0.0% 0.0% -0.0% -0.0% -0.0% Geometric Mean -0.0% +0.0% -0.0% -0.0% -0.0% Manual inspection of programs in testsuite/tests/programs --------------------------------------------------------- I built these programs with a bunch of dump flags and `-O` and compared STG, Cmm, and Asm dumps and file sizes. (Below the numbers in parenthesis show number of modules in the program) These programs have identical compiler (same .hi and .o sizes, STG, and Cmm and Asm dumps): - Queens (1), andre_monad (1), cholewo-eval (2), cvh_unboxing (3), andy_cherry (7), fun_insts (1), hs-boot (4), fast2haskell (2), jl_defaults (1), jq_readsPrec (1), jules_xref (1), jtod_circint (4), jules_xref2 (1), lennart_range (1), lex (1), life_space_leak (1), bargon-mangler-bug (7), record_upd (1), rittri (1), sanders_array (1), strict_anns (1), thurston-module-arith (2), okeefe_neural (1), joao-circular (6), 10queens (1) Programs with different compiler outputs: - jl_defaults (1): For some reason GHC HEAD marks a lot of top-level `[Int]` closures as CAFFY for no reason. With this patch we no longer make them CAFFY and generate less SRT entries. For some reason Main.o is slightly larger with this patch (1.3%) and the executable sizes are the same. (I'd expect both to be smaller) - launchbury (1): Same as jl_defaults: top-level `[Int]` closures marked as CAFFY for no reason. Similarly `Main.o` is 1.4% larger but the executable sizes are the same. - galois_raytrace (13): Differences are in the Parse module. There are a lot, but some of the changes are caused by the fact that for some reason (I think a bug) GHC HEAD marks the dictionary for `Functor Identity` as CAFFY. Parse.o is 0.4% larger, the executable size is the same. - north_array: We now generate less SRT entries because some of array primops used in this program like `NewArrayOp` get eliminated during Stg-to-Cmm and turn some CAFFY things into non-CAFFY. Main.o gets 24% larger (9224 bytes from 9000 bytes), executable sizes are the same. - seward-space-leak: Difference in this program is better shown by this smaller example: module Lib where data CDS = Case [CDS] [(Int, CDS)] | Call CDS CDS instance Eq CDS where Case sels1 rets1 == Case sels2 rets2 = sels1 == sels2 && rets1 == rets2 Call a1 b1 == Call a2 b2 = a1 == a2 && b1 == b2 _ == _ = False In this program GHC HEAD builds a new SRT for the recursive group of `(==)`, `(/=)` and the dictionary closure. Then `/=` points to `==` in its SRT field, and `==` uses the SRT object as its SRT. With this patch we use the closure for `/=` as the SRT and add `==` there. Then `/=` gets an empty SRT field and `==` points to `/=` in its SRT field. This change looks fine to me. Main.o gets 0.07% larger, executable sizes are identical. head.hackage ------------ head.hackage's CI script builds 428 packages from Hackage using this patch with no failures. Compiler performance -------------------- The compiler perf tests report that the compiler allocates slightly more (worst case observed so far is 4%). However most programs in the test suite are small, single file programs. To benchmark compiler performance on something more realistic I build Cabal (the library, 236 modules) with different optimisation levels. For the "max residency" row I run GHC with `+RTS -s -A100k -i0 -h` for more accurate numbers. Other rows are generated with just `-s`. (This is because `-i0` causes running GC much more frequently and as a result "bytes copied" gets inflated by more than 25x in some cases) * -O0 | | GHC HEAD | This MR | Diff | | --------------- | -------------- | -------------- | ------ | | Bytes allocated | 54,413,350,872 | 54,701,099,464 | +0.52% | | Bytes copied | 4,926,037,184 | 4,990,638,760 | +1.31% | | Max residency | 421,225,624 | 424,324,264 | +0.73% | * -O1 | | GHC HEAD | This MR | Diff | | --------------- | --------------- | --------------- | ------ | | Bytes allocated | 245,849,209,992 | 246,562,088,672 | +0.28% | | Bytes copied | 26,943,452,560 | 27,089,972,296 | +0.54% | | Max residency | 982,643,440 | 991,663,432 | +0.91% | * -O2 | | GHC HEAD | This MR | Diff | | --------------- | --------------- | --------------- | ------ | | Bytes allocated | 291,044,511,408 | 291,863,910,912 | +0.28% | | Bytes copied | 37,044,237,616 | 36,121,690,472 | -2.49% | | Max residency | 1,071,600,328 | 1,086,396,256 | +1.38% | Extra compiler allocations -------------------------- Runtime allocations of programs are as reported above (NoFib section). The compiler now allocates more than before. Main source of allocation in this patch compared to base commit is the new SRT algorithm (GHC.Cmm.Info.Build). Below is some of the extra work we do with this patch, numbers generated by profiled stage 2 compiler when building a pathological case (the test 'ManyConstructors') with '-O2': - We now sort the final STG for a module, which means traversing the entire program, generating free variable set for each top-level binding, doing SCC analysis, and re-ordering the program. In ManyConstructors this step allocates 97,889,952 bytes. - We now do SRT analysis on static data, which in a program like ManyConstructors causes analysing 10,000 bindings that we would previously just skip. This step allocates 70,898,352 bytes. - We now maintain an SRT map for the entire module as we compile Cmm groups: data ModuleSRTInfo = ModuleSRTInfo { ... , moduleSRTMap :: SRTMap } (SRTMap is just a strict Map from the 'containers' library) This map gets an entry for most bindings in a module (exceptions are THUNKs and CAFFY static functions). For ManyConstructors this map gets 50015 entries. - Once we're done with code generation we generate a NameSet from SRTMap for the non-CAFFY names in the current module. This set gets the same number of entries as the SRTMap. - Finally we update CafInfos in ModDetails for the non-CAFFY Ids, using the NameSet generated in the previous step. This usually does the least amount of allocation among the work listed here. Only place with this patch where we do less work in the CAF analysis in the tidying pass (CoreTidy). However that doesn't save us much, as the pass still needs to traverse the whole program and update IdInfos for other reasons. Only thing we don't here do is the `hasCafRefs` pass over the RHS of bindings, which is a stateless pass that returns a boolean value, so it doesn't allocate much. (Metric changes blow are all increased allocations) Metric changes -------------- Metric Increase: ManyAlternatives ManyConstructors T13035 T14683 T1969 T9961
* Module hierarchy (#13009): StgSylvain Henry2019-12-311-0/+396