summaryrefslogtreecommitdiff
path: root/testsuite/tests/regalloc
Commit message (Collapse)AuthorAgeFilesLines
* Multiple Home UnitsMatthew Pickering2021-12-281-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Multiple home units allows you to load different packages which may depend on each other into one GHC session. This will allow both GHCi and HLS to support multi component projects more naturally. Public Interface ~~~~~~~~~~~~~~~~ In order to specify multiple units, the -unit @⟨filename⟩ flag is given multiple times with a response file containing the arguments for each unit. The response file contains a newline separated list of arguments. ``` ghc -unit @unitLibCore -unit @unitLib ``` where the `unitLibCore` response file contains the normal arguments that cabal would pass to `--make` mode. ``` -this-unit-id lib-core-0.1.0.0 -i -isrc LibCore.Utils LibCore.Types ``` The response file for lib, can specify a dependency on lib-core, so then modules in lib can use modules from lib-core. ``` -this-unit-id lib-0.1.0.0 -package-id lib-core-0.1.0.0 -i -isrc Lib.Parse Lib.Render ``` Then when the compiler starts in --make mode it will compile both units lib and lib-core. There is also very basic support for multiple home units in GHCi, at the moment you can start a GHCi session with multiple units but only the :reload is supported. Most commands in GHCi assume a single home unit, and so it is additional work to work out how to modify the interface to support multiple loaded home units. Options used when working with Multiple Home Units There are a few extra flags which have been introduced specifically for working with multiple home units. The flags allow a home unit to pretend it’s more like an installed package, for example, specifying the package name, module visibility and reexported modules. -working-dir ⟨dir⟩ It is common to assume that a package is compiled in the directory where its cabal file resides. Thus, all paths used in the compiler are assumed to be relative to this directory. When there are multiple home units the compiler is often not operating in the standard directory and instead where the cabal.project file is located. In this case the -working-dir option can be passed which specifies the path from the current directory to the directory the unit assumes to be it’s root, normally the directory which contains the cabal file. When the flag is passed, any relative paths used by the compiler are offset by the working directory. Notably this includes -i and -I⟨dir⟩ flags. -this-package-name ⟨name⟩ This flag papers over the awkward interaction of the PackageImports and multiple home units. When using PackageImports you can specify the name of the package in an import to disambiguate between modules which appear in multiple packages with the same name. This flag allows a home unit to be given a package name so that you can also disambiguate between multiple home units which provide modules with the same name. -hidden-module ⟨module name⟩ This flag can be supplied multiple times in order to specify which modules in a home unit should not be visible outside of the unit it belongs to. The main use of this flag is to be able to recreate the difference between an exposed and hidden module for installed packages. -reexported-module ⟨module name⟩ This flag can be supplied multiple times in order to specify which modules are not defined in a unit but should be reexported. The effect is that other units will see this module as if it was defined in this unit. The use of this flag is to be able to replicate the reexported modules feature of packages with multiple home units. Offsetting Paths in Template Haskell splices ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When using Template Haskell to embed files into your program, traditionally the paths have been interpreted relative to the directory where the .cabal file resides. This causes problems for multiple home units as we are compiling many different libraries at once which have .cabal files in different directories. For this purpose we have introduced a way to query the value of the -working-dir flag to the Template Haskell API. By using this function we can implement a makeRelativeToProject function which offsets a path which is relative to the original project root by the value of -working-dir. ``` import Language.Haskell.TH.Syntax ( makeRelativeToProject ) foo = $(makeRelativeToProject "./relative/path" >>= embedFile) ``` > If you write a relative path in a Template Haskell splice you should use the makeRelativeToProject function so that your library works correctly with multiple home units. A similar function already exists in the file-embed library. The function in template-haskell implements this function in a more robust manner by honouring the -working-dir flag rather than searching the file system. Closure Property for Home Units ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ For tools or libraries using the API there is one very important closure property which must be adhered to: > Any dependency which is not a home unit must not (transitively) depend on a home unit. For example, if you have three packages p, q and r, then if p depends on q which depends on r then it is illegal to load both p and r as home units but not q, because q is a dependency of the home unit p which depends on another home unit r. If you are using GHC by the command line then this property is checked, but if you are using the API then you need to check this property yourself. If you get it wrong you will probably get some very confusing errors about overlapping instances. Limitations of Multiple Home Units ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ There are a few limitations of the initial implementation which will be smoothed out on user demand. * Package thinning/renaming syntax is not supported * More complicated reexports/renaming are not yet supported. * It’s more common to run into existing linker bugs when loading a large number of packages in a session (for example #20674, #20689) * Backpack is not yet supported when using multiple home units. * Dependency chasing can be quite slow with a large number of modules and packages. * Loading wired-in packages as home units is currently not supported (this only really affects GHC developers attempting to load template-haskell). * Barely any normal GHCi features are supported, it would be good to support enough for ghcid to work correctly. Despite these limitations, the implementation works already for nearly all packages. It has been testing on large dependency closures, including the whole of head.hackage which is a total of 4784 modules from 452 packages. Internal Changes ~~~~~~~~~~~~~~~~ * The biggest change is that the HomePackageTable is replaced with the HomeUnitGraph. The HomeUnitGraph is a map from UnitId to HomeUnitEnv, which contains information specific to each home unit. * The HomeUnitEnv contains: - A unit state, each home unit can have different package db flags - A set of dynflags, each home unit can have different flags - A HomePackageTable * LinkNode: A new node type is added to the ModuleGraph, this is used to place the linking step into the build plan so linking can proceed in parralel with other packages being built. * New invariant: Dependencies of a ModuleGraphNode can be completely determined by looking at the value of the node. In order to achieve this, downsweep now performs a more complete job of downsweeping and then the dependenices are recorded forever in the node rather than being computed again from the ModSummary. * Some transitive module calculations are rewritten to use the ModuleGraph which is more efficient. * There is always an active home unit, which simplifies modifying a lot of the existing API code which is unit agnostic (for example, in the driver). The road may be bumpy for a little while after this change but the basics are well-tested. One small metric increase, which we accept and also submodule update to haddock which removes ExtendedModSummary. Closes #10827 ------------------------- Metric Increase: MultiLayerModules ------------------------- Co-authored-by: Fendor <power.walross@gmail.com>
* Remove `optLevel` from `DynFlags` (closes #20500)Gergo ERDI2021-12-091-1/+1
|
* driver: Cleanups related to ModLocationMatthew Pickering2021-10-191-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ModLocation is the data type which tells you the locations of all the build products which can affect recompilation. It is now computed in one place and not modified through the pipeline. Important locations will now just consult ModLocation rather than construct the dynamic object path incorrectly. * Add paths for dynamic object and dynamic interface files to ModLocation. * Always use the paths from mod location when looking for where to find any interface or object file. * Always use the paths in a ModLocation when deciding where to write an interface and object file. * Remove `dynamicOutputFile` and `dynamicOutputHi` functions which *calculated* (incorrectly) the location of `dyn_o` and `dyn_hi` files. * Don't set `outputFile_` and so-on in `enableCodeGenWhen`, `-o` and hence `outputFile_` should not affect the location of object files in `--make` mode. It is now sufficient to just update the ModLocation with the temporary paths. * In `hscGenBackendPipeline` don't recompute the `ModLocation` to account for `-dynamic-too`, the paths are now accurate from the start of the run. * Rename `getLocation` to `mkOneShotModLocation`, as that's the only place it's used. Increase the locality of the definition by moving it close to the use-site. * Load the dynamic interface from ml_dyn_hi_file rather than attempting to reconstruct it in load_dynamic_too. * Add a variety of tests to check how -o -dyno etc interact with each other. Some other clean-ups * DeIOify mkHomeModLocation and friends, they are all pure functions. * Move FinderOpts into GHC.Driver.Config.Finder, next to initFinderOpts. * Be more precise about whether we mean outputFile or outputFile_: there were many places where outputFile was used but the result shouldn't have been affected by `-dyno` (for example the filename of the resulting executable). In these places dynamicNow would never be set but it's still more precise to not allow for this possibility. * Typo fixes suffices -> suffixes in the appropiate places.
* Dynflags: introduce DiagOptsSylvain Henry2021-07-011-1/+3
| | | | | | | | | | | | | | | | | | | | | | Use DiagOpts for diagnostic options instead of directly querying DynFlags (#17957). Surprising performance improvements on CI: T4801(normal) ghc/alloc 313236344.0 306515216.0 -2.1% GOOD T9961(normal) ghc/alloc 384502736.0 380584384.0 -1.0% GOOD ManyAlternatives(normal) ghc/alloc 797356128.0 786644928.0 -1.3% ManyConstructors(normal) ghc/alloc 4389732432.0 4317740880.0 -1.6% T783(normal) ghc/alloc 408142680.0 402812176.0 -1.3% Metric Decrease: T4801 T9961 T783 ManyAlternatives ManyConstructors Bump haddock submodule
* Make Logger independent of DynFlagsSylvain Henry2021-06-071-14/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce LogFlags as a independent subset of DynFlags used for logging. As a consequence in many places we don't have to pass both Logger and DynFlags anymore. The main reason for this refactoring is that I want to refactor the systools interfaces: for now many systools functions use DynFlags both to use the Logger and to fetch their parameters (e.g. ldInputs for the linker). I'm interested in refactoring the way they fetch their parameters (i.e. use dedicated XxxOpts data types instead of DynFlags) for #19877. But if I did this refactoring before refactoring the Logger, we would have duplicate parameters (e.g. ldInputs from DynFlags and linkerInputs from LinkerOpts). Hence this patch first. Some flags don't really belong to LogFlags because they are subsystem specific (e.g. most DumpFlags). For example -ddump-asm should better be passed in NCGConfig somehow. This patch doesn't fix this tight coupling: the dump flags are part of the UI but they are passed all the way down for example to infer the file name for the dumps. Because LogFlags are a subset of the DynFlags, we must update the former when the latter changes (not so often). As a consequence we now use accessors to read/write DynFlags in HscEnv instead of using `hsc_dflags` directly. In the process I've also made some subsystems less dependent on DynFlags: - CmmToAsm: by passing some missing flags via NCGConfig (see new fields in GHC.CmmToAsm.Config) - Core.Opt.*: - by passing -dinline-check value into UnfoldingOpts - by fixing some Core passes interfaces (e.g. CallArity, FloatIn) that took DynFlags argument for no good reason. - as a side-effect GHC.Core.Opt.Pipeline.doCorePass is much less convoluted.
* Support new parser types in GHCAlfredo Di Napoli2021-05-261-3/+1
| | | | | | | | | | | | | | This commit converts the lexers and all the parser machinery to use the new parser types and diagnostics infrastructure. Furthermore, it cleans up the way the parser code was emitting hints. As a result of this systematic approach, the test output of the `InfixAppPatErr` and `T984` tests have been changed. Previously they would emit a `SuggestMissingDo` hint, but this was not at all helpful in resolving the error, and it was even confusing by just looking at the original program that triggered the errors. Update haddock submodule
* Add GhcMessage and ancillary typesAlfredo Di Napoli2021-04-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit adds GhcMessage and ancillary (PsMessage, TcRnMessage, ..) types. These types will be expanded to represent more errors generated by different subsystems within GHC. Right now, they are underused, but more will come in the glorious future. See https://gitlab.haskell.org/ghc/ghc/-/wikis/Errors-as-(structured)-values for a design overview. Along the way, lots of other things had to happen: * Adds Semigroup and Monoid instance for Bag * Fixes #19746 by parsing OPTIONS_GHC pragmas into Located Strings. See GHC.Parser.Header.toArgs (moved from GHC.Utils.Misc, where it didn't belong anyway). * Addresses (but does not completely fix) #19709, now reporting desugarer warnings and errors appropriately for TH splices. Not done: reporting type-checker warnings for TH splices. * Some small refactoring around Safe Haskell inference, in order to keep separate classes of messages separate. * Some small refactoring around initDsTc, in order to keep separate classes of messages separate. * Separate out the generation of messages (that is, the construction of the text block) from the wrapping of messages (that is, assigning a SrcSpan). This is more modular than the previous design, which mixed the two. Close #19746. This was a collaborative effort by Alfredo di Napoli and Richard Eisenberg, with a key assist on #19746 by Iavor Diatchki. Metric Increase: MultiLayerModules
* Read constants header instead of global platformConstantsSylvain Henry2021-04-101-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | With this patch we switch from reading the globally installed platformConstants file to reading the DerivedConstants.h header file that is bundled in the RTS unit. When we build the RTS unit itself, we get it from its includes directories. The new parser is more efficient and strict than the Read instance for PlatformConstants and we get about 2.2MB less allocations in every cases. However it only really shows in tests that don't allocate much, hence the following metric decreases. Metric Decrease: Naperian T10421 T10547 T12150 T12234 T12425 T13035 T18304 T18923 T5837 T6048 T18140
* Compute Severity of diagnostics at birthAlfredo Di Napoli2021-04-011-2/+2
| | | | | | | | | | | | | | | | | | | | | This commit further expand on the design for #18516 by getting rid of the `defaultReasonSeverity` in favour of a function called `diagReasonSeverity` which correctly takes the `DynFlags` as input. The idea is to compute the `Severity` and the `DiagnosticReason` of each message "at birth", without doing any later re-classifications, which are potentially error prone, as the `DynFlags` might evolve during the course of the program. In preparation for a proper refactoring, now `pprWarning` from the Parser.Ppr module has been renamed to `mkParserWarn`, which now takes a `DynFlags` as input. We also get rid of the reclassification we were performing inside `printOrThrowWarnings`. Last but not least, this commit removes the need for reclassify inside GHC.Tc.Errors, and also simplifies the implementation of `maybeReportError`. Update Haddock submodule
* Add -finfo-table-map which maps info tables to source positionsMatthew Pickering2021-03-031-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This new flag embeds a lookup table from the address of an info table to information about that info table. The main interface for consulting the map is the `lookupIPE` C function > InfoProvEnt * lookupIPE(StgInfoTable *info) The `InfoProvEnt` has the following structure: > typedef struct InfoProv_{ > char * table_name; > char * closure_desc; > char * ty_desc; > char * label; > char * module; > char * srcloc; > } InfoProv; > > typedef struct InfoProvEnt_ { > StgInfoTable * info; > InfoProv prov; > struct InfoProvEnt_ *link; > } InfoProvEnt; The source positions are approximated in a similar way to the source positions for DWARF debugging information. They are only approximate but in our experience provide a good enough hint about where the problem might be. It is therefore recommended to use this flag in conjunction with `-g<n>` for more accurate locations. The lookup table is also emitted into the eventlog when it is available as it is intended to be used with the `-hi` profiling mode. Using this flag will significantly increase the size of the resulting object file but only by a factor of 2-3x in our experience.
* Refactor LoggerSylvain Henry2021-02-131-10/+12
| | | | | | | | | | | | | | | | | | | | | Before this patch, the only way to override GHC's default logging behavior was to set `log_action`, `dump_action` and `trace_action` fields in DynFlags. This patch introduces a new Logger abstraction and stores it in HscEnv instead. This is part of #17957 (avoid storing state in DynFlags). DynFlags are duplicated and updated per-module (because of OPTIONS_GHC pragma), so we shouldn't store global state in them. This patch also fixes a race in parallel "--make" mode which updated the `generatedDumps` IORef concurrently. Bump haddock submodule The increase in MultilayerModules is tracked in #19293. Metric Increase: MultiLayerModules
* Remove errShortString, cleanup error-related functionsAlfredo Di Napoli2021-01-091-0/+1
| | | | | | | | | | | This commit removes the errShortString field from the ErrMsg type, allowing us to cleanup a lot of dynflag-dependent error functions, and move them in a more specialised 'GHC.Driver.Errors' closer to the driver, where they are actually used. Metric Increase: T4801 T9961
* Move Unit related fields from DynFlags to HscEnvSylvain Henry2020-12-141-1/+2
| | | | | | | | | | | | | The unit database cache, the home unit and the unit state were stored in DynFlags while they ought to be stored in the compiler session state (HscEnv). This patch fixes this. It introduces a new UnitEnv type that should be used in the future to handle separate unit environments (especially host vs target units). Related to #17957 Bump haddock submodule
* Move this_module into NCGConfigBen Gamari2020-11-111-2/+2
| | | | | | In various places in the NCG we need the Module currently being compiled. Let's move this into the environment instead of chewing threw another register.
* Use ADTs for parser errors/warningsSylvain Henry2020-10-011-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Haskell and Cmm parsers/lexers now report errors and warnings using ADTs defined in GHC.Parser.Errors. They can be printed using functions in GHC.Parser.Errors.Ppr. Some of the errors provide hints with a separate ADT (e.g. to suggest to turn on some extension). For now, however, hints are not consistent across all messages. For example some errors contain the hints in the main message. I didn't want to change any message with this patch. I expect these changes to be discussed and implemented later. Surprisingly, this patch enhances performance. On CI (x86_64/deb9/hadrian, ghc/alloc): parsing001 -11.5% T13719 -2.7% MultiLayerModules -3.5% Naperian -3.1% Bump haddock submodule Metric Decrease: MultiLayerModules Naperian T13719 parsing001
* Put CFG weights into their own module (#17957)Sylvain Henry2020-08-211-2/+1
| | | | It avoids having to query DynFlags to get them
* DynFlags: refactor GHC.CmmToAsm (#17957, #10143)Sylvain Henry2020-08-181-1/+2
| | | | | | | | | | | | | | | | | This patch removes the use of `sdocWithDynFlags` from GHC.CmmToAsm.*.Ppr To do that I've had to make some refactoring: * X86' and PPC's `Instr` are no longer `Outputable` as they require a `Platform` argument * `Instruction` class now exposes `pprInstr :: Platform -> instr -> SDoc` * as a consequence, I've refactored some modules to avoid .hs-boot files * added (derived) functor instances for some datatypes parametric in the instruction type. It's useful for pretty-printing as we just have to map `pprInstr` before pretty-printing the container datatype.
* Unit: split and rename modulesSylvain Henry2020-04-301-1/+1
| | | | | | | Introduce GHC.Unit.* hierarchy for everything concerning units, packages and modules. Update Haddock submodule
* Refactoring unit management codeSylvain Henry2020-04-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Over the years the unit management code has been modified a lot to keep up with changes in Cabal (e.g. support for several library components in the same package), to integrate BackPack, etc. I found it very hard to understand as the terminology wasn't consistent, was referring to past concepts, etc. The terminology is now explained as clearly as I could in the Note "About Units" and the code is refactored to reflect it. ------------------- Many names were misleading: UnitId is not an Id but could be a virtual unit (an indefinite one instantiated on the fly), IndefUnitId constructor may contain a definite instantiated unit, etc. * Rename IndefUnitId into InstantiatedUnit * Rename IndefModule into InstantiatedModule * Rename UnitId type into Unit * Rename IndefiniteUnitId constructor into VirtUnit * Rename DefiniteUnitId constructor into RealUnit * Rename packageConfigId into mkUnit * Rename getPackageDetails into unsafeGetUnitInfo * Rename InstalledUnitId into UnitId Remove references to misleading ComponentId: a ComponentId is just an indefinite unit-id to be instantiated. * Rename ComponentId into IndefUnitId * Rename ComponentDetails into UnitPprInfo * Fix display of UnitPprInfo with empty version: this is now used for units dynamically generated by BackPack Generalize several types (Module, Unit, etc.) so that they can be used with different unit identifier types: UnitKey, UnitId, Unit, etc. * GenModule: Module, InstantiatedModule and InstalledModule are now instances of this type * Generalize DefUnitId, IndefUnitId, Unit, InstantiatedUnit, PackageDatabase Replace BackPack fake "hole" UnitId by a proper HoleUnit constructor. Add basic support for UnitKey. They should be used more in the future to avoid mixing them up with UnitId as we do now. Add many comments. Update Haddock submodule
* Modules: Utils and Data (#13009)Sylvain Henry2020-04-261-3/+3
| | | | | | | Update Haddock submodule Metric Increase: haddock.compiler
* CmmToAsm DynFlags refactoring (#17957)Sylvain Henry2020-04-211-2/+4
| | | | | | | | | | | | | * Remove `DynFlags` parameter from `isDynLinkName`: `isDynLinkName` used to test the global `ExternalDynamicRefs` flag. Now we test it outside of `isDynLinkName` * Add new fields into `NCGConfig`: current unit id, sse/bmi versions, externalDynamicRefs, etc. * Replace many uses of `DynFlags` by `NCGConfig` * Moved `BMI/SSE` datatypes into `GHC.Platform`
* Modules: Types (#13009)Sylvain Henry2020-03-291-4/+4
| | | | | | | Update Haddock submodule Metric Increase: haddock.compiler
* Refactor CmmToAsm (disentangle DynFlags)Sylvain Henry2020-03-151-1/+1
| | | | | | | | | | | | | | | | | | | | | This patch disentangles a bit more DynFlags from the native code generator (CmmToAsm). In more details: - add a new NCGConfig datatype in GHC.CmmToAsm.Config which contains the configuration of a native code generation session - explicitly pass NCGConfig/Platform arguments when necessary - as a consequence `sdocWithPlatform` is gone and there are only a few `sdocWithDynFlags` left - remove the use of `unsafeGlobalDynFlags` from GHC.CmmToAsm.CFG - remove `sdocDebugLevel` (now we pass the debug level via NCGConfig) There are still some places where DynFlags is used, especially because of pretty-printing (CLabel), because of Cmm helpers (such as `cmmExprType`) and because of `Outputable` instance for the instructions. These are left for future refactoring as this patch is already big.
* Modules: CmmToAsm (#13009)Sylvain Henry2020-02-241-4/+4
|
* Modules: Driver (#13009)Sylvain Henry2020-02-211-3/+3
| | | | submodule updates: nofib, haddock
* Do CafInfo/SRT analysis in CmmÖmer Sinan Ağacan2020-01-311-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes all CafInfo predictions and various hacks to preserve predicted CafInfos from the compiler and assigns final CafInfos to interface Ids after code generation. SRT analysis is extended to support static data, and Cmm generator is modified to allow generating static_link fields after SRT analysis. This also fixes `-fcatch-bottoms`, which introduces error calls in case expressions in CorePrep, which runs *after* CoreTidy (which is where we decide on CafInfos) and turns previously non-CAFFY things into CAFFY. Fixes #17648 Fixes #9718 Evaluation ========== NoFib ----- Boot with: `make boot mode=fast` Run: `make mode=fast EXTRA_RUNTEST_OPTS="-cachegrind" NoFibRuns=1` -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- CS -0.0% 0.0% -0.0% -0.0% -0.0% CSD -0.0% 0.0% -0.0% -0.0% -0.0% FS -0.0% 0.0% -0.0% -0.0% -0.0% S -0.0% 0.0% -0.0% -0.0% -0.0% VS -0.0% 0.0% -0.0% -0.0% -0.0% VSD -0.0% 0.0% -0.0% -0.0% -0.5% VSM -0.0% 0.0% -0.0% -0.0% -0.0% anna -0.1% 0.0% -0.0% -0.0% -0.0% ansi -0.0% 0.0% -0.0% -0.0% -0.0% atom -0.0% 0.0% -0.0% -0.0% -0.0% awards -0.0% 0.0% -0.0% -0.0% -0.0% banner -0.0% 0.0% -0.0% -0.0% -0.0% bernouilli -0.0% 0.0% -0.0% -0.0% -0.0% binary-trees -0.0% 0.0% -0.0% -0.0% -0.0% boyer -0.0% 0.0% -0.0% -0.0% -0.0% boyer2 -0.0% 0.0% -0.0% -0.0% -0.0% bspt -0.0% 0.0% -0.0% -0.0% -0.0% cacheprof -0.0% 0.0% -0.0% -0.0% -0.0% calendar -0.0% 0.0% -0.0% -0.0% -0.0% cichelli -0.0% 0.0% -0.0% -0.0% -0.0% circsim -0.0% 0.0% -0.0% -0.0% -0.0% clausify -0.0% 0.0% -0.0% -0.0% -0.0% comp_lab_zift -0.0% 0.0% -0.0% -0.0% -0.0% compress -0.0% 0.0% -0.0% -0.0% -0.0% compress2 -0.0% 0.0% -0.0% -0.0% -0.0% constraints -0.0% 0.0% -0.0% -0.0% -0.0% cryptarithm1 -0.0% 0.0% -0.0% -0.0% -0.0% cryptarithm2 -0.0% 0.0% -0.0% -0.0% -0.0% cse -0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e1 -0.0% 0.0% -0.0% -0.0% -0.0% digits-of-e2 -0.0% 0.0% -0.0% -0.0% -0.0% dom-lt -0.0% 0.0% -0.0% -0.0% -0.0% eliza -0.0% 0.0% -0.0% -0.0% -0.0% event -0.0% 0.0% -0.0% -0.0% -0.0% exact-reals -0.0% 0.0% -0.0% -0.0% -0.0% exp3_8 -0.0% 0.0% -0.0% -0.0% -0.0% expert -0.0% 0.0% -0.0% -0.0% -0.0% fannkuch-redux -0.0% 0.0% -0.0% -0.0% -0.0% fasta -0.0% 0.0% -0.0% -0.0% -0.0% fem -0.0% 0.0% -0.0% -0.0% -0.0% fft -0.0% 0.0% -0.0% -0.0% -0.0% fft2 -0.0% 0.0% -0.0% -0.0% -0.0% fibheaps -0.0% 0.0% -0.0% -0.0% -0.0% fish -0.0% 0.0% -0.0% -0.0% -0.0% fluid -0.1% 0.0% -0.0% -0.0% -0.0% fulsom -0.0% 0.0% -0.0% -0.0% -0.0% gamteb -0.0% 0.0% -0.0% -0.0% -0.0% gcd -0.0% 0.0% -0.0% -0.0% -0.0% gen_regexps -0.0% 0.0% -0.0% -0.0% -0.0% genfft -0.0% 0.0% -0.0% -0.0% -0.0% gg -0.0% 0.0% -0.0% -0.0% -0.0% grep -0.0% 0.0% -0.0% -0.0% -0.0% hidden -0.0% 0.0% -0.0% -0.0% -0.0% hpg -0.1% 0.0% -0.0% -0.0% -0.0% ida -0.0% 0.0% -0.0% -0.0% -0.0% infer -0.0% 0.0% -0.0% -0.0% -0.0% integer -0.0% 0.0% -0.0% -0.0% -0.0% integrate -0.0% 0.0% -0.0% -0.0% -0.0% k-nucleotide -0.0% 0.0% -0.0% -0.0% -0.0% kahan -0.0% 0.0% -0.0% -0.0% -0.0% knights -0.0% 0.0% -0.0% -0.0% -0.0% lambda -0.0% 0.0% -0.0% -0.0% -0.0% last-piece -0.0% 0.0% -0.0% -0.0% -0.0% lcss -0.0% 0.0% -0.0% -0.0% -0.0% life -0.0% 0.0% -0.0% -0.0% -0.0% lift -0.0% 0.0% -0.0% -0.0% -0.0% linear -0.1% 0.0% -0.0% -0.0% -0.0% listcompr -0.0% 0.0% -0.0% -0.0% -0.0% listcopy -0.0% 0.0% -0.0% -0.0% -0.0% maillist -0.0% 0.0% -0.0% -0.0% -0.0% mandel -0.0% 0.0% -0.0% -0.0% -0.0% mandel2 -0.0% 0.0% -0.0% -0.0% -0.0% mate -0.0% 0.0% -0.0% -0.0% -0.0% minimax -0.0% 0.0% -0.0% -0.0% -0.0% mkhprog -0.0% 0.0% -0.0% -0.0% -0.0% multiplier -0.0% 0.0% -0.0% -0.0% -0.0% n-body -0.0% 0.0% -0.0% -0.0% -0.0% nucleic2 -0.0% 0.0% -0.0% -0.0% -0.0% para -0.0% 0.0% -0.0% -0.0% -0.0% paraffins -0.0% 0.0% -0.0% -0.0% -0.0% parser -0.1% 0.0% -0.0% -0.0% -0.0% parstof -0.1% 0.0% -0.0% -0.0% -0.0% pic -0.0% 0.0% -0.0% -0.0% -0.0% pidigits -0.0% 0.0% -0.0% -0.0% -0.0% power -0.0% 0.0% -0.0% -0.0% -0.0% pretty -0.0% 0.0% -0.3% -0.4% -0.4% primes -0.0% 0.0% -0.0% -0.0% -0.0% primetest -0.0% 0.0% -0.0% -0.0% -0.0% prolog -0.0% 0.0% -0.0% -0.0% -0.0% puzzle -0.0% 0.0% -0.0% -0.0% -0.0% queens -0.0% 0.0% -0.0% -0.0% -0.0% reptile -0.0% 0.0% -0.0% -0.0% -0.0% reverse-complem -0.0% 0.0% -0.0% -0.0% -0.0% rewrite -0.0% 0.0% -0.0% -0.0% -0.0% rfib -0.0% 0.0% -0.0% -0.0% -0.0% rsa -0.0% 0.0% -0.0% -0.0% -0.0% scc -0.0% 0.0% -0.3% -0.5% -0.4% sched -0.0% 0.0% -0.0% -0.0% -0.0% scs -0.0% 0.0% -0.0% -0.0% -0.0% simple -0.1% 0.0% -0.0% -0.0% -0.0% solid -0.0% 0.0% -0.0% -0.0% -0.0% sorting -0.0% 0.0% -0.0% -0.0% -0.0% spectral-norm -0.0% 0.0% -0.0% -0.0% -0.0% sphere -0.0% 0.0% -0.0% -0.0% -0.0% symalg -0.0% 0.0% -0.0% -0.0% -0.0% tak -0.0% 0.0% -0.0% -0.0% -0.0% transform -0.0% 0.0% -0.0% -0.0% -0.0% treejoin -0.0% 0.0% -0.0% -0.0% -0.0% typecheck -0.0% 0.0% -0.0% -0.0% -0.0% veritas -0.0% 0.0% -0.0% -0.0% -0.0% wang -0.0% 0.0% -0.0% -0.0% -0.0% wave4main -0.0% 0.0% -0.0% -0.0% -0.0% wheel-sieve1 -0.0% 0.0% -0.0% -0.0% -0.0% wheel-sieve2 -0.0% 0.0% -0.0% -0.0% -0.0% x2n1 -0.0% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min -0.1% 0.0% -0.3% -0.5% -0.5% Max -0.0% 0.0% -0.0% -0.0% -0.0% Geometric Mean -0.0% -0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Program Size Allocs Instrs Reads Writes -------------------------------------------------------------------------------- circsim -0.1% 0.0% -0.0% -0.0% -0.0% constraints -0.0% 0.0% -0.0% -0.0% -0.0% fibheaps -0.0% 0.0% -0.0% -0.0% -0.0% gc_bench -0.0% 0.0% -0.0% -0.0% -0.0% hash -0.0% 0.0% -0.0% -0.0% -0.0% lcss -0.0% 0.0% -0.0% -0.0% -0.0% power -0.0% 0.0% -0.0% -0.0% -0.0% spellcheck -0.0% 0.0% -0.0% -0.0% -0.0% -------------------------------------------------------------------------------- Min -0.1% 0.0% -0.0% -0.0% -0.0% Max -0.0% 0.0% -0.0% -0.0% -0.0% Geometric Mean -0.0% +0.0% -0.0% -0.0% -0.0% Manual inspection of programs in testsuite/tests/programs --------------------------------------------------------- I built these programs with a bunch of dump flags and `-O` and compared STG, Cmm, and Asm dumps and file sizes. (Below the numbers in parenthesis show number of modules in the program) These programs have identical compiler (same .hi and .o sizes, STG, and Cmm and Asm dumps): - Queens (1), andre_monad (1), cholewo-eval (2), cvh_unboxing (3), andy_cherry (7), fun_insts (1), hs-boot (4), fast2haskell (2), jl_defaults (1), jq_readsPrec (1), jules_xref (1), jtod_circint (4), jules_xref2 (1), lennart_range (1), lex (1), life_space_leak (1), bargon-mangler-bug (7), record_upd (1), rittri (1), sanders_array (1), strict_anns (1), thurston-module-arith (2), okeefe_neural (1), joao-circular (6), 10queens (1) Programs with different compiler outputs: - jl_defaults (1): For some reason GHC HEAD marks a lot of top-level `[Int]` closures as CAFFY for no reason. With this patch we no longer make them CAFFY and generate less SRT entries. For some reason Main.o is slightly larger with this patch (1.3%) and the executable sizes are the same. (I'd expect both to be smaller) - launchbury (1): Same as jl_defaults: top-level `[Int]` closures marked as CAFFY for no reason. Similarly `Main.o` is 1.4% larger but the executable sizes are the same. - galois_raytrace (13): Differences are in the Parse module. There are a lot, but some of the changes are caused by the fact that for some reason (I think a bug) GHC HEAD marks the dictionary for `Functor Identity` as CAFFY. Parse.o is 0.4% larger, the executable size is the same. - north_array: We now generate less SRT entries because some of array primops used in this program like `NewArrayOp` get eliminated during Stg-to-Cmm and turn some CAFFY things into non-CAFFY. Main.o gets 24% larger (9224 bytes from 9000 bytes), executable sizes are the same. - seward-space-leak: Difference in this program is better shown by this smaller example: module Lib where data CDS = Case [CDS] [(Int, CDS)] | Call CDS CDS instance Eq CDS where Case sels1 rets1 == Case sels2 rets2 = sels1 == sels2 && rets1 == rets2 Call a1 b1 == Call a2 b2 = a1 == a2 && b1 == b2 _ == _ = False In this program GHC HEAD builds a new SRT for the recursive group of `(==)`, `(/=)` and the dictionary closure. Then `/=` points to `==` in its SRT field, and `==` uses the SRT object as its SRT. With this patch we use the closure for `/=` as the SRT and add `==` there. Then `/=` gets an empty SRT field and `==` points to `/=` in its SRT field. This change looks fine to me. Main.o gets 0.07% larger, executable sizes are identical. head.hackage ------------ head.hackage's CI script builds 428 packages from Hackage using this patch with no failures. Compiler performance -------------------- The compiler perf tests report that the compiler allocates slightly more (worst case observed so far is 4%). However most programs in the test suite are small, single file programs. To benchmark compiler performance on something more realistic I build Cabal (the library, 236 modules) with different optimisation levels. For the "max residency" row I run GHC with `+RTS -s -A100k -i0 -h` for more accurate numbers. Other rows are generated with just `-s`. (This is because `-i0` causes running GC much more frequently and as a result "bytes copied" gets inflated by more than 25x in some cases) * -O0 | | GHC HEAD | This MR | Diff | | --------------- | -------------- | -------------- | ------ | | Bytes allocated | 54,413,350,872 | 54,701,099,464 | +0.52% | | Bytes copied | 4,926,037,184 | 4,990,638,760 | +1.31% | | Max residency | 421,225,624 | 424,324,264 | +0.73% | * -O1 | | GHC HEAD | This MR | Diff | | --------------- | --------------- | --------------- | ------ | | Bytes allocated | 245,849,209,992 | 246,562,088,672 | +0.28% | | Bytes copied | 26,943,452,560 | 27,089,972,296 | +0.54% | | Max residency | 982,643,440 | 991,663,432 | +0.91% | * -O2 | | GHC HEAD | This MR | Diff | | --------------- | --------------- | --------------- | ------ | | Bytes allocated | 291,044,511,408 | 291,863,910,912 | +0.28% | | Bytes copied | 37,044,237,616 | 36,121,690,472 | -2.49% | | Max residency | 1,071,600,328 | 1,086,396,256 | +1.38% | Extra compiler allocations -------------------------- Runtime allocations of programs are as reported above (NoFib section). The compiler now allocates more than before. Main source of allocation in this patch compared to base commit is the new SRT algorithm (GHC.Cmm.Info.Build). Below is some of the extra work we do with this patch, numbers generated by profiled stage 2 compiler when building a pathological case (the test 'ManyConstructors') with '-O2': - We now sort the final STG for a module, which means traversing the entire program, generating free variable set for each top-level binding, doing SCC analysis, and re-ordering the program. In ManyConstructors this step allocates 97,889,952 bytes. - We now do SRT analysis on static data, which in a program like ManyConstructors causes analysing 10,000 bindings that we would previously just skip. This step allocates 70,898,352 bytes. - We now maintain an SRT map for the entire module as we compile Cmm groups: data ModuleSRTInfo = ModuleSRTInfo { ... , moduleSRTMap :: SRTMap } (SRTMap is just a strict Map from the 'containers' library) This map gets an entry for most bindings in a module (exceptions are THUNKs and CAFFY static functions). For ManyConstructors this map gets 50015 entries. - Once we're done with code generation we generate a NameSet from SRTMap for the non-CAFFY names in the current module. This set gets the same number of entries as the SRTMap. - Finally we update CafInfos in ModDetails for the non-CAFFY Ids, using the NameSet generated in the previous step. This usually does the least amount of allocation among the work listed here. Only place with this patch where we do less work in the CAF analysis in the tidying pass (CoreTidy). However that doesn't save us much, as the pass still needs to traverse the whole program and update IdInfos for other reasons. Only thing we don't here do is the `hasCafRefs` pass over the RHS of bindings, which is a stateless pass that returns a boolean value, so it doesn't allocate much. (Metric changes blow are all increased allocations) Metric changes -------------- Metric Increase: ManyAlternatives ManyConstructors T13035 T14683 T1969 T9961
* Module hierarchy: Cmm (cf #13009)Sylvain Henry2020-01-251-6/+6
|
* testsuite: Skip regalloc_unit_tests unless have_ncgBen Gamari2019-10-261-0/+1
| | | | | This is a unit test for the native code generator's register allocator; naturally. the NCG is required.
* Module hierarchy: StgToCmm (#13009)Sylvain Henry2019-09-101-1/+1
| | | | | | Add StgToCmm module hierarchy. Platform modules that are used in several other places (NCG, LLVM codegen, Cmm transformations) are put into GHC.Platform.
* Support generating HIE filesAlec Theriault2018-12-111-1/+1
| | | | | | | | | | | | | | | | | | | | Adds a `-fenable-ide-info` flag which instructs GHC to generate `.hie` files (see the wiki page: https://ghc.haskell.org/trac/ghc/wiki/HIEFiles). This is a rebased version of Zubin Duggal's (@wz1000) GHC changes for his GSOC project, as posted here: https://gist.github.com/wz1000/5ed4ddd0d3e96d6bc75e095cef95363d. Test Plan: ./validate Reviewers: bgamari, gershomb, nomeata, alanz, sjakobi Reviewed By: alanz, sjakobi Subscribers: alanz, hvr, sjakobi, rwbarton, wz1000, carter Differential Revision: https://phabricator.haskell.org/D5239
* An overhaul of the SRT representationSimon Marlow2018-05-161-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Summary: - Previously we would hvae a single big table of pointers per module, with a set of bitmaps to reference entries within it. The new representation is identical to a static constructor, which is much simpler for the GC to traverse, and we get to remove the complicated bitmap-traversal code from the GC. - Rewrite all the code to generate SRTs in CmmBuildInfoTables, and document it much better (see Note [SRTs]). This has been something I've wanted to do since we moved to the new code generator, I finally had the opportunity to finish it while on a transatlantic flight recently :) There are a series of 4 diffs: 1. D4632 (this one), which does the bulk of the changes 2. D4633 which adds support for smaller `CmmLabelDiffOff` constants 3. D4634 which takes advantage of D4632 and D4633 to save a word in info tables that have an SRT on x86_64. This is where most of the binary size improvement comes from. 4. D4637 which makes a further optimisation to merge some SRTs with static FUN closures. This adds some complexity and the benefits are fairly modest, so it's not clear yet whether we should do this. Results (after (3), on x86_64) - GHC itself (staticaly linked) is 5.2% smaller - -1.7% binary sizes in nofib, -2.9% module sizes. Full nofib results: P176 - I measured the overhead of traversing all the static objects in a major GC in GHC itself by doing `replicateM_ 1000 performGC` as the first thing in `Main.main`. The new version was 5-10% faster, but the results did vary quite a bit. - I'm not sure if there's a compile-time difference, the results are too unreliable. Test Plan: validate Reviewers: bgamari, michalt, niteria, simonpj, erikd, osa1 Subscribers: thomie, carter Differential Revision: https://phabricator.haskell.org/D4632
* Generalize CmmUnwind and pass unwind information through NCGBen Gamari2017-02-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As discussed in D1532, Trac Trac #11337, and Trac Trac #11338, the stack unwinding information produced by GHC is currently quite approximate. Essentially we assume that register values do not change at all within a basic block. While this is somewhat true in normal Haskell code, blocks containing foreign calls often break this assumption. This results in unreliable call stacks, especially in the code containing foreign calls. This is worse than it sounds as unreliable unwinding information can at times result in segmentation faults. This patch set attempts to improve this situation by tracking unwinding information with finer granularity. By dispensing with the assumption of one unwinding table per block, we allow the compiler to accurately represent the areas surrounding foreign calls. Towards this end we generalize the representation of unwind information in the backend in three ways, * Multiple CmmUnwind nodes can occur per block * CmmUnwind nodes can now carry unwind information for multiple registers (while not strictly necessary; this makes emitting unwinding information a bit more convenient in the compiler) * The NCG backend is given an opportunity to modify the unwinding records since it may need to make adjustments due to, for instance, native calling convention requirements for foreign calls (see #11353). This sets the stage for resolving #11337 and #11338. Test Plan: Validate Reviewers: scpmw, simonmar, austin, erikd Subscribers: qnikst, thomie Differential Revision: https://phabricator.haskell.org/D2741
* regalloc_unit_tests: disable on UNREG targetsSergei Trofimovich2016-12-181-1/+2
| | | | | | | Test requires register allocator to be present (native code generator). Signed-off-by: Sergei Trofimovich <siarheit@google.com>
* Test for unnecessary register spillsThomas Jakway2016-11-155-0/+229
Reviewers: mainland, simonmar, michalt, bgamari, austin Reviewed By: bgamari Subscribers: simonpj, mpickering, thomie Differential Revision: https://phabricator.haskell.org/D2638 GHC Trac Issues: #12744, #12745