diff options
author | Simon Peyton Jones <simon.peytonjones@gmail.com> | 2023-02-21 10:51:34 +0000 |
---|---|---|
committer | Marge Bot <ben+marge-bot@smart-cactus.org> | 2023-02-28 18:54:59 -0500 |
commit | 7192ef91c855e1fae6997f75cfde76aafd0b4bcf (patch) | |
tree | aef67a692c95e4e11b50d855ba651784eb89c109 /compiler/GHC/Core.hs | |
parent | 239202a2b14714740e016d7bbcd4f351356fcb00 (diff) | |
download | haskell-7192ef91c855e1fae6997f75cfde76aafd0b4bcf.tar.gz |
Take more care with unlifted bindings in the specialiser
As #22998 showed, we were floating an unlifted binding to top
level, which breaks a Core invariant.
The fix is easy, albeit a little bit conservative. See
Note [Care with unlifted bindings] in GHC.Core.Opt.Specialise
Diffstat (limited to 'compiler/GHC/Core.hs')
-rw-r--r-- | compiler/GHC/Core.hs | 113 |
1 files changed, 62 insertions, 51 deletions
diff --git a/compiler/GHC/Core.hs b/compiler/GHC/Core.hs index db332b421c..a92252a61c 100644 --- a/compiler/GHC/Core.hs +++ b/compiler/GHC/Core.hs @@ -366,68 +366,32 @@ a Coercion, (sym c). Note [Core letrec invariant] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -The right hand sides of all top-level and recursive @let@s -/must/ be of lifted type (see "Type#type_classification" for -the meaning of /lifted/ vs. /unlifted/). +The Core letrec invariant: -There is one exception to this rule, top-level @let@s are -allowed to bind primitive string literals: see -Note [Core top-level string literals]. + The right hand sides of all + /top-level/ or /recursive/ + bindings must be of lifted type -Note [Core top-level string literals] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -As an exception to the usual rule that top-level binders must be lifted, -we allow binding primitive string literals (of type Addr#) of type Addr# at the -top level. This allows us to share string literals earlier in the pipeline and -crucially allows other optimizations in the Core2Core pipeline to fire. -Consider, + There is one exception to this rule, top-level @let@s are + allowed to bind primitive string literals: see + Note [Core top-level string literals]. - f n = let a::Addr# = "foo"# - in \x -> blah +See "Type#type_classification" in GHC.Core.Type +for the meaning of "lifted" vs. "unlifted"). -In order to be able to inline `f`, we would like to float `a` to the top. -Another option would be to inline `a`, but that would lead to duplicating string -literals, which we want to avoid. See #8472. - -The solution is simply to allow top-level unlifted binders. We can't allow -arbitrary unlifted expression at the top-level though, unlifted binders cannot -be thunks, so we just allow string literals. - -We allow the top-level primitive string literals to be wrapped in Ticks -in the same way they can be wrapped when nested in an expression. -CoreToSTG currently discards Ticks around top-level primitive string literals. -See #14779. - -Also see Note [Compilation plan for top-level string literals]. - -Note [Compilation plan for top-level string literals] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Here is a summary on how top-level string literals are handled by various -parts of the compilation pipeline. - -* In the source language, there is no way to bind a primitive string literal - at the top level. - -* In Core, we have a special rule that permits top-level Addr# bindings. See - Note [Core top-level string literals]. Core-to-core passes may introduce - new top-level string literals. - -* In STG, top-level string literals are explicitly represented in the syntax - tree. - -* A top-level string literal may end up exported from a module. In this case, - in the object file, the content of the exported literal is given a label with - the _bytes suffix. +For the non-top-level, non-recursive case see Note [Core let-can-float invariant]. Note [Core let-can-float invariant] -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The let-can-float invariant: - The right hand side of a non-recursive 'Let' - /may/ be of unlifted type, but only if + The right hand side of a /non-top-level/, /non-recursive/ binding + may be of unlifted type, but only if the expression is ok-for-speculation or the 'Let' is for a join point. + (For top-level or recursive lets see Note [Core letrec invariant].) + This means that the let can be floated around without difficulty. For example, this is OK: @@ -466,6 +430,53 @@ we need to allow lots of things in the arguments of a call. TL;DR: we relaxed the let/app invariant to become the let-can-float invariant. +Note [Core top-level string literals] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +As an exception to the usual rule that top-level binders must be lifted, +we allow binding primitive string literals (of type Addr#) of type Addr# at the +top level. This allows us to share string literals earlier in the pipeline and +crucially allows other optimizations in the Core2Core pipeline to fire. +Consider, + + f n = let a::Addr# = "foo"# + in \x -> blah + +In order to be able to inline `f`, we would like to float `a` to the top. +Another option would be to inline `a`, but that would lead to duplicating string +literals, which we want to avoid. See #8472. + +The solution is simply to allow top-level unlifted binders. We can't allow +arbitrary unlifted expression at the top-level though, unlifted binders cannot +be thunks, so we just allow string literals. + +We allow the top-level primitive string literals to be wrapped in Ticks +in the same way they can be wrapped when nested in an expression. +CoreToSTG currently discards Ticks around top-level primitive string literals. +See #14779. + +Also see Note [Compilation plan for top-level string literals]. + +Note [Compilation plan for top-level string literals] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Here is a summary on how top-level string literals are handled by various +parts of the compilation pipeline. + +* In the source language, there is no way to bind a primitive string literal + at the top level. + +* In Core, we have a special rule that permits top-level Addr# bindings. See + Note [Core top-level string literals]. Core-to-core passes may introduce + new top-level string literals. + + See GHC.Core.Utils.exprIsTopLevelBindable, and exprIsTickedString + +* In STG, top-level string literals are explicitly represented in the syntax + tree. + +* A top-level string literal may end up exported from a module. In this case, + in the object file, the content of the exported literal is given a label with + the _bytes suffix. + Note [NON-BOTTOM-DICTS invariant] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It is a global invariant (not checkable by Lint) that |