diff options
Diffstat (limited to 'compiler/GHC/Cmm/cmm-notes')
-rw-r--r-- | compiler/GHC/Cmm/cmm-notes | 184 |
1 files changed, 184 insertions, 0 deletions
diff --git a/compiler/GHC/Cmm/cmm-notes b/compiler/GHC/Cmm/cmm-notes new file mode 100644 index 0000000000..d664a195b7 --- /dev/null +++ b/compiler/GHC/Cmm/cmm-notes @@ -0,0 +1,184 @@ +More notes (Aug 11) +~~~~~~~~~~~~~~~~~~ +* CmmInfo.cmmToRawCmm expands info tables to their representations + (needed for .cmm files as well as the code generators) + +* Why is FCode a lazy monad? That makes it inefficient. + We want laziness to get code out one procedure at a time, + but not at the instruction level. + UPDATE (31/5/2016): FCode is strict since 09afcc9b. + +Things we did + * Remove CmmCvt.graphToZgraph (Conversion from old to new Cmm reps) + * Remove HscMain.optionallyConvertAndOrCPS (converted old Cmm to + new, ran pipeline, and converted back) + * Remove CmmDecl. Put its types in Cmm. Import Cmm into OldCmm + so it can get those types. + + +More notes (June 11) +~~~~~~~~~~~~~~~~~~~~ + +* In CmmContFlowOpt.branchChainElim, can a single block be the + successor of two calls? + +* Check in ClosureInfo: + -- NB: Results here should line up with the results of SMRep.rtsClosureType + +More notes (May 11) +~~~~~~~~~~~~~~~~~~~ +In CmmNode, consider splitting CmmCall into two: call and jump + +Notes on new codegen (Aug 10) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Things to do: + - Proc points pass all arguments on the stack, adding more code and + slowing down things a lot. We either need to fix this or even better + would be to get rid of proc points. + + - Sort out Label, LabelMap, LabelSet versus BlockId, BlockEnv, BlockSet + dichotomy. Mostly this means global replace, but we also need to make + Label an instance of Outputable (probably in the Outputable module). + + EZY: We should use Label, since that's the terminology Hoopl uses. + + - AsmCodeGen has a generic Cmm optimiser; move this into new pipeline + EZY (2011-04-16): The mini-inliner has been generalized and ported, + but the constant folding and other optimizations need to still be + ported. + + - AsmCodeGen has post-native-cg branch eliminator (shortCutBranches); + we ultimately want to share this with the Cmm branch eliminator. + + - At the moment, references to global registers like Hp are "lowered" + late (in CgUtils.fixStgRegisters). We should do this early, in the + new native codegen, much in the way that we lower calling conventions. + Might need to be a bit sophisticated about aliasing. + + - Move to new Cmm rep: + * Make native CG consume New Cmm; + * Convert Old Cmm->New Cmm to keep old path alive + * Produce New Cmm when reading in .cmm files + + - Top-level SRT threading is a bit ugly + + - See "CAFs" below; we want to totally refactor the way SRTs are calculated + + - Garbage-collect https://gitlab.haskell.org/ghc/ghc/wikis/commentary/compiler/cps + moving good stuff into + https://gitlab.haskell.org/ghc/ghc/wikis/commentary/compiler/new-code-gen-pipeline + + - Currently AsmCodeGen top level calls AsmCodeGen.cmmToCmm, which is a small + C-- optimiser. It has quite a lot of boilerplate folding code in AsmCodeGen + (cmmBlockConFold, cmmStmtConFold, cmmExprConFold), before calling out to + CmmOpt. ToDo: see what optimisations are being done; and do them before + AsmCodeGen. + + - If we stick CAF and stack liveness info on a LastCall node (not LastRet/Jump) + then all CAF and stack liveness stuff be completed before we split + into separate C procedures. + + Short term: + compute and attach liveness into LastCall + right at end, split, cvt to old rep + [must split before cvt, because old rep is not expressive enough] + + Longer term: + when old rep disappears, + move the whole splitting game into the C back end *only* + (guided by the procpoint set) + +---------------------------------------------------- + Proc-points +---------------------------------------------------- + +Consider this program, which has a diamond control flow, +with a call on one branch + fn(p,x) { + h() + if b then { ... f(x) ...; q=5; goto J } + else { ...; q=7; goto J } + J: ..p...q... + } +then the join point J is a "proc-point". So, is 'p' passed to J +as a parameter? Or, if 'p' was saved on the stack anyway, perhaps +to keep it alive across the call to h(), maybe 'p' gets communicated +to J that way. This is an awkward choice. (We think that we currently +never pass variables to join points via arguments.) + +Furthermore, there is *no way* to pass q to J in a register (other +than a parameter register). + +What we want is to do register allocation across the whole caboodle. +Then we could drop all the code that deals with the above awkward +decisions about spilling variables across proc-points. + +Note that J doesn't need an info table. + +What we really want is for each LastCall (not LastJump/Ret) +to have an info table. Note that ProcPoints that are not successors +of calls don't need an info table. + +Figuring out proc-points +~~~~~~~~~~~~~~~~~~~~~~~~ +Proc-points are identified by +GHC.Cmm.ProcPoint.minimalProcPointSet/extendPPSet Although there isn't +that much code, JD thinks that it could be done much more nicely using +a dominator analysis, using the Dataflow Engine. + +---------------------------------------------------- + CAFs +---------------------------------------------------- + +* The code for a procedure f may refer to either the *closure* + or the *entry point* of another top-level procedure g. + If f is live, then so is g. f's SRT must include g's closure. + +* The CLabel for the entry-point/closure reveals whether g is + a CAF (or refers to CAFs). See the IdLabel constructor of CLabel. + +* The CAF-ness of the original top-level definitions is figured out + (by GHC.Iface.Tidy) before we generate C--. This CafInfo is only set for + top-level Ids; nested bindings stay with MayHaveCafRefs. + +* Currently an SRT contains (only) pointers to (top-level) closures. + +* Consider this Core code + f = \x -> let g = \y -> ...x...y...h1... + in ...h2...g... + and suppose that h1, h2 have IdInfo of MayHaveCafRefs. + Therefore, so will f, But g will not (since it's nested). + + This generates C-- roughly like this: + f_closure: .word f_entry + f_entry() [info-tbl-for-f] { ...jump g_entry...jump h2... } + g_entry() [info-tbl-for-g] { ...jump h1... } + + Note that there is no top-level closure for g (only an info table). + This fact (whether or not there is a top-level closure) is recorded + in the InfoTable attached to the CmmProc for f, g + INVARIANT: + Any out-of-Group references to an IdLabel goes to + a Proc whose InfoTable says "I have a top-level closure". + Equivalently: + A CmmProc whose InfoTable says "I do not have a top-level + closure" is referred to only from its own Group. + +* So: info-tbl-for-f must have an SRT that keeps h1,h2 alive + info-tbl-for-g must have an SRT that keeps h1 (only) alive + + But if we just look for the free CAF refs, we get: + f h2 (only) + g h1 + + So we need to do a transitive closure thing to flesh out + f's keep-alive refs to include h1. + +* The SRT info is the C_SRT field of Cmm.ClosureTypeInfo in a + CmmInfoTable attached to each CmmProc. CmmPipeline.toTops actually does + the attaching, right at the end of the pipeline. The C_SRT part + gives offsets within a single, shared table of closure pointers. + +* DECIDED: we can generate SRTs based on the final Cmm program + without knowledge of how it is generated. |