diff options
author | Alan Zimmerman <alan.zimm@gmail.com> | 2022-10-23 23:25:36 +0100 |
---|---|---|
committer | Alan Zimmerman <alan.zimm@gmail.com> | 2022-12-06 22:03:53 +0000 |
commit | 171ab320ee673668c937aa62f49a85ac8a26f2f7 (patch) | |
tree | f3911108adf4f8b3e1a9788658340d17493fdd7f | |
parent | a74225803dc4ec14e3aef96cfe5e17bdc5f0d2a7 (diff) | |
download | haskell-wip/az/epa-note.tar.gz |
Starting on note re exact print annotationswip/az/epa-note
-rw-r--r-- | compiler/GHC/Parser/Annotation.hs | 146 |
1 files changed, 146 insertions, 0 deletions
diff --git a/compiler/GHC/Parser/Annotation.hs b/compiler/GHC/Parser/Annotation.hs index 7b7fccc862..ef1e166236 100644 --- a/compiler/GHC/Parser/Annotation.hs +++ b/compiler/GHC/Parser/Annotation.hs @@ -184,6 +184,152 @@ https://gitlab.haskell.org/ghc/ghc/wikis/api-annotations -} +-- DRAFT NEW VERSION +{- +Note [Exactprint annotations] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +In the syntax tree produced by the parser (of type `HsExpr GhcPs`, for +expressions) the tree decorated with "ExactPrint Annotations". The +goal of these annotations is to be able to print the syntax tree +/precisely/ as it was read in by the parser, including +* Every keyword placed exactly where it was in the original input +* Every parenthesis, and every comma +* Every comment, including Haddock comments + +The extra information to achieve this is held in the TTG extension +fields of each constructor. (See Note [Trees that Grow] in module +Language.Haskell.Syntax.Extension). For example, the extension field +of the `HsIf` constructor tells where the `if`, `then`, and `else` +keywords are placed: + + data HsExpr p = ... + | HsIf (XIf p) (LHsExpr p) (LHsExpr p) (LHsExpr p) + + type instance XIf GhcPs = EpAnn AnnsIf + +Here the extension field is populated with `EpAnn AnnsIf`: +* The /specifics/ of the keywords etc for `HsIf` are contained in the `AnnsIf` data type, +* The /generic requirements/ for exact-print are in the `EpAnn` wrapper, which wraps + every exact-print annotation + + data AnnsIf + = AnnsIf { + aiIf :: EpaLocation, + aiThen :: EpaLocation, + aiElse :: EpaLocation, + aiThenSemi :: Maybe EpaLocation, + aiElseSemi :: Maybe EpaLocation + } deriving Data + + + data EpAnn ann + = EpAnn { entry :: !Anchor + , anns :: !ann + , comments :: !EpAnnComments + } + +The EpAnn is a type capturing the generic requirements for exact +printing, as well as the placement of keywords specific to the +constructor being annotated. In this case they are + + data EpaLocation = EpaSpan !RealSrcSpan + | EpaDelta !DeltaPos ![LEpaComment] + +To start with the generic requirements, the first is the Anchor. + + data Anchor = Anchor { anchor :: RealSrcSpan + , anchor_op :: AnchorOperation } + + data AnchorOperation = UnchangedAnchor + | MovedAnchor DeltaPos + +Printing involves going to the appropriate start location, and then +outputting a representation of the item there. Given the AST is a +tree, this is an inherently recursive operation. + +The anchor exists to identify the appropriate start location. It caters +to the two scenarios for exact printing: + * to reproduce the original source that was parsed + * to be able to print a modified AST + +To achieve both of these use cases at the same time, the exact printing +algorithm uses the Anchor as a reference to the "top left" corner of +the AST fragment to be printed. In the unmodified case, this location +is simply the `RealSrcSpan` in the `anchor` field, and the `anchor_op` +is `UnchangedAnchor`. If the AST fragment to be printed has moved, its +spacing in the printing its containing AST fragment is captured in a +`MovedAnchor` operation, with an appropriate `DeltaPos`. + +A similar role is played by `EpaLocation` for positioning the +annotated keywords. + +But, the `anchor` RealSrcSpan is /also/ used to calculate the offset +of the first item to be printed for HsIf (AZ note: in practice this +offset is always zero, hence we could likely unify EpaLocation and +Anchor). + +Located Items +~~~~~~~~~~~~~ + +Historically the exact print annotations were kept out of the AST +itself, and linked in by a combination of SrcSpan and Constructor +name. This has carried over to the current version by inserting an +exact print annotation into every `Located` item in the `GhcPass` AST. +Now that exact print annotations can be placed in the TTG extension +points, we can potentially remove some of the ones in locations, if +they are not needed for other reasons such as error reporting. + +The annotations are stored by making use of the `XRec` type family to +make the association (See Note [XRec and SrcSpans in the AST] in +Language.Haskell.Syntax.Extension). + + type family XRec p a = r | r -> a + + type family Anno a = b -- See Note [XRec and Anno in the AST] in GHC.Parser.Annotation + + type instance XRec (GhcPass p) a = GenLocated (Anno a) a + +To go back to the `HsIf` example, it is a constructor for `HsExpr`. + + type instance Anno (HsExpr (GhcPass p)) = SrcSpanAnnA + +This tells us that in an AST belonging to a GHC pass, the `Anno` +instance is `SrcSpannAnnA`. + +i.e. XRec GhcPs HsExpr ~ GenLocated SrcSpannAnnA HsExpr + +This is captured in a shorthand form with a type synonym as + + type LocatedA = GenLocated SrcSpanAnnA + +In turn, SrcSpanAnnA is a type synonym for a further cascade of types + + type SrcSpanAnnA = SrcAnn AnnListItem + + type SrcAnn ann = SrcSpanAnn' (EpAnn ann) + + data SrcSpanAnn' a = SrcSpanAnn { ann :: !a, locA :: !SrcSpan } + +And at the end of the cascade we see + + LocatedA HsExpr + ~ GenLocated (SrcSpanAnn { ann: EpAnn AnnListItem, locA: SrcSpan}) HsExpr + +So the SrcSpan used in the rest of GHC for error message locations is +included, as well as an `EpAnn` giving the general exact printing +requirements and a custom annotation type + + data AnnListItem = AnnListItem {lann_trailing :: [TrailingAnn]} + +The `[TrailingAnn]` capture the locations of any mix of commas, +semicolons or vbars that may follow the given expression in its +surrounding context. This is not properly a part of the `HsExpr`, +since it relates to how it is used in its parent structure. + +TO BE CONTINUED, input welcome +-} + + -- -------------------------------------------------------------------- -- | Exact print annotations exist so that tools can perform source to |