summaryrefslogtreecommitdiff
path: root/docs/core-spec/core-spec.mng
diff options
context:
space:
mode:
Diffstat (limited to 'docs/core-spec/core-spec.mng')
-rw-r--r--docs/core-spec/core-spec.mng112
1 files changed, 97 insertions, 15 deletions
diff --git a/docs/core-spec/core-spec.mng b/docs/core-spec/core-spec.mng
index d1d89055e7..952a172e6f 100644
--- a/docs/core-spec/core-spec.mng
+++ b/docs/core-spec/core-spec.mng
@@ -30,7 +30,7 @@ System FC, as implemented in GHC\footnote{This
document was originally prepared by Richard Eisenberg (\texttt{eir@cis.upenn.edu}),
but it should be maintained by anyone who edits the functions or data structures
mentioned in this file. Please feel free to contact Richard for more information.}\\
-\Large 23 October, 2015
+\Large 26 January, 2018
\end{center}
\section{Introduction}
@@ -89,6 +89,10 @@ variables:
\ottn
}
+\gram{
+\ottl
+}
+
We sometimes omit the type/kind annotation to a variable when it is obvious from context.
\subsection{Expressions}
@@ -112,13 +116,39 @@ is ok-for-speculation. See \verb|#let_app_invariant#| in \ghcfile{coreSyn/CoreSy
\item Types and coercions can only appear on the right-hand-side of an application.
\item The $[[t]]$ form of an expression must not then turn out to be a coercion.
In other words, the payload inside of a \texttt{Type} constructor must not turn out
-to be built with \texttt{CoercionTy}.
+to be built with \texttt{CoercionTy}.
+\item Join points (introduced by $[[join]]$ expressions) follow the invariants
+laid out in \verb|Note [Invariants on join points]| in
+\ghcfile{coreSyn/CoreSyn.lhs}:
+\begin{enumerate}
+ \item All occurences must be tail calls. This is enforced in our typing
+ rules using the label environment $[[D]]$.
+ \item Each join point has a \emph{join arity}. In this document, we write
+ each label as $[[p/I_t]]$ for the name $[[p]]$, the type $[[t]]$, and the
+ join arity $[[I]]$. The right-hand side of the binding must begin with at
+ least $[[I]]$ lambdas. This is enforced implicitly in
+ \ottdrulename{Tm\_JoinNonRec} and \ottdrulename{Tm\_JoinRec} by the use of
+ $[[split]]$ meta-function.
+ \item If the binding is recursive, then all other bindings in the recursive
+ group must be join points. We enforce this in our reformulation of the
+ grammar; in the actual AST, a $[[join]]$ is simply a $[[let]]$ where each
+ identifier is flagged as a join id, so this invariant requires that this
+ flag must be consistent across a recursive binding.
+ \item The binding's type must not be polymorphic in its return type. This
+ is expressed in \ottdrulename{Label\_Label}; see
+ Section~\ref{sec:name_consistency}.
+\end{enumerate}
+
\end{itemize}
Bindings for $[[let]]$ statements:
\gram{\ottbinding}
+Bindings for $[[join]]$ statements:
+
+\gram{\ottjbinding}
+
Case alternatives:
\gram{\ottalt}
@@ -139,9 +169,10 @@ A program is just a list of bindings:
\gram{\ottt}
-\ctor{ForAllTy}s are represented in two different ways, depending on whether
-the \ctor{ForAllTy} is anonymous (written $[[t1 -> t2]]$) or
-named (written $[[forall n . t]]$).
+\ctor{FunTy} is the special case for non-dependent function type. The
+\ctor{TyBinder} in \ghcfile{types/TyCoRep.lhs} distinguishes whether a binder is
+anonymous (\ctor{FunTy}) or named (\ctor{ForAllTy}). See
+\verb|Note [TyBinders]| in \ghcfile{types/TyCoRep.lhs}.
There are some invariants on types:
\begin{itemize}
@@ -152,7 +183,7 @@ $[[T]]$. It should be another application or a type variable.
does \emph{not} need to be saturated.
\item A saturated application of $[[(->) t1 t2]]$ should be represented as
$[[t1 -> t2]]$. This is a different point in the grammar, not just pretty-printing.
-The constructor for a saturated $[[(->)]]$ is \texttt{ForAllTy}.
+The constructor for a saturated $[[(->)]]$ is \texttt{FunTy}.
\item A type-level literal is represented in GHC with a different datatype than
a term-level literal, but we are ignoring this distinction here.
\item A coercion used as a type should appear only in the right-hand side of
@@ -164,7 +195,7 @@ are purely representational. The metatheory would remain the same if these forms
were removed in favor of $[[t1 t2]]$. Nevertheless, we keep all three forms in
this documentation to accurately reflect the implementation.
-The \texttt{Named} variant of a \texttt{Binder} (the first argument to a
+The \texttt{ArgFlag} field of a \texttt{TyVarBinder} (the first argument to a
\texttt{ForAllTy}) also tracks visibility of arguments. Visibility affects
only source Haskell, and is omitted from this presentation.
@@ -176,14 +207,14 @@ We use the notation $[[t1 k1~#k2 t2]]$ to stand for $[[(~#) k1 k2 t1 t2]]$.
Invariants on coercions:
\begin{itemize}
-\item $[[<t1 t2>_R]]$ is used; never $[[<t1>_R <t2>_Nom]]$.
-\item If $[[<T>_R]]$ is applied to some coercions, at least one of which is not
-reflexive, use $[[T_R </ gi // i />]]$, never $[[<T>_R g1 g2]] \ldots$.
+\item $[[<t1 t2>]]$ is used; never $[[<t1> <t2> ]]$.
+\item If $[[<T>]]$ is applied to some coercions, at least one of which is not
+reflexive, use $[[T_R </ gi // i />]]$, never $[[<T> g1 g2]] \ldots$.
\item The $[[T]]$ in $[[T_R </gi//i/>]]$ is never a type synonym, though it could
be a type function.
\item Every non-reflexive coercion coerces between two distinct types.
\item The name in a coercion must be a term-level name (\ctor{Id}).
-\item The contents of $[[<t>_R]]$ must not be a coercion. In other words,
+\item The contents of $[[<t>]]$ must not be a coercion. In other words,
the payload in a \texttt{Refl} must not be built with \texttt{CoercionTy}.
\end{itemize}
@@ -201,6 +232,20 @@ for more background.
\gram{\ottR}
+The \texttt{GRefl} constructor taks an $[[mg]]$. It wraps a kind coercion, which
+might be reflexive or any coercion:
+
+\gram{\ottmg}
+
+A nominal reflexive coercion is quite common, so we keep the special form
+\texttt{Refl}. Invariants on reflexive coercions:
+
+\begin{itemize}
+\item Always use $[[<t>]]$; never $[[<t> Nom MRefl]]$.
+\item All invariants on $[[<t>]]$ hold for $[[<t> R MRefl]]$.
+\item Use $[[<t> Rep MRefl]]$; never $[[sub <t>]]$.
+\end{itemize}
+
Is it a left projection or a right projection?
\gram{\ottLorR}
@@ -275,15 +320,21 @@ synonym for $[[TYPE 'Unlifted]]$.
\section{Contexts}
The functions in \ghcfile{coreSyn/CoreLint.lhs} use the \texttt{LintM} monad.
-This monad contains a context with a set of bound variables $[[G]]$. The
-formalism treats $[[G]]$ as an ordered list, but GHC uses a set as its
+This monad contains a context with a set of bound variables $[[G]]$ and a set
+of bound labels $[[D]]$. The formalism treats $[[G]]$ and $[[D]]$ as ordered
+lists, but GHC uses sets as its
representation.
\gram{
\ottG
}
-We assume the Barendregt variable convention that all new variables are
+\gram{
+\ottD
+}
+
+We assume the Barendregt variable convention that all new variables and labels
+are
fresh in the context. In the implementation, of course, some work is done
to guarantee this freshness. In particular, adding a new type variable to
the context sometimes requires creating a new, fresh variable name and then
@@ -313,12 +364,29 @@ Here is the definition of $[[vars_of]]$, taken from \coderef{coreSyn/CoreSyn.lhs
\end{array}
\]
+Here is the definition of $[[split]]$, which has no direct analogue in the
+source but simplifies the presentation here:
+
+\[
+\begin{array}{ll}
+[[split]]_0 [[t]] &= [[t]] \\
+[[split]]_n ([[s -> t]]) &= [[s]] \; ([[split]]_{n-1} [[t]]) \\
+[[split]]_n ([[forall alpha _ k . t]]) &= [[k]] \; ([[split]]_{n-1} [[t]])
+\end{array}
+\]
+
+The idea is simply to peel off the leading $i$ argument types (which may be
+kinds for type arguments) from a given type and return them in a sequence, with
+the return type (given $i$ arguments) as the final element of the sequence.
+
\subsection{Binding consistency}
\ottdefnlintXXbind{}
\ottdefnlintSingleBinding{}
+\ottdefnlintSingleBindingXXjoins{}
+
In the GHC source, this function contains a number of other checks, such
as for strictness and exportability. See the source code for further information.
@@ -376,11 +444,19 @@ See Section~\ref{sec:tyconroles} for more information about $[[tyConRolesX]]$, a
see Section~\ref{sec:axiom_rules} for more information about $[[coaxrProves]]$.
\subsection{Name consistency}
+\label{sec:name_consistency}
-There are two very similar checks for names, one declared as a local function:
+There are three very similar checks for names, two performed as part of
+\coderef{coreSyn/CoreLint.hs}{lintSingleBinding}:
\ottdefnlintSingleBindingXXlintBinder{}
+\ottdefnlintSingleBindingXXlintBinderXXjoin{}
+
+The point of the extra checks on $[[t']]$ is that a join point's type cannot be
+polymorphic in its return type; see \texttt{Note [The polymorphism rule of join
+points]} in \ghcfile{coreSyn/CoreSyn.hs}.
+
\ottdefnlintBinder{}
\subsection{Substitution consistency}
@@ -477,6 +553,12 @@ Also note that this semantics implements call-by-name, not call-by-need. So
while it describes the operational meaning of a term, it does not describe what
subexpressions are shared, and when.
+\subsection{Join points}
+Dealing with join points properly here would be cumbersome and pointless, since
+by design they work no differently than functions as far as FC is concerned.
+Reading $[[join]]$ as $[[let]]$ and $[[jump]]$ as application should tell you
+all need to know.
+
\subsection{Operational semantics rules}
\ottdefnstep{}