1 files changed, 97 insertions, 15 deletions
diff --git a/docs/core-spec/core-spec.mng b/docs/core-spec/core-spec.mng
index d1d89055e7..952a172e6f 100644
--- a/docs/core-spec/core-spec.mng
+++ b/docs/core-spec/core-spec.mng
@@ -30,7 +30,7 @@ System FC, as implemented in GHC\footnote{This
 document was originally prepared by Richard Eisenberg (\texttt{eir@cis.upenn.edu}),
 but it should be maintained by anyone who edits the functions or data structures
 mentioned in this file. Please feel free to contact Richard for more information.}\\
-\Large 23 October, 2015
+\Large 26 January, 2018
 \end{center}
 
 \section{Introduction}
@@ -89,6 +89,10 @@ variables:
 \ottn
 }
 
+\gram{
+\ottl
+}
+
 We sometimes omit the type/kind annotation to a variable when it is obvious from context.
 
 \subsection{Expressions}
@@ -112,13 +116,39 @@ is ok-for-speculation. See \verb|#let_app_invariant#| in \ghcfile{coreSyn/CoreSy
 \item Types and coercions can only appear on the right-hand-side of an application.
 \item The $[[t]]$ form of an expression must not then turn out to be a coercion.
 In other words, the payload inside of a \texttt{Type} constructor must not turn out
-to be built with \texttt{CoercionTy}. 
+to be built with \texttt{CoercionTy}.
+\item Join points (introduced by $[[join]]$ expressions) follow the invariants
+laid out in \verb|Note [Invariants on join points]| in
+\ghcfile{coreSyn/CoreSyn.lhs}:
+\begin{enumerate}
+    \item All occurences must be tail calls. This is enforced in our typing
+    rules using the label environment $[[D]]$.
+    \item Each join point has a \emph{join arity}. In this document, we write
+    each label as $[[p/I_t]]$ for the name $[[p]]$, the type $[[t]]$, and the
+    join arity $[[I]]$. The right-hand side of the binding must begin with at
+    least $[[I]]$ lambdas. This is enforced implicitly in
+    \ottdrulename{Tm\_JoinNonRec} and \ottdrulename{Tm\_JoinRec} by the use of
+    $[[split]]$ meta-function.
+    \item If the binding is recursive, then all other bindings in the recursive
+    group must be join points. We enforce this in our reformulation of the
+    grammar; in the actual AST, a $[[join]]$ is simply a $[[let]]$ where each
+    identifier is flagged as a join id, so this invariant requires that this
+    flag must be consistent across a recursive binding.
+    \item The binding's type must not be polymorphic in its return type. This
+    is expressed in \ottdrulename{Label\_Label}; see
+    Section~\ref{sec:name_consistency}.
+\end{enumerate}
+
 \end{itemize}
 
 Bindings for $[[let]]$ statements:
 
 \gram{\ottbinding}
 
+Bindings for $[[join]]$ statements:
+
+\gram{\ottjbinding}
+
 Case alternatives:
 
 \gram{\ottalt}
@@ -139,9 +169,10 @@ A program is just a list of bindings:
 
 \gram{\ottt}
 
-\ctor{ForAllTy}s are represented in two different ways, depending on whether
-the \ctor{ForAllTy} is anonymous (written $[[t1 -> t2]]$) or
-named (written $[[forall n . t]]$).
+\ctor{FunTy} is the special case for non-dependent function type. The
+\ctor{TyBinder} in \ghcfile{types/TyCoRep.lhs} distinguishes whether a binder is
+anonymous (\ctor{FunTy}) or named (\ctor{ForAllTy}). See
+\verb|Note [TyBinders]| in \ghcfile{types/TyCoRep.lhs}.
 
 There are some invariants on types:
 \begin{itemize}
@@ -152,7 +183,7 @@ $[[T]]$. It should be another application or a type variable.
 does \emph{not} need to be saturated.
 \item A saturated application of $[[(->) t1 t2]]$ should be represented as
 $[[t1 -> t2]]$. This is a different point in the grammar, not just pretty-printing.
-The constructor for a saturated $[[(->)]]$ is \texttt{ForAllTy}.
+The constructor for a saturated $[[(->)]]$ is \texttt{FunTy}.
 \item A type-level literal is represented in GHC with a different datatype than
 a term-level literal, but we are ignoring this distinction here.
 \item A coercion used as a type should appear only in the right-hand side of
@@ -164,7 +195,7 @@ are purely representational. The metatheory would remain the same if these forms
 were removed in favor of $[[t1 t2]]$. Nevertheless, we keep all three forms in
 this documentation to accurately reflect the implementation.
 
-The \texttt{Named} variant of a \texttt{Binder} (the first argument to a
+The \texttt{ArgFlag} field of a \texttt{TyVarBinder} (the first argument to a
 \texttt{ForAllTy}) also tracks visibility of arguments. Visibility affects
 only source Haskell, and is omitted from this presentation.
 
@@ -176,14 +207,14 @@ We use the notation $[[t1 k1~#k2 t2]]$ to stand for $[[(~#) k1 k2 t1 t2]]$.
 
 Invariants on coercions:
 \begin{itemize}
-\item $[[<t1 t2>_R]]$ is used; never $[[<t1>_R <t2>_Nom]]$.
-\item If $[[<T>_R]]$ is applied to some coercions, at least one of which is not
-reflexive, use $[[T_R </ gi // i />]]$, never $[[<T>_R g1 g2]] \ldots$.
+\item $[[<t1 t2>]]$ is used; never $[[<t1> <t2> ]]$.
+\item If $[[<T>]]$ is applied to some coercions, at least one of which is not
+reflexive, use $[[T_R </ gi // i />]]$, never $[[<T> g1 g2]] \ldots$.
 \item The $[[T]]$ in $[[T_R </gi//i/>]]$ is never a type synonym, though it could
 be a type function.
 \item Every non-reflexive coercion coerces between two distinct types.
 \item The name in a coercion must be a term-level name (\ctor{Id}).
-\item The contents of $[[<t>_R]]$ must not be a coercion. In other words,
+\item The contents of $[[<t>]]$ must not be a coercion. In other words,
 the payload in a \texttt{Refl} must not be built with \texttt{CoercionTy}.
 \end{itemize}
 
@@ -201,6 +232,20 @@ for more background.
 
 \gram{\ottR}
 
+The \texttt{GRefl} constructor taks an $[[mg]]$. It wraps a kind coercion, which
+might be reflexive or any coercion:
+
+\gram{\ottmg}
+
+A nominal reflexive coercion is quite common, so we keep the special form
+\texttt{Refl}. Invariants on reflexive coercions:
+
+\begin{itemize}
+\item Always use $[[<t>]]$; never $[[<t> Nom MRefl]]$.
+\item All invariants on $[[<t>]]$ hold for $[[<t> R MRefl]]$.
+\item Use $[[<t> Rep MRefl]]$; never $[[sub <t>]]$.
+\end{itemize}
+
 Is it a left projection or a right projection?
 
 \gram{\ottLorR}
@@ -275,15 +320,21 @@ synonym for $[[TYPE 'Unlifted]]$.
 \section{Contexts}
 
 The functions in \ghcfile{coreSyn/CoreLint.lhs} use the \texttt{LintM} monad.
-This monad contains a context with a set of bound variables $[[G]]$. The
-formalism treats $[[G]]$ as an ordered list, but GHC uses a set as its
+This monad contains a context with a set of bound variables $[[G]]$ and a set
+of bound labels $[[D]]$. The formalism treats $[[G]]$ and $[[D]]$ as ordered
+lists, but GHC uses sets as its
 representation.
 
 \gram{
 \ottG
 }
 
-We assume the Barendregt variable convention that all new variables are
+\gram{
+\ottD
+}
+
+We assume the Barendregt variable convention that all new variables and labels
+are
 fresh in the context. In the implementation, of course, some work is done
 to guarantee this freshness. In particular, adding a new type variable to
 the context sometimes requires creating a new, fresh variable name and then
@@ -313,12 +364,29 @@ Here is the definition of $[[vars_of]]$, taken from \coderef{coreSyn/CoreSyn.lhs
 \end{array}
 \]
 
+Here is the definition of $[[split]]$, which has no direct analogue in the
+source but simplifies the presentation here:
+
+\[
+\begin{array}{ll}
+[[split]]_0 [[t]] &= [[t]] \\
+[[split]]_n ([[s -> t]]) &= [[s]] \; ([[split]]_{n-1} [[t]]) \\
+[[split]]_n ([[forall alpha _ k . t]]) &= [[k]] \; ([[split]]_{n-1} [[t]])
+\end{array}
+\]
+
+The idea is simply to peel off the leading $i$ argument types (which may be
+kinds for type arguments) from a given type and return them in a sequence, with
+the return type (given $i$ arguments) as the final element of the sequence.
+
 \subsection{Binding consistency}
 
 \ottdefnlintXXbind{}
 
 \ottdefnlintSingleBinding{}
 
+\ottdefnlintSingleBindingXXjoins{}
+
 In the GHC source, this function contains a number of other checks, such
 as for strictness and exportability. See the source code for further information.
 
@@ -376,11 +444,19 @@ See Section~\ref{sec:tyconroles} for more information about $[[tyConRolesX]]$, a
 see Section~\ref{sec:axiom_rules} for more information about $[[coaxrProves]]$.
 
 \subsection{Name consistency}
+\label{sec:name_consistency}
 
-There are two very similar checks for names, one declared as a local function:
+There are three very similar checks for names, two performed as part of
+\coderef{coreSyn/CoreLint.hs}{lintSingleBinding}:
 
 \ottdefnlintSingleBindingXXlintBinder{}
 
+\ottdefnlintSingleBindingXXlintBinderXXjoin{}
+
+The point of the extra checks on $[[t']]$ is that a join point's type cannot be
+polymorphic in its return type; see \texttt{Note [The polymorphism rule of join
+points]} in \ghcfile{coreSyn/CoreSyn.hs}.
+
 \ottdefnlintBinder{}
 
 \subsection{Substitution consistency}
@@ -477,6 +553,12 @@ Also note that this semantics implements call-by-name, not call-by-need. So
 while it describes the operational meaning of a term, it does not describe what
 subexpressions are shared, and when.
 
+\subsection{Join points}
+Dealing with join points properly here would be cumbersome and pointless, since
+by design they work no differently than functions as far as FC is concerned.
+Reading $[[join]]$ as $[[let]]$ and $[[jump]]$ as application should tell you
+all need to know.
+
 \subsection{Operational semantics rules}
 
 \ottdefnstep{}