summaryrefslogtreecommitdiff
path: root/docs/backpack/algorithm.tex
diff options
context:
space:
mode:
Diffstat (limited to 'docs/backpack/algorithm.tex')
-rw-r--r--docs/backpack/algorithm.tex1392
1 files changed, 0 insertions, 1392 deletions
diff --git a/docs/backpack/algorithm.tex b/docs/backpack/algorithm.tex
deleted file mode 100644
index 1c5adbd91f..0000000000
--- a/docs/backpack/algorithm.tex
+++ /dev/null
@@ -1,1392 +0,0 @@
-% TODO
-% - definite typechecker/compiler implies it only looks at the
-% installed unit database, not TRUE
-
-
-\documentclass{article}
-
-\usepackage{pifont}
-\usepackage{graphicx} %[pdftex] OR [dvips]
-\usepackage{fullpage}
-\usepackage{wrapfig}
-\usepackage{float}
-\usepackage{titling}
-\usepackage{hyperref}
-\usepackage{tikz}
-\usepackage{color}
-\usepackage{footnote}
-\usepackage{float}
-\usepackage{algpseudocode}
-\usepackage{bigfoot}
-\usepackage{amssymb}
-\usepackage{amsmath}
-\usepackage{framed}
-
-% Alter some LaTeX defaults for better treatment of figures:
-% See p.105 of "TeX Unbound" for suggested values.
-% See pp. 199-200 of Lamport's "LaTeX" book for details.
-% General parameters, for ALL pages:
-\renewcommand{\topfraction}{0.9} % max fraction of floats at top
-\renewcommand{\bottomfraction}{0.8} % max fraction of floats at bottom
-% Parameters for TEXT pages (not float pages):
-\setcounter{topnumber}{2}
-\setcounter{bottomnumber}{2}
-\setcounter{totalnumber}{4} % 2 may work better
-\setcounter{dbltopnumber}{2} % for 2-column pages
-\renewcommand{\dbltopfraction}{0.9} % fit big float above 2-col. text
-\renewcommand{\textfraction}{0.07} % allow minimal text w. figs
-% Parameters for FLOAT pages (not text pages):
-\renewcommand{\floatpagefraction}{0.7} % require fuller float pages
-% N.B.: floatpagefraction MUST be less than topfraction !!
-\renewcommand{\dblfloatpagefraction}{0.7} % require fuller float pages
-
-% remember to use [htp] or [htpb] for placement
-
-\newcommand{\I}[1]{\ensuremath{\mathit{#1}}}
-
-\newcommand{\optionrule}{\noindent\rule{1.0\textwidth}{0.75pt}}
-
-\newenvironment{aside}
- {\begin{figure}\def\FrameCommand{\hspace{2em}}
- \MakeFramed{\advance\hsize-\width}\optionrule\small}
-{\par\vskip-\smallskipamount\optionrule\endMakeFramed\end{figure}}
-
-\setlength{\droptitle}{-6em}
-
-\newcommand{\Red}[1]{{\color{red} #1}}
-
-\title{The Backpack algorithm}
-
-\begin{document}
-
-\maketitle
-
-\noindent Backpack introduces the \emph{Backpack language}, which is
-used to define \emph{Backpack units} that provide some modules given
-implementations for some signatures. A Backpack unit can be
-immediately typechecked and elaborated into core; at some later point,
-each unit may then be compiled one or more times with different
-instantiations of its requirements.
-
-A unit is typechecked against the indefinite unit database
-(and the installed unit database, for units that have no
-requirements) in three steps:
-
-\begin{enumerate}
- \item The \textbf{dependency solver} computes the module and unit
- dependency structure of the declarations in a unit. Specifically,
- it produces (1) the set of \I{ModuleName}s which are required
- by the unit, (2) a directed acyclic graph labeled by \I{Module} of
- the modules and signatures of the unit, and (3) a directed (and acyclic, for now)
- graph labeled by \I{UnitId} of the includes of the unit.
-
- \item The \textbf{shaper} takes each module/signature in the DAG and
- computes its \I{Shape}, i.e., the list of \I{AvailInfo}s which are
- provided/required by each module/signature (respectively). This
- step is interleaved with the next step.
-
- \item The \textbf{type checker} takes each
- module/signature in the DAG (annotated with its shape) and type
- checks it. Cabal will save these type checking results
- in the indefinite unit database under the \I{ComponentId} associated
- with this unit.
-\end{enumerate}
-
-At some later point in time, a unit may be compiled against
-the installed unit database, if the user specifies a mapping which
-instantiates the requirements of a unit (a mapping from \I{ModuleName}s to \I{Module}s):
-
-\begin{enumerate}
- \item The \textbf{dependency solver} operates as before, but it also
- is responsible for computing the full set of \I{InstalledUnitId}s which
- must be compiled before this unit can be compiled, and compiling
- them.
-
- \item The \textbf{shaper} is not needed. (ToDo: except for recursive
- module loops.)
-
- \item The \textbf{type checker and compiler} takes each
- module/signature in the DAG, and typechecks and compiles the unit.
- Cabal will save the type checking results and object
- code to the installed unit database under the \I{InstalledUnitId}
- associated with this (instantiated) unit.
-\end{enumerate}
-
-\Red{ToDo: Rewrite this for added clarity}
-
-\newpage
-
-\section{Front-end syntax}
-
-\begin{figure}[htpb]
-$$
-\begin{array}{rcll}
-p,q,r && \mbox{Component names} \\
-m,n && \mbox{Module names} \\[1em]
-\multicolumn{3}{l}{\mbox{\bf Units}} \\
- \I{unit} & ::= & \verb|unit|\; p\; [\I{provreq}]\; \verb|where {| d_1 \verb|;| \ldots \verb|;| d_n \verb|}| \\[1em]
-\multicolumn{3}{l}{\mbox{\bf Declarations}} \\
- d & ::= & \verb|module|\; m \; [exports]\; \verb|where|\; \I{body} \\
- & | & \verb|signature|\; m \; [exports]\; \verb|where|\; \I{body} \\
- & | & \verb|include|\; p \; [provreq] \\[1em]
-\multicolumn{3}{l}{\mbox{\bf Provides/requires specification}} \\
-\I{provreq} & ::= & \verb|(| \, \I{rns} \, \verb|)| \;
- [ \verb|requires(|\, \I{rns} \, \verb|)| ] \\
-\I{rns} & ::= & \I{rn}_0 \verb|,| \, \ldots \verb|,| \, \I{rn}_n [\verb|,|] & \mbox{Renamings} \\
-\I{rn} & ::= & m\; \verb|as| \; n & \mbox{Renaming} \\[1em]
-\multicolumn{3}{l}{\mbox{\bf Haskell code}} \\
-\I{exports} & & \mbox{A Haskell module export list} \\
-\I{body} & & \mbox{A Haskell module body} \\
-\end{array}
-$$
-\caption{Syntax of Backpack} \label{fig:syntax}
-\end{figure}
-
-A (slightly simplified) syntax of Backpack is given in Figure~\ref{fig:syntax}.
-
-\newpage
-\section{Unit databases and the unit renamer}
-
-\begin{figure}[htpb]
-$$
-\begin{array}{rcll}
-\multicolumn{3}{l}{\mbox{\bf Identifiers}} \\
- \I{ComponentId} & ::= & \mbox{Opaque unique identifier} \\
- \I{UnitId} & ::= & \I{ComponentId} \verb|(|\, \I{HoleMap}\, \verb|)| \\
- & | & \verb|hole| \\
- \I{HoleMap} & ::= & \I{ModuleName}\; \verb|->|\; \I{Module}\; \verb|,|\, \ldots \\
- \I{Module} & ::= & \I{UnitId} \verb|:| \I{ModuleName} \\
- \I{InstalledUnitId} & ::= & \I{UnitId} \quad \mbox{(with no occurrences of \texttt{hole})} \\[1em]
-\multicolumn{3}{l}{\mbox{\bf Unit databases}} \\
- \I{ComponentDb} & ::= & \I{ComponentId} \; \verb|->| \; \I{ComponentRecord} \verb|,|\, \ldots \\
- \I{InstalledUnitDb} & ::= & \I{InstalledUnitId} \; \verb|->| \; \I{InstalledUnitRecord} \verb|,|\, \ldots \\
-\end{array}
-$$
-\caption{Unit identification} \label{fig:ids}
-\end{figure}
-
-\subsection{The component and unit database}
-
-To install a package so that it is available when other packages are compiled,
-we must record it in some sort of database (which the compiler will query later).
-The obvious design for such a database is for it to record \emph{installed packages},
-each of which has a collection of object files and interfaces from a compilation.
-However, with Backpack, we have to refine this picture in two ways:
-
-\begin{enumerate}
- \item A package is always a unit of distribution: something that has
- single authorship and is uploaded to Hackage. It would seriously hamper
- modular programming in the small, however, if you always had to create
- a new package to abstract over some requirements. So we say that a
- package can define multiple \emph{components}. The \emph{component database}
- records typechecked components.
- \item In Backpack, it may not be possible to \emph{compile} a component
- at install time: it may depend on some (as yet) unspecified holes.
- A component that is compiled to a specific instantiation is called a
- \emph{unit}. Thus, we maintain a second \emph{installed unit database}
- which records compiled units; the \emph{component database} contains
- typechecked-only components. (In practice, these two database are
- stored together, as components with no holes live both in the component
- database and the installed unit database.)
-\end{enumerate}
-%
-Thus, there are four closely related types of identifier to be aware of:
-
-\begin{description}
-\item[Component IDs] A component ID uniquely represents a component
-in a Cabal package, including the name and version of the containing package,
-the transitive dependencies of the component, and even the build information
-for the component. This ID is opaque to GHC and selected by Cabal
-(although GHC may take a component ID and suffix it with a unit name to
-derive a new component ID.) Component IDs identity entries in the
-\textbf{component database}, which contains the results of typechecking
-a component, but no actual object code. However, it does contain the
-elaborated source, so that it can be built into actual code when
-the requirement is filled.
-
-\item[Unit ID] A unit ID is a component ID augmented with a
-\I{HoleMap}, which says how the requirements of the component are
-instantiated. Every component ID induces a unit ID, where each
-requirement is filled with a fictitious unit \verb|hole|: when we
-typecheck a component for the component database, it is as if we are
-``compiling'' it instantiated with holes. Units instantiated with holes
-are never installed; they are strictly for type-checking (although we
-could generate code for them, which could be linked if the \verb|hole|
-symbols are rewritten to their true destinations).
-
-\item[Installed unit IDs] An installed unit ID is a unit ID which has
-no \verb|hole|s; it identifies a unit that can be compiled. The
-\textbf{installed unit database} caches the compilation results of these
-units, so that if a unit is compiled multiple times with the same
-instantiation, this code can be reused. (This database most closely
-resembles the existing installed package database with GHC today.)
-\end{description}
-
-\subsection{The unit renamer}
-
-The unit renamer is responsible for transforming component names in a
-Backpack file into \I{ComponentId}s, so that they can be uniquely
-identified in the component database.
-Given a base \I{ComponentId} of the library component we are compiling (\I{ThisComponentId})
-and a mapping from $p$ to \I{ComponentId} (\I{ComponentNameMap}), we rename
-as follows:
-
-\begin{itemize}
- \item Every unit declaration $\verb|unit|\; p$ is renamed to $\I{ThisComponentId}\, \verb|-|\, p$.
- \item Every unit include $\verb|include|\; p$ is renamed to $\I{ThisComponentId}\, \verb|-|\, p$ if $p$ was declared in the Backpack file; otherwise it is renamed according to \I{ComponentNameMap}.
-\end{itemize}
-
-The \I{ComponentNameMap} is entirely user specified, so there is a great deal
-of flexibility on how it can be created, but the convention we expect to
-be used by Cabal is that a component name $p$ corresponds to the same-named
-unit in a \emph{package} named $p$. Packages that don't use Backpack
-only have one component, the library component, which has the same name as package.
-
-\paragraph{Notational conventions}
-In the rest of this document, when it is unambiguous, we will use $p$, $q$ and $r$
-interchangeably with \I{ComponentId}, as after unit renaming, there
-are no occurrences of component names.
-
-\newpage
-\section{Dependency solver}
-
-\begin{figure}[htpb]
-$$
-\begin{array}{rcll}
- \tilde{d} & ::= & \verb|module|\; Module \; [exports]\; \verb|where|\; \I{body} \\
- & | & \verb|signature|\; \I{Module} \; [exports]\; \verb|where|\; \I{body} \\
- & | & \verb|merge|\; \I{Module} \\
- & | & \verb|include|\; \I{UnitId} \\
- \I{ComponentRecord}^{\mathsf{dep}} & ::= & \verb|provides:|\; m\; \verb|->|\; \I{Module}\verb|,|\, \ldots\\
- & & \verb|requires:|\; m\verb|,|\, \ldots
-\end{array}
-$$
-\caption{Resolved declarations} \label{fig:resolved}
-\end{figure}
-
-The dependency solver computes the unfilled requirements of a component, a
-dependency DAG on the modules/signatures in the component, and a dependency
-DAG on the includes in the component. We assume every referenced $p$ in the
-component must be recorded in the component database, such that we can
-look up $\I{ComponentRecord}^{\mathsf{dep}}$.
-
-\paragraph{Computing unfilled requirements} The unfilled requirements are $R-P$, where $R$ and $P$ are sets of module names computed from the declarations in the following manner:
-
-\begin{itemize}
- \item $\verb|include|\; p$: union the (domain of the) provisions with $P$ and the requirements with $R$.
- \item $\verb|module|\; m$: add $m$ to $P$.
- \item $\verb|signature|\; m$: add $m$ to $R$.
-\end{itemize}
-
-\paragraph{Declaration dependency graph}
-We define a graph where the nodes are as described in Figure~\ref{fig:resolved}:
-there is a node per
-for each module and signature, as well as an extra ``merge'' node for
-every unfilled requirement, which merges the interfaces of a local signature and
-any requirements brought in from includes.
-%
-Each node is identified by the tuple $\left(\I{Module}, \I{IsSource?}\right)$, where
-the \I{Module} of a declaration $m$ in component $p$ is \verb|p(H):m|, where $H$ maps
-each unfilled requirement $n$ to \verb|hole:n|, and \I{IsSource?} is true only for signatures.
-
-The edges of the directed graph signify a ``depends on'' relation, and are
-defined as follows:
-
-\begin{itemize}
- \item A module/signature $m$ depends on a module/signature merge $n$ if $m$ imports $n$.
- \item A module/signature $m$ depends on a signature $n$ if $m$ \verb|{-# SOURCE #-}| imports $n$.
- \item A module/signature merge $m$ depends on a local signature $m$ (if it exists).
-\end{itemize}
-%
-If the resulting graph has a cycle, this is an error.
-
-\paragraph{Include dependency graph} We define an dependency graph
-between includes, where an $\verb|include|\; p$ depends on
-$\verb|include|\; q$ if, for some module name $m$, $p$ requires $m$ and
-$q$ provides $m$. If there is a cyclic, then there is cross-component
-mutual recursion: for now, this is an error.
-
-Assuming an acyclic graph, we can compute the \I{UnitId} of each
-key as follows. Initialize $\Gamma$, a substitution from holes to \I{Module},
-to the identity substitution. For each $\verb|include|\; p$ in topological
-order, define its \I{UnitId} to be $p$ with the mapping $\Gamma$ with its
-domain restricted to the requirements of $p$. Then, for each provision
-$m\; \verb|->|\; \I{Module}$, update $\Gamma$ so that
-$\Gamma(m) = \operatorname{subst} (\Gamma, \I{Module})$
-(where $\operatorname{subst}$ recursively applies the substitution $\Gamma$ in \I{Module}).
-
-During compilation, the include dependency graph is useful for
-determining a compilation order for included units.
-
-\newpage
-\section{Requirement calculation}
-
-\Red{to write}
-
-\newpage
-\section{Shaping pass}
-
-\begin{figure}[htpb]
-$$
-\begin{array}{rcll}
-\I{Shape} & ::= & \verb|provides:|\; m \; \verb|->|\; \I{Module}\; \I{ModShape} \verb|;|\; \ldots \\
- & & \verb|requires:| \; m \; \verb|->|\; \textcolor{white}{\I{Module}}\; \I{ModShape} \verb|;|\; \ldots \\
-\I{ModShape} & ::= & \I{AvailInfo}_0 \verb|,|\, \ldots \verb|,|\, \I{AvailInfo}_n \\
-\I{AvailInfo} & ::= & \I{Name} & \mbox{Plain identifiers} \\
- & | & \I{Name} \, \verb|{| \, \I{Name}_0\verb|,| \, \ldots\verb|,| \, \I{Name}_n \, \verb|}| & \mbox{Type constructors} \\
-\I{Name} & ::= & \I{Module} \verb|.| \I{OccName} \\
-\I{OccName} & & \mbox{Unqualified name in a namespace} \\
-\I{ComponentRecord}^{\mathsf{shape}} & ::= & \I{Shape}
-\end{array}
-$$
-\caption{Shaping} \label{fig:shaping}
-\end{figure}
-
-The shaping pass computes the export \I{AvailInfo}s for each node in
-the dependency graph; collectively, these form the \I{Shape} of
-the unit described in Figure~\ref{fig:shaping}. Equivalently,
-the \I{Shape} of unit specifies what a unit requires and provides
-at the Haskell declaration level.
-
-An \I{AvailInfo} names a Haskell declaration that may be exported.
-It may be a plain identifier \I{Name}, or it may be a type constructor,
-in which case it has children \I{Name}s representing the names of the
-data constructors, record selectors, etc. This level of hierarchy
-makes it possible to use ellipses in an import list, e.g. \verb|TyCon(..)|,
-to selectively import just the logical children of a type constructor.
-Children names have the invariant that they have the same \I{Module} as the parent name.
-In a \I{ModShape}/export list,
-the \I{OccName}s of the plain identifier \I{AvailInfo}s and the \emph{children}
-of type constructor are unique
-(although the top-level \I{Name}s may not have unique \I{OccName}s).
-
-The compilation of every node is associated with a ``shape context'',
-which represents the modules which are transitively depended upon. Let the
-environment shape context is the merge of the shapes of all includes; to
-shape a node:
-
-\begin{enumerate}
- \item Merge the environment shape contexts with the shape contexts
- of all direct dependencies, resulting in the initial shape context.
- \item Rename the module/signature according to the initial shape context,
- getting a \I{ModShape}. Importantly, when renaming the signature \verb|M|,
- any declarations defined in the signature are assigned a \I{Name}s with the \I{Module} \verb|hole:M| (rather than a \I{Module} based on the current unit $p$).
- \item Merge this \I{ModShape} into the initial shape context
- (modules go in provisions while signatures go in provisions), the
- result defining the shape context of this node.
-\end{enumerate}
-
-We now elaborate on these steps in more detail.
-
-\subsection{Shapes of includes}
-Given an \verb|include p (X) requires (Y)|, we can look up the shape
-for $p$ from the indefinite package database. However, an include can
-also rename provisions and requires (where $X$, $Y$ are partial maps
-from module name to module name), which requires transforms the shape
-in the following way:
-
-\begin{itemize}
- \item For each original provision $m\; \verb|->|\; \ldots$, provide
- $X (m)\; \verb|->|\; \ldots$ if $X (m)$ is defined.
- \item For each original requirement $m\; \verb|->|\; \ldots$, require
- $Y (m)\; \verb|->|\; \ldots$ if $Y (m)$ is defined, and $m$ if it is not.
- (Non-mentioned requirements are always passed through).
- \item For each requirement renaming from \verb|M| to \verb|N| in $Y$, substitute
- all occurrences of \verb|hole:M| to \verb|hole:N| in the \I{ModShape}
- of all provisions and requirements.
-\end{itemize}
-
-\subsection{Shape merging}
-Before specifying the how to merge shapes algorithm, we must define some subprocedures
-for unifying and merging lower-level entities such as \I{AvailInfo}s and \I{Name}s,
-which produce \I{Name} substitution that are applied to shapes.
-
-\begin{description}
- \item[Unify two \textit{Name}s]
- (produces a \I{Name} and a \I{Name} substitution) \\
- Error if the names do not have matching \I{OccName}s. Error if neither name
- is a hole name. Otherwise, without loss of generality let $m$ be the hole name
- and $n$ the other name, return $n$ and the substitution of $m$ to $n$.
- \item[Merge two sets of \textit{Name}s]
- (produces a set of \I{Name}s and a \I{Name} substitution) \\
- Let two \I{Name}s be related if they have the same \I{OccName}.
- Union the two sets, unifying related names.
- \item[Unify two \textit{AvailInfo}s]
- (produces an \I{AvailInfo} and a \I{Name} substitution) \\
- If both \I{AvailInfo}s are simply
- a \I{Name}, unify the two \I{Name}s. If both \I{AvailInfo}s are
- $\I{Name}\, \verb|{|\, \I{Name}_0\verb|,|\, \ldots\verb|,|\, \I{Name}_n\, \verb|}|$,
- unify the top-level \I{Name}, apply the substitution to both \I{AvailInfo}s,
- and return the unified \I{Name} with the union of the child names of the
- substituted \I{AvailInfo}s.
- Otherwise, error.
- \item[Merge two sets of \textit{AvailInfo}s]
- (produces a set of \I{AvailInfo}s and a \I{Name} substitution) \\
- Let two \I{AvailInfo}s be related if they both are of the form
- \I{Name} and have matching \I{OccName}s, or if they both are of
- the form
- $\I{Name}\, \verb|{|\, \I{Name}_0\verb|,|\, \ldots\verb|,|\, \I{Name}_n\, \verb|}|$
- and there exists a child name in each which have matching \I{OccName}s.
- Union the two sets, unifying related \I{AvailInfo}s.
- \item[Apply a name substitution on an \textit{AvailInfo}] (produces an \I{AvailInfo}) \\
- Substitute the top-level \I{Name}, which induces a substitution from
- \I{Module} to $\I{Module}'$. Apply this module substitution to each child
- \I{Name} in the \I{AvailInfo}.
-
-\end{description}
-%
-Shape merging takes two units with inputs (requirements) and outputs
-(provisions) and ``wires'' them up so that outputs feed into inputs. To
-merge the shape of $p$ with the shape of $q$:
-
-\begin{enumerate}
- \item \emph{Fill every requirement of $q$ with provided modules from
- $p$.} For each requirement $M$ of $q$ that is provided by $p$,
- substitute each \I{Module} occurrence of \verb|hole:M| with the
- provided $p\verb|(|M\verb|)|$ (however, do \textbf{NOT} substitute the
- top-level \I{Module} in a \I{Name}s), merge the \I{AvailInfo}s and apply
- the resulting \I{Name} substitution, and
- remove the requirement from $q$. If the \I{AvailInfo}s of the
- provision are not a superset of the required \I{AvailInfo}s,
- error.
- \item If mutual recursion is supported, \emph{fill every requirement
- of $p$ with provided modules from $q$.}
- \item \emph{Merge leftover requirements.} For each requirement $M$
- of $q$ that is not provided by $p$ but required by $p$, and let
- the new requirement be the merge of
- \I{AvailInfo}s, applying the resulting \I{Name} substitution.
- \item \emph{Add provisions of $q$.} Union the provisions of $p$ and $q$, (lazily) erroring
- if there is a duplicate that doesn't have the same \I{Module}.
-\end{enumerate}
-
-\newpage
-\section{Indefinite type checker}
-
-\begin{figure}[htpb]
-$$
-\begin{array}{rcll}
-\I{ComponentRecord} & ::= & \I{ModIface}_0 \verb|;|\, \ldots\verb|;|\, \I{ModIface}_n \\[1em]
-\multicolumn{3}{l}{\mbox{\bf Module interface}} \\
-\I{ModIface} & ::= & \verb|module| \; \I{Module} \; \verb|(| \I{mi\_exports} \verb|)| \; \verb|where| \\
-& & \qquad \I{mi\_decls} \\
-& & \qquad \I{mi\_insts} \\
-& & \qquad \I{dep\_orphs} \\
-\I{mi\_exports} & ::= & \I{AvailInfo}_0 \verb|,|\, \ldots \verb|,|\, \I{AvailInfo}_n & \mbox{Export list} \\
-\I{mi\_decls} & ::= & \I{IfaceDecl}_0 \verb|;|\, \ldots \verb|;|\, \I{IfaceDecl}_n & \mbox{Defined declarations} \\
-\I{mi\_insts} & ::= & \I{IfaceClsInst}_0 \verb|;|\, \ldots \verb|;|\, \I{IfaceClsInst}_n & \mbox{Defined instances} \\
-\I{dep\_orphs} & ::= & \I{Module}_0 \verb|;|\, \ldots \verb|;|\, \I{Module}_n & \mbox{Transitive orphan dependencies} \\[1em]
-\multicolumn{3}{l}{\mbox{\bf Interface declarations}} \\
-\I{IfaceDecl} & ::= & \I{OccName} \; \verb|::| \; \I{IfaceId} \\
- & | & \verb|data| \; \I{OccName} \; \verb|=| \;\ \I{IfaceData} \\
- & | & \ldots \\
-\I{IfaceClsInst} & & \mbox{A type-class instance} \\
-\I{IfaceId} & & \mbox{Interface of top-level binder} \\
-\I{IfaceData} & & \mbox{Interface of type constructor} \\
-\end{array}
-$$
-\caption{Module interfaces in GHC} \label{fig:typecheck}
-\end{figure}
-
-\Red{This needs updating.}
-
-In general terms,
-type checking an indefinite unit (a unit with holes) involves
-calculating, for every module, a \I{ModIface} representing the
-type/interface of the module in question (which is serialized
-to disk). The general form of these
-interface files are described in Figure~\ref{fig:typecheck}; notably,
-the interfaces \I{IfaceId}, \I{IfaceData}, etc. contain \I{Name} references,
-which must be resolved by
-looking up a \I{ModIface} corresponding to the \I{Module} associated
-with the \I{Name}. (We will say more about this lookup process shortly.)
-For example, given:
-
-\begin{verbatim}
- unit p where
- signature H where
- data T
- module A(S, T) where
- import H
- data S = S T
-\end{verbatim}
-%
-the \I{PkgType} is:
-
-\begin{verbatim}
- module hole:H (hole:H.T) where
- data T -- abstract type constructor
- module THIS:A (THIS:A.S, hole:H.T) where
- data S = S hole:H.T
- -- where THIS = p(H -> hole:H)
-\end{verbatim}
-%
-However, while it is true that the \I{ModIface} is the final result
-of type checking, we actually are conflating two distinct concepts: the user-visible
-notion of a \I{ModuleName}, which, when imported, brings some \I{Name}s
-into scope (or could trigger a deprecation warning, or pull in some
-orphan instances\ldots), versus the actual declarations, which, while recorded
-in the \I{ModIface}, have an independent existence: even if a declaration
-is not visible for an import, we may internally refer to its \I{Name}, and
-need to look it up to find out type information. (A simple case when
-this can occur is if a module exports a function with type \verb|T -> T|,
-but doesn't export \verb|T|).
-
-\begin{figure}[htpb]
-$$
-\begin{array}{rcll}
-\I{ModDetails} & ::= & \langle\I{md\_types} \verb|;|\; \I{md\_insts}\rangle \\
-\I{md\_types} & ::= & \I{TyThing}_0 \verb|,|\, \ldots\verb|,|\, \I{TyThing}_n \\
-\I{md\_insts} & ::= & \I{ClsInst}_0 \verb|,|\, \ldots\verb|,|\, \I{ClsInst}_n \\[1em]
-\multicolumn{3}{l}{\mbox{\bf Type-checked declarations}} \\
-\I{TyThing} & & \mbox{Type-checked thing with a \I{Name}} \\
-\I{ClsInst} & & \mbox{Type-checked type class instance} \\
-\end{array}
-$$
-\caption{Semantic objects in GHC} \label{fig:typecheck-more}
-\end{figure}
-
-Thus, a \I{ModIface} can be type-checked into a \I{ModDetails}, described in
-Figure~\ref{fig:typecheck-more}. Notice that a \I{ModDetails} is just
-a bag of type-checkable entities which GHC knows about. We
-define the \emph{external package state (EPT)} to
-simply be the union of the \I{ModDetails}
-of all external modules.
-
-Type checking is a delicate balancing act between module
-interfaces and our semantic objects. A \I{ModIface} may get
-type-checked multiple times with different hole instantiations
-to provide multiple \I{ModDetails}.
-Furthermore complicating matters
-is that GHC does this resolution \emph{lazily}: a \I{ModIface}
-is only converted to a \I{ModDetails} when we are looking up
-the type of a \I{Name} that is described by the interface;
-thus, unlike usual theoretical treatments of type checking, we can't
-eagerly go ahead and perform substitutions on \I{ModIface}s when
-they get included.
-
-In a separate compiler like GHC, there are two primary functions we must provide:
-
-\paragraph{\textit{ModuleName} to \textit{ModIface}} Given a \I{ModuleName} which
-was explicitly imported by a user, we must produce a \I{ModIface}
-that, among other things, specifies what \I{Name}s are brought
-into scope. This is used by the renamer to resolve plain references
-to identifiers to real \I{Name}s. (By the way, if shaping produced
-renamed trees, it would not be necessary to do this step!)
-
-\paragraph{\textit{Module} to \textit{ModDetails}/EPT} Given a \I{Module} which may be
-a part of a \I{Name}, we must be able to type check it into
-a \I{ModDetails} (usually by reading and typechecking the \I{ModIface}
-associated with the \I{Module}, but this process is involved). This
-is used by the type checker to find out type information on things. \\
-
-There are two points in the type checker where these capabilities are exercised:
-
-\paragraph{Source-level imports} When a user explicitly imports a
-module, the \textit{ModuleName} is mapped to a \textit{ModIface}
-to find out what exports are brought into scope (\I{mi\_exports})
-and what orphan instances must be loaded (\I{dep\_orphs}). Additionally,
-the \textit{Module} is loaded to the EPT to bring instances from
-the module into scope.
-
-\paragraph{Internal name lookup} During type checking, we may have
-a \I{Name} for which we need type information (\I{TyThing}). If it's not already in the
-EPT, we type check and load
-into the EPT the \I{ModDetails} of the \I{Module} in the \I{Name},
-and then check the EPT again. (\verb|importDecl|)
-
-\subsection{\textit{ModuleName} to \textit{ModIface}}
-
-In all cases, the \I{mi\_exports} can be calculated directly from the
-shaping process, which specifies exactly for each \I{ModuleName} in scope
-what will be brought into scope.
-
-\paragraph{Modules} Modules are straightforward, as for any
-\I{Module} there is only one possibly \I{ModIface} associated
-with it (the \I{ModIface} for when we type-checked the (unique) \verb|module|
-declaration.)
-
-\paragraph{Signatures} For signatures, there may be multiple \I{ModIface}s
-associated with a \I{ModuleName} in scope, e.g. in this situation:
-
-\begin{verbatim}
- unit p where
- signature S where
- data A
- unit q where
- include p
- signature S where
- data B
- module M where
- import S
-\end{verbatim}
-%
-Each literal \verb|signature| has a \I{ModIface} associated with it; and
-the import of \verb|S| in \verb|M|, we want to see the \emph{merged}
-\I{ModIface}s. We can determine the \I{mi\_exports} from the shape,
-but we also need to pull in orphan instances for each signature, and
-produce a warning for each deprecated signature.
-
-\begin{aside}
-\textbf{Does hiding a signature hide its orphans.} Suppose that we have
-extended Backpack to allow hiding signatures from import.
-
-\begin{verbatim}
- unit p requires (H) where -- H is hidden from import
- module A where
- instance Eq (a -> b) where -- orphan
- signature H {-# DEPRECATED "Don't use me" #-} where
- import A
-
- unit q where
- include p
- signature H where
- data T
- module M where
- import H -- warn deprecated?
- instance Eq (a -> b) -- overlap?
-\end{verbatim}
-
-It is probably the most consistent to not pull in orphan instances
-and not give the deprecated warning: this corresponds to merging
-visible \I{ModIface}s, and ignoring invisible ones.
-\end{aside}
-
-\subsection{\textit{Module} to \textit{ModDetails}}
-
-\paragraph{Modules} For modules, we have a \I{Module} of
-the form $\I{p}\verb|(|m\; \verb|->|\; \I{Module}\verb|,|\, \ldots\verb|)|$,
-and we also have a unique \I{ModIface}, where each hole instantiation
-is $\verb|hole:|m$.
-
-To generate the \I{ModDetails} associated with the specific instantiation,
-we have to type-check the \I{ModIface} with the following adjustments:
-
-\begin{enumerate}
- \item Perform a \I{Module} substitution according to the instantiation
- of the \I{ModIface}'s \I{Module}. (NB: we \emph{do}
- substitute \verb|hole:A.x| to \verb|hole:B.x| if we instantiated
- \verb|A -> hole:B|, \emph{unlike} the disjoint
- substitutions applied by shaping.)
- \item Perform a \I{Name} substitution as follows: for any name
- with a unit key that is a $\verb|hole|$,
- substitute with the recorded \I{Name} in the requirements of the shape.
- Otherwise, look up the (unique) \I{ModIface} for the \I{Module},
- and substitute with the corresponding \I{Name} in the \I{mi\_exports}.
-\end{enumerate}
-
-\paragraph{Signatures} For signatures, we have a \I{Module} of the form
-$\verb|hole:|m$. Unlike modules, there are multiple \I{ModIface}s associated with a hole.
-We distinguish each separate \I{ModIface} by considering the full \I{UnitId}
-it was defined in, e.g. \verb|p(A -> hole:C, B -> q():B)|; call this
-the hole's \emph{defining unit key}; the set of \I{ModIface}s for a hole
-and their defining unit keys can easily be calculated during shaping.
-
-To generate the \I{ModDetails} associated with a hole, we type-check each
-\I{ModIface}, with the following adjustments:
-
-\begin{enumerate}
- \item Perform a \I{Module} substitution according to the instantiation
- of the defining unit key. (NB: This may rename the hole itself!)
- \item Perform a \I{Name} substitution as follows, in the same manner
- as would be done in the case of modules.
- \item When these \I{ModDetails} are merged into the EPT, some merging
- of duplicate types may occur; a type
- may be defined multiple times, in which case we check that each
- definition is compatible with the previous ones. A concrete
- type is always compatible with an abstract type.
-\end{enumerate}
-
-\paragraph{Invariants} When we perform \I{Name} substitutions, we must be
-sure that we can always find out the correct \I{Name} to substitute to.
-This isn't obviously true, consider:
-
-\begin{verbatim}
- unit p where
- signature S(foo) where
- data T
- foo :: T
- module M(bar) where
- import S
- bar = foo
- unit q where
- module A(T(..)) where
- data T = T
- foo = T
- module S(foo) where
- import A
- include p
- module A where
- import M
- ... bar ...
-\end{verbatim}
-%
-When we type check \verb|p|, we get the \I{ModIface}s:
-
-\begin{verbatim}
- module hole:S(hole:S.foo) where
- data T
- foo :: hole:S.T
- module THIS:M(THIS:M.bar) where
- bar :: hole:S.T
-\end{verbatim}
-%
-Now, when we type check \verb|A|, we pull on the \I{Name} \verb|p(S -> q():S):M.bar|,
-which means we have to type check the \I{ModIface} for \verb|p(S -> q():S):M|.
-The un-substituted type of \verb|bar| has a reference to \verb|hole:S.T|;
-this should be substituted to \verb|q():S.T|. But how do we discover this?
-We know that \verb|hole:S| was instantiated to \verb|q():S|, so we might try
-and look for \verb|q():S.T|. However, this \I{Name} does not exist because
-the \verb|module S| reexports the selector from \verb|A|! Nor can we consult
-the (unique) \I{ModIface} for the module, as it doesn't reexport the relevant
-type.
-
-The conclusion, then, is that a module written this way should be disallowed.
-Specifically, the correctness condition for a signature is this: \emph{Any \I{Name}
-mentioned in the \I{ModIface} of a signature must either be from an external module, or be
-exported by the signature}.
-
-\newpage
-\section{Installation}
-
-This section defines the syntax for the file-system format of the \I{ComponentDb}.
-Like entries in the installed unit database, an entry is a sequence of fields
-with values.
-
-Indefinite unit entries share some entries in common with entries in the installed
-unit database:
-
-\begin{description}
- \item[\texttt{component-id:}] \I{ComponentId} \newline
- The unique identifier of an installed package. This combined
- with \texttt{unit-name} uniquely identifies an entry in the
- installed unit database.
- \item[\texttt{exposed:}] \verb|True| or \verb|False| \newline
- Whether or not this unit is exposed, i.e. it is available for
- \verb|include| under its \verb|unit-name| (this is used to compute
- the default \I{ComponentNameMap} when GHC is called by itself).
- \item[\texttt{import-dirs:}] \I{FilePath} \newline
- Where interface files for this unit can be found. (NB: these
- interface files are templates, which contain references to holes
- which we can substitute.) (There's exactly one.)
- \item[\texttt{exposed-modules:}] \I{ModuleName} or \I{ModuleName} \texttt{from} \I{Module} $\ldots$ \newline
- The set of exposed modules from this unit, including reexports from
- other units.
- \item[\texttt{other-modules:}] \I{ModuleName} $\ldots$ \newline
- Non-exposed modules; there is an interface for each of these
- in the import-dirs. (Redundant, but useful for error reporting.)
-\end{description}
-%
-As well as all non-essential, Cabal-specific metadata; e.g. \texttt{name}, \texttt{version}, \ldots (\texttt{data-dir} and \texttt{haddock} probably)
-Here are new entries for indefinite units:
-
-\begin{description}
- \item[\texttt{requires:}] \I{ModuleName} \ldots \newline
- The set of module names which are requirements of this unit.
- (Installed units instead record \texttt{instantiated-with}, which
- specifies how each requirement was instantiated.) Every
- requirement has an interface in the import-dirs.
- \item[\texttt{source-dir:}] \I{FilePath} \newline
- The path to the original source of the package.
- \item[\texttt{setup-executable:}] \I{FilePath} \newline
- The path to the \texttt{Setup} executable as described by the Cabal
- specification which is capable of building and installing the package
- using \texttt{./Setup instantiate} (this is a new command which lets
- us program how the requirements of the indefinite unit should be filled),
- \texttt{./Setup build}, \texttt{./Setup copy} and
- \texttt{./Setup register}.
- \item[\texttt{package-config:}] \I{FilePath} \newline
- The path to the package configuration saved when the indefinite
- unit was installed. This should contain all of the relevant configuration
- information necessary to build a package, except how its requirements
- are instantiated.
-\end{description}
-%
-The string representation of \I{Module} is worth remarking upon. In
-this specification, \I{Module} is a recursive data structure. For
-installed packages, it is acceptable to flatten \I{Module} into a
-hash representing the \I{UnitId} and the \I{ModuleName}, as the \I{UnitId}
-is an \I{InstalledUnitId} which has an entry in the database. However,
-this is unacceptable for indefinite units, because we don't have an
-entry per \I{UnitId}. So, for \I{UnitId}s in the indefinite unit database,
-the full tree is written out, subject to this syntax:
-
-\begin{verbatim}
-Module ::= UnitId ":" ModuleName
-UnitId ::= InstalledPackageId
- [ "/" UnitName "(" HoleMap ")" ]
- | "hole"
-HoleMap ::= ModuleName "->" Module "," ...
-\end{verbatim}
-
-\section{Appendix: Shaping}
-
-This section clarifies some subtle aspects about shaping.
-
-\subsection{\textit{OccName} is implied by \textit{Name}}
-In Haskell, the following is not valid syntax:
-
-\begin{verbatim}
- import A (foobar as baz)
-\end{verbatim}
-In particular, a \I{Name} which is in scope will always have the same
-\I{OccName} (even if it may be qualified.) You might imagine relaxing
-this restriction so that declarations can be used under different \I{OccName}s;
-in such a world, we need a different definition of shape:
-
-\begin{verbatim}
- Shape ::=
- provided: ModuleName -> Module { OccName -> Name }
- required: ModuleName -> { OccName -> Name }
-\end{verbatim}
-Presently, however, such an \I{OccName} annotation would be redundant: it can be inferred from the \I{Name}.
-
-\subsection{Holes of a unit are a mapping, not a set.}
-
-Why can't the \I{UnitId} just record a
-set of \I{Module}s, e.g. $\I{UnitId}\;::= \; p \; \verb|{| \; \I{Module} \; \verb|}|$? Consider:
-
-\begin{verbatim}
- unit p (A) requires (H1, H2) where
- signature H1(T) where
- data T
- signature H2(T) where
- data T
- module A(A(..)) where
- import qualified H1
- import qualified H2
- data A = A H1.T H2.T
-
- unit q (A12, A21) where
- module I1(T) where
- data T = T Int
- module I2(T) where
- data T = T Bool
- include p (A as A12) requires (H1 as I1, H2 as I2)
- include p (A as A21) requires (H1 as I2, H2 as I1)
-\end{verbatim}
-With a mapping, the first instance of \verb|p| has key \verb|p(H1 -> q():I1, H2 -> q():I2)|
-while the second instance has key \verb|p(H1 -> q():I2, H2 -> q():I1)|; with
-a set, both would have the key \verb|p{q():I1, q():I2}| and be
-indistinguishable.
-
-\subsection{Signatures can require a specific entity.}
-With requirements like \verb|A -> { hole:A.T, hole:A.foo }|,
-why not specify it as \verb|A -> { T, foo }|,
-e.g., \verb|required: { ModuleName -> { OccName } }|? Consider:
-
-\begin{verbatim}
- unit p () requires (A, B) where
- signature A(T) where
- data T
- signature B(T) where
- import T
-\end{verbatim}
-The requirements of this unit specify that \verb|A.T| $=$ \verb|B.T|; this
-can be expressed with \I{Name}s as
-
-\begin{verbatim}
- A -> { hole:A.T }
- B -> { hole:A.T }
-\end{verbatim}
-But, without \I{Name}s, the sharing constraint is impossible: \verb|A -> { T }; B -> { T }|. (NB: \verb|A| and \verb|B| could be filled with different modules, they just have
-to both export the same \verb|T|.)
-
-\subsection{The \textit{Name} of a value is used to avoid
-ambiguous identifier errors.}
-We state that two types
-are equal when their \I{Name}s are the same; however,
-for values, it is less clear why we care. But consider this example:
-
-\begin{verbatim}
- unit p (A) requires (H1, H2) where
- signature H1(x) where
- x :: Int
- signature H2(x) where
- import H1(x)
- module A(y) where
- import H1
- import H2
- y = x
-\end{verbatim}
-The reference to \verb|x| in \verb|A| is unambiguous, because it is known
-that \verb|x| from \verb|H1| and \verb|x| from \verb|H2| are the same (have
-the same \I{Name}.) If they were not the same, it would be ambiguous and
-should cause an error. Knowing the \I{Name} of a value distinguishes
-between these two cases.
-
-\subsection{Holes are linear}
-Requirements do not record what \I{Module} represents
-the identity of a requirement, which means that it's not possible to assert
-that hole \verb|A| and hole \verb|B| should be implemented with the same module,
-as might occur with aliasing:
-
-\begin{verbatim}
- signature A where
- signature B where
- alias A = B
-\end{verbatim}
-%
-The benefit of this restriction is that when a requirement is filled,
-it is obvious that this is the only requirement that is filled: you won't
-magically cause some other requirements to be filled. The downside is
-it's not possible to write a unit which looks for an interface it is
-looking for in one of $n$ names, accepting any name as an acceptable linkage.
-If aliasing was allowed, we'd need a separate physical shaping context,
-to make sure multiple mentions of the same hole were consistent.
-
-\subsection{A unit does not ``provide'' its signatures}
-We enforce the invariant that
-a provision is always (syntactically) a \verb|module| and a requirement
-is always a \verb|signature|. This means that if you have a requirement
-and a provision of the same name, the requirement can \emph{always} be filled
-with the provision.
-
-The alternate design, where a unit both requires and provides
-its signatures, makes it unclear if a provision
-will actually fill a signature. Consider this example, where
-a signature is required and exposed:
-
-\begin{verbatim}
- unit a-sigs (A) requires (A) where -- ***
- signature A where
- data T
-
- unit a-user (B) requires (A) where
- signature A where
- data T
- x :: T
- module B where
- ...
-
- unit p where
- include a-sigs
- include a-user
-\end{verbatim}
-%
-When we consider merging in the shape of \verb|a-user|, does the
-\verb|A| provided by \verb|a-sigs| fill in the \verb|A| requirement
-in \verb|a-user|? It \emph{should not}, since \verb|a-sigs| does not
-actually provide enough declarations to satisfy \verb|a-user|'s
-requirement: the intended semantics \emph{merges} the requirements
-of \verb|a-sigs| and \verb|a-user|.
-
-What about this example?
-
-\begin{verbatim}
- unit a-sigs (M as A) requires (H as A) where
- signature H(T) where
- data T
- module M(T) where
- import H(T)
-\end{verbatim}
-%
-We rightly should error, since the provision is a module. And in this situation:
-
-\begin{verbatim}
- unit a-sigs (H as A) requires (H) where
- signature H(T) where
- data T
-\end{verbatim}
-%
-The requirements should be merged, but should the merged requirement
-be under the name \verb|H| or \verb|A|?
-
-It may still be possible to use the \verb|(A) requires (A)| syntax to
-indicate exposed signatures, but this would be a mere syntactic
-alternative to \verb|() requires (exposed A)|.
-
-\subsection{Signature visibility, and defaulting}
-The simplest formulation of requirements is to have them always be
-importable. One proposed enhancement, however, is to allow some
-requirements to be ``non-importable''; that is, they are not visible
-to people who include packages.
-
-One simple way of modeling this is to associate each required module
-with a flag indicating whether or not it is importable or not. Then, we
-might imagine that an explicit export list could be used to toggle
-whether or not a requirement is visible or not.
-
-However, when an export list is absent, we have to pick a default
-visibility for a signature. If we use the same behavior as with
-modules, a strange situation can occur:
-
-\begin{verbatim}
- unit p where -- S is visible
- signature S where
- x :: True
-
- unit q where -- use defaulting
- include p
- signature S where
- y :: True
- module M where
- import S
- z = x && y -- OK
-
- unit r where
- include q
- module N where
- import S
- z = y -- OK
- z = x -- ???
-\end{verbatim}
-%
-Absent the second signature declaration in \verb|q|, \verb|S.x| clearly
-should not be visible in \verb|N|. However, what ought to occur when this signature
-declaration is added? One interpretation is to say that only some
-(but not all) declarations are provided (\verb|S.x| remains invisible);
-another interpretation is that adding \verb|S| is enough to treat
-the signature as ``in-line'', and all declarations are now provided
-(\verb|S.x| is visible).
-
-The latter interpretation avoids having to keep track of providedness
-per declarations, and means that you can always express defaulting
-behavior by writing an explicit provides declaration on the unit.
-However, it has the odd behavior of making empty signatures semantically
-meaningful:
-
-\begin{verbatim}
-unit q where
- include p
- signature S where
-\end{verbatim}
-%
-% SPJ: This would be too complicated (if there's yet a third way)
-
-This is pretty complicated, so signature visibility is not currently
-planned to be implemented.
-
-\subsection{Tricky \textit{AvailInfo} merging scenarios}
-
-\paragraph{Merging when type constructors are not in scope}
-
-\begin{verbatim}
- signature A1(foo) where
- data A = A { foo :: Int, bar :: Bool }
-
- signature A2(bar) where
- data A = A { foo :: Int, bar :: Bool }
-\end{verbatim}
-%
-If we merge \verb|A1| and \verb|A2|, are we supposed to conclude that
-the types \verb|A1.A| and \verb|A2.A| (not in scope!) are the same?
-The answer is no! Consider these implementations:
-
-\begin{verbatim}
- module A1(A(..)) where
- data A = A { foo :: Int, bar :: Bool }
-
- module A2(A(..)) where
- data A = A { foo :: Int, bar :: Bool }
-
- module A(foo, bar) where
- import A1(foo)
- import A2(bar)
-\end{verbatim}
-
-Here, \verb|module A1| implements \verb|signature A1|, \verb|module A2| implements \verb|signature A2|,
-and \verb|module A| implements \verb|signature A1| and \verb|signature A2| individually
-and should certainly implement their merge. This is why we cannot simply
-merge type constructors based on the \I{OccName} of their top-level type;
-merging only occurs between in-scope identifiers.
-
-\paragraph{Does merging a selector merge the type constructor?}
-
-\begin{verbatim}
- signature A1(A(..)) where
- data A = A { foo :: Int, bar :: Bool }
-
- signature A2(A(..)) where
- data A = A { foo :: Int, bar :: Bool }
-
- signature A2(foo) where
- import A1(foo)
-\end{verbatim}
-%
-Does the last signature, which is written in the style of a sharing constraint on \verb|foo|,
-also cause \verb|bar| and the type and constructor \verb|A| to be unified?
-Because a merge of a child name results in a substitution on the parent name,
-the answer is yes.
-
-\paragraph{Incomplete data declarations}
-
-\begin{verbatim}
- signature A1(A(foo)) where
- data A = A { foo :: Int }
-
- signature A2(A(bar)) where
- data A = A { bar :: Bool }
-\end{verbatim}
-%
-Should \verb|A1| and \verb|A2| merge? If yes, this would imply
-that data definitions in signatures could only be \emph{partial}
-specifications of their true data types. This seems complicated,
-which suggests this should not be supported; however, in fact,
-this sort of definition, while disallowed during type checking,
-should be \emph{allowed} during shaping. The reason that the
-shape we abscribe to the signatures \verb|A1| and \verb|A2| are
-equivalent to the shapes for these which should merge:
-
-\begin{verbatim}
- signature A1(A(foo)) where
- data A = A { foo :: Int, bar :: Bool }
-
- signature A2(A(bar)) where
- data A = A { foo :: Int, bar :: Bool }
-\end{verbatim}
-
-\subsection{Subtyping record selectors as functions}
-
-\begin{verbatim}
- signature H(A, foo) where
- data A
- foo :: A -> Int
-
- module M(A, foo) where
- data A = A { foo :: Int, bar :: Bool }
-\end{verbatim}
-%
-Does \verb|M| successfully fill \verb|H|? If so, it means that anywhere
-a signature requests a function \verb|foo|, we can instead validly
-provide a record selector. This capability seems quite attractive,
-although in practice record selectors rarely seem to be abstracted this
-way: one reason is that \verb|M.foo| still \emph{is} a record selector,
-and can be used to modify a record. (Many library authors find this
-surprising!)
-
-Nor does this seem to be an insurmountable instance of the avoidance
-problem:
-as a workaround, \verb|H| can equivalently be written as:
-
-\begin{verbatim}
- signature H(foo) where
- data A = A { foo :: Int, bar :: Bool }
-\end{verbatim}
-%
-However, you might not like this, as the otherwise irrelevant \verb|bar| must be mentioned
-in the definition.
-
-In any case, actually implementing this `subtyping' is quite complicated, because we can no
-longer assume that every child name is associated with a parent name.
-The technical difficulty is that we now need to unify a plain identifier
-\I{AvailInfo} (from the signature) with a type constructor \I{AvailInfo}
-(from a module.) It is not clear what this should mean.
-Consider this situation:
-
-\begin{verbatim}
- unit p where
- signature H(A, foo, bar) where
- data A
- foo :: A -> Int
- bar :: A -> Bool
- module X(A, foo) where
- import H
- unit q where
- include p
- signature H(bar) where
- data A = A { foo :: Int, bar :: Bool }
- module Y where
- import X(A(..)) -- ???
-\end{verbatim}
-
-Should the wildcard import on \verb|X| be allowed?
-This question is equivalent to whether or not shaping discovers
-whether or not a function is a record selector and propagates this
-information elsewhere.
-If the wildcard is not allowed, here is another situation:
-
-\begin{verbatim}
- unit p where
- -- define without record selectors
- signature X1(A, foo) where
- data A
- foo :: A -> Int
- module M1(A, foo) where
- import X1
-
- unit q where
- -- define with record selectors (X1s unify)
- signature X1(A(..)) where
- data A = A { foo :: Int, bar :: Bool }
- signature X2(A(..)) where
- data A = A { foo :: Int, bar :: Bool }
-
- -- export some record selectors
- signature Y1(bar) where
- import X1
- signature Y2(bar) where
- import X2
-
- unit r where
- include p
- include q
-
- -- sharing constraint
- signature Y2(bar) where
- import Y1(bar)
-
- -- the payload
- module Test where
- import M1(foo)
- import X2(foo)
- ... foo ... -- conflict?
-\end{verbatim}
-
-Without the sharing constraint, the \verb|foo|s from \verb|M1| and \verb|X2|
-should conflict. With it, however, we should conclude that the \verb|foo|s
-are the same, even though the \verb|foo| from \verb|M1| is \emph{not}
-considered a child of \verb|A|, and even though in the sharing constraint
-we \emph{only} unified \verb|bar| (and its parent \verb|A|). To know that
-\verb|foo| from \verb|M1| should also be unified, we have to know a bit
-more about \verb|A| when the sharing constraint performs unification;
-however, the \I{AvailInfo} will only tell us about what is in-scope, which
-is \emph{not} enough information.
-
-\subsection{Some examples}
-
-\subsubsection{A simple example}
-
-In the following set of units:
-
-\begin{verbatim}
- unit p(M) requires (A) where
- signature A(T) where
- data T
- module M(T, S) where
- import A(T)
- data S = S T
-
- unit q where
- module A where
- data T = T
- include p
-\end{verbatim}
-
-When we \verb|include p|, we need to merge the partial shape
-of \verb|q| (with just provides \verb|A|) with the shape
-of \verb|p|. Here is each step of the merging process:
-
-\begin{verbatim}
- shape 1 shape 2
- --------------------------------------------------------------------------------
-(initial shapes)
-provides: A -> THIS:A { q():A.T } M -> p(A -> hole:A) { hole:A.T, p(A -> hole:A).S }
-requires: (nothing) A -> { hole:A.T }
-
-(after filling requirements)
-provides: A -> THIS:A { q():A.T } M -> p(A -> THIS:A) { q():A.T, p(A -> THIS:A).S }
-requires: (nothing) (nothing)
-
-(after adding provides)
-provides: A -> THIS:A { q():A.T }
- M -> p(A -> THIS:A) { q():A.T, p(A -> THIS:A).S }
-requires: (nothing)
-\end{verbatim}
-
-Notice that we substituted \verb|hole:A| with \verb|THIS:A|, but \verb|hole:A.T| with \verb|q():A.T|.
-
-\subsubsection{Requirements merging can affect provisions}
-
-When a merge results in a substitution, we substitute over both
-requirements and provisions:
-
-\begin{verbatim}
- signature H(T) where
- data T
- module A(T) where
- import H(T)
- module B(T) where
- data T = T
-
- -- provides: A -> THIS:A { hole:H.T }
- -- B -> THIS:B { THIS:B.T }
- -- requires: H -> { hole:H.T }
-
- signature H(T, f) where
- import B(T)
- f :: a -> a
-
- -- provides: A -> THIS:A { THIS:B.T } -- UPDATED
- -- B -> THIS:B { THIS:B.T }
- -- requires: H -> { THIS:B.T, hole:H.f } -- UPDATED
-\end{verbatim}
-
-\subsubsection{Sharing constraints}
-
-Suppose you have two signature which both independently define a type,
-and you would like to assert that these two types are the same. In the
-ML world, such a constraint is known as a sharing constraint. Sharing
-constraints can be encoded in Backpacks via clever use of reexports;
-they are also an instructive example for signature merging.
-
-\begin{verbatim}
- signature A(T) where
- data T
- signature B(T) where
- data T
-
- -- requires: A -> { hole:A.T }
- B -> { hole:B.T }
-
- -- the sharing constraint!
- signature A(T) where
- import B(T)
- -- (shape to merge)
- -- requires: A -> { hole:B.T }
-
- -- (after merge)
- -- requires: A -> { hole:A.T }
- -- B -> { hole:A.T }
-\end{verbatim}
-%
-\Red{I'm pretty sure any choice of \textit{Name} is OK, since the
-subsequent substitution will make it alpha-equivalent.}
-
-\subsection{Export declarations}
-
-If an explicit export declaration is given, the final shape is the
-computed shape, minus any provisions not mentioned in the list, with the
-appropriate renaming applied to provisions and requirements. (Requirements
-are implicitly passed through if they are not named.)
-If no explicit export declaration is given, the final shape is
-the computed shape, including only provisions which were defined
-in the declarations of the unit.
-
-
-\section{Cabal}
-
-
-% \I{InstalledUnitId} & ::= & \I{ComponentId} \verb|(| \, m \; \verb|->| \; \I{Module} \verb|,|\, \ldots\, \verb|)| & \mbox{Also known as \I{UnitId}} \\
-% \I{Module} & ::= & \I{InstalledUnitId} \verb|:| m \\
-
-\paragraph{Indefinite versus installed units}
-The purpose of an \I{ComponentId} is to uniquely identify the results of \textbf{typechecking}
-an indefinite unit; whereas an \I{InstalledUnitId} uniquely identifies
-the results of \textbf{compiling} a unit with all its holes
-filled. Thus, an \I{InstalledUnitId} also records a \emph{hole mapping}
-which specifies how each hole was filled. If an \I{InstalledUnitId}
-is only partially filled, we may refer to it as a \I{UnitId} (as these
-are never installed.)
-
-\paragraph{Units versus packages}
-
-Cabal packages are:
-
-\begin{itemize}
- \item The unit of distribution
- \item The unit that Hackage handles
- \item The unit of versioning
- \item The unit of ownership (who maintains it etc)
-\end{itemize}
-
-Backpack units are the building blocks of modular development;
-there may be multiple units per a Cabal package. While in theory
-Cabal could do sophisticated things with multiple units in a
-package, we expect Cabal
-to pick a distinguished unit (with the same unit name $p$ as
-the package) which serves as the publically visible unit.
-
-
-%\newpage
-
-\end{document} % chktex 16