\documentclass{article} \usepackage{color} \usepackage{fullpage} \usepackage{hyperref} \newcommand{\Red}[1]{{\color{red} #1}} \title{The Backpack Manual} \begin{document} \maketitle Backpack is a new module system for Haskell, intended to enable ``separate modular development'', where application-writers depend on libraries by programming against abstract interfaces instead of specific implementations. Our goal is to reduce software coupling, and let the type system and compiler assist developers when they are developing large software systems. Backpack was originally described in a POPL'14 paper \url{http://plv.mpi-sws.org/backpack/}, but this document is intended to describe the syntax of a language you might actually be able to \emph{write}, i.e., from the perspective of a software developer. \paragraph{Examples of Backpack ``in the large''} In the standard practice of large-scale software development today, users organize code into libraries. Here is a very simple example of some Haskell code structured in this way: \begin{verbatim} unit p where module A where x = True y = False unit q where include p module B where import A b = x \end{verbatim} \verb|p| is a reusable unit (or package/library if you like) which provides a single module \verb|A|. \verb|q| depends on \verb|p| by directly including it, bringing all of the modules of \verb|p| into scope for import. There is a downside to writing code this way: without editing your source code, you can't actually swap out \verb|p| with another version (maybe \verb|p-2.0|, or perhaps even a version of \verb|p| written by someone else). Traditionally, this is overcome by relying on an extralinguistic mechanism, the \emph{package manager}, which indirects \verb|include p| so that it can refer to some other piece of code. Backpack offers a different, in-language way of expressing dependency without committing to an implementation: \begin{verbatim} unit q where require p module B where import A b = x \end{verbatim} Or, equivalently: \begin{verbatim} unit p-interface where signature A where x :: Bool y :: Bool unit q where include p-interface module B where import A b = x \end{verbatim} With the \verb|require| keyword, Backpack automatically computes \verb|p-interface|, which expresses the abstract interface of \verb|p|. \verb|q| utilizes a subset of this interface (unused requirements don't ``count''), resulting in this final version of \verb|q|: \begin{verbatim} unit q where signature A where x :: Bool module B where import A b = x \end{verbatim} The technical innovation of Backpack is that the indefinite version of \verb|q|, which does not depend on a specific implementation of \verb|p|, still composes naturally and in the \emph{same} way that the definite version of \verb|q|. That is to say, you can take a hierarchy of libraries that \verb|include| one another (today's situation) and replace each \verb|include| with a \verb|require|. The result is a reusable set of components with explicitly defined requirements. If the inferred requirements are not general enough, a signature can be written explicitly. Finally, the package manager serves a new role: it can be used to select a combination of components which fulfill all the requirements. Unlike dependency resolution with version numbers, with interfaces the package manager can do this \emph{completely} precisely. \paragraph{Examples of Backpack ``in the small''} Although Backpack was originally designed for library-scale development, it can also be used for small-scale modular development, similar to ML modules. Here is a simple example of writing an associated list implementation that is parametrized over the key: \begin{verbatim} unit assoc-map where signature H where data T eq :: T -> T -> Bool module Assoc where import H mylookup :: T -> [(T, a)] -> Maybe a mylookup x xs = fmap snd (find (eq x) xs) unit main where module AbsInt where type T = Int eq x y = abs x == abs y include assoc-map (Assoc as AbsIntAssoc) requires (T as AbsInt) module Main where import AbsIntAssoc import Prelude main = print (mylookup -1 [(1,"yes"),(-2,"no")]) \end{verbatim} For example, in Haskell there are many different kinds of ``string''-like data types: \verb|String| (a linked list of characters), \verb|ByteString| (an unstructured raw bytestring) and \verb|Text| (a UTF-8 encoded bytestring). Many libraries, e.g., parsers, need to work on any string-like input which a user might want to give them. A more full example of a Backpack version of some real library code can be found in Appendix~\ref{sec:tagstream-example}. \section{The Backpack file} \begin{verbatim} packages ::= "{" package_0 ";" ... ";" package_n "}" \end{verbatim} A Backpack file consists of a list of named packages. All packages in a Backpack file live in the global namespace. A package defines a collection of modules, exporting some of these modules so that other modules can make use of them via \emph{include}. You can compile a definite package \verb|p| in a Backpack file \verb|foo.bkp| with \verb|ghc --backpack foo.bkp p|; you can type-check an indefinite package by adding \verb|-fno-code|. \Red{ToDo: How do you import an external Backpack file?} \Red{ToDo: Facility for private packages.} \begin{verbatim} package ::= "package" pkgname [pkgexports] "where" pkgbody pkgname ::= /* package name, e.g. containers (no version!) */ pkgbody ::= "{" pkgdecl_0 ";" ... ";" pkgdecl_n "}" \end{verbatim} A package begins with the keyword \verb|package|, its name, and an optional export specification (e.g., a list of modules to be exposed). The header is followed a list of declarations which define modules, signatures and include other packages. \section{Declarations} \begin{verbatim} pkgdecl ::= "module" modid [exports] "where" body | "signature" modid [exports] "where" body | "include" pkgname ["as" pkgname] [inclspec] \end{verbatim} A package is made up of package declarations, which either introduce a new module implementation, introduces a new module signature, or includes a package from the package environment. The declaration of modules and signatures is exactly as it is in Haskell98, so we don't reprise the grammar here. \Red{ToDo: Clarify whether order matters. Both are valid designs, but I think order-less is more user-friendly.} We generally don't expect users to place their module source code in package files; thus we provide the following forms to refer to \verb|hs| and \verb|hsig| files living on the file-system: \begin{verbatim} pkgdecl ::= "module" modid_0 "=" path_0 "," ... "," modid_n "=" path_n | "signature" modid_0 "=" path_0 "," ... "," modid_n "=" path_n | "module" modid_0 "," ... "," modid_n | "signature" modid_0 "," ... "," modid_n \end{verbatim} Thus, \verb|module A = "A.hs"| defines the body of \verb|A| based on the contents of \verb|A.hs| in the package's source directory. When the assignment is omitted, we implicitly refer to the file path created by replacing periods with directory separators and adding an appropriate file extension (thus, we can also write \verb|module A|). \begin{verbatim} pkgdecl ::= "source" path \end{verbatim} The \verb|source| keyword is another construct which allows us to define modules by simply scanning the path in question. For example, if \verb|src| contains two files, \verb|A.hs| and \verb|B.hsig|, then ``\verb|source "src"|'' is equivalent to ``\verb|module A = "src/A.hs"; signature B = "src/B.hsig"|''. \Red{ToDo: Allow defining package-wide module imports, which propagate to all inline modules and signatures.} \Red{ToDo: Allow defining anonymous modules with bare type/expression declarations.} \section{Signatures} A signature, denoted with \verb|signature| in a Backpack file and a file with the \verb|hsig| extension on the file system, represents a (type) signature for a Haskell module. It can contain type signatures, data declarations, type classes, type class instances and reexports, but it cannot contain any value definitions.\footnote{Signatures are the backbone of the Backpack module system. A signature can be used to type-check client code which uses a module (without the module implementation), or to verify that an implementation upholds some signature (without a client implementation.)} Signatures are essentially \texttt{hs-boot} modules which do not support mutual recursion but have no runtime efficiency cost. Here is an example of a module signature representing an abstract map type: \begin{verbatim} module Map where type role Map nominal representational data Map k v instance Functor (Map k) empty :: Map k a \end{verbatim} \section{Includes and exports} \begin{verbatim} pkgdecl ::= "include" pkgname ["as" pkgname] [inclspec] inclspec ::= "(" renaming_0 "," ... "," renaming_n [","] ")" [ "requires" "(" renaming_0 "," ... "," renaming_n [","] ")" ] renaming ::= modid [ "as" modid ] | "package" pkgname \end{verbatim} % Add me later: % | "hiding" "(" modid_0 "," ... "," modid_n [","] ")" An include brings the modules and signatures of a package into scope. If these modules/signatures have the same names as other modules/signatures in scope, \emph{mix-in linking} occurs. In particular: \begin{itemize} \item Module + module = error (unless they really are the same!) \item Module + signature = the signature is filled in, and is no longer part of the requirements of the package. \item Signature + signature = the signatures are merged together. \end{itemize} An include is associated with an optional \verb|inclspec|, which can be to thin the provided modules and rename the provided and required modules of an include. In its simplest mode of use, an \verb|inclspec| is a list of modules to be brought into scope, e.g. \verb|include p (A, B)|. Requirements cannot be hidden, but they can be renamed to provide an implementation (or even to just reexport the requirement under another name.) If a requirement is not mentioned in an explicit requirements list, it is implicitly included (thus, \verb|requires (Hole)| has only a purely documentary effect). It is not valid to rename a provision to a requirement, or a requirement to a provision. \begin{verbatim} pkgexports ::= inclspec \end{verbatim} An export, symmetrically, specifies what modules a package will bring into scope if it is included without any \verb|inclspec|. Any module which is omitted from an explicit export list is not exposed (however, like before, requirements cannot be hidden.) When an explicit export list is omitted, you can calculate the provides and requires of a package as follows: \begin{itemize} \item A package provides any non-included modules and signatures. (It only provides an included module/signature if it is explicitly reexported.) \item A package requires any transitively reachable signatures or hole signatures which are not filled in with an implementation. \end{itemize} \Red{ToDo: Properly describe ``hole signatures'' in the declarations section} \subsection{Requirements} The fact that requirements are \emph{implicitly} propagated from package to package can result in some spooky ``action at a distance''. However, this implicit behavior is one of the key ingredients to making mix-in modular development scale: you don't want to have to explicitly link everything up, as you might have to do in a traditional ML module system. You cannot, however, import a requirement, unless it is also provided, which helps increase encapsulation. If a package provides a module, it can be imported: \begin{verbatim} package p (A) requires (A) where signature A where x :: Bool package q (B) requires (A) where include p module B where import A -- OK \end{verbatim} If it does not, it cannot be imported: \Red{Alternately, the import is OK but doesn't result in any identifiers being brought into scope.} \begin{verbatim} package p () requires (A) where -- yes, this is kind of pointless signature A where x :: Bool package q (B) requires (A) where include p module B where import A -- ERROR! \end{verbatim} This means that it is always safe for a package to remove requirements or weaken holes; clients will always continue to compile. Of course, if there is a different signature for the hole in scope, the import is not an error; however, no declarations from \verb|p| are in scope: \begin{verbatim} package p () requires (A) where signature A where x :: Bool package q (B) requires (A) where include p signature A where y :: Bool module B where import A x' = x -- ERROR! y' = y -- OK \end{verbatim} To summarize, requirements are part of the interface of a package; however, they provide no identifiers as far as imports are concerned. \Red{There is some subtle interaction with requirements and shaping; see Shaping by example for more details.} \subsection{Package includes/exports} A package export is easy enough to explain by analogy of module exports in Haskell: a \verb|package p| in an export list explicitly reexports the identifiers from that package; whereas even a default, wildcard export list would only export locally defined identifiers/modules. For example, this module exports the modules of both \verb|base| and \verb|array|. \begin{verbatim} package combined(package base, package array) where include base include array \end{verbatim} However, in Backpack, a package may be included multiple times, making such declarations ambiguous. Thus, a package can be included as a local package name to disambiguate: \begin{verbatim} package p(package q1) where -- equivalent to B1 include impls (A1, A2) include q as q1 (hole A as A1, B as B1) include q as q2 (hole A as A2, B as B2) \end{verbatim} A package include, e.g. \verb|include a (package p)| is only valid if \verb|a| exports the package \verb|p| explicitly.\footnote{It's probably possible to use anonymous packages to allow easily dividing a package into subpackages, but this is silly and you can always just put it in an actual package.} \section{(Transparent) signature ascription} \begin{verbatim} inclspec ::= ... | "::" pkgexp pkgexp ::= pkgname | "package" [exports] "where" pkgbody \end{verbatim} Signature ascription subsumes thinning: it narrows the exports of modules in a package to those specified by a signature package. This package \verb|pkgexp| is specified with either a reference to a named package or an \emph{anonymous package} (in prior work, these have been referred to as \emph{units}, although here the distinction is not necessary as our system is \emph{purely applicative}). Ascription also imposes a \emph{requirement} on the package being abscribed. Suppose you have \verb|p :: psig|, then: \begin{itemize} \item Everything provided \verb|psig| must also be provided by \verb|p|. \item Everything required by \verb|p| must also be required by \verb|psig|. \end{itemize} \Red{Alternately, the second requirement is not necessary, and you calculate the new requirements by taking the requirements of \texttt{psig}, removing the provides of \texttt{p}, and then adding the requirements of \texttt{p}. This makes it possible to ascribe includes for \emph{adapter} packages, which provide some modules given a different set of requirements.} Semantically, ascription replaces the module with a signature, type-checks the package against the signature, and then \emph{post facto} links the signature against the implementation. An ascribed include can be replaced with the signature it is ascribed with, resulting in a package which still typechecks but has more holes. \Red{You have to link at the VERY END, because if you link immediately after processing the module with the ascribed include, the module identities will leak. Of course, if we're compiling we just link eagerly. But now this means that if you have a definite package which uses ascription, even assuming all packages in the environment type-check, you must still type-check this package twice, once indefinitely and then with the actual linking relationship.} For example, ascription in the export specification thins out all private identifiers from the package: \begin{verbatim} package psig where signature A where public :: Bool package p :: psig where module A.Internal where not_exported = 0 module A where public = True private = False \end{verbatim} and, symmetrically, ascription in an include hides identifiers: \begin{verbatim} package psig where signature A where public :: Bool package p where module A where public = True private = False package q where include p :: psig module B where import A ... public ... -- OK ... private ... -- ERROR \end{verbatim} \Red{Observation: thinning is subsumed by transparent signature ascription, but NOT renaming. Thus RENAMING does not commute across signature ascription; you must do it either before or after. Syntax for this is tricky.} \paragraph{Syntactic sugar for anonymous packages} \begin{verbatim} pkgexp ::= pkgbody | path \end{verbatim} It may be useful to provide two forms of sugar for specifying anonymous packages: \verb|pkgbody| is equivalent to \verb|package where pkgbody|; and \verb|"path"| is equivalent to \verb|package where source "path"|. \appendix \section{Full grammar} \begin{verbatim} packages ::= "{" package_0 ";" ... ";" package_n "}" package ::= "package" pkgname [pkgexports] "where" pkgbody pkgname ::= /* package name, e.g. containers (no version!) */ pkgbody ::= "{" pkgdecl_0 ";" ... ";" pkgdecl_n "}" pkgdecl ::= "module" modid [exports] "where" body | "signature" modid [exports] "where" body | "include" pkgname ["as" pkgname] [inclspec] | "module" modid_0 "=" path_0 "," ... "," modid_n "=" path_n | "signature" modid_0 "=" path_0 "," ... "," modid_n "=" path_n | "module" modid_0 "," ... "," modid_n | "signature" modid_0 "," ... "," modid_n | "source" path inclspec ::= "(" renaming_0 "," ... "," renaming_n [","] ")" [ "requires" "(" renaming_0 "," ... "," renaming_n [","] ")" ] | "::" pkgexp pkgexports ::= inclspec renaming ::= modid [ "as" modid ] | "package" pkgname pkgexp ::= pkgname | "package" [exports] "where" pkgbody | pkgbody | path \end{verbatim} \newpage \label{sec:tagstream-example} \section{\texttt{tagstream-conduit} example} When someone recently asked on Haskell Reddit what the ``precise problem Backpack addresses'' was, Chris Doner offered a nice example from the \verb|tagstream-conduit|. The existing implementation, at \url{http://hackage.haskell.org/package/tagstream-conduit-0.5.5.1/docs/Text-HTML-TagStream-Entities.html}, uses a data type to define a ``module'' which is then used to implement two modules in the library, a variant for \verb|ByteString| and a variant for \verb|Text|. Here is how this would be done in Backpack: \Red{This still contains some syntax which I haven't fully explained.} \begin{verbatim} -- | This is an ordinary module which defines some types -- which are exported by the package. module Text.HTML.TagStream.Types where data Token' s = Text s | ... -- | This provides a 'Decoder' implementation for 'ByteString's. -- We define 'Str' to be 'ByteString' and implement a few -- functions manually, as well as reexport existing functions -- from some libraries. We don't plan to publically export -- these from the package. module Decoder.ByteString ( module Decoder.ByteString, Builder, break, drop, uncons ) where import Data.ByteString import Data.ByteString.Builder type Str = ByteString toStr = toByteString fromStr = fromByteString decodeEntity = ... -- | This provides a 'Decoder' implementation for 'Text', much -- the same as 'Decoder.ByteString'. module Decoder.Text ( module Decoder.Text, Builder, break, drop, uncons ) where import Data.Text import Data.Text.Lazy.Builder type Str = Text toStr = toText fromStr = fromText decodeEntity = ... -- | This defines the "functor"; the module implementing entity -- decoding, 'Entities', is parametrized by an abstract interface -- 'Decoder' which describes two types, 'Str' and 'Builder', as -- well as operations on them which are avaiable for the implementation. unit decoder where signature Decoder where data Str data Builder toStr :: Builder -> Str break :: (Char -> Bool) -> Str -> (Str, Str) fromStr :: Str -> Builder drop :: Int -> Str -> Str decodeEntity :: Str -> Maybe Str uncons :: Str -> Maybe (Char, Str) module Entities where import Text.HTML.TagStream.Types import Data.Conduit import Decoder decodeEntities :: Monad m => Conduit (Token' Str) m (Token' Str) decodeEntities = ... -- | Finally, these two lines instantiate 'Entities' twice; -- once with 'Decoder.ByteString', and once as 'Decoder.Text'. include decoder (Entities as Text.HTML.TagStream.ByteString) requires (Decoder as Decoder.ByteString) include decoder (Entities as Text.HTML.TagStream.Text) requires (Decoder as Decoder.Text) \end{verbatim} Without having the source-code inline, the out-of-line Backpack file would look something like this: \begin{verbatim} module Text.HTML.TagStream.Types -- Text/HTML/TagStream/Types.hs module Decoder.ByteString -- Decoder/ByteString.hs module Decoder.Text -- Decoder/Text.hs unit decoder where signature Decoder -- decoder/Decoder.hsig module Entities -- decoder/Entities.hs include decoder (Entities as Text.HTML.TagStream.ByteString) requires (Decoder as Decoder.ByteString) include decoder (Entities as Text.HTML.TagStream.Text) requires (Decoder as Decoder.Text) \end{verbatim} \end{document}