summaryrefslogtreecommitdiff
path: root/docs
diff options
context:
space:
mode:
Diffstat (limited to 'docs')
-rw-r--r--docs/users_guide/external_core.xml1804
-rw-r--r--docs/users_guide/flags.xml2
-rw-r--r--docs/users_guide/glasgow_exts.xml63
-rw-r--r--docs/users_guide/ug-book.xml.in1
-rw-r--r--docs/users_guide/ug-ent.xml.in1
-rw-r--r--docs/users_guide/using.xml38
6 files changed, 1 insertions, 1908 deletions
diff --git a/docs/users_guide/external_core.xml b/docs/users_guide/external_core.xml
deleted file mode 100644
index e4354410ef..0000000000
--- a/docs/users_guide/external_core.xml
+++ /dev/null
@@ -1,1804 +0,0 @@
-<?xml version="1.0" encoding="utf-8"?>
-
-<!--
-This document is a semi-automatic conversion of docs/ext-core/core.tex to DocBook using
-
-1. `htlatex` to convert LaTeX to HTML
-2. `pandoc` to convert HTML to DocBook
-3. extensive manual work by James H. Fisher (jameshfisher@gmail.com)
--->
-
-<!--
-TODO:
-
-* Replace "java" programlisting with "ghccore"
-("ghccore" is not recognized by highlighters,
-causing some generators to fail).
-
-* Complete bibliography entries with journal titles;
-I am unsure of the proper DocBook elements.
-
-* Integrate this file with the rest of the Users' Guide.
--->
-
-
-<chapter id="an-external-representation-for-the-ghc-core-language-for-ghc-6.10">
- <title>An External Representation for the GHC Core Language (For GHC 6.10)</title>
-
- <para>Andrew Tolmach, Tim Chevalier ({apt,tjc}@cs.pdx.edu) and The GHC Team</para>
-
- <para>This chapter provides a precise definition for the GHC Core
- language, so that it can be used to communicate between GHC and new
- stand-alone compilation tools such as back-ends or
- optimizers.<footnote>
- <para>This is a draft document, which attempts
- to describe GHC’s current behavior as precisely as possible. Working
- notes scattered throughout indicate areas where further work is
- needed. Constructive comments are very welcome, both on the
- presentation, and on ways in which GHC could be improved in order to
- simplify the Core story.</para>
-
- <para>Support for generating external Core (post-optimization) was
- originally introduced in GHC 5.02. The definition of external Core in
- this document reflects the version of external Core generated by the
- HEAD (unstable) branch of GHC as of May 3, 2008 (version 6.9), using
- the compiler flag <code>-fext-core</code>. We expect that GHC 6.10 will be
- consistent with this definition.</para>
- </footnote>
- The definition includes a formal grammar and an informal semantics.
- An executable typechecker and interpreter (in Haskell), which
- formally embody the static and dynamic semantics, are available
- separately.</para>
-
- <section id="introduction">
- <title>Introduction</title>
-
- <para>The Glasgow Haskell Compiler (GHC) uses an intermediate language,
- called <quote>Core,</quote> as its internal program representation within the
- compiler’s simplification phase. Core resembles a subset of
- Haskell, but with explicit type annotations in the style of the
- polymorphic lambda calculus (F<subscript>ω</subscript>).</para>
-
- <para>GHC’s front-end translates full Haskell 98 (plus some extensions)
- into Core. The GHC optimizer then repeatedly transforms Core
- programs while preserving their meaning. A <quote>Core Lint</quote> pass in GHC
- typechecks Core in between transformation passes (at least when
- the user enables linting by setting a compiler flag), verifying
- that transformations preserve type-correctness. Finally, GHC’s
- back-end translates Core into STG-machine code <citation>stg-machine</citation> and then into C
- or native code.</para>
-
- <para>Two existing papers discuss the original rationale for the design
- and use of Core <citation>ghc-inliner,comp-by-trans-scp</citation>, although the (two different) idealized
- versions of Core described therein differ in significant ways from
- the actual Core language in current GHC. In particular, with the
- advent of GHC support for generalized algebraic datatypes (GADTs)
- <citation>gadts</citation> Core was extended beyond its previous
- F<subscript>ω</subscript>-style incarnation to support type
- equality constraints and safe coercions, and is now based on a
- system known as F<subscript>C</subscript> <citation>system-fc</citation>.</para>
-
- <para>Researchers interested in writing just <emphasis>part</emphasis> of a Haskell compiler,
- such as a new back-end or a new optimizer pass, might like to use
- GHC to provide the other parts of the compiler. For example, they
- might like to use GHC’s front-end to parse, desugar, and
- type-check source Haskell, then feeding the resulting code to
- their own back-end tool. As another example, they might like to
- use Core as the target language for a front-end compiler of their
- own design, feeding externally synthesized Core into GHC in order
- to take advantage of GHC’s optimizer, code generator, and run-time
- system. Without external Core, there are two ways for compiler
- writers to do this: they can link their code into the GHC
- executable, which is an arduous process, or they can use the GHC
- API <citation>ghc-api</citation> to do the same task more cleanly. Both ways require new
- code to be written in Haskell.</para>
-
- <para>We present a precisely specified external format for Core files.
- The external format is text-based and human-readable, to promote
- interoperability and ease of use. We hope this format will make it
- easier for external developers to use GHC in a modular way.</para>
-
- <para>It has long been true that GHC prints an ad-hoc textual
- representation of Core if you set certain compiler flags. But this
- representation is intended to be read by people who are debugging
- the compiler, not by other programs. Making Core into a
- machine-readable, bi-directional communication format requires:
-
- <orderedlist>
- <listitem>
- precisely specifying the external format of Core;
- </listitem>
- <listitem>
- modifying GHC to generate external Core files
- (post-simplification; as always, users can control the exact
- transformations GHC does with command-line flags);
- </listitem>
- <listitem>
- modifying GHC to accept external Core files in place of
- Haskell source files (users will also be able to control what
- GHC does to those files with command-line flags).
- </listitem>
- </orderedlist>
- </para>
-
- <para>The first two facilities will let developers couple GHC’s
- front-end (parser, type-checker, desugarer), and optionally its
- optimizer, with new back-end tools. The last facility will let
- developers write new Core-to-Core transformations as an external
- tool and integrate them into GHC. It will also allow new
- front-ends to generate Core that can be fed into GHC’s optimizer
- or back-end.</para>
-
- <para>However, because there are many (undocumented) idiosyncracies in
- the way GHC produces Core from source Haskell, it will be hard for
- an external tool to produce Core that can be integrated with
- GHC-produced Core (e.g., for the Prelude), and we don’t aim to
- support this. Indeed, for the time being, we aim to support only
- the first two facilities and not the third: we define and
- implement Core as an external format that GHC can use to
- communicate with external back-end tools, and defer the larger
- task of extending GHC to support reading this external format back
- in.</para>
-
- <para>This document addresses the first requirement, a formal Core
- definition, by proposing a formal grammar for an
- <link linkend="external-grammar-of-core">external representation of Core</link>,
- and an <link linkend="informal-semantics">informal semantics</link>.</para>
-
- <para>GHC supports many type system extensions; the External Core
- printer built into GHC only supports some of them. However,
- External Core should be capable of representing any Haskell 98
- program, and may be able to represent programs that require
- certain type system extensions as well. If a program uses
- unsupported features, GHC may fail to compile it to Core when the
- -fext-core flag is set, or GHC may successfully compile it to
- Core, but the external tools will not be able to typecheck or
- interpret it.</para>
-
- <para>Formal static and dynamic semantics in the form of an executable
- typechecker and interpreter are available separately in the GHC
- source tree
- <footnote><ulink url="http://git.haskell.org/ghc.git/tree">http://git.haskell.org/ghc.git</ulink></footnote>
- under <code>utils/ext-core</code>.</para>
-
- </section>
- <section id="external-grammar-of-core">
- <title>External Grammar of Core</title>
-
- <para>In designing the external grammar, we have tried to strike a
- balance among a number of competing goals, including easy
- parseability by machines, easy readability by humans, and adequate
- structural simplicity to allow straightforward presentations of
- the semantics. Thus, we had to make some compromises.
- Specifically:</para>
-
- <itemizedlist>
- <listitem>In order to avoid explosion of parentheses, we support
- standard precedences and short-cuts for expressions, types,
- and kinds. Thus we had to introduce multiple non-terminals for
- each of these syntactic categories, and as a result, the
- concrete grammar is longer and more complex than the
- underlying abstract syntax.</listitem>
-
- <listitem>On the other hand, we have kept the grammar simpler by
- avoiding special syntax for tuple types and terms. Tuples
- (both boxed and unboxed) are treated as ordinary constructors.</listitem>
-
- <listitem>All type abstractions and applications are given in full, even
- though some of them (e.g., for tuples) could be reconstructed;
- this means a parser for Core does not have to reconstruct
- types.<footnote>
- These choices are certainly debatable. In
- particular, keeping type applications on tuples and case arms
- considerably increases the size of Core files and makes them less
- human-readable, though it allows a Core parser to be simpler.
- </footnote></listitem>
-
- <listitem>The syntax of identifiers is heavily restricted (to just
- alphanumerics and underscores); this again makes Core easier
- to parse but harder to read.</listitem>
- </itemizedlist>
-
- <para>We use the following notational conventions for syntax:
-
- <informaltable frame="none">
- <tgroup cols='2' align='left' colsep="0" rowsep="0">
- <tbody>
- <row>
- <entry>[ pat ]</entry>
- <entry>optional</entry>
- </row>
-
- <row>
- <entry>{ pat }</entry>
- <entry>zero or more repetitions</entry>
- </row>
-
- <row>
- <entry>
- { pat }<superscript>+</superscript>
- </entry>
- <entry>one or more repetitions</entry>
- </row>
-
- <row>
- <entry>
- pat<subscript>1</subscript> ∣ pat<subscript>2</subscript>
- </entry>
- <entry>choice</entry>
- </row>
-
- <row>
- <entry>
- <code>fibonacci</code>
- </entry>
- <entry>terminal syntax in typewriter font</entry>
- </row>
- </tbody>
- </tgroup>
- </informaltable>
- </para>
-
- <informaltable frame="none" colsep="0" rowsep="0">
- <tgroup cols='5'>
- <colspec colname="cat" align="left" colwidth="3*" />
- <colspec colname="lhs" align="right" colwidth="2*" />
- <colspec align="center" colwidth="*" />
- <colspec colname="rhs" align="left" colwidth="10*" />
- <colspec colname="name" align="right" colwidth="6*" />
- <tbody>
- <row rowsep="1">
- <entry>Module</entry>
- <entry>module</entry>
- <entry>→</entry>
- <entry>
- <code>%module</code> mident { tdef ; }{ vdefg ; }
- </entry>
- <entry></entry>
- </row>
-
- <row>
- <entry morerows="1" valign="top">Type defn.</entry>
- <entry morerows="1" valign="top">tdef</entry>
- <entry>→</entry>
- <entry>
- <code>%data</code> qtycon { tbind } <code>= {</code> [ cdef {<code>;</code> cdef } ] <code>}</code>
- </entry>
- <entry>algebraic type</entry>
- </row>
- <row rowsep="1">
- <entry>∣</entry>
- <entry>
- <code>%newtype</code> qtycon qtycon { tbind } <code>=</code> ty
- </entry>
- <entry>newtype</entry>
- </row>
-
- <row rowsep="1">
- <entry>Constr. defn.</entry>
- <entry>cdef</entry>
- <entry>→</entry>
- <entry>
- qdcon { <code>@</code> tbind }{ aty }<superscript>+</superscript>
- </entry>
- </row>
-
- <row>
- <entry morerows="2" valign="top">Value defn.</entry>
- <entry morerows="1" valign="top">vdefg</entry>
- <entry>→</entry>
- <entry><code>%rec {</code> vdef { <code>;</code> vdef } <code>}</code></entry>
- <entry>recursive</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry>vdef</entry>
- <entry>non-recursive</entry>
- </row>
-
- <row rowsep="1">
- <entry>vdef</entry>
- <entry>→</entry>
- <entry>qvar <code>::</code> ty <code>=</code> exp</entry>
- <entry></entry>
- </row>
-
- <row>
- <entry morerows="3" valign="top">Atomic expr.</entry>
- <entry morerows="3" valign="top">aexp</entry>
- <entry>→</entry>
- <entry>qvar</entry>
- <entry>variable</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry>qdcon</entry>
- <entry>data constructor</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry>lit</entry>
- <entry>literal</entry>
- </row>
-
- <row rowsep="1">
- <entry>∣</entry>
- <entry><code>(</code> exp <code>)</code></entry>
- <entry>nested expr.</entry>
- </row>
-
- <row>
- <entry morerows="9" valign="top">Expression</entry>
- <entry morerows="9" valign="top">exp</entry>
- <entry>→</entry>
- <entry>aexp</entry>
- <entry>atomic expresion</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry>aexp { arg }<superscript>+</superscript></entry>
- <entry>application</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>\</code> { binder }<superscript>+</superscript> &arw; exp</entry>
- <entry>abstraction</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>%let</code> vdefg <code>%in</code> exp</entry>
- <entry>local definition</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>%case (</code> aty <code>)</code> exp <code>%of</code> vbind <code>{</code> alt { <code>;</code> alt } <code>}</code></entry>
- <entry>case expression</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>%cast</code> exp aty</entry>
- <entry>type coercion</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>%note</code> &quot; { char } &quot; exp</entry>
- <entry>expression note</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>%external ccall &quot;</code> { char } <code>&quot;</code> aty</entry>
- <entry>external reference</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>%dynexternal ccall</code> aty</entry>
- <entry>external reference (dynamic)</entry>
- </row>
-
- <row rowsep="1">
- <entry>∣</entry>
- <entry><code>%label &quot;</code> { char } <code>&quot;</code></entry>
- <entry>external label</entry>
- </row>
-
- <row>
- <entry morerows="1" valign="top">Argument</entry>
- <entry morerows="1" valign="top">arg</entry>
- <entry>→</entry>
- <entry><code>@</code> aty</entry>
- <entry>type argument</entry>
- </row>
-
- <row rowsep="1">
- <entry>∣</entry>
- <entry>aexp</entry>
- <entry>value argument</entry>
- </row>
-
- <row>
- <entry morerows="2" valign="top">Case alt.</entry>
- <entry morerows="2" valign="top">alt</entry>
- <entry>→</entry>
- <entry>qdcon { <code>@</code> tbind }{ vbind } <code>&arw;</code> exp</entry>
- <entry>constructor alternative</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry>lit <code>&arw;</code> exp</entry>
- <entry>literal alternative</entry>
- </row>
-
- <row rowsep="1">
- <entry>∣</entry>
- <entry><code>%_ &arw;</code> exp</entry>
- <entry>default alternative</entry>
- </row>
-
- <row>
- <entry morerows="1" valign="top">Binder</entry>
- <entry morerows="1" valign="top">binder</entry>
- <entry>→</entry>
- <entry><code>@</code> tbind</entry>
- <entry>type binder</entry>
- </row>
-
- <row rowsep="1">
- <entry>∣</entry>
- <entry>vbind</entry>
- <entry>value binder</entry>
- </row>
-
- <row>
- <entry morerows="1" valign="top">Type binder</entry>
- <entry morerows="1" valign="top">tbind</entry>
- <entry>→</entry>
- <entry>tyvar</entry>
- <entry>implicitly of kind *</entry>
- </row>
-
- <row rowsep="1">
- <entry>∣</entry>
- <entry><code>(</code> tyvar <code>::</code> kind <code>)</code></entry>
- <entry>explicitly kinded</entry>
- </row>
-
- <row rowsep="1">
- <entry>Value binder</entry>
- <entry>vbind</entry>
- <entry>→</entry>
- <entry><code>(</code> var <code>::</code> ty <code>)</code></entry>
- <entry></entry>
- </row>
-
- <row>
- <entry morerows="3" valign="top">Literal</entry>
- <entry morerows="3" valign="top">lit</entry>
- <entry>→</entry>
- <entry><code>(</code> [<code>-</code>] { digit }<superscript>+</superscript> <code>::</code> ty <code>)</code></entry>
- <entry>integer</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>(</code> [<code>-</code>] { digit }<superscript>+</superscript> <code>%</code> { digit }<superscript>+</superscript> <code>::</code> ty <code>)</code></entry>
- <entry>rational</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>( '</code> char <code>' ::</code> ty <code>)</code></entry>
- <entry>character</entry>
- </row>
-
- <row rowsep="1">
- <entry>∣</entry>
- <entry><code>( &quot;</code> { char } <code>&quot; ::</code> ty <code>)</code></entry>
- <entry>string</entry>
- </row>
-
- <row>
- <entry morerows="2" valign="top">Character</entry>
- <entry morerows="1" valign="top">char</entry>
- <entry>→</entry>
- <entry namest="rhs" nameend="name"><emphasis>any ASCII character in range 0x20-0x7E except 0x22,0x27,0x5c</emphasis></entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>\x</code> hex hex</entry>
- <entry>ASCII code escape sequence</entry>
- </row>
-
- <row rowsep="1">
- <entry>hex</entry>
- <entry>→</entry>
- <entry>0∣…∣9 ∣a ∣…∣f</entry>
- <entry></entry>
- </row>
-
- <row>
- <entry morerows="2" valign="top">Atomic type</entry>
- <entry morerows="2" valign="top">aty</entry>
- <entry>→</entry>
- <entry>tyvar</entry>
- <entry>type variable</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry>qtycon</entry>
- <entry>type constructor</entry>
- </row>
-
- <row rowsep="1">
- <entry>∣</entry>
- <entry><code>(</code> ty <code>)</code></entry>
- <entry>nested type</entry>
- </row>
-
- <row>
- <entry morerows="7" valign="top">Basic type</entry>
- <entry morerows="7" valign="top">bty</entry>
- <entry>→</entry>
- <entry>aty</entry>
- <entry>atomic type</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry>bty aty</entry>
- <entry>type application</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>%trans</code> aty aty</entry>
- <entry>transitive coercion</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>%sym</code> aty</entry>
- <entry>symmetric coercion</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>%unsafe</code> aty aty</entry>
- <entry>unsafe coercion</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>%left</code> aty</entry>
- <entry>left coercion</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>%right</code> aty</entry>
- <entry>right coercion</entry>
- </row>
-
- <row rowsep="1">
- <entry>∣</entry>
- <entry><code>%inst</code> aty aty</entry>
- <entry>instantiation coercion</entry>
- </row>
-
- <row>
- <entry morerows="2" valign="top">Type</entry>
- <entry morerows="2" valign="top">ty</entry>
- <entry>→</entry>
- <entry>bty</entry>
- <entry>basic type</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>%forall</code> { tbind }<superscript>+</superscript> <code>.</code> ty</entry>
- <entry>type abstraction</entry>
- </row>
-
- <row rowsep="1">
- <entry>∣</entry>
- <entry>bty <code>&arw;</code> ty</entry>
- <entry>arrow type construction</entry>
- </row>
-
- <row>
- <entry morerows="4" valign="top">Atomic kind</entry>
- <entry morerows="4" valign="top">akind</entry>
- <entry>→</entry>
- <entry><code>*</code></entry>
- <entry>lifted kind</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>#</code></entry>
- <entry>unlifted kind</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry><code>?</code></entry>
- <entry>open kind</entry>
- </row>
-
- <row>
- <entry>∣</entry>
- <entry>bty <code>:=:</code> bty</entry>
- <entry>equality kind</entry>
- </row>
-
- <row rowsep="1">
- <entry>∣</entry>
- <entry><code>(</code> kind <code>)</code></entry>
- <entry>nested kind</entry>
- </row>
-
- <row>
- <entry morerows="1" valign="top">Kind</entry>
- <entry morerows="1" valign="top">kind</entry>
- <entry>→</entry>
- <entry>akind</entry>
- <entry>atomic kind</entry>
- </row>
-
- <row rowsep="1">
- <entry>∣</entry>
- <entry>akind <code>&arw;</code> kind</entry>
- <entry>arrow kind</entry>
- </row>
-
- <row>
- <entry morerows="7" valign="top">Identifier</entry>
- <entry>mident</entry>
- <entry>→</entry>
- <entry>pname <code>:</code> uname</entry>
- <entry>module</entry>
- </row>
-
- <row>
- <entry>tycon</entry>
- <entry>→</entry>
- <entry>uname</entry>
- <entry>type constr.</entry>
- </row>
-
- <row>
- <entry>qtycon</entry>
- <entry>→</entry>
- <entry>mident <code>.</code> tycon</entry>
- <entry>qualified type constr.</entry>
- </row>
-
- <row>
- <entry>tyvar</entry>
- <entry>→</entry>
- <entry>lname</entry>
- <entry>type variable</entry>
- </row>
-
- <row>
- <entry>dcon</entry>
- <entry>→</entry>
- <entry>uname</entry>
- <entry>data constr.</entry>
- </row>
-
- <row>
- <entry>qdcon</entry>
- <entry>→</entry>
- <entry>mident <code>.</code> dcon</entry>
- <entry>qualified data constr.</entry>
- </row>
-
- <row>
- <entry>var</entry>
- <entry>→</entry>
- <entry>lname</entry>
- <entry>variable</entry>
- </row>
-
- <row rowsep="1">
- <entry>qvar</entry>
- <entry>→</entry>
- <entry>[ mident <code>.</code> ] var</entry>
- <entry>optionally qualified variable</entry>
- </row>
-
- <row>
- <entry morerows="6" valign="top">Name</entry>
- <entry>lname</entry>
- <entry>→</entry>
- <entry>lower { namechar }</entry>
- <entry></entry>
- </row>
-
- <row>
- <entry>uname</entry>
- <entry>→</entry>
- <entry>upper { namechar }</entry>
- <entry></entry>
- </row>
-
- <row>
- <entry>pname</entry>
- <entry>→</entry>
- <entry>{ namechar }<superscript>+</superscript></entry>
- <entry></entry>
- </row>
-
- <row>
- <entry>namechar</entry>
- <entry>→</entry>
- <entry>lower ∣ upper ∣ digit</entry>
- <entry></entry>
- </row>
-
- <row>
- <entry>lower</entry>
- <entry>→</entry>
- <entry><code>a</code> ∣ <code>b</code> ∣ … ∣ <code>z</code> ∣ <code>_</code></entry>
- <entry></entry>
- </row>
-
- <row>
- <entry>upper</entry>
- <entry>→</entry>
- <entry><code>A</code> ∣ <code>B</code> ∣ … ∣ <code>Z</code></entry>
- <entry></entry>
- </row>
-
- <row>
- <entry>digit</entry>
- <entry>→</entry>
- <entry><code>0</code> ∣ <code>1</code> ∣ … ∣ <code>9</code></entry>
- <entry></entry>
- </row>
- </tbody>
- </tgroup>
- </informaltable>
- </section>
-
- <section id="informal-semantics">
- <title>Informal Semantics</title>
-
- <para>At the term level, Core resembles a explicitly-typed polymorphic
- lambda calculus (F<subscript>ω</subscript>), with the addition of
- local <code>let</code> bindings, algebraic type definitions, constructors, and
- <code>case</code> expressions, and primitive types, literals and operators. Its
- type system is richer than that of System F, supporting explicit
- type equality coercions and type functions.<citation>system-fc</citation></para>
-
- <para>In this section we concentrate on the less obvious points about
- Core.</para>
-
- <section id="program-organization-and-modules">
- <title>Program Organization and Modules</title>
-
- <para>Core programs are organized into <emphasis>modules</emphasis>, corresponding directly
- to source-level Haskell modules. Each module has a identifying
- name <emphasis>mident</emphasis>. A module identifier consists of a <emphasis>package name</emphasis>
- followed by a module name, which may be hierarchical: for
- example, <code>base:GHC.Base</code> is the module identifier for GHC’s Base
- module. Its name is <code>Base</code>, and it lives in the GHC hierarchy
- within the <code>base</code> package. Section 5.8 of the GHC users’ guide
- explains package names <citation>ghc-user-guide</citation>. In particular, note that a Core
- program may contain multiple modules with the same (possibly
- hierarchical) module name that differ in their package names. In
- some of the code examples that follow, we will omit package
- names and possibly full hierarchical module names from
- identifiers for brevity, but be aware that they are always
- required.<footnote>
- A possible improvement to the Core syntax
- would be to add explicit import lists to Core modules, which could be
- used to specify abbrevations for long qualified names. This would make
- the code more human-readable.
- </footnote></para>
-
- <para>Each module may contain the following kinds of top-level
- declarations:
-
- <itemizedlist>
- <listitem>
- Algebraic data type declarations, each defining a type
- constructor and one or more data constructors;
- </listitem>
- <listitem>
- Newtype declarations, corresponding to Haskell <code>newtype</code>
- declarations, each defining a type constructor and a
- coercion name; and
- </listitem>
- <listitem>
- Value declarations, defining the types and values of
- top-level variables.
- </listitem>
- </itemizedlist>
- </para>
-
- <para>No type constructor, data constructor, or top-level value may be
- declared more than once within a given module. All the type
- declarations are (potentially) mutually recursive. Value
- declarations must be in dependency order, with explicit grouping
- of potentially mutually recursive declarations.</para>
-
- <para>Identifiers defined in top-level declarations may be <emphasis>external</emphasis> or
- <emphasis>internal</emphasis>. External identifiers can be referenced from any other
- module in the program, using conventional dot notation (e.g.,
- <code>base:GHC.Base.Bool</code>, <code>base:GHC.Base.True</code>). Internal identifiers
- are visible only within the defining module. All type and data
- constructors are external, and are always defined and referenced
- using fully qualified names (with dots).</para>
-
- <para>A top-level value is external if it is defined and referenced
- using a fully qualified name with a dot (e.g., <code>main:MyModule.foo = ...</code>);
- otherwise, it is internal (e.g., <code>bar = ...</code>). Note that
- Core’s notion of an external identifier does not necessarily
- coincide with that of <quote>exported</quote> identifier in a Haskell source
- module. An identifier can be an external identifier in Core, but
- not be exported by the original Haskell source
- module.<footnote>
- Two examples of such identifiers are: data
- constructors, and values that potentially appear in an unfolding. For an
- example of the latter, consider <code>Main.foo = ... Main.bar ...</code>, where
- <code>Main.foo</code> is inlineable. Since <code>bar</code> appears in <code>foo</code>’s unfolding, it is
- defined and referenced with an external name, even if <code>bar</code> was not
- exported by the original source module.
- </footnote>
- However, if an identifier was exported by the Haskell source
- module, it will appear as an external name in Core.</para>
-
- <para>Core modules have no explicit import or export lists. Modules
- may be mutually recursive. Note that because of the latter fact,
- GHC currently prints out the top-level bindings for every module
- as a single recursive group, in order to avoid keeping track of
- dependencies between top-level values within a module. An
- external Core tool could reconstruct dependencies later, of
- course.</para>
-
- <para>There is also an implicitly-defined module <code>ghc-prim:GHC.Prim</code>,
- which exports the <quote>built-in</quote> types and values that must be
- provided by any implementation of Core (including GHC). Details
- of this module are in the <link linkend="primitive-module">Primitive Module section</link>.</para>
-
- <para>A Core <emphasis>program</emphasis> is a collection of distinctly-named modules that
- includes a module called main:Main having an exported value
- called <code>main:ZCMain.main</code> of type <code>base:GHC.IOBase.IO a</code> (for some
- type <code>a</code>). (Note that the strangely named wrapper for <code>main</code> is the
- one exception to the rule that qualified names defined within a
- module <code>m</code> must have module name <code>m</code>.)</para>
-
- <para>Many Core programs will contain library modules, such as
- <code>base:GHC.Base</code>, which implement parts of the Haskell standard
- library. In principle, these modules are ordinary Haskell
- modules, with no special status. In practice, the requirement on
- the type of <code>main:Main.main</code> implies that every program will
- contain a large subset of the standard library modules.</para>
-
- </section>
- <section id="namespaces">
- <title>Namespaces</title>
-
- <para>There are five distinct namespaces:
- <orderedlist>
- <listitem>module identifiers (<code>mident</code>),</listitem>
- <listitem>type constructors (<code>tycon</code>),</listitem>
- <listitem>type variables (<code>tyvar</code>),</listitem>
- <listitem>data constructors (<code>dcon</code>),</listitem>
- <listitem>term variables (<code>var</code>).</listitem>
- </orderedlist>
- </para>
-
- <para>Spaces (1), (2+3), and (4+5) can be distinguished from each
- other by context. To distinguish (2) from (3) and (4) from (5),
- we require that data and type constructors begin with an
- upper-case character, and that term and type variables begin
- with a lower-case character.</para>
-
- <para>Primitive types and operators are not syntactically
- distinguished.</para>
-
- <para>Primitive <emphasis>coercion</emphasis> operators, of which there are six, <emphasis>are</emphasis>
- syntactically distinguished in the grammar. This is because
- these coercions must be fully applied, and because
- distinguishing their applications in the syntax makes
- typechecking easier.</para>
-
- <para>A given variable (type or term) may have multiple definitions
- within a module. However, definitions of term variables never
- <quote>shadow</quote> one another: the scope of the definition of a given
- variable never contains a redefinition of the same variable.
- Type variables may be shadowed. Thus, if a term variable has
- multiple definitions within a module, all those definitions must
- be local (let-bound). The only exception to this rule is that
- (necessarily closed) types labelling <code>%external</code> expressions may
- contain <code>tyvar</code> bindings that shadow outer bindings.</para>
-
- <para>Core generated by GHC makes heavy use of encoded names, in which
- the characters <code>Z</code> and <code>z</code> are used to introduce escape sequences
- for non-alphabetic characters such as dollar sign <code>$</code> (<code>zd</code>), hash <code>#</code>
- (<code>zh</code>), plus <code>+</code> (<code>zp</code>), etc. This is the same encoding used in <code>.hi</code>
- files and in the back-end of GHC itself, except that we
- sometimes change an initial <code>z</code> to <code>Z</code>, or vice-versa, in order to
- maintain case distinctions.</para>
-
- <para>Finally, note that hierarchical module names are z-encoded in
- Core: for example, <code>base:GHC.Base.foo</code> is rendered as
- <code>base:GHCziBase.foo</code>. A parser may reconstruct the module
- hierarchy, or regard <code>GHCziBase</code> as a flat name.</para>
-
- </section>
- <section id="types-and-kinds">
- <title>Types and Kinds</title>
-
- <para>In Core, all type abstractions and applications are explicit.
- This make it easy to typecheck any (closed) fragment of Core
- code. An full executable typechecker is available separately.</para>
-
- <section id="types">
- <title>Types</title>
-
- <para>Types are described by type expressions, which are built from
- named type constructors and type variables using type
- application and universal quantification. Each type
- constructor has a fixed arity ≥ 0. Because it is so widely
- used, there is special infix syntax for the fully-applied
- function type constructor (<code>&arw;</code>). (The prefix identifier for
- this constructor is <code>ghc-prim:GHC.Prim.ZLzmzgZR</code>; this should
- only appear in unapplied or partially applied form.)</para>
-
- <para>There are also a number of other primitive type constructors
- (e.g., <code>Intzh</code>) that are predefined in the <code>GHC.Prim</code> module, but
- have no special syntax. <code>%data</code> and <code>%newtype</code> declarations
- introduce additional type constructors, as described below.
- Type constructors are distinguished solely by name.</para>
-
- </section>
- <section id="coercions">
- <title>Coercions</title>
-
- <para>A type may also be built using one of the primitive coercion
- operators, as described in <link linkend="namespaces">the Namespaces section</link>. For details on the
- meanings of these operators, see the System FC paper <citation>system-fc</citation>. Also
- see <link linkend="newtypes">the Newtypes section</link> for
- examples of how GHC uses coercions in Core code.</para>
-
- </section>
- <section id="kinds">
- <title>Kinds</title>
- <para>As described in the Haskell definition, it is necessary to
- distinguish well-formed type-expressions by classifying them
- into different <emphasis>kinds</emphasis> <citation>haskell98, p. 41</citation><!-- TODO -->. In particular, Core
- explicitly records the kind of every bound type variable.</para>
-
- <para>In addition, Core’s kind system includes equality kinds, as in
- System FC <citation>system-fc</citation>. An application of a built-in coercion, or of a
- user-defined coercion as introduced by a <code>newtype</code> declaration,
- has an equality kind.</para>
-
- </section>
- <section id="lifted-and-unlifted-types">
- <title>Lifted and Unlifted Types</title>
-
- <para>Semantically, a type is <emphasis>lifted</emphasis> if and only if it has bottom as
- an element. We need to distinguish them because operationally,
- terms with lifted types may be represented by closures; terms
- with unlifted types must not be represented by closures, which
- implies that any unboxed value is necessarily unlifted. We
- distinguish between lifted and unlifted types by ascribing
- them different kinds.</para>
-
- <para>Currently, all the primitive types are unlifted (including a
- few boxed primitive types such as <code>ByteArrayzh</code>). Peyton-Jones
- and Launchbury <citation>pj:unboxed</citation> described the ideas behind unboxed and
- unlifted types.</para>
-
- </section>
- <section id="type-constructors-base-kinds-and-higher-kinds">
- <title>Type Constructors; Base Kinds and Higher Kinds</title>
-
- <para>Every type constructor has a kind, depending on its arity and
- whether it or its arguments are lifted.</para>
-
- <para>Term variables can only be assigned types that have base
- kinds: the base kinds are <code>*</code>, <code>#</code>, and <code>?</code>. The three base kinds
- distinguish the liftedness of the types they classify: <code>*</code>
- represents lifted types; <code>#</code> represents unlifted types; and <code>?</code> is
- the <quote>open</quote> kind, representing a type that may be either lifted
- or unlifted. Of these, only <code>*</code> ever appears in Core type
- declarations generated from user code; the other two are
- needed to describe certain types in primitive (or otherwise
- specially-generated) code (which, after optimization, could
- potentially appear anywhere).</para>
-
- <para>In particular, no top-level identifier (except in
- <code>ghc-prim:GHC.Prim</code>) has a type of kind <code>#</code> or <code>?</code>.</para>
-
- <para>Nullary type constructors have base kinds: for example, the
- type <code>Int</code> has kind <code>*</code>, and <code>Int#</code> has kind <code>#</code>.</para>
-
- <para>Non-nullary type constructors have higher kinds: kinds that
- have the form
- k<subscript>1</subscript><code>&arw;</code>k<subscript>2</subscript>, where
- k<subscript>1</subscript> and k<subscript>2</subscript> are
- kinds. For example, the function type constructor <code>&arw;</code> has
- kind <code>* &arw; (* &arw; *)</code>. Since Haskell allows abstracting
- over type constructors, type variables may have higher kinds;
- however, much more commonly they have kind <code>*</code>, so that is the
- default if a type binder omits a kind.</para>
-
- </section>
-
- <section id="type-synonyms-and-type-equivalence">
- <title>Type Synonyms and Type Equivalence</title>
-
- <para>There is no mechanism for defining type synonyms
- (corresponding to Haskell <code>type</code> declarations).</para>
-
- <para>Type equivalence is just syntactic equivalence on type
- expressions (of base kinds) modulo:</para>
-
- <itemizedlist>
- <listitem>alpha-renaming of variables bound in <code>%forall</code> types;</listitem>
- <listitem>the identity a <code>&arw;</code> b ≡ <code>ghc-prim:GHC.Prim.ZLzmzgZR</code> a b</listitem>
- </itemizedlist>
-
- </section>
- </section>
- <section id="algebraic-data-types">
- <title>Algebraic data types</title>
-
- <para>Each data declaration introduces a new type constructor and a
- set of one or more data constructors, normally corresponding
- directly to a source Haskell <code>data</code> declaration. For example, the
- source declaration
-
- <programlisting language="haskell">
-data Bintree a =
- Fork (Bintree a) (Bintree a)
- | Leaf a
- </programlisting>
-
- might induce the following Core declaration
-
- <programlisting language="java">
-%data Bintree a = {
- Fork (Bintree a) (Bintree a);
- Leaf a)}
- </programlisting>
-
- which introduces the unary type constructor Bintree of kind
- <code>*&arw;*</code> and two data constructors with types
-
- <programlisting language="java">
-Fork :: %forall a . Bintree a &arw; Bintree a &arw; Bintree a
-Leaf :: %forall a . a &arw; Bintree a
- </programlisting>
-
- We define the <emphasis>arity</emphasis> of each data constructor to be the number of
- value arguments it takes; e.g. <code>Fork</code> has arity 2 and <code>Leaf</code> has
- arity 1.</para>
-
- <para>For a less conventional example illustrating the possibility of
- higher-order kinds, the Haskell source declaration
-
- <programlisting language="haskell">
-data A f a = MkA (f a)
- </programlisting>
-
- might induce the Core declaration
-
- <programlisting language="java">
-%data A (f::*&arw;*) a = { MkA (f a) }
- </programlisting>
-
- which introduces the constructor
-
- <programlisting language="java">
-MkA :: %forall (f::*&arw;*) a . (f a) &arw; (A f) a
- </programlisting></para>
-
- <para>GHC (like some other Haskell implementations) supports an
- extension to Haskell98 for existential types such as
-
- <programlisting language="haskell">
-data T = forall a . MkT a (a &arw; Bool)
- </programlisting>
-
- This is represented by the Core declaration
-
- <programlisting language="java">
-%data T = {MkT @a a (a &arw; Bool)}
- </programlisting>
-
- which introduces the nullary type constructor T and the data
- constructor
-
- <programlisting language="java">
-MkT :: %forall a . a &arw; (a &arw; Bool) &arw; T
- </programlisting>
-
- In general, existentially quantified variables appear as extra
- universally quantified variables in the data contructor types. An
- example of how to construct and deconstruct values of type <code>T</code> is
- shown in <link linkend="expression-forms">the Expression Forms section</link>.</para>
-
- </section>
- <section id="newtypes">
- <title>Newtypes</title>
-
- <para>Each Core <code>%newtype</code> declaration introduces a new type constructor
- and an associated representation type, corresponding to a source
- Haskell <code>newtype</code> declaration. However, unlike in source Haskell,
- a <code>%newtype</code> declaration does not introduce any data constructors.</para>
-
- <para>Each <code>%newtype</code> declaration also introduces a new coercion
- (syntactically, just another type constructor) that implies an
- axiom equating the type constructor, applied to any type
- variables bound by the <code>%newtype</code>, to the representation type.</para>
-
- <para>For example, the Haskell fragment
-
- <programlisting language="haskell">
-newtype U = MkU Bool
-u = MkU True
-v = case u of
- MkU b &arw; not b
- </programlisting>
-
- might induce the Core fragment
-
- <programlisting language="java">
-%newtype U ZCCoU = Bool;
-u :: U = %cast (True)
- ((%sym ZCCoU));
-v :: Bool = not (%cast (u) ZCCoU);
- </programlisting></para>
-
- <para>The <code>newtype</code> declaration implies that the types <code>U</code> and <code>Bool</code> have
- equivalent representations, and the coercion axiom <code>ZCCoU</code>
- provides evidence that <code>U</code> is equivalent to <code>Bool</code>. Notice that in
- the body of <code>u</code>, the boolean value <code>True</code> is cast to type <code>U</code> using
- the primitive symmetry rule applied to <code>ZCCoU</code>: that is, using a
- coercion of kind <code>Bool :=: U</code>. And in the body of <code>v</code>, <code>u</code> is cast
- back to type <code>Bool</code> using the axiom <code>ZCCoU</code>.</para>
-
- <para>Notice that the <code>case</code> in the Haskell source code above translates
- to a <code>cast</code> in the corresponding Core code. That is because
- operationally, a <code>case</code> on a value whose type is declared by a
- <code>newtype</code> declaration is a no-op. Unlike a <code>case</code> on any other
- value, such a <code>case</code> does no evaluation: its only function is to
- coerce its scrutinee’s type.</para>
-
- <para>Also notice that unlike in a previous draft version of External
- Core, there is no need to handle recursive newtypes specially.</para>
-
- </section>
-
- <section id="expression-forms">
- <title>Expression Forms</title>
-
- <para>Variables and data constructors are straightforward.</para>
-
- <para>Literal (<emphasis role="variable">lit</emphasis>) expressions consist of a literal value, in one of
- four different formats, and a (primitive) type annotation. Only
- certain combinations of format and type are permitted;
- see <link linkend="primitive-module">the Primitive Module section</link>.
- The character and string formats can describe only 8-bit ASCII characters.</para>
-
- <para>Moreover, because the operational semantics for Core interprets
- strings as C-style null-terminated strings, strings should not
- contain embedded nulls.</para>
-
- <para>In Core, value applications, type applications, value
- abstractions, and type abstractions are all explicit. To tell
- them apart, type arguments in applications and formal type
- arguments in abstractions are preceded by an <code>@ symbol</code>. (In
- abstractions, the <code>@</code> plays essentially the same role as the more
- usual Λ symbol.) For example, the Haskell source declaration
-
- <programlisting language="haskell">
-f x = Leaf (Leaf x)
- </programlisting>
-
- might induce the Core declaration
-
- <programlisting language="java">
-f :: %forall a . a &arw; BinTree (BinTree a) =
- \ @a (x::a) &arw; Leaf @(Bintree a) (Leaf @a x)
- </programlisting></para>
-
- <para>Value applications may be of user-defined functions, data
- constructors, or primitives. None of these sorts of applications
- are necessarily saturated.</para>
-
- <para>Note that the arguments of type applications are not always of
- kind <code>*</code>. For example, given our previous definition of type <code>A</code>:
-
- <programlisting language="haskell">
-data A f a = MkA (f a)
- </programlisting>
-
- the source code
-
- <programlisting language="haskell">
-MkA (Leaf True)
- </programlisting>
-
- becomes
-
- <programlisting language="java">
-(MkA @Bintree @Bool) (Leaf @Bool True)
- </programlisting></para>
-
- <para>Local bindings, of a single variable or of a set of mutually
- recursive variables, are represented by <code>%let</code> expressions in the
- usual way.</para>
-
- <para>By far the most complicated expression form is <code>%case</code>. <code>%case</code>
- expressions are permitted over values of any type, although they
- will normally be algebraic or primitive types (with literal
- values). Evaluating a <code>%case</code> forces the evaluation of the
- expression being tested (the <quote>scrutinee</quote>). The value of the
- scrutinee is bound to the variable following the <code>%of</code> keyword,
- which is in scope in all alternatives; this is useful when the
- scrutinee is a non-atomic expression (see next example). The
- scrutinee is preceded by the type of the entire <code>%case</code>
- expression: that is, the result type that all of the <code>%case</code>
- alternatives have (this is intended to make type reconstruction
- easier in the presence of type equality coercions).</para>
-
- <para>In an algebraic <code>%case</code>, all the case alternatives must be labeled
- with distinct data constructors from the algebraic type,
- followed by any existential type variable bindings (see below),
- and typed term variable bindings corresponding to the data
- constructor’s arguments. The number of variables must match the
- data constructor’s arity.</para>
-
- <para>For example, the following Haskell source expression
-
- <programlisting language="haskell">
-case g x of
- Fork l r &arw; Fork r l
- t@(Leaf v) &arw; Fork t t
- </programlisting>
-
- might induce the Core expression
-
- <programlisting language="java">
-%case ((Bintree a)) g x %of (t::Bintree a)
- Fork (l::Bintree a) (r::Bintree a) &arw;
- Fork @a r l
- Leaf (v::a) &arw;
- Fork @a t t
- </programlisting></para>
-
- <para>When performing a <code>%case</code> over a value of an
- existentially-quantified algebraic type, the alternative must
- include extra local type bindings for the
- existentially-quantified variables. For example, given
-
- <programlisting language="haskell">
-data T = forall a . MkT a (a &arw; Bool)
- </programlisting>
-
- the source
-
- <programlisting language="haskell">
-case x of
- MkT w g &arw; g w
- </programlisting>
-
- becomes
-
- <programlisting language="java">
-%case x %of (x’::T)
- MkT @b (w::b) (g::b&arw;Bool) &arw; g w
- </programlisting></para>
-
- <para>In a <code>%case</code> over literal alternatives, all the case alternatives
- must be distinct literals of the same primitive type.</para>
-
- <para>The list of alternatives may begin with a default alternative
- labeled with an underscore (<code>%_</code>), whose right-hand side will be
- evaluated if none of the other alternatives match. The default
- is optional except for in a case over a primitive type, or when
- there are no other alternatives. If the case is over neither an
- algebraic type nor a primitive type, then the list of
- alternatives must contain a default alternative and nothing
- else. For algebraic cases, the set of alternatives need not be
- exhaustive, even if no default is given; if alternatives are
- missing, this implies that GHC has deduced that they cannot
- occur.</para>
-
- <para><code>%cast</code> is used to manipulate newtypes, as described in
- <link linkend="newtypes">the Newtype section</link>. The <code>%cast</code> expression
- takes an expression and a coercion: syntactically, the coercion
- is an arbitrary type, but it must have an equality kind. In an
- expression <code>(cast e co)</code>, if <code>e :: T</code> and <code>co</code> has kind <code>T :=: U</code>, then
- the overall expression has type <code>U</code> <citation>ghc-fc-commentary</citation>. Here, <code>co</code> must be a
- coercion whose left-hand side is <code>T</code>.</para>
-
- <para>Note that unlike the <code>%coerce</code> expression that existed in previous
- versions of Core, this means that <code>%cast</code> is (almost) type-safe:
- the coercion argument provides evidence that can be verified by
- a typechecker. There are still unsafe <code>%cast</code>s, corresponding to
- the unsafe <code>%coerce</code> construct that existed in old versions of
- Core, because there is a primitive unsafe coercion type that can
- be used to cast arbitrary types to each other. GHC uses this for
- such purposes as coercing the return type of a function (such as
- error) which is guaranteed to never return:
-
- <programlisting language="haskell">
-case (error &quot;&quot;) of
- True &arw; 1
- False &arw; 2
- </programlisting>
-
- becomes:
-
- <programlisting language="java">
-%cast (error @ Bool (ZMZN @ Char))
-(%unsafe Bool Integer);
- </programlisting>
-
- <code>%cast</code> has no operational meaning and is only used in
- typechecking.</para>
-
- <para>A <code>%note</code> expression carries arbitrary internal information that
- GHC finds interesting. The information is encoded as a string.
- Expression notes currently generated by GHC include the inlining
- pragma (<code>InlineMe</code>) and cost-center labels for profiling.</para>
-
- <para>A <code>%external</code> expression denotes an external identifier, which has
- the indicated type (always expressed in terms of Haskell
- primitive types). External Core supports two kinds of external
- calls: <code>%external</code> and <code>%dynexternal</code>. Only the former is supported
- by the current set of stand-alone Core tools. In addition, there
- is a <code>%label</code> construct which GHC may generate but which the Core
- tools do not support.</para>
-
- <para>The present syntax for externals is sufficient for describing C
- functions and labels. Interfacing to other languages may require
- additional information or a different interpretation of the name
- string.</para>
-
- </section>
-
- <section id="expression-evaluation">
- <title>Expression Evaluation</title>
- <para>The dynamic semantics of Core are defined on the type-erasure of
- the program: for example, we ignore all type abstractions and
- applications. The denotational semantics of the resulting
- type-free program are just the conventional ones for a
- call-by-name language, in which expressions are only evaluated
- on demand. But Core is intended to be a call-by-<emphasis>need</emphasis> language,
- in which expressions are only evaluated once. To express the
- sharing behavior of call-by-need, we give an operational model
- in the style of Launchbury <citation>launchbury93natural</citation>.</para>
-
- <para>This section describes the model informally; a more formal
- semantics is separately available as an executable interpreter.</para>
-
- <para>To simplify the semantics, we consider only <quote>well-behaved</quote> Core
- programs in which constructor and primitive applications are
- fully saturated, and in which non-trivial expresssions of
- unlifted kind (<code>#</code>) appear only as scrutinees in <code>%case</code>
- expressions. Any program can easily be put into this form; a
- separately available preprocessor illustrates how. In the
- remainder of this section, we use <quote>Core</quote> to mean <quote>well-behaved</quote>
- Core.</para>
-
- <para>Evaluating a Core expression means reducing it to <emphasis>weak-head normal form (WHNF)</emphasis>,
- i.e., a primitive value, lambda abstraction,
- or fully-applied data constructor. Evaluating a program means
- evaluating the expression <code>main:ZCMain.main</code>.</para>
-
- <para>To make sure that expression evaluation is shared, we make use
- of a <emphasis>heap</emphasis>, which contains <emphasis>heap entries</emphasis>. A heap entry can be:
-
- <itemizedlist>
- <listitem>
- A <emphasis>thunk</emphasis>, representing an unevaluated expression, also known
- as a suspension.
- </listitem>
- <listitem>
- A <emphasis>WHNF</emphasis>, representing an evaluated expression. The result of
- evaluating a thunk is a WHNF. A WHNF is always a closure
- (corresponding to a lambda abstraction in the source
- program) or a data constructor application: computations
- over primitive types are never suspended.
- </listitem>
- </itemizedlist></para>
-
- <para><emphasis>Heap pointers</emphasis> point to heap entries: at different times, the
- same heap pointer can point to either a thunk or a WHNF, because
- the run-time system overwrites thunks with WHNFs as computation
- proceeds.</para>
-
- <para>The suspended computation that a thunk represents might
- represent evaluating one of three different kinds of expression.
- The run-time system allocates a different kind of thunk
- depending on what kind of expression it is:
-
- <itemizedlist>
- <listitem>
- A thunk for a value definition has a group of suspended
- defining expressions, along with a list of bindings between
- defined names and heap pointers to those suspensions. (A
- value definition may be a recursive group of definitions or
- a single non-recursive definition, and it may be top-level
- (global) or <code>let</code>-bound (local)).
- </listitem>
- <listitem>
- A thunk for a function application (where the function is
- user-defined) has a suspended actual argument expression,
- and a binding between the formal argument and a heap pointer
- to that suspension.
- </listitem>
- <listitem>
- A thunk for a constructor application has a suspended actual
- argument expression; the entire constructed value has a heap
- pointer to that suspension embedded in it.
- </listitem>
- </itemizedlist></para>
-
- <para>As computation proceeds, copies of the heap pointer for a given
- thunk propagate through the executing program. When another
- computation demands the result of that thunk, the thunk is
- <emphasis>forced</emphasis>: the run-time system computes the thunk’s result,
- yielding a WHNF, and overwrites the heap entry for the thunk
- with the WHNF. Now, all copies of the heap pointer point to the
- new heap entry: a WHNF. Forcing occurs only in the context of
-
- <itemizedlist>
- <listitem>evaluating the operator expression of an application;</listitem>
- <listitem>evaluating the scrutinee of a <code>case</code> expression; or</listitem>
- <listitem>evaluating an argument to a primitive or external function application</listitem>
- </itemizedlist>
- </para>
-
- <para>When no pointers to a heap entry (whether it is a thunk or WHNF)
- remain, the garbage collector can reclaim the space it uses. We
- assume this happens implicitly.</para>
-
- <para>With the exception of functions, arrays, and mutable variables,
- we intend that values of all primitive types should be held
- <emphasis>unboxed</emphasis>: they should not be heap-allocated. This does not
- violate call-by-need semantics: all primitive types are
- <emphasis>unlifted</emphasis>, which means that values of those types must be
- evaluated strictly. Unboxed tuple types are not heap-allocated
- either.</para>
-
- <para>Certain primitives and <code>%external</code> functions cause side-effects to
- state threads or to the real world. Where the ordering of these
- side-effects matters, Core already forces this order with data
- dependencies on the pseudo-values representing the threads.</para>
-
- <para>An implementation must specially support the <code>raisezh</code> and
- <code>handlezh</code> primitives: for example, by using a handler stack.
- Again, real-world threading guarantees that they will execute in
- the correct order.</para>
-
- </section>
- </section>
- <section id="primitive-module">
- <title>Primitive Module</title>
-
- <para>The semantics of External Core rely on the contents and informal
- semantics of the primitive module <code>ghc-prim:GHC.Prim</code>. Nearly all
- the primitives are required in order to cover GHC’s implementation
- of the Haskell98 standard prelude; the only operators that can be
- completely omitted are those supporting the byte-code interpreter,
- parallelism, and foreign objects. Some of the concurrency
- primitives are needed, but can be given degenerate implementations
- if it desired to target a purely sequential backend (see Section
- <link linkend="non-concurrent-back-end">the Non-concurrent Back End section</link>).</para>
-
- <para>In addition to these primitives, a large number of C library
- functions are required to implement the full standard Prelude,
- particularly to handle I/O and arithmetic on less usual types.</para>
-
- <para>For a full listing of the names and types of the primitive
- operators, see the GHC library documentation <citation>ghcprim</citation>.</para>
-
- <section id="non-concurrent-back-end">
- <title>Non-concurrent Back End</title>
-
- <para>The Haskell98 standard prelude doesn’t include any concurrency
- support, but GHC’s implementation of it relies on the existence
- of some concurrency primitives. However, it never actually forks
- multiple threads. Hence, the concurrency primitives can be given
- degenerate implementations that will work in a non-concurrent
- setting, as follows:</para>
-
- <itemizedlist>
- <listitem>
- <code>ThreadIdzh</code> can be represented by a singleton type, whose
- (unique) value is returned by <code>myThreadIdzh</code>.
- </listitem>
- <listitem>
- <code>forkzh</code> can just die with an <quote>unimplemented</quote> message.
- </listitem>
- <listitem>
- <code>killThreadzh</code> and <code>yieldzh</code> can also just die <quote>unimplemented</quote>
- since in a one-thread world, the only thread a thread can
- kill is itself, and if a thread yields the program hangs.
- </listitem>
- <listitem>
- <code>MVarzh a</code> can be represented by <code>MutVarzh (Maybe a)</code>; where a
- concurrent implementation would block, the sequential
- implementation can just die with a suitable message (since
- no other thread exists to unblock it).
- </listitem>
- <listitem>
- <code>waitReadzh</code> and <code>waitWritezh</code> can be implemented using a <code>select</code>
- with no timeout.
- </listitem>
- </itemizedlist>
- </section>
-
- <section id="literals">
- <title>Literals</title>
-
- <para>Only the following combination of literal forms and types are
- permitted:</para>
-
- <informaltable frame="none" colsep="0" rowsep="0">
- <tgroup cols='3'>
- <colspec colname="literal" align="left" colwidth="*" />
- <colspec colname="type" align="left" colwidth="*" />
- <colspec colname="description" align="left" colwidth="4*" />
- <thead>
- <row>
- <entry>Literal form</entry>
- <entry>Type</entry>
- <entry>Description</entry>
- </row>
- </thead>
- <tbody>
- <row>
- <entry morerows="3" valign="top">integer</entry>
- <entry><code>Intzh</code></entry>
- <entry>Int</entry>
- </row>
- <row>
- <entry><code>Wordzh</code></entry>
- <entry>Word</entry>
- </row>
- <row>
- <entry><code>Addrzh</code></entry>
- <entry>Address</entry>
- </row>
- <row>
- <entry><code>Charzh</code></entry>
- <entry>Unicode character code</entry>
- </row>
-
- <row>
- <entry morerows="1" valign="top">rational</entry>
- <entry><code>Floatzh</code></entry>
- <entry>Float</entry>
- </row>
- <row>
- <entry><code>Doublezh</code></entry>
- <entry>Double</entry>
- </row>
-
- <row>
- <entry>character</entry>
- <entry><code>Charzh</code></entry>
- <entry>Unicode character specified by ASCII character</entry>
- </row>
-
- <row>
- <entry>string</entry>
- <entry><code>Addrzh</code></entry>
- <entry>Address of specified C-format string</entry>
- </row>
- </tbody>
- </tgroup>
- </informaltable>
- </section>
- </section>
-
-
- <bibliolist>
- <!-- This bibliography was semi-automatically converted by JabRef from core.bib. -->
-
- <title>References</title>
-
- <biblioentry>
- <abbrev>ghc-user-guide</abbrev>
- <authorgroup>
- <author><surname>The GHC Team</surname></author>
- </authorgroup>
- <citetitle pubwork="article">The Glorious Glasgow Haskell Compilation System User's Guide, Version 6.8.2</citetitle>
- <pubdate>2008</pubdate>
- <bibliomisc><ulink url="http://www.haskell.org/ghc/docs/latest/html/users_guide/index.html">http://www.haskell.org/ghc/docs/latest/html/users_guide/index.html</ulink></bibliomisc>
- </biblioentry>
-
- <biblioentry>
- <abbrev>ghc-fc-commentary</abbrev>
- <authorgroup>
- <author><surname>GHC Wiki</surname></author>
- </authorgroup>
- <citetitle pubwork="article">System FC: equality constraints and coercions</citetitle>
- <pubdate>2006</pubdate>
- <bibliomisc><ulink url="http://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/FC">http://ghc.haskell.org/trac/ghc/wiki/Commentary/Compiler/FC</ulink></bibliomisc>
- </biblioentry>
-
- <biblioentry>
- <abbrev>ghc-api</abbrev>
- <authorgroup>
- <author><surname>Haskell Wiki</surname></author>
- </authorgroup>
- <citetitle pubwork="article">Using GHC as a library</citetitle>
- <pubdate>2007</pubdate>
- <bibliomisc><ulink url="http://haskell.org/haskellwiki/GHC/As_a_library">http://haskell.org/haskellwiki/GHC/As_a_library</ulink></bibliomisc>
- </biblioentry>
-
- <biblioentry>
- <abbrev>haskell98</abbrev>
- <authorgroup>
- <editor><firstname>Simon</firstname><surname>Peyton-Jones</surname></editor>
- </authorgroup>
- <citetitle pubwork="article">Haskell 98 Language and Libraries: The Revised Report</citetitle>
- <publisher>
- <publishername>Cambridge University Press</publishername>
- <address>
- <city>Cambridge></city>
- <state>UK</state>
- </address>
- </publisher>
- <pubdate>2003</pubdate>
- </biblioentry>
-
- <biblioentry>
- <abbrev>system-fc</abbrev>
- <authorgroup>
- <author><firstname>Martin</firstname><surname>Sulzmann</surname></author>
- <author><firstname>Manuel M.T.</firstname><surname>Chakravarty</surname></author>
- <author><firstname>Simon</firstname><surname>Peyton-Jones</surname></author>
- <author><firstname>Kevin</firstname><surname>Donnelly</surname></author>
- </authorgroup>
- <citetitle pubwork="article">System F with type equality coercions</citetitle>
- <publisher>
- <publishername>ACM</publishername>
- <address>
- <city>New York</city>
- <state>NY</state>
- <country>USA</country>
- </address>
- </publisher>
- <artpagenums>53-66</artpagenums>
- <pubdate>2007</pubdate>
- <bibliomisc><ulink url="http://portal.acm.org/citation.cfm?id=1190324">http://portal.acm.org/citation.cfm?id=1190324</ulink></bibliomisc>
- <!-- booktitle = {{TLDI '07: Proceedings of the 2007 ACM SIGPLAN International Workshop on Types in Language Design and Implementation}}, -->
- </biblioentry>
-
- <biblioentry>
- <abbrev>gadts</abbrev>
- <authorgroup>
- <author><firstname>Simon</firstname><surname>Peyton-Jones</surname></author>
- <author><firstname>Dimitrios</firstname><surname>Vytiniotis</surname></author>
- <author><firstname>Stephanie</firstname><surname>Weirich</surname></author>
- <author><firstname>Geoffrey</firstname><surname>Washburn</surname></author>
- </authorgroup>
- <citetitle pubwork="article">Simple unification-based type inference for GADTs</citetitle>
- <publisher>
- <publishername>ACM</publishername>
- <address>
- <city>New York</city>
- <state>NY</state>
- <country>USA</country>
- </address>
- </publisher>
- <artpagenums>50-61</artpagenums>
- <pubdate>2006</pubdate>
- <bibliomisc><ulink url="http://research.microsoft.com/Users/simonpj/papers/gadt/index.htm">http://research.microsoft.com/Users/simonpj/papers/gadt/index.htm</ulink></bibliomisc>
- </biblioentry>
-
- <biblioentry>
- <abbrev>Launchbury94</abbrev>
- <authorgroup>
- <author><firstname>John</firstname><surname>Launchbury</surname></author>
- <author><firstname>Simon L.</firstname><surname>Peyton-Jones</surname></author>
- </authorgroup>
- <citetitle pubwork="article">Lazy Functional State Threads</citetitle>
- <artpagenums>24-35</artpagenums>
- <pubdate>1994</pubdate>
- <bibliomisc><ulink url="http://citeseer.ist.psu.edu/article/launchbury93lazy.html">http://citeseer.ist.psu.edu/article/launchbury93lazy.html</ulink></bibliomisc>
- <!-- booktitle = "{SIGPLAN} {Conference} on {Programming Language Design and Implementation}", -->
- </biblioentry>
-
- <biblioentry>
- <abbrev>pj:unboxed</abbrev>
- <authorgroup>
- <author><firstname>Simon L.</firstname><surname>Peyton-Jones</surname></author>
- <author><firstname>John</firstname><surname>Launchbury</surname></author>
- <editor><firstname>J.</firstname><surname>Hughes</surname></editor>
- </authorgroup>
- <citetitle pubwork="article">Unboxed Values as First Class Citizens in a Non-strict Functional Language</citetitle>
- <publisher>
- <publishername>Springer-Verlag LNCS523</publishername>
- <address>
- <city>Cambridge</city>
- <state>Massachussetts</state>
- <country>USA</country>
- </address>
- </publisher>
- <artpagenums>636-666</artpagenums>
- <pubdate>1991, August 26-28</pubdate>
- <bibliomisc><ulink url="http://citeseer.ist.psu.edu/jones91unboxed.html">http://citeseer.ist.psu.edu/jones91unboxed.html</ulink></bibliomisc>
- <!-- booktitle = "Proceedings of the Conference on Functional Programming and Computer Architecture", -->
- </biblioentry>
-
- <biblioentry>
- <abbrev>ghc-inliner</abbrev>
- <authorgroup>
- <author><firstname>Simon</firstname><surname>Peyton-Jones</surname></author>
- <author><firstname>Simon</firstname><surname>Marlow</surname></author>
- </authorgroup>
- <citetitle pubwork="article">Secrets of the Glasgow Haskell Compiler inliner</citetitle>
- <pubdate>1999</pubdate>
- <address>
- <city>Paris</city>
- <country>France</country>
- </address>
- <bibliomisc><ulink url="http://research.microsoft.com/Users/simonpj/Papers/inlining/inline.pdf">http://research.microsoft.com/Users/simonpj/Papers/inlining/inline.pdf</ulink></bibliomisc>
- <!-- booktitle = "Workshop on Implementing Declarative Languages", -->
- </biblioentry>
-
- <biblioentry>
- <abbrev>comp-by-trans-scp</abbrev>
- <authorgroup>
- <author><firstname>Simon L.</firstname><surname>Peyton-Jones</surname></author>
- <author><firstname>A. L. M.</firstname><surname>Santos</surname></author>
- </authorgroup>
- <citetitle pubwork="article">A transformation-based optimiser for Haskell</citetitle>
- <citetitle pubwork="journal">Science of Computer Programming</citetitle>
- <volumenum>32</volumenum>
- <issuenum>1-3</issuenum>
- <artpagenums>3-47</artpagenums>
- <pubdate>1998</pubdate>
- <bibliomisc><ulink url="http://citeseer.ist.psu.edu/peytonjones98transformationbased.html">http://citeseer.ist.psu.edu/peytonjones98transformationbased.html</ulink></bibliomisc>
- </biblioentry>
-
- <biblioentry>
- <abbrev>stg-machine</abbrev>
- <authorgroup>
- <author><firstname>Simon L.</firstname><surname>Peyton-Jones</surname></author>
- </authorgroup>
- <citetitle pubwork="article">Implementing Lazy Functional Languages on Stock Hardware: The Spineless Tagless G-Machine</citetitle>
- <citetitle pubwork="journal">Journal of Functional Programming</citetitle>
- <volumenum>2</volumenum>
- <issuenum>2</issuenum>
- <artpagenums>127-202</artpagenums>
- <pubdate>1992</pubdate>
- <bibliomisc><ulink url="http://citeseer.ist.psu.edu/peytonjones92implementing.html">http://citeseer.ist.psu.edu/peytonjones92implementing.html</ulink></bibliomisc>
- </biblioentry>
-
- <biblioentry>
- <abbrev>launchbury93natural</abbrev>
- <authorgroup>
- <author><firstname>John</firstname><surname>Launchbury</surname></author>
- </authorgroup>
- <citetitle pubwork="article">A Natural Semantics for Lazy Evaluation</citetitle>
- <artpagenums>144-154</artpagenums>
- <address>
- <city>Charleston</city>
- <state>South Carolina</state>
- </address>
- <pubdate>1993</pubdate>
- <bibliomisc><ulink url="http://citeseer.ist.psu.edu/launchbury93natural.html">http://citeseer.ist.psu.edu/launchbury93natural.html</ulink></bibliomisc>
- <!-- booktitle = "Conference Record of the Twentieth Annual {ACM} {SIGPLAN}-{SIGACT} Symposium on Principles of Programming Languages", -->
- </biblioentry>
-
- <biblioentry>
- <abbrev>ghcprim</abbrev>
- <authorgroup>
- <author><surname>The GHC Team</surname></author>
- </authorgroup>
- <citetitle pubwork="article">Library documentation: GHC.Prim</citetitle>
- <pubdate>2008</pubdate>
- <bibliomisc><ulink url="http://www.haskell.org/ghc/docs/latest/html/libraries/base/GHC-Prim.html">http://www.haskell.org/ghc/docs/latest/html/libraries/base/GHC-Prim.html</ulink></bibliomisc>
- </biblioentry>
- </bibliolist>
-
-</chapter>
diff --git a/docs/users_guide/flags.xml b/docs/users_guide/flags.xml
index 593bf4b1ef..dc7644a5d6 100644
--- a/docs/users_guide/flags.xml
+++ b/docs/users_guide/flags.xml
@@ -2624,7 +2624,7 @@
<tbody>
<row>
<entry><option>-fext-core</option></entry>
- <entry>Generate <filename>.hcr</filename> external Core files</entry>
+ <entry><emphasis>(deprecated)</emphasis> Generate <filename>.hcr</filename> external Core files</entry>
<entry>dynamic</entry>
<entry>-</entry>
</row>
diff --git a/docs/users_guide/glasgow_exts.xml b/docs/users_guide/glasgow_exts.xml
index acc796371a..dc381a4daa 100644
--- a/docs/users_guide/glasgow_exts.xml
+++ b/docs/users_guide/glasgow_exts.xml
@@ -11257,69 +11257,6 @@ program even if fusion doesn't happen. More rules in <filename>GHC/List.lhs</fi
</sect2>
-<sect2 id="core-pragma">
- <title>CORE pragma</title>
-
- <indexterm><primary>CORE pragma</primary></indexterm>
- <indexterm><primary>pragma, CORE</primary></indexterm>
- <indexterm><primary>core, annotation</primary></indexterm>
-
-<para>
- The external core format supports <quote>Note</quote> annotations;
- the <literal>CORE</literal> pragma gives a way to specify what these
- should be in your Haskell source code. Syntactically, core
- annotations are attached to expressions and take a Haskell string
- literal as an argument. The following function definition shows an
- example:
-
-<programlisting>
-f x = ({-# CORE "foo" #-} show) ({-# CORE "bar" #-} x)
-</programlisting>
-
- Semantically, this is equivalent to:
-
-<programlisting>
-g x = show x
-</programlisting>
-</para>
-
-<para>
- However, when external core is generated (via
- <option>-fext-core</option>), there will be Notes attached to the
- expressions <function>show</function> and <varname>x</varname>.
- The core function declaration for <function>f</function> is:
-</para>
-
-<programlisting>
- f :: %forall a . GHCziShow.ZCTShow a ->
- a -> GHCziBase.ZMZN GHCziBase.Char =
- \ @ a (zddShow::GHCziShow.ZCTShow a) (eta::a) ->
- (%note "foo"
- %case zddShow %of (tpl::GHCziShow.ZCTShow a)
- {GHCziShow.ZCDShow
- (tpl1::GHCziBase.Int ->
- a ->
- GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha
-r)
- (tpl2::a -> GHCziBase.ZMZN GHCziBase.Char)
- (tpl3::GHCziBase.ZMZN a ->
- GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha
-r) ->
- tpl2})
- (%note "bar"
- eta);
-</programlisting>
-
-<para>
- Here, we can see that the function <function>show</function> (which
- has been expanded out to a case expression over the Show dictionary)
- has a <literal>%note</literal> attached to it, as does the
- expression <varname>eta</varname> (which used to be called
- <varname>x</varname>).
-</para>
-
-</sect2>
-
</sect1>
<sect1 id="special-ids">
diff --git a/docs/users_guide/ug-book.xml.in b/docs/users_guide/ug-book.xml.in
index dc5d4f7c35..b87563ac3b 100644
--- a/docs/users_guide/ug-book.xml.in
+++ b/docs/users_guide/ug-book.xml.in
@@ -17,7 +17,6 @@
&lang-features;
&ffi-chap;
&extending-ghc;
-&external-core;
&wrong;
&utils;
&win32-dll;
diff --git a/docs/users_guide/ug-ent.xml.in b/docs/users_guide/ug-ent.xml.in
index ce87089f24..a4ed036c3a 100644
--- a/docs/users_guide/ug-ent.xml.in
+++ b/docs/users_guide/ug-ent.xml.in
@@ -12,7 +12,6 @@
<!ENTITY sooner SYSTEM "sooner.xml" >
<!ENTITY lang-features SYSTEM "lang.xml" >
<!ENTITY glasgowexts SYSTEM "glasgow_exts.xml" >
-<!ENTITY external-core SYSTEM "external_core.xml" >
<!ENTITY packages SYSTEM "packages.xml" >
<!ENTITY parallel SYSTEM "parallel.xml" >
<!ENTITY safehaskell SYSTEM "safe_haskell.xml" >
diff --git a/docs/users_guide/using.xml b/docs/users_guide/using.xml
index 8d8211eb5a..1c3e20c041 100644
--- a/docs/users_guide/using.xml
+++ b/docs/users_guide/using.xml
@@ -2967,44 +2967,6 @@ data D = D !C
</sect1>
&runtime;
-
-<sect1 id="ext-core">
- <title>Generating and compiling External Core Files</title>
-
- <indexterm><primary>intermediate code generation</primary></indexterm>
-
- <para>GHC can dump its optimized intermediate code (said to be in &ldquo;Core&rdquo; format)
- to a file as a side-effect of compilation. Non-GHC back-end tools can read and process Core files; these files have the suffix
- <filename>.hcr</filename>. The Core format is described in <ulink url="../../core.pdf">
- <citetitle>An External Representation for the GHC Core Language</citetitle></ulink>,
- and sample tools
- for manipulating Core files (in Haskell) are available in the
- <ulink url="http://hackage.haskell.org/package/extcore">extcore package on Hackage</ulink>. Note that the format of <literal>.hcr</literal>
- files is <emphasis>different</emphasis> from the Core output format that GHC generates
- for debugging purposes (<xref linkend="options-debugging"/>), though the two formats appear somewhat similar.</para>
-
- <para>The Core format natively supports notes which you can add to
- your source code using the <literal>CORE</literal> pragma (see <xref
- linkend="pragmas"/>).</para>
-
- <variablelist>
-
- <varlistentry>
- <term>
- <option>-fext-core</option>
- <indexterm><primary><option>-fext-core</option></primary></indexterm>
- </term>
- <listitem>
- <para>Generate <literal>.hcr</literal> files.</para>
- </listitem>
- </varlistentry>
-
- </variablelist>
-
-<para>Currently (as of version 6.8.2), GHC does not have the ability to read in External Core files as source. If you would like GHC to have this ability, please <ulink url="http://ghc.haskell.org/trac/ghc/wiki/MailingListsAndIRC">make your wishes known to the GHC Team</ulink>.</para>
-
-</sect1>
-
&debug;
&flags;