1 files changed, 202 insertions, 0 deletions
diff --git a/compiler/ndpFlatten/TODO b/compiler/ndpFlatten/TODO
new file mode 100644
index 0000000000..e596609205
--- /dev/null
+++ b/compiler/ndpFlatten/TODO
@@ -0,0 +1,202 @@
+	           TODO List for Flattening Support in GHC	     -*-text-*-
+		   =======================================
+
+Middle-End Related
+~~~~~~~~~~~~~~~~~~
+
+Flattening Transformation
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+* Complete and test
+
+* Complete the analysis
+
+* Type transformation: The idea solution would probably be if we can add some
+  generic machinery, so that we can define all the rules for handling the type
+  and value transformations in a library.  (The PrelPArr for WayNDP.)
+
+
+Library Related
+~~~~~~~~~~~~~~~
+
+* Problem with re-exporting PrelPArr from Prelude is that it would also be
+  visible when -pparr is not given.  There should be a mechanism to implicitly
+  import more than one module (like PERVASIVE modules in M3)
+
+* We need a PrelPArr-like library for when flattening is used, too.  In fact,
+  we need some library routines that are on the level of merely vectorised
+  code (eg, for the dummy default vectors), and then, all the `PArrays' stuff
+  implementing fast unboxed arrays and fusion.
+
+* Enum is a problem.  Ideally, we would like `enumFromToP' and
+  `enumFromThenToP' to be members of `Enum'.  On the other hand, we really do
+  not want to change `Enum'.  The solution for the moment is to define
+
+    enumFromTo x y       = mapP toEnum [:fromEnum x .. fromEnum y:]
+    enumFromThenTo x y z = mapP toEnum [:fromEnum x, fromEnum y .. fromEnum z:]
+
+  like the Haskell Report does for the list versions.  This is hopefully
+  efficient enough as array fusion should fold the two traversals into one.
+  [DONE]
+
+
+DOCU that should go into the Commentary
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The type constructor [::]
+-------------------------
+
+The array type constructor [::] is quite similar to [] (list constructor) in
+that GHC has to know about it (in TysWiredIn); however, there are some
+differences:
+
+* [::] is an abstract type, whereas [] is not
+
+* if flattening is switched on, all occurences of the type are actually
+  removed by appropriate program transformations.
+
+The module PrelPArr that actually implements nested parallel arrays.  [::] is
+eliminated only if in addition to array support, flattening is activated.  It
+is just an option rather than the only method to implement those arrays.
+
+  Flags: -fparr	      -- syntactic support for parallel arrays (via `PrelPArr')
+			 * Dynamic hsc option; can be reversed with -fno-parr
+	 -fflatten    -- flattening transformation
+			 * Static hsc option
+	 -ndp	      -- this a way option, which implies -fparr and -fflatten
+			 (way options are handled by the driver and are not
+			 directly seen by hsc)
+	 -ddump-vect  -- dump Core after vectorisation
+		         * Dynamic hsc option
+
+* PrelPArr implements array variants of the Prelude list functions plus some
+  extra functions (also, some list functions (eg, those generating infinite
+  lists) have been left out.
+
+* prelude/PrelNames has been extended with all the names from PrelPArr that
+  need to be known inside the compiler
+
+* The variable GhcSupportsPArr, which can be set in build.mk decides whether
+  `PrelPArr' is to be compiled or not.  (We probably need to supress compiling
+  PrelPArr in WayNDP, or rather replace it with a different PrelPArr.)
+
+* Say something about `TysWiredIn.parrTyCon' as soon as we know how it
+  actually works... 
+
+Parser and AST Notes:
+- Parser and AST is quite straight forward.  Essentially, the list cases
+  duplicated with a name containing `PArr' or `parr' and modified to fit the
+  slightly different semantics (ie, finite length, strict).
+- The value and pattern `[::]' is an empty explicit parallel array (ie,
+  something of the form `ExplicitPArr ty []' in the AST).  This is in contrast
+  to lists, which use the nil-constructor instead.  In the case of parallel
+  arrays, using a constructor would be rather awkward, as it is not a
+  constructor-based type.
+- Thus, array patterns have the general form `[:p1, p2, ..., pn:]', where n >=
+  0.  Thus, two array patterns overlap iff they have the same length.
+- The type constructor for parallel is internally represented as a
+  `TyCon.AlgTyCon' with a wired in definition in `TysWiredIn'.  
+
+Desugarer Notes:
+- Desugaring of patterns involving parallel arrays:
+  * In Match.tidy1, we use fake array constructors; ie, any pattern `[:p1, ...,
+    pn:]' is replaces by the expression `MkPArr<n> p1 ... pn', where
+    `MkPArr<n>' is the n-ary array constructor.  These constructors are fake,
+    because they are never used to actually represent array values; in fact,
+    they are removed again before pattern compilation is finished.  However,
+    the use of these fake constructors implies that we need not modify large
+    parts of the machinery of the pattern matching compiler, as array patterns
+    are handled like any other constructor pattern.
+  * Check.simplify_pat introduces the same fake constructors as Match.tidy1
+    and removed again by Check.make_con.
+  * In DsUtils.mkCoAlgCaseMatchResult, we catch the case of array patterns and
+    generate code as the following example illustrates, where the LHS is the
+    code that would be produced if array construtors would really exist:
+
+      case v of pa {
+	MkPArr1 x1       -> e1
+	MkPArr2 x2 x3 x4 -> e2
+	DFT	         -> e3
+      }
+
+    =>
+
+      case lengthP v of
+        Int# i# -> 
+	  case i# of l {
+	    1   -> let x1 = v!:0                       in e1
+	    3   -> let x2 = v!:0; x2 = v!:1; x3 = v!:2 in e2
+	    DFT ->					      e3
+	  }
+  * The desugaring of array comprehensions is in `DsListComp', but follows
+    rules that are different from that for translating list comprehensions.
+    Denotationally, it boils down to the same, but the operational
+    requirements for an efficient implementation of array comprehensions are
+    rather different.
+
+    [:e | qss:] = <<[:e | qss:]>> () [:():]
+
+    <<[:e' |           :]>> pa ea = mapP (\pa -> e') ea
+    <<[:e' | b     , qs:]>> pa ea = <<[:e' | qs:]>> pa (filterP (\pa -> b) ea)
+    <<[:e' | p <- e, qs:]>> pa ea = 
+      let ef = filterP (\x -> case x of {p -> True; _ -> False}) e
+      in
+      <<[:e' | qs:]>> (pa, p) (crossP ea ef)
+    <<[:e' | let ds, qs:]>> pa ea = 
+      <<[:e' | qs:]>> (pa, (x_1, ..., x_n)) 
+		      (mapP (\v@pa -> (v, let ds in (x_1, ..., x_n))) ea)
+    where
+      {x_1, ..., x_n} = DV (ds)		-- Defined Variables
+    <<[:e' | qs | qss:]>>   pa ea = 
+      <<[:e' | qss:]>> (pa, (x_1, ..., x_n)) 
+		       (zipP ea <<[:(x_1, ..., x_n) | qs:]>>)
+    where
+      {x_1, ..., x_n} = DV (qs)
+
+    Moreover, we have
+
+      crossP       :: [:a:] -> [:b:] -> [:(a, b):]
+      crossP a1 a2  = let
+			len1 = lengthP a1
+			len2 = lengthP a2
+			x1   = concatP $ mapP (replicateP len2) a1
+			x2   = concatP $ replicateP len1 a2
+		      in
+		      zipP x1 x2
+
+    For a more efficient implementation of `crossP', see `PrelPArr'.
+
+    Optimisations: 
+    - In the `p <- e' rule, if `pa = ()', drop it and simplify the `crossP ea
+      e' to `e'.
+    - We assume that fusion will optimise sequences of array processing
+      combinators.
+    - Do we want to have the following function?
+
+        mapFilterP :: (a -> Maybe b) -> [:a:] -> [:b:]
+
+      Even with fusion `(mapP (\p -> e) . filterP (\p -> b))' may still result
+      in redundant pattern matching operations.  (Let's wait with this until
+      we have seen what the Simplifier does to the generated code.)
+
+Flattening Notes:
+* The story about getting access to all the names like "fst" etc that we need
+  to generate during flattening is quite involved.  To have a reasonable
+  chance to get at the stuff, we need to put flattening inbetween the
+  desugarer and the simplifier as an extra pass in HscMain.hscMain.  After
+  that point, the persistent compiler state is zapped (for heap space
+  reduction reasons, I guess) and nothing remains of the imported interfaces
+  in one shot mode.
+
+  Moreover, to get the Ids that we need into the type environment, we need to
+  force the renamer to include them.  This is done in
+  RnEnv.getImplicitModuleFVs, which computes all implicitly imported names.
+  We let it add the names from FlattenInfo.namesNeededForFlattening.
+
+  Given all these arrangements, FlattenMonad can obtain the needed Ids from
+  the persistent compiler state without much further hassle.
+
+  [It might be worthwhile to document in the non-Flattening part of the
+  Commentary that the persistent compiler state is zapped after desugaring and
+  how the free variables determined by the renamer imply which names are
+  imported.]