ghc/docs/state_interface/state-interface.verb


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054

\documentstyle[a4wide,grasp]{article}
\renewcommand{\textfraction}{0.1}
\renewcommand{\floatpagefraction}{0.9}
\renewcommand{\dblfloatpagefraction}{0.9}

\sloppy
\renewcommand{\today}{July 1996}

\begin{document}

\title{GHC prelude: types and operations}
\author{Simon L Peyton Jones \and John Launchbury \and Will Partain}

\maketitle
\tableofcontents

This ``state interface document'' corresponds to Glasgow Haskell
version~2.01.

\section{Really primitive stuff}

This section defines all the types which are primitive in Glasgow Haskell, and the
operations provided for them.

A primitive type is one which cannot be defined in Haskell, and which is 
therefore built into the language and compiler.
Primitive types are always unboxed; that is, a value of primitive type cannot be 
bottom.

Primitive values are often represented by a simple bit-pattern, such as @Int#@, 
@Float#@, @Double#@.  But this is not necessarily the case: a primitive value 
might be represented by a pointer to a heap-allocated object.  Examples include 
@Array#@, the type of primitive arrays.  You might think this odd: doesn't being 
heap-allocated mean that it has a box?  No, it does not.  A primitive array is 
heap-allocated because it is too big a value to fit in a register, and would be 
too expensive to copy around; in a sense, it is accidental that it is represented 
by a pointer.  If a pointer represents a primitive value, then it really does 
point to that value: no unevaluated thunks, no indirections...nothing can be at 
the other end of the pointer than the primitive value.

This section also describes a few non-primitive types, which are needed 
to express the result types of some primitive operations.

\subsection{Character and numeric types}

There are the following obvious primitive types:
@
type Char#
type Int#	-- see also Word# and Addr#, later
type Float#
type Double#
@
If you want to know their exact equivalents in C, see
@ghc/includes/StgTypes.lh@ in the GHC source.

Literals for these types may be written as follows:
@
1#		an Int#
1.2#		a Float#
1.34##		a Double#
'a'#		a Char#; for weird characters, use '\o<octal>'#
"a"#		an Addr# (a `char *')
@

\subsubsection{Comparison operations}
@
{gt,ge,eq,ne,lt,le}Char# :: Char# -> Char# -> Bool
    -- ditto for Int#, Word#, Float#, Double#, and Addr#
@

\subsubsection{Unboxed-character operations}
@
ord# :: Char# -> Int#
chr# :: Int# -> Char#
@

\subsubsection{Unboxed-@Int@ operations}
@
{plus,minus,times,quot,div,rem}Int# :: Int# -> Int# -> Int#
negateInt# :: Int# -> Int#
@
NB: No error/overflow checking!

\subsubsection{Unboxed-@Double@ and @Float@ operations}
@
{plus,minus,times,divide}Double# :: Double# -> Double# -> Double#
negateDouble# :: Double# -> Double#

float2Int#	:: Double# -> Int#   -- just a cast, no checking!
int2Double#	:: Int# -> Double#

expDouble#	:: Double# -> Double#
logDouble#	:: Double# -> Double#
sqrtDouble#	:: Double# -> Double#
sinDouble#	:: Double# -> Double#
cosDouble#	:: Double# -> Double#
tanDouble#	:: Double# -> Double#
asinDouble#	:: Double# -> Double#
acosDouble#	:: Double# -> Double#
atanDouble#	:: Double# -> Double#
sinhDouble#	:: Double# -> Double#
coshDouble#	:: Double# -> Double#
tanhDouble#	:: Double# -> Double#
powerDouble#	:: Double# -> Double# -> Double#
@
There's an exactly-matching set of unboxed-@Float@ ops; replace
@Double#@ with @Float#@ in the list above.  There are two
coercion functions for @Float#@/@Double#@:
@
float2Double#	:: Float# -> Double#
double2Float#	:: Double# -> Float#
@
The primitive versions of @encodeDouble@/@decodeDouble@:
@
encodeDouble#	:: Int# -> Int# -> ByteArray#	-- Integer mantissa
		-> Int#				-- Int exponent
		-> Double#

decodeDouble#	:: Double#
		-> GHCbase.ReturnIntAndGMP
@
(And the same for @Float#@s.)

\subsection{Operations on/for @Integers@ (interface to GMP)}
\label{sect:horrid-Integer-pairing-types}

We implement @Integers@ (arbitrary-precision integers) using the GNU
multiple-precision (GMP) package.

NB: some of this might change if we upgrade to using GMP~2.x.

The data type for @Integer@ must mirror that for @MP_INT@ in @gmp.h@
(see @gmp.info@).  It comes out as:
@
data Integer = J# Int# Int# ByteArray#
@
So, @Integer@ is really just a ``pairing'' type for a particular
collection of primitive types.

The operations in the GMP return other combinations of
GMP-plus-something, so we need ``pairing'' types for those, too:
@
data Return2GMPs     = Return2GMPs Int# Int# ByteArray# Int# Int# ByteArray#
data ReturnIntAndGMP = ReturnIntAndGMP Int# Int# Int# ByteArray#

-- ????? something to return a string of bytes (in the heap?)
@
The primitive ops to support @Integers@ use the ``pieces'' of the
representation, and are as follows:
@
negateInteger#	:: Int# -> Int# -> ByteArray# -> Integer

{plus,minus,times}Integer# :: Int# -> Int# -> ByteArray#
			   -> Int# -> Int# -> ByteArray#
			   -> Integer

cmpInteger# :: Int# -> Int# -> ByteArray#
	    -> Int# -> Int# -> ByteArray#
	    -> Int# -- -1 for <; 0 for ==; +1 for >

divModInteger#, quotRemInteger#
	:: Int# -> Int# -> ByteArray#
	-> Int# -> Int# -> ByteArray#
	-> GHCbase.Return2GMPs

integer2Int# :: Int# -> Int# -> ByteArray#
	     -> Int# 

int2Integer#  :: Int#  -> Integer -- NB: no error-checking on these two!
word2Integer# :: Word# -> Integer

addr2Integer# :: Addr# -> Integer
	-- the Addr# is taken to be a `char *' string
	-- to be converted into an Integer
@


\subsection{Words and addresses}

A @Word#@ is used for bit-twiddling operations.  It is the same size as
an @Int#@, but has no sign nor any arithmetic operations.
@
type Word#	-- Same size/etc as Int# but *unsigned*
type Addr#	-- A pointer from outside the "Haskell world" (from C, probably);
		-- described under "arrays"
@
@Word#@s and @Addr#@s have the usual comparison operations.
Other unboxed-@Word@ ops (bit-twiddling and coercions):
@
and#, or# :: Word# -> Word# -> Word#

not# :: Word# -> Word#

shiftL#, shiftRA#, shiftRL# :: Word# -> Int# -> Word#
	-- shift left, right arithmetic, right logical

iShiftL#, iShiftRA#, iShiftRL# :: Int# -> Int# -> Int#
	-- same shift ops, but on Int#s

int2Word#	:: Int#  -> Word# -- just a cast, really
word2Int#	:: Word# -> Int#
@

Unboxed-@Addr@ ops (C casts, really):
@
int2Addr#	:: Int#  -> Addr#
addr2Int#	:: Addr# -> Int#
@
Operations for indexing off of C pointers (@Addr#@s) to snatch values
are listed under ``arrays''.

\subsection{Arrays}

The type @Array# elt@ is the type of primitive,
unboxed arrays of values of type @elt@.  
@
type Array# elt
@
@Array#@ is more primitive than a Haskell
array --- indeed, Haskell arrays are implemented using @Array#@ ---
in that an @Array#@ is indexed only by @Int#@s, starting at zero.  It is also
more primitive by virtue of being unboxed.  That doesn't mean that it isn't
a heap-allocated object --- of course, it is.  Rather, being unboxed means
that it is represented by a pointer to the array itself, and not to a thunk
which will evaluate to the array (or to bottom).
The components of an @Array#@ are themselves boxed.

The type @ByteArray#@ is similar to @Array#@, except that it contains
just a string of (non-pointer) bytes.
@
type ByteArray#
@
Arrays of these types are useful when a Haskell program wishes to
construct a value to pass to a C procedure.  It is also possible to
use them to build (say) arrays of unboxed characters for internal use
in a Haskell program.  Given these uses, @ByteArray#@ is deliberately
a bit vague about the type of its components.  Operations are provided
to extract values of type @Char#@, @Int#@, @Float#@, @Double#@, and
@Addr#@ from arbitrary offsets within a @ByteArray#@.  (For type @Foo#@,
the $i$th offset gets you the $i$th @Foo#@, not the @Foo#@ at byte-position $i$.  Mumble.)
(If you want a @Word#@, grab an @Int#@, then coerce it.)

Lastly, we have static byte-arrays, of type @Addr#@ [mentioned
previously].  (Remember the duality between arrays and pointers in C.)
Arrays of this types are represented by a pointer to an array in the
world outside Haskell, so this pointer is not followed by the garbage
collector.  In other respects they are just like @ByteArray#@.  They
are only needed in order to pass values from C to Haskell.

\subsubsection{Reading and writing.}

Primitive arrays are linear, and indexed starting at zero.

The size and indices of a @ByteArray#@, @Addr#@, and
@MutableByteArray#@ are all in bytes.  It's up to the program to
calculate the correct byte offset from the start of the array.  This
allows a @ByteArray#@ to contain a mixture of values of different
type, which is often needed when preparing data for and unpicking
results from C.  (Umm... not true of indices... WDP 95/09)

{\em Should we provide some @sizeOfDouble#@ constants?}

Out-of-range errors on indexing should be caught by the code which
uses the primitive operation; the primitive operations themselves do
{\em not} check for out-of-range indexes. The intention is that the
primitive ops compile to one machine instruction or thereabouts.

We use the terms ``reading'' and ``writing'' to refer to accessing {\em mutable} 
arrays (see Section~\ref{sect:mutable}), and ``indexing'' 
to refer to reading a value from an {\em immutable} 
array.

If you want to read/write a @Word#@, read an @Int#@ and coerce.

Immutable byte arrays are straightforward to index (all indices in bytes):
@
indexCharArray#   :: ByteArray# -> Int# -> Char#
indexIntArray#    :: ByteArray# -> Int# -> Int#
indexAddrArray#   :: ByteArray# -> Int# -> Addr#
indexFloatArray#  :: ByteArray# -> Int# -> Float#
indexDoubleArray# :: ByteArray# -> Int# -> Double#

indexCharOffAddr#   :: Addr# -> Int# -> Char#
indexIntOffAddr#    :: Addr# -> Int# -> Int#
indexFloatOffAddr#  :: Addr# -> Int# -> Float#
indexDoubleOffAddr# :: Addr# -> Int# -> Double#
indexAddrOffAddr#   :: Addr# -> Int# -> Addr#	-- Get an Addr# from an Addr# offset
@
The last of these, @indexAddrOffAddr#@, extracts an @Addr#@ using an offset
from another @Addr#@, thereby providing the ability to follow a chain of
C pointers.

Something a bit more interesting goes on when indexing arrays of boxed
objects, because the result is simply the boxed object. So presumably
it should be entered --- we never usually return an unevaluated
object!  This is a pain: primitive ops aren't supposed to do
complicated things like enter objects.  The current solution is to
return a lifted value, but I don't like it!
@
indexArray#       :: Array# elt -> Int# -> GHCbase.Lift elt  -- Yuk!
@

\subsubsection{The state type}

The primitive type @State#@ represents the state of a state transformer.
It is parameterised on the desired type of state, which serves to keep
states from distinct threads distinct from one another.  But the {\em only}
effect of this parameterisation is in the type system: all values of type
@State#@ are represented in the same way.  Indeed, they are all 
represented by nothing at all!  The code generator ``knows'' to generate no 
code, and allocate no registers etc, for primitive states.
@
type State# s
@

The type @GHCbuiltins.RealWorld@ is truly opaque: there are no values defined
of this type, and no operations over it.  It is ``primitive'' in that
sense---but it is {\em not unboxed!} Its only role in life is to be the type
which distinguishes the @PrimIO@ state transformer (see
Section~\ref{sect:io-spec}).
@
data RealWorld
@

\subsubsection{States}

A single, primitive, value of type @State# RealWorld@ is provided.
@
realWorld# :: State# GHCbuiltins.RealWorld
@
(Note: in the compiler, not a @PrimOp@; just a mucho magic @Id@.)

\subsection{State pairing types}
\label{sect:horrid-pairing-types}

This subsection defines some types which, while they aren't quite primitive 
because we can define them in Haskell, are very nearly so.  They define 
constructors which pair a primitive state with a value of each primitive type.
They are required to express the result type of the primitive operations in the 
state monad.
@
data StateAndPtr#    s elt = StateAndPtr#    (State# s) elt 

data StateAndChar#   s     = StateAndChar#   (State# s) Char# 
data StateAndInt#    s     = StateAndInt#    (State# s) Int# 
data StateAndWord#   s     = StateAndWord#   (State# s) Word#
data StateAndFloat#  s     = StateAndFloat#  (State# s) Float# 
data StateAndDouble# s     = StateAndDouble# (State# s) Double#  
data StateAndAddr#   s     = StateAndAddr#   (State# s) Addr#

data StateAndStablePtr# s a = StateAndStablePtr#  (State# s) (StablePtr# a)
data StateAndForeignObj# s  = StateAndForeignObj# (State# s) ForeignObj#
data StateAndSynchVar#  s a = StateAndSynchVar#  (State# s) (SynchVar# a)

data StateAndArray#            s elt = StateAndArray#        (State# s) (Array# elt) 
data StateAndMutableArray#     s elt = StateAndMutableArray# (State# s) (MutableArray# s elt)  
data StateAndByteArray#        s = StateAndByteArray#        (State# s) ByteArray# 
data StateAndMutableByteArray# s = StateAndMutableByteArray# (State# s) (MutableByteArray# s)
@


\subsection{Mutable arrays}
\label{sect:mutable}

Corresponding to @Array#@ and @ByteArray#@,
we have the types of mutable versions of each.  
In each case, the representation is a pointer
to a suitable block of (mutable) heap-allocated storage.
@
type MutableArray# s elt
type MutableByteArray# s
@
\subsubsection{Allocation.}

Mutable arrays can be allocated.
Only pointer-arrays are initialised; arrays of non-pointers are filled
in by ``user code'' rather than by the array-allocation primitive.
Reason: only the pointer case has to worry about GC striking with a
partly-initialised array.
@
newArray#       :: Int# -> elt -> State# s -> StateAndMutableArray# s elt 

newCharArray#   :: Int# -> State# s -> StateAndMutableByteArray# s 
newIntArray#    :: Int# -> State# s -> StateAndMutableByteArray# s 
newAddrArray#   :: Int# -> State# s -> StateAndMutableByteArray# s 
newFloatArray#  :: Int# -> State# s -> StateAndMutableByteArray# s 
newDoubleArray# :: Int# -> State# s -> StateAndMutableByteArray# s 
@
The size of a @ByteArray#@ is given in bytes.

\subsubsection{Reading and writing}

%OLD: Remember, offsets in a @MutableByteArray#@ are in bytes.
@
readArray#       :: MutableArray# s elt -> Int# -> State# s -> StateAndPtr#    s elt
readCharArray#   :: MutableByteArray# s -> Int# -> State# s -> StateAndChar#   s
readIntArray#    :: MutableByteArray# s -> Int# -> State# s -> StateAndInt#    s
readAddrArray#	 :: MutableByteArray# s -> Int# -> State# s -> StateAndAddr#   s 
readFloatArray#  :: MutableByteArray# s -> Int# -> State# s -> StateAndFloat#  s 
readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndDouble# s 

writeArray#       :: MutableArray# s elt -> Int# -> elt     -> State# s -> State# s 
writeCharArray#   :: MutableByteArray# s -> Int# -> Char#   -> State# s -> State# s 
writeIntArray#    :: MutableByteArray# s -> Int# -> Int#    -> State# s -> State# s 
writeAddrArray#   :: MutableByteArray# s -> Int# -> Addr#   -> State# s -> State# s 
writeFloatArray#  :: MutableByteArray# s -> Int# -> Float#  -> State# s -> State# s 
writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s 
@

\subsubsection{Equality.}

One can take ``equality'' of mutable arrays.  What is compared is the
{\em name} or reference to the mutable array, not its contents.
@
sameMutableArray#     :: MutableArray# s elt -> MutableArray# s elt -> Bool
sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
@

\subsubsection{Freezing mutable arrays}

Only unsafe-freeze has a primitive.  (Safe freeze is done directly in Haskell 
by copying the array and then using @unsafeFreeze@.) 
@
unsafeFreezeArray#     :: MutableArray# s elt -> State# s -> StateAndArray#     s elt
unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> StateAndByteArray# s
@

\subsubsection{Stable pointers}

{\em Andy's comment.} {\bf Errors:} The following is not strictly true: the current
implementation is not as polymorphic as claimed.  The reason for this
is that the C programmer will have to use a different entry-routine
for each type of stable pointer.  At present, we only supply a very
limited number (1) of these routines.  It might be possible to
increase the range of these routines by providing general purpose
entry points to apply stable pointers to (stable pointers to)
arguments and to enter (stable pointers to) boxed primitive values.
{\em End of Andy's comment.}

A stable pointer is a name for a Haskell object which can be passed to the 
external world.  It is ``stable'' in the sense that the name does not change when 
the Haskell garbage collector runs --- in contrast to the address of the object 
which may well change.

The stable pointer type is parameterised by the type of the thing which is named.
@
type StablePtr# a
@
A stable pointer is represented by an index into the (static) 
@StablePointerTable@.  The Haskell garbage collector treats the 
@StablePointerTable@ as a source of roots for GC.

The @makeStablePointer@ function converts a value into a stable pointer.
It is part of the @PrimIO@ monad, because we want to be sure we don't
allocate one twice by accident, and then only free one of the copies.
@
makeStablePointer#  :: a -> State# RealWorld -> StateAndStablePtr# RealWorld a
freeStablePointer#  :: StablePtr# a -> State# RealWorld -> State# RealWorld
deRefStablePointer# :: StablePtr# a -> State# RealWorld -> StateAndPtr RealWorld a
@
There is also a C procedure @FreeStablePtr@ which frees a stable pointer.

%
% Rewritten and updated for MallocPtr++ -- 4/96 SOF
%
\subsubsection{Foreign objects}

A @ForeignObj@ is a reference to an object outside the Haskell
world (i.e., from the C world, or a reference to an object on another
machine completely.), where the Haskell world has been told ``Let me
know when you're finished with this ...''.

The @ForeignObj@ type is just a special @Addr#@ ({\em not} parameterised).
@
type ForeignObj#
@

A typical use of @ForeignObj@ is in constructing Haskell bindings
to external libraries. A good example is that of writing a binding to
an image-processing library (which was actually the main motivation
for implementing @ForeignObj@'s precursor, @MallocPtr@). The
images manipulated are not stored in the Haskell heap, either because
the library insist on allocating them internally or we (sensibly)
decide to spare the GC from having to heave heavy images around.

@
data Image = Image ForeignObj#

instance CCallable Image
@

The @ForeignObj#@ type is then used to refer to the externally
allocated image, and to acheive some type safety, the Haskell binding
defines the @Image@ data type. So, a value of type @ForeignObj#@ is
used to ``box'' up an external reference into a Haskell heap object
that we can then indirectly reference:

@
createImage :: (Int,Int) -> PrimIO Image
@

So far, this looks just like an @Addr#@ type, but @ForeignObj#@
offers a bit more, namely that we can specify a {\em finalisation
routine} to invoke when the @ForeignObj#@ is discarded by the
GC. The garbage collector invokes the finalisation routine associated
with the @ForeignObj#@, saying `` Thanks, I'm through with this
now..'' For the image-processing library, the finalisation routine could for
the images free up memory allocated for them. The finalisation routine has
currently to be written in C (the finalisation routine can in turn call on
@FreeStablePtr@ to deallocate a stable pointer.).

Associating a finalisation routine with an external object is done by 
@makeForeignObj#@:

@
makeForeignObj# :: Addr# -- foreign reference
                -> Addr# -- pointer to finalisation routine
		-> StateAndForeignObj# RealWorld ForeignObj#
@

(Implementation: a linked list of all @ForeignObj#@s is maintained to allow the
 garbage collector to detect when a @ForeignObj#@ becomes garbage.)

Like @Array@, @ForeignObj#@s are represented by heap objects.

{\bf ToDo:} Decide whether @FreeCHeapPointer@ is allowed to call on a
stable pointer. (I sincerely hope not since we will still be in the
GC at this point.)

\subsubsection{Synchronizing variables (I-vars, M-vars)}

ToDo ToDo ToDo

@
type SynchVar# s elt	-- primitive

newSynchVar#:: State# s -> StateAndSynchVar# s elt

takeMVar#   :: SynchVar# s elt -> State# s -> StateAndPtr# s elt
putMVar#    :: SynchVar# s elt -> State# s -> State# s

readIVar#   :: SynchVar# s elt -> State# s -> StateAndPtr# s elt
writeIVar#  :: SynchVar# s elt -> State# s -> State# s
@

\subsubsection{Controlling the garbage collector}

The C function {\tt PerformGC\/}, allows the C world to force Haskell
to do a garbage collection.  It can only be called while Haskell
is performing a C Call.

Note that this function can be used to define a Haskell IO operation
with the same effect:
@
>	performGCIO :: PrimIO ()
>	performGCIO = _ccall_gc_ PerformGC
@

{\bf ToDo:} Is there any need for abnormal/normal termination to force
a GC too?  Is there any need for a function that provides finer
control over GC: argument = amount of space required; result = amount
of space recovered.

\subsection{@spark#@ primitive operation (for parallel execution)}

{\em ToDo: say something}  It's used in the unfolding for @par@.

\subsection{The @errorIO#@ primitive operation}

The @errorIO#@ primitive takes an argument much like @PrimIO@.  It aborts execution of
the current program, and continues instead by performing the given @PrimIO@-like value
on the current state of the world.
@
errorIO# :: (State RealWorld -> ((), State RealWorld)) -> a
@

\subsection{C Calls}

{\bf ToDo:} current implementation has state variable as second
argument not last argument.

The @ccall#@ primitive can't be given an ordinary type, because it has
a variable number of arguments.  The nearest we can get is:
@
ccall# :: CRoutine -> a1# -> ... -> an# -> State# RealWorld -> StateAndR# RealWorld
@
where the type variables @a1#@\ldots@an#@ and @r#@ can be instantiated by any
primitive type, and @StateAndR#@ is the appropriate pairing type from 
Section~\ref{sect:horrid-pairing-types}.  The @CRoutine@ 
isn't a proper Haskell type at all; it just reminds us that @ccall#@ needs to 
know what C routine to call.

This notation is really short for a massive family of @ccall#@ primitives, one 
for each combination of types.  (Of course, the compiler simply remembers the 
types involved, and generates appropriate code when it finally spits out the C.)

Unlike all the other primitive operators, @ccall#@ is not bound to an in-scope 
identifier.  The only way it is possible to generate a @ccall#@ is via the 
@_ccall_@ construct.

All this applies equally to @casm#@:
@
casm#  :: CAsmString -> a1# -> ... -> an# -> State# RealWorld -> StateAndR# RealWorld
@

%------------------------------------------------------------
\section{Library stuff built with the Really Primitive Stuff}

\subsection{The state transformer monad}

\subsubsection{Types}

A state transformer is a function from a state to a pair of a result and a new 
state.  
@
newtype ST s a = ST (State s -> (a, State s))
@
The @ST@ type is {\em abstract}, so that the programmer cannot see its 
representation.  If he could, he could write bad things like:
@
bad :: ST s a
bad = ST $ \ s -> ...(f s)...(g s)...
@
Here, @s@ is duplicated, which would be bad news.

A state is represented by a primitive state value, of type @State# s@, 
wrapped up in a @State@ constructor.  The reason for boxing it in this
way is so that we can be strict or lazy in the state.  (Remember, all 
primitive types are unboxed, and hence can't be bottom; but types built
with @data@ are all boxed.)
@
data State s = S# (State# s)
@

\subsubsection{The state transformer combinators}

Now for the combinators, all of which live inside the @ST@
abstraction.  Notice that @returnST@ and @thenST@ are lazy in the
state.
@
returnST :: a -> ST s a
returnST a s = (a, s)

thenST :: ST s a -> (a -> ST s b) -> ST s b
thenST m k s = let (r,new_s) = m s
               in 
               k r new_s

fixST :: (a -> ST s a) -> ST s a
fixST k s = let ans = k r s
                (r,new_s) = ans
            in
            ans
@
The interesting one is, of course, @runST@.  We can't infer its type!
(It has a funny name because it must be wired into the compiler.)
@
-- runST :: forall a. (forall s. ST s a) -> a
runST m = case m (S# realWorld#) of
           (r,_) -> r
@

\subsubsection{Other useful combinators}

There are various other standard combinators, all defined in terms the
fundamental combinators above. The @seqST@ combinator is like
@thenST@, except that it discards the result of the first state
transformer:
@
seqST :: ST s a -> ST s b -> ST s b
seqST m1 m2 = m1 `thenST` (\_ -> m2)
@

We also have {\em strict} (... in the state...) variants of the
then/return combinators (same types as their pals):
@
returnStrictlyST a s@(S# _) = (a, s)

thenStrictlyST m k s@(S# _)
  = case (m s) of { (r, new_s@(S# _)) ->
    k r new_s }

seqStrictlyST m k = ... ditto, for seqST ...
@

The combinator @listST@ takes a list of state transformers, and
composes them in sequence, returning a list of their results:
@
listST :: [ST s a] -> ST s [a]
listST []     = returnST []
listST (m:ms) = m		`thenST` \ r ->
		listST ms	`thenST` \ rs ->
		returnST (r:rs)
@
The @mapST@ combinator ``lifts'' a function from a value to state
transformers to one which works over a list of values:
@
mapST :: (a -> ST s b) -> [a] -> ST s [b]
mapST f ms = listST (map f ms)
@
The @mapAndUnzipST@ combinator is similar to @mapST@, except that here the
function returns a pair:
@
mapAndUnzipST :: (a -> ST s (b,c)) -> [a] -> ST s ([b],[c])
mapAndUnzipST f (m:ms)
  = f m			`thenST` \ ( r1,  r2) ->
    mapAndUnzipST f ms	`thenST` \ (rs1, rs2) ->
    returnST (r1:rs1, r2:rs2)
@

\subsubsection{The @PrimIO@ monad}
\label{sect:io-spec}

The @PrimIO@ type is defined in as a state transformer which manipulates the 
@RealWorld@.
@
type PrimIO a = ST RealWorld a      -- Transparent
@
The @PrimIO@ type is an ordinary type synonym, transparent to the programmer.

The type @RealWorld@ and value @realWorld#@ do not need to be hidden (although 
there is no particular point in exposing them).  Even having a value of type 
@realWorld#@ does not compromise safety, since the type @ST@ is hidden. 

It is type-correct to use @returnST@ in an I/O context, but it is a
bit more efficient to use @returnPrimIO@.  The latter is strict in the
state, which propagates backwards to all the earlier combinators
(provided they are unfolded).  Why is it safe for @returnPrimIO@ to be
strict in the state?  Because every context in which an I/O state
transformer is used will certainly evaluate the resulting state; it is
the state of the real world!
@
returnPrimIO :: a -> PrimIO a
returnPrimIO a s@(S# _) -> (a, s)
@
We provide strict versions of the other combinators too.
@
thenPrimIO m k s = case m s of
		     (r,s) -> k r s
@
@fixPrimIO@ has to be lazy, though!
@
fixPrimIO  = fixST
@
The other combinators are just the same as before, but use the strict
@thenPrimIO@ and @returnPrimIO@ for efficiency.
@
foldrPrimIO f z []     = z
foldrPrimIO f z (m:ms) = foldrPrimIO f z ms `thenPrimIO` \ ms' ->
			 f m ms'

listPrimIO ms = foldrPrimIO (\ a xs -> a `thenPrimIO` \ x -> returnPrimIO (x : xs))
		(returnPrimIO []) ms

mapPrimIO f ms = listPrimIO (map f ms)

mapAndUnzipPrimIO f (m:ms)
  = f m			    `thenPrimIO` \ ( r1,  r2) ->
    mapAndUnzipPrimIO f ms  `thenPrimIO` \ (rs1, rs2) ->
    returnPrimIO (r1:rs1, r2:rs2)
@

\subsection{Arrays}

\subsubsection{Types}

@
data Array      ix elt = Array     (ix,ix) (Array# elt)
data ByteArray ix      = ByteArray (ix,ix) ByteArray#

data MutableArray     s ix elt = MutableArray     (ix,ix) (MutableArray# s elt)
data MutableByteArray s ix     = MutableByteArray (ix,ix) (MutableByteArray# s)
@

\subsubsection{Operations on immutable arrays}

Ordinary array indexing is straightforward.
@
(!) :: Ix ix => Array ix elt -> ix -> elt
@
QUESTIONs: should @ByteArray@s be indexed by Ints or ix?  With byte offsets
or sized ones? (sized ones [WDP])
@
indexCharArray   :: Ix ix => ByteArray ix -> ix -> Char
indexIntArray    :: Ix ix => ByteArray ix -> ix -> Int
indexAddrArray   :: Ix ix => ByteArray ix -> ix -> Addr
indexFloatArray  :: Ix ix => ByteArray ix -> ix -> Float
indexDoubleArray :: Ix ix => ByteArray ix -> ix -> Double
@
@Addr@s are indexed straightforwardly by @Int@s.  Unlike the primitive
operations, though, the offsets assume that the array consists entirely of the
type of value being indexed, and so there's an implicit multiplication by
the size of that value.  To access @Addr@s with mixed values requires
you to do a DIY job using the primitives.
@
indexAddrChar :: Addr -> Int -> Char
...etc...
indexStaticCharArray   :: Addr -> Int -> Char
indexStaticIntArray    :: Addr -> Int -> Int
indexStaticFloatArray  :: Addr -> Int -> Float
indexStaticDoubleArray :: Addr -> Int -> Double
indexStaticArray       :: Addr -> Int -> Addr
@

\subsubsection{Operations on mutable arrays}
@
newArray     :: Ix ix => (ix,ix) -> elt -> ST s (MutableArray s ix elt)
newCharArray :: Ix ix => (ix,ix) -> ST s (MutableByteArray s ix) 
...
@

@
readArray   :: Ix ix => MutableArray s ix elt -> ix -> ST s elt 
readCharArray   :: Ix ix => MutableByteArray s ix -> ix -> ST s Char 
...
@

@
writeArray  :: Ix ix => MutableArray s ix elt -> ix -> elt -> ST s () 
writeCharArray  :: Ix ix => MutableByteArray s ix -> ix -> Char -> ST s () 
...
@

@
freezeArray :: Ix ix => MutableArray s ix elt -> ST s (Array ix elt)
freezeCharArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix)
...
@

We have no need on one-function-per-type for unsafe freezing:
@
unsafeFreezeArray :: Ix ix => MutableArray s ix elt -> ST s (Array ix elt)  
unsafeFreezeByteArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix)
@

Sometimes we want to snaffle the bounds of one of these beasts:
@
boundsOfArray     :: Ix ix => MutableArray s ix elt -> (ix, ix)  
boundsOfByteArray :: Ix ix => MutableByteArray s ix -> (ix, ix)
@

Lastly, ``equality'':
@
sameMutableArray     :: MutableArray s ix elt -> MutableArray s ix elt -> Bool
sameMutableByteArray :: MutableByteArray s ix -> MutableByteArray s ix -> Bool
@


\subsection{Variables}

\subsubsection{Types}

Mutable variables are (for now anyway) implemented as arrays.  The @MutableVar@ type
is opaque, so we can change the implementation later if we want.
@
type MutableVar s a = MutableArray s Int a
@

\subsubsection{Operations}
@
newVar   :: a -> ST s (MutableVar s a)
readVar  :: MutableVar s a -> ST s a
writeVar :: MutableVar s a -> a -> ST s ()
sameVar  :: MutableVar s a -> MutableVar s a -> Bool
@

\subsection{Stable pointers}

Nothing exciting here, just simple boxing up.
@
data StablePtr a = StablePtr (StablePtr# a)

makeStablePointer :: a -> StablePtr a
freeStablePointer :: StablePtr a -> PrimIO ()
@

\subsection{Foreign objects}

Again, just boxing up.
@
data ForeignObj = ForeignObj ForeignObj#

makeForeignObj :: Addr -> Addr -> PrimIO ForeignObj
@

\subsection{C calls}

Everything in this section goes for @_casm_@ too.

{\em ToDo: mention @_ccall_gc_@ and @_casm_gc_@...}

The @_ccall_@ construct has the following form:
$$@_ccall_@~croutine~a_1~\ldots~a_n$$
This whole construct has type $@PrimIO@~res$.
The rules are these:
\begin{itemize}
\item
The first ``argument'', $croutine$, must be the literal name of a C procedure.
It cannot be a Haskell expression which evaluates to a string, etc; it must be 
simply the name of the procedure.
\item
The arguments $a_1, \ldots,a_n$ must be of {\em C-callable} type.
\item
The whole construct has type $@PrimIO@~ty$, where $ty$ is a {\em C-returnable} type.
\end{itemize}
A {\em boxed-primitive} type is both C-callable and C-returnable.
A boxed primitive type is anything declared by:
@
data T = C# t
@
where @t@ is a primitive type.  Note that
programmer-defined boxed-primitive types are perfectly OK:
@
data Widget = W# Int#
data Screen = S# CHeapPtr#
@

There are other types that can be passed to C (C-callable).  This
table summarises (including the standard boxed-primitive types):
@
Boxed	    	  Type of transferd   Corresp.     Which is
Type	    	  Prim. component     C type       *probably*...
------	    	  ---------------     ------       -------------
Char	    	  Char#		      StgChar	    unsigned char
Int 	    	  Int#		      StgInt	    long int
Word	    	  Word#		      StgWord	    unsigned long int
Addr	    	  Addr#		      StgAddr	    char *
Float	    	  Float#	      StgFloat      float
Double		  Double#	      StgDouble     double
		  		       
Array		  Array#	      StgArray      StgPtr
ByteArray	  ByteArray#	      StgByteArray  StgPtr
MutableArray	  MutableArray#	      StgArray      StgPtr
MutableByteArray  MutableByteArray#   StgByteArray  StgPtr
		  		      
State		  State#	      nothing!
		  		      
StablePtr	  StablePtr#	      StgStablePtr  StgPtr
ForeignObj	  ForeignObj#	      StgForeignObj StgPtr
@

All of the above are {\em C-returnable} except:
@
	Array, ByteArray, MutableArray, MutableByteArray
@

{\bf ToDo:} I'm pretty wary of @Array@ and @MutableArray@ being in
this list, and not too happy about @State@ [WDP].

{\bf ToDo:} Can code generator pass all the primitive types?  Should this be
extended to include {\tt Bool\/} (or any enumeration type?)

The type checker must be able to figure out just which of the C-callable/returnable
types is being used.  If it can't, you have to add type signatures. For example,
@
f x = _ccall_ foo x
@
is not good enough, because the compiler can't work out what type @x@ is, nor 
what type the @_ccall_@ returns.  You have to write, say:
@
f :: Int -> PrimIO Float
f x = _ccall_ foo x
@

\subsubsection{Implementation}

The desugarer unwraps the @_ccall_@ construct by inserting the necessary 
evaluations etc to unbox the arguments.  For example, the body of the definition 
of @f@ above would become:
@
        (\ s -> case x of { I# x# -> 
                case s of { S# s# ->
                case ccall# [Int#,Float#] x# s# of { StateAndFloat# f# new_s# ->
                (F# f#, S# new_s#)
                }}})
@
Notice that the state, too, is unboxed.

The code generator must deal specially with primitive objects which
are stored on the heap.

... details omitted ...

%
%More importantly, it must construct a C Heap Pointer heap-object after
%a @_ccall_@ which returns a @MallocPtr#@.
%

%--------------------------------------------------------
\section{Non-primitive stuff that must be wired into GHC}

@
data Char    = C# Char#
data Int     = I# Int#
data Word    = W# Word#
data Addr    = A# Addr#

data Float   = F# Float#
data Double  = D# Double#
data Integer = J# Int# Int# ByteArray#

-- and the other boxed-primitive types:
    Array, ByteArray, MutableArray, MutableByteArray,
    StablePtr, ForeignObj

data Bool     = False | True
data Ordering = LT | EQ | GT  -- used in derived comparisons

data List a = [] | a : (List a)
-- tuples...

data Lift a = Lift a    -- used Yukkily as described elsewhere

type String  = [Char]    -- convenience, only
@

%------------------------------------------------------------
\section{Programmer interface(s)}

\subsection{The bog-standard interface}

If you rely on the implicit @import Prelude@ that GHC normally does
for you, and if you don't use any weird flags (notably
@-fglasgow-exts@), and if you don't import one of the fairly-magic
@PreludeGla*@ interfaces, then GHC should work {\em exactly} as the
Haskell report says, and the full user namespaces should be available
to you.

\subsection{If you mess about with @import Prelude@...}

Innocent hiding, e.g.,
@
import Prelude hiding ( fromIntegral )
@
should work just fine.

There are some things you can do that will make GHC crash, e.g.,
hiding a standard class:
@
import Prelude hiding ( Eq(..) )
@
Don't do that.

\subsection{Turning on Glasgow extensions with @-fglasgow-exts@}

If you turn on @-fglasgow-exts@, then all the primitive types and
operations described herein are available.

It is possible that some name conflicts between your code and the
wired-in things might spring to life (though we doubt it...).
Change your names :-)

\end{document}