1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
|
{-
(c) The University of Glasgow 2011
-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE LambdaCase #-}
-- | The deriving code for the Functor, Foldable, and Traversable classes
module GHC.Tc.Deriv.Functor
( FFoldType(..)
, functorLikeTraverse
, deepSubtypesContaining
, foldDataConArgs
, gen_Functor_binds
, gen_Foldable_binds
, gen_Traversable_binds
)
where
import GHC.Prelude
import GHC.Data.Bag
import GHC.Core.DataCon
import GHC.Data.FastString
import GHC.Hs
import GHC.Utils.Panic
import GHC.Builtin.Names
import GHC.Types.Name.Reader
import GHC.Types.SrcLoc
import GHC.Utils.Monad.State.Strict
import GHC.Tc.Deriv.Generate
import GHC.Tc.Utils.TcType
import GHC.Core.TyCon
import GHC.Core.TyCo.Rep
import GHC.Core.Type
import GHC.Utils.Misc
import GHC.Types.Var
import GHC.Types.Var.Set
import GHC.Types.Id.Make (coerceId)
import GHC.Builtin.Types (true_RDR, false_RDR)
import Data.Maybe (catMaybes, isJust)
{-
************************************************************************
* *
Functor instances
see http://www.mail-archive.com/haskell-prime@haskell.org/msg02116.html
* *
************************************************************************
For the data type:
data T a = T1 Int a | T2 (T a)
We generate the instance:
instance Functor T where
fmap f (T1 b1 a) = T1 b1 (f a)
fmap f (T2 ta) = T2 (fmap f ta)
Notice that we don't simply apply 'fmap' to the constructor arguments.
Rather
- Do nothing to an argument whose type doesn't mention 'a'
- Apply 'f' to an argument of type 'a'
- Apply 'fmap f' to other arguments
That's why we have to recurse deeply into the constructor argument types,
rather than just one level, as we typically do.
What about types with more than one type parameter? In general, we only
derive Functor for the last position:
data S a b = S1 [b] | S2 (a, T a b)
instance Functor (S a) where
fmap f (S1 bs) = S1 (fmap f bs)
fmap f (S2 (p,q)) = S2 (a, fmap f q)
However, we have special cases for
- tuples
- functions
More formally, we write the derivation of fmap code over type variable
'a for type 'b as ($fmap 'a 'b x). In this general notation the derived
instance for T is:
instance Functor T where
fmap f (T1 x1 x2) = T1 ($(fmap 'a 'b1) x1) ($(fmap 'a 'a) x2)
fmap f (T2 x1) = T2 ($(fmap 'a '(T a)) x1)
$(fmap 'a 'b x) = x -- when b does not contain a
$(fmap 'a 'a x) = f x
$(fmap 'a '(b1,b2) x) = case x of (x1,x2) -> ($(fmap 'a 'b1 x1), $(fmap 'a 'b2 x2))
$(fmap 'a '(T b1 a) x) = fmap f x -- when a only occurs directly as the last argument of T
$(fmap 'a '(T b1 b2) x) = fmap (\y. $(fmap 'a 'b2 y)) x -- when a only occurs in the last parameter, b2
$(fmap 'a '(tb -> tc) x) = \(y:tb[b/a]) -> $(fmap 'a' 'tc' (x $(cofmap 'a 'tb y)))
For functions, the type parameter 'a can occur in a contravariant position,
which means we need to derive a function like:
cofmap :: (a -> b) -> (f b -> f a)
This is pretty much the same as $fmap, only without the $(cofmap 'a 'a x) and
$(cofmap 'a '(T b1 a) x) cases:
$(cofmap 'a 'b x) = x -- when b does not contain a
$(cofmap 'a 'a x) = error "type variable in contravariant position"
$(cofmap 'a '(b1,b2) x) = case x of (x1,x2) -> ($(cofmap 'a 'b1) x1, $(cofmap 'a 'b2) x2)
$(cofmap 'a '(T b1 a) x) = error "type variable in contravariant position" -- when a only occurs directly as the last argument of T
$(cofmap 'a '(T b1 b2) x) = fmap (\y. $(cofmap 'a 'b2 y)) x -- when a only occurs in the last parameter, b2
$(cofmap 'a '(tb -> tc) x) = \(y:tb[b/a]) -> $(cofmap 'a' 'tc' (x $(fmap 'a 'tb y)))
Note that the code produced by $(fmap _ _ _) is always a higher order function,
with type `(a -> b) -> (g a -> g b)` for some g.
Note that there are two distinct cases in $fmap (and $cofmap) that match on an
application of some type constructor T (where T is not a tuple type
constructor):
$(fmap 'a '(T b1 a) x) = fmap f x -- when a only occurs directly as the last argument of T
$(fmap 'a '(T b1 b2) x) = fmap (\y. $(fmap 'a 'b2 y)) x -- when a only occurs in the last parameter, b2
While the latter case technically subsumes the former case, it is important to
give special treatment to the former case to avoid unnecessary eta expansion.
See Note [Avoid unnecessary eta expansion in derived fmap implementations].
We also generate code for (<$) in addition to fmap—see Note [Deriving <$] for
an explanation of why this is important. Just like $fmap/$cofmap above, there
is a similar algorithm for generating `p <$ x` (for some constant `p`):
$(replace 'a 'b x) = x -- when b does not contain a
$(replace 'a 'a x) = p
$(replace 'a '(b1,b2) x) = case x of (x1,x2) -> ($(replace 'a 'b1 x1), $(replace 'a 'b2 x2))
$(replace 'a '(T b1 a) x) = p <$ x -- when a only occurs directly as the last argument of T
$(replace 'a '(T b1 b2) x) = fmap (\y. $(replace 'a 'b2 y)) x -- when a only occurs in the last parameter, b2
$(replace 'a '(tb -> tc) x) = \(y:tb[b/a]) -> $(replace 'a' 'tc' (x $(coreplace 'a 'tb y)))
$(coreplace 'a 'b x) = x -- when b does not contain a
$(coreplace 'a 'a x) = error "type variable in contravariant position"
$(coreplace 'a '(b1,b2) x) = case x of (x1,x2) -> ($(coreplace 'a 'b1 x1), $(coreplace 'a 'b2 x2))
$(coreplace 'a '(T b1 a) x) = error "type variable in contravariant position" -- when a only occurs directly as the last argument of T
$(coreplace 'a '(T b1 b2) x) = fmap (\y. $(coreplace 'a 'b2 y)) x -- when a only occurs in the last parameter, b2
$(coreplace 'a '(tb -> tc) x) = \(y:tb[b/a]) -> $(coreplace 'a' 'tc' (x $(replace 'a 'tb y)))
-}
gen_Functor_binds :: SrcSpan -> TyCon -> [Type] -> (LHsBinds GhcPs, BagDerivStuff)
-- When the argument is phantom, we can use fmap _ = coerce
-- See Note [Phantom types with Functor, Foldable, and Traversable]
gen_Functor_binds loc tycon _
| Phantom <- last (tyConRoles tycon)
= (unitBag fmap_bind, emptyBag)
where
fmap_name = L (noAnnSrcSpan loc) fmap_RDR
fmap_bind = mkRdrFunBind fmap_name fmap_eqns
fmap_eqns = [mkSimpleMatch fmap_match_ctxt
[nlWildPat]
coerce_Expr]
fmap_match_ctxt = mkPrefixFunRhs fmap_name
gen_Functor_binds loc tycon tycon_args
= (listToBag [fmap_bind, replace_bind], emptyBag)
where
data_cons = getPossibleDataCons tycon tycon_args
fmap_name = L (noAnnSrcSpan loc) fmap_RDR
-- See Note [EmptyDataDecls with Functor, Foldable, and Traversable]
fmap_bind = mkRdrFunBindEC 2 id fmap_name fmap_eqns
fmap_match_ctxt = mkPrefixFunRhs fmap_name
fmap_eqn con = flip evalState bs_RDRs $
match_for_con fmap_match_ctxt [f_Pat] con parts
where
parts = foldDataConArgs ft_fmap con
fmap_eqns = map fmap_eqn data_cons
ft_fmap :: FFoldType (LHsExpr GhcPs -> State [RdrName] (LHsExpr GhcPs))
ft_fmap = FT { ft_triv = \x -> pure x
-- fmap f x = x
, ft_var = \x -> pure $ nlHsApp f_Expr x
-- fmap f x = f x
, ft_fun = \g h x -> mkSimpleLam $ \b -> do
gg <- g b
h $ nlHsApp x gg
-- fmap f x = \b -> h (x (g b))
, ft_tup = mkSimpleTupleCase (match_for_con CaseAlt)
-- fmap f x = case x of (a1,a2,..) -> (g1 a1,g2 a2,..)
, ft_ty_app = \_ arg_ty g x ->
-- If the argument type is a bare occurrence of the
-- data type's last type variable, then we can generate
-- more efficient code.
-- See Note [Avoid unnecessary eta expansion in derived fmap implementations]
if tcIsTyVarTy arg_ty
then pure $ nlHsApps fmap_RDR [f_Expr,x]
else do gg <- mkSimpleLam g
pure $ nlHsApps fmap_RDR [gg,x]
-- fmap f x = fmap g x
, ft_forall = \_ g x -> g x
, ft_bad_app = panic "in other argument in ft_fmap"
, ft_co_var = panic "contravariant in ft_fmap" }
-- See Note [Deriving <$]
replace_name = L (noAnnSrcSpan loc) replace_RDR
-- See Note [EmptyDataDecls with Functor, Foldable, and Traversable]
replace_bind = mkRdrFunBindEC 2 id replace_name replace_eqns
replace_match_ctxt = mkPrefixFunRhs replace_name
replace_eqn con = flip evalState bs_RDRs $
match_for_con replace_match_ctxt [z_Pat] con parts
where
parts = foldDataConArgs ft_replace con
replace_eqns = map replace_eqn data_cons
ft_replace :: FFoldType (LHsExpr GhcPs -> State [RdrName] (LHsExpr GhcPs))
ft_replace = FT { ft_triv = \x -> pure x
-- p <$ x = x
, ft_var = \_ -> pure z_Expr
-- p <$ _ = p
, ft_fun = \g h x -> mkSimpleLam $ \b -> do
gg <- g b
h $ nlHsApp x gg
-- p <$ x = \b -> h (x (g b))
, ft_tup = mkSimpleTupleCase (match_for_con CaseAlt)
-- p <$ x = case x of (a1,a2,..) -> (g1 a1,g2 a2,..)
, ft_ty_app = \_ arg_ty g x ->
-- If the argument type is a bare occurrence of the
-- data type's last type variable, then we can generate
-- more efficient code.
-- See [Deriving <$]
if tcIsTyVarTy arg_ty
then pure $ nlHsApps replace_RDR [z_Expr,x]
else do gg <- mkSimpleLam g
pure $ nlHsApps fmap_RDR [gg,x]
-- p <$ x = fmap (p <$) x
, ft_forall = \_ g x -> g x
, ft_bad_app = panic "in other argument in ft_replace"
, ft_co_var = panic "contravariant in ft_replace" }
-- Con a1 a2 ... -> Con (f1 a1) (f2 a2) ...
match_for_con :: Monad m
=> HsMatchContext GhcPs
-> [LPat GhcPs] -> DataCon
-> [LHsExpr GhcPs -> m (LHsExpr GhcPs)]
-> m (LMatch GhcPs (LHsExpr GhcPs))
match_for_con ctxt = mkSimpleConMatch ctxt $
\con_name xsM -> do xs <- sequence xsM
pure $ nlHsApps con_name xs -- Con x1 x2 ..
{-
Note [Avoid unnecessary eta expansion in derived fmap implementations]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
For the sake of simplicity, the algorithm that derived implementations of
fmap used to have a single case that dealt with applications of some type
constructor T (where T is not a tuple type constructor):
$(fmap 'a '(T b1 b2) x) = fmap (\y. $(fmap 'a 'b2 y)) x -- when a only occurs in the last parameter, b2
This generated less than optimal code in certain situations, however. Consider
this example:
data List a = Nil | Cons a (List a) deriving Functor
This would generate the following Functor instance:
instance Functor List where
fmap f Nil = Nil
fmap f (Cons x xs) = Cons (f x) (fmap (\y -> f y) xs)
The code `fmap (\y -> f y) xs` is peculiar, since it eta expands an application
of `f`. What's worse, this eta expansion actually degrades performance! To see
why, we can trace an invocation of fmap on a small List:
fmap id $ Cons 0 $ Cons 0 $ Cons 0 $ Cons 0 Nil
Cons (id 0) $ fmap (\y -> id y)
$ Cons 0 $ Cons 0 $ Cons 0 Nil
Cons (id 0) $ Cons ((\y -> id y) 0)
$ fmap (\y' -> (\y -> id y) y')
$ Cons 0 $ Cons 0 Nil
Cons (id 0) $ Cons ((\y -> id y) 0)
$ Cons ((\y' -> (\y -> id y) y') 0)
$ fmap (\y'' -> (\y' -> (\y -> id y) y') y'')
$ Cons 0 Nil
Cons (id 0) $ Cons ((\y -> id y) 0)
$ Cons ((\y' -> (\y -> id y) y') 0)
$ Cons ((\y'' -> (\y' -> (\y -> id y) y') y'') 0)
$ fmap (\y''' -> (\y'' -> (\y' -> (\y -> id y) y') y'') y''')
$ Nil
Cons (id 0) $ Cons ((\y -> id y) 0)
$ Cons ((\y' -> (\y -> id y) y') 0)
$ Cons ((\y'' -> (\y' -> (\y -> id y) y') y'') 0)
$ Nil
Notice how the number of lambdas—and hence, the number of closures—one
needs to evaluate grows very quickly. In general, a List with N cons cells will
require (1 + 2 + ... (N-1)) beta reductions, which takes O(N^2) time! This is
what caused the performance issues observed in #7436.
But hold on a second: shouldn't GHC's optimizer be able to eta reduce
`\y -> f y` to `f` and avoid these beta reductions? Unfortunately, this is not
the case. In general, eta reduction can change the semantics of a program. For
instance, (\x -> ⊥) `seq` () converges, but ⊥ `seq` () diverges. It just so
happens that the fmap implementation above would have the same semantics
regardless of whether or not `\y -> f y` or `f` is used, but GHC's optimizer is
not yet smart enough to realize this (see #17881).
To avoid this quadratic blowup, we add a special case to $fmap that applies
`fmap f` directly:
$(fmap 'a '(T b1 a) x) = fmap f x -- when a only occurs directly as the last argument of T
$(fmap 'a '(T b1 b2) x) = fmap (\y. $(fmap 'a 'b2 y)) x -- when a only occurs in the last parameter, b2
With this modified algorithm, the derived Functor List instance becomes:
instance Functor List where
fmap f Nil = Nil
fmap f (Cons x xs) = Cons (f x) (fmap f xs)
No lambdas in sight, just the way we like it.
This special case does not prevent all sources quadratic closure buildup,
however. In this example:
data PolyList a = PLNil | PLCons a (PolyList (PolyList a))
deriving Functor
We would derive the following code:
instance Functor PolyList where
fmap f PLNil = PLNil
fmap f (PLCons x xs) = PLCons (f x) (fmap (\y -> fmap f y) xs)
The use of `fmap (\y -> fmap f y) xs` builds up closures in much the same way
as `fmap (\y -> f y) xs`. The difference here is that even if we eta reduced
to `fmap (fmap f) xs`, GHC would /still/ build up a closure, since we are
recursively invoking fmap with a different argument (fmap f). Since we end up
paying the price of building a closure either way, we do not extend the special
case in $fmap any further, since it wouldn't buy us anything.
The ft_ty_app field of FFoldType distinguishes between these two $fmap cases by
inspecting the argument type. If the argument type is a bare type variable,
then we can conclude the type variable /must/ be the same as the data type's
last type parameter. We know that this must be the case since there is an
invariant that the argument type in ft_ty_app will always contain the last
type parameter somewhere (see Note [FFoldType and functorLikeTraverse]), so
if the argument type is a bare variable, then that must be exactly the last
type parameter.
Note that the ft_ty_app case of ft_replace (which derives implementations of
(<$)) also inspects the argument type to generate more efficient code.
See Note [Deriving <$].
Note [Deriving <$]
~~~~~~~~~~~~~~~~~~
We derive the definition of <$. Allowing this to take the default definition
can lead to memory leaks: mapping over a structure with a constant function can
fill the result structure with trivial thunks that retain the values from the
original structure. The simplifier seems to handle this all right for simple
types, but not for recursive ones. Consider
data Tree a = Bin !(Tree a) a !(Tree a) | Tip deriving Functor
-- fmap _ Tip = Tip
-- fmap f (Bin l v r) = Bin (fmap f l) (f v) (fmap f r)
Using the default definition of <$, we get (<$) x = fmap (\_ -> x) and that
simplifies no further. Why is that? `fmap` is defined recursively, so GHC
cannot inline it. The static argument transformation would turn the definition
into a non-recursive one
-- fmap f = go where
-- go Tip = Tip
-- go (Bin l v r) = Bin (go l) (f v) (go r)
which GHC could inline, producing an efficient definion of `<$`. But there are
several problems. First, GHC does not perform the static argument transformation
by default, even with -O2. Second, even when it does perform the static argument
transformation, it does so only when there are at least two static arguments,
which is not the case for fmap. Finally, when the type in question is
non-regular, such as
data Nesty a = Z a | S (Nesty a) (Nest (a, a))
the function argument is no longer (entirely) static, so the static argument
transformation will do nothing for us.
Applying the default definition of `<$` will produce a tree full of thunks that
look like ((\_ -> x) x0), which represents unnecessary thunk allocation and
also retention of the previous value, potentially leaking memory. Instead, we
derive <$ separately. Two aspects are different from fmap: the case of the
sought type variable (ft_var) and the case of a type application (ft_ty_app).
The interesting one is ft_ty_app. We have to distinguish two cases: the
"immediate" case where the type argument *is* the sought type variable, and
the "nested" case where the type argument *contains* the sought type variable.
The immediate case:
Suppose we have
data Imm a = Imm (F ... a)
Then we want to define
x <$ Imm q = Imm (x <$ q)
The nested case:
Suppose we have
data Nes a = Nes (F ... (G a))
Then we want to define
x <$ Nes q = Nes (fmap (x <$) q)
We inspect the argument type in ft_ty_app
(see Note [FFoldType and functorLikeTraverse]) to distinguish between these
two cases. If the argument type is a bare type variable, then we know that it
must be the same variable as the data type's last type parameter.
This is very similar to a trick that derived fmap implementations
use in their own ft_ty_app case.
See Note [Avoid unnecessary eta expansion in derived fmap implementations],
which explains why checking if the argument type is a bare variable is
the right thing to do.
We could, but do not, give tuples special treatment to improve efficiency
in some cases. Suppose we have
data Nest a = Z a | S (Nest (a,a))
The optimal definition would be
x <$ Z _ = Z x
x <$ S t = S ((x, x) <$ t)
which produces a result with maximal internal sharing. The reason we do not
attempt to treat this case specially is that we have no way to give
user-provided tuple-like types similar treatment. If the user changed the
definition to
data Pair a = Pair a a
data Nest a = Z a | S (Nest (Pair a))
they would experience a surprising degradation in performance. -}
{-
Utility functions related to Functor deriving.
Since several things use the same pattern of traversal, this is abstracted into functorLikeTraverse.
This function works like a fold: it makes a value of type 'a' in a bottom up way.
-}
-- Generic traversal for Functor deriving
-- See Note [FFoldType and functorLikeTraverse]
data FFoldType a -- Describes how to fold over a Type in a functor like way
= FT { ft_triv :: a
-- ^ Does not contain variable
, ft_var :: a
-- ^ The variable itself
, ft_co_var :: a
-- ^ The variable itself, contravariantly
, ft_fun :: a -> a -> a
-- ^ Function type
, ft_tup :: TyCon -> [a] -> a
-- ^ Tuple type. The @[a]@ is the result of folding over the
-- arguments of the tuple.
, ft_ty_app :: Type -> Type -> a -> a
-- ^ Type app, variable only in last argument. The two 'Type's are
-- the function and argument parts of @fun_ty arg_ty@,
-- respectively.
, ft_bad_app :: a
-- ^ Type app, variable other than in last argument
, ft_forall :: TcTyVar -> a -> a
-- ^ Forall type
}
functorLikeTraverse :: forall a.
TyVar -- ^ Variable to look for
-> FFoldType a -- ^ How to fold
-> Type -- ^ Type to process
-> a
functorLikeTraverse var (FT { ft_triv = caseTrivial, ft_var = caseVar
, ft_co_var = caseCoVar, ft_fun = caseFun
, ft_tup = caseTuple, ft_ty_app = caseTyApp
, ft_bad_app = caseWrongArg, ft_forall = caseForAll })
ty
= fst (go False ty)
where
go :: Bool -- Covariant or contravariant context
-> Type
-> (a, Bool) -- (result of type a, does type contain var)
go co ty | Just ty' <- tcView ty = go co ty'
go co (TyVarTy v) | v == var = (if co then caseCoVar else caseVar,True)
go co (FunTy { ft_arg = x, ft_res = y, ft_af = af })
| InvisArg <- af = go co y
| xc || yc = (caseFun xr yr,True)
where (xr,xc) = go (not co) x
(yr,yc) = go co y
go co (AppTy x y) | xc = (caseWrongArg, True)
| yc = (caseTyApp x y yr, True)
where (_, xc) = go co x
(yr,yc) = go co y
go co ty@(TyConApp con args)
| not (or xcs) = (caseTrivial, False) -- Variable does not occur
-- At this point we know that xrs, xcs is not empty,
-- and at least one xr is True
| isTupleTyCon con = (caseTuple con xrs, True)
| or (init xcs) = (caseWrongArg, True) -- T (..var..) ty
| Just (fun_ty, arg_ty) <- splitAppTy_maybe ty -- T (..no var..) ty
= (caseTyApp fun_ty arg_ty (last xrs), True)
| otherwise = (caseWrongArg, True) -- Non-decomposable (eg type function)
where
-- When folding over an unboxed tuple, we must explicitly drop the
-- runtime rep arguments, or else GHC will generate twice as many
-- variables in a unboxed tuple pattern match and expression as it
-- actually needs. See #12399
(xrs,xcs) = unzip (map (go co) (dropRuntimeRepArgs args))
go co (ForAllTy (Bndr v vis) x)
| isVisibleArgFlag vis = panic "unexpected visible binder"
| v /= var && xc = (caseForAll v xr,True)
where (xr,xc) = go co x
go _ _ = (caseTrivial,False)
-- Return all syntactic subterms of ty that contain var somewhere
-- These are the things that should appear in instance constraints
deepSubtypesContaining :: TyVar -> Type -> [TcType]
deepSubtypesContaining tv
= functorLikeTraverse tv
(FT { ft_triv = []
, ft_var = []
, ft_fun = (++)
, ft_tup = \_ xs -> concat xs
, ft_ty_app = \t _ ts -> t:ts
, ft_bad_app = panic "in other argument in deepSubtypesContaining"
, ft_co_var = panic "contravariant in deepSubtypesContaining"
, ft_forall = \v xs -> filterOut ((v `elemVarSet`) . tyCoVarsOfType) xs })
foldDataConArgs :: FFoldType a -> DataCon -> [a]
-- Fold over the arguments of the datacon
foldDataConArgs ft con
= map foldArg (map scaledThing $ dataConOrigArgTys con)
where
foldArg
= case getTyVar_maybe (last (tyConAppArgs (dataConOrigResTy con))) of
Just tv -> functorLikeTraverse tv ft
Nothing -> const (ft_triv ft)
-- If we are deriving Foldable for a GADT, there is a chance that the last
-- type variable in the data type isn't actually a type variable at all.
-- (for example, this can happen if the last type variable is refined to
-- be a concrete type such as Int). If the last type variable is refined
-- to be a specific type, then getTyVar_maybe will return Nothing.
-- See Note [DeriveFoldable with ExistentialQuantification]
--
-- The kind checks have ensured the last type parameter is of kind *.
-- Make a HsLam using a fresh variable from a State monad
mkSimpleLam :: (LHsExpr GhcPs -> State [RdrName] (LHsExpr GhcPs))
-> State [RdrName] (LHsExpr GhcPs)
-- (mkSimpleLam fn) returns (\x. fn(x))
mkSimpleLam lam =
get >>= \case
n:names -> do
put names
body <- lam (nlHsVar n)
return (mkHsLam [nlVarPat n] body)
_ -> panic "mkSimpleLam"
mkSimpleLam2 :: (LHsExpr GhcPs -> LHsExpr GhcPs
-> State [RdrName] (LHsExpr GhcPs))
-> State [RdrName] (LHsExpr GhcPs)
mkSimpleLam2 lam =
get >>= \case
n1:n2:names -> do
put names
body <- lam (nlHsVar n1) (nlHsVar n2)
return (mkHsLam [nlVarPat n1,nlVarPat n2] body)
_ -> panic "mkSimpleLam2"
-- "Con a1 a2 a3 -> fold [x1 a1, x2 a2, x3 a3]"
--
-- @mkSimpleConMatch fold extra_pats con insides@ produces a match clause in
-- which the LHS pattern-matches on @extra_pats@, followed by a match on the
-- constructor @con@ and its arguments. The RHS folds (with @fold@) over @con@
-- and its arguments, applying an expression (from @insides@) to each of the
-- respective arguments of @con@.
mkSimpleConMatch :: Monad m => HsMatchContext GhcPs
-> (RdrName -> [a] -> m (LHsExpr GhcPs))
-> [LPat GhcPs]
-> DataCon
-> [LHsExpr GhcPs -> a]
-> m (LMatch GhcPs (LHsExpr GhcPs))
mkSimpleConMatch ctxt fold extra_pats con insides = do
let con_name = getRdrName con
let vars_needed = takeList insides as_RDRs
let bare_pat = nlConVarPat con_name vars_needed
let pat = if null vars_needed
then bare_pat
else nlParPat bare_pat
rhs <- fold con_name
(zipWith (\i v -> i $ nlHsVar v) insides vars_needed)
return $ mkMatch ctxt (extra_pats ++ [pat]) rhs emptyLocalBinds
-- "Con a1 a2 a3 -> fmap (\b2 -> Con a1 b2 a3) (traverse f a2)"
--
-- @mkSimpleConMatch2 fold extra_pats con insides@ behaves very similarly to
-- 'mkSimpleConMatch', with two key differences:
--
-- 1. @insides@ is a @[Maybe (LHsExpr RdrName)]@ instead of a
-- @[LHsExpr RdrName]@. This is because it filters out the expressions
-- corresponding to arguments whose types do not mention the last type
-- variable in a derived 'Foldable' or 'Traversable' instance (i.e., the
-- 'Nothing' elements of @insides@).
--
-- 2. @fold@ takes an expression as its first argument instead of a
-- constructor name. This is because it uses a specialized
-- constructor function expression that only takes as many parameters as
-- there are argument types that mention the last type variable.
--
-- See Note [Generated code for DeriveFoldable and DeriveTraversable]
mkSimpleConMatch2 :: Monad m
=> HsMatchContext GhcPs
-> (LHsExpr GhcPs -> [LHsExpr GhcPs]
-> m (LHsExpr GhcPs))
-> [LPat GhcPs]
-> DataCon
-> [Maybe (LHsExpr GhcPs)]
-> m (LMatch GhcPs (LHsExpr GhcPs))
mkSimpleConMatch2 ctxt fold extra_pats con insides = do
let con_name = getRdrName con
vars_needed = takeList insides as_RDRs
pat = nlConVarPat con_name vars_needed
-- Make sure to zip BEFORE invoking catMaybes. We want the variable
-- indices in each expression to match up with the argument indices
-- in con_expr (defined below).
exps = catMaybes $ zipWith (\i v -> (`nlHsApp` nlHsVar v) <$> i)
insides vars_needed
-- An element of argTysTyVarInfo is True if the constructor argument
-- with the same index has a type which mentions the last type
-- variable.
argTysTyVarInfo = map isJust insides
(asWithTyVar, asWithoutTyVar) = partitionByList argTysTyVarInfo as_Vars
con_expr
| null asWithTyVar = nlHsApps con_name asWithoutTyVar
| otherwise =
let bs = filterByList argTysTyVarInfo bs_RDRs
vars = filterByLists argTysTyVarInfo bs_Vars as_Vars
in mkHsLam (map nlVarPat bs) (nlHsApps con_name vars)
rhs <- fold con_expr exps
return $ mkMatch ctxt (extra_pats ++ [pat]) rhs emptyLocalBinds
-- "case x of (a1,a2,a3) -> fold [x1 a1, x2 a2, x3 a3]"
mkSimpleTupleCase :: Monad m => ([LPat GhcPs] -> DataCon -> [a]
-> m (LMatch GhcPs (LHsExpr GhcPs)))
-> TyCon -> [a] -> LHsExpr GhcPs -> m (LHsExpr GhcPs)
mkSimpleTupleCase match_for_con tc insides x
= do { let data_con = tyConSingleDataCon tc
; match <- match_for_con [] data_con insides
; return $ nlHsCase x [match] }
{-
************************************************************************
* *
Foldable instances
see http://www.mail-archive.com/haskell-prime@haskell.org/msg02116.html
* *
************************************************************************
Deriving Foldable instances works the same way as Functor instances,
only Foldable instances are not possible for function types at all.
Given (data T a = T a a (T a) deriving Foldable), we get:
instance Foldable T where
foldr f z (T x1 x2 x3) =
$(foldr 'a 'a) x1 ( $(foldr 'a 'a) x2 ( $(foldr 'a '(T a)) x3 z ) )
-XDeriveFoldable is different from -XDeriveFunctor in that it filters out
arguments to the constructor that would produce useless code in a Foldable
instance. For example, the following datatype:
data Foo a = Foo Int a Int deriving Foldable
would have the following generated Foldable instance:
instance Foldable Foo where
foldr f z (Foo x1 x2 x3) = $(foldr 'a 'a) x2
since neither of the two Int arguments are folded over.
The cases are:
$(foldr 'a 'a) = f
$(foldr 'a '(b1,b2)) = \x z -> case x of (x1,x2) -> $(foldr 'a 'b1) x1 ( $(foldr 'a 'b2) x2 z )
$(foldr 'a '(T b1 b2)) = \x z -> foldr $(foldr 'a 'b2) z x -- when a only occurs in the last parameter, b2
Note that the arguments to the real foldr function are the wrong way around,
since (f :: a -> b -> b), while (foldr f :: b -> t a -> b).
One can envision a case for types that don't contain the last type variable:
$(foldr 'a 'b) = \x z -> z -- when b does not contain a
But this case will never materialize, since the aforementioned filtering
removes all such types from consideration.
See Note [Generated code for DeriveFoldable and DeriveTraversable].
Foldable instances differ from Functor and Traversable instances in that
Foldable instances can be derived for data types in which the last type
variable is existentially quantified. In particular, if the last type variable
is refined to a more specific type in a GADT:
data GADT a where
G :: a ~ Int => a -> G Int
then the deriving machinery does not attempt to check that the type a contains
Int, since it is not syntactically equal to a type variable. That is, the
derived Foldable instance for GADT is:
instance Foldable GADT where
foldr _ z (GADT _) = z
See Note [DeriveFoldable with ExistentialQuantification].
Note [Deriving null]
~~~~~~~~~~~~~~~~~~~~
In some cases, deriving the definition of 'null' can produce much better
results than the default definition. For example, with
data SnocList a = Nil | Snoc (SnocList a) a
the default definition of 'null' would walk the entire spine of a
nonempty snoc-list before concluding that it is not null. But looking at
the Snoc constructor, we can immediately see that it contains an 'a', and
so 'null' can return False immediately if it matches on Snoc. When we
derive 'null', we keep track of things that cannot be null. The interesting
case is type application. Given
data Wrap a = Wrap (Foo (Bar a))
we use
null (Wrap fba) = all null fba
but if we see
data Wrap a = Wrap (Foo a)
we can just use
null (Wrap fa) = null fa
Indeed, we allow this to happen even for tuples:
data Wrap a = Wrap (Foo (a, Int))
produces
null (Wrap fa) = null fa
As explained in Note [Deriving <$], giving tuples special performance treatment
could surprise users if they switch to other types, but Ryan Scott seems to
think it's okay to do it for now.
-}
gen_Foldable_binds :: SrcSpan -> TyCon -> [Type] -> (LHsBinds GhcPs, BagDerivStuff)
-- When the parameter is phantom, we can use foldMap _ _ = mempty
-- See Note [Phantom types with Functor, Foldable, and Traversable]
gen_Foldable_binds loc tycon _
| Phantom <- last (tyConRoles tycon)
= (unitBag foldMap_bind, emptyBag)
where
foldMap_name = L (noAnnSrcSpan loc) foldMap_RDR
foldMap_bind = mkRdrFunBind foldMap_name foldMap_eqns
foldMap_eqns = [mkSimpleMatch foldMap_match_ctxt
[nlWildPat, nlWildPat]
mempty_Expr]
foldMap_match_ctxt = mkPrefixFunRhs foldMap_name
gen_Foldable_binds loc tycon tycon_args
| null data_cons -- There's no real point producing anything but
-- foldMap for a type with no constructors.
= (unitBag foldMap_bind, emptyBag)
| otherwise
= (listToBag [foldr_bind, foldMap_bind, null_bind], emptyBag)
where
data_cons = getPossibleDataCons tycon tycon_args
foldr_bind = mkRdrFunBind (L (noAnnSrcSpan loc) foldable_foldr_RDR) eqns
eqns = map foldr_eqn data_cons
foldr_eqn con
= evalState (match_foldr z_Expr [f_Pat,z_Pat] con =<< parts) bs_RDRs
where
parts = sequence $ foldDataConArgs ft_foldr con
foldMap_name = L (noAnnSrcSpan loc) foldMap_RDR
-- See Note [EmptyDataDecls with Functor, Foldable, and Traversable]
foldMap_bind = mkRdrFunBindEC 2 (const mempty_Expr)
foldMap_name foldMap_eqns
foldMap_eqns = map foldMap_eqn data_cons
foldMap_eqn con
= evalState (match_foldMap [f_Pat] con =<< parts) bs_RDRs
where
parts = sequence $ foldDataConArgs ft_foldMap con
-- Given a list of NullM results, produce Nothing if any of
-- them is NotNull, and otherwise produce a list of Maybes
-- with Justs representing unknowns and Nothings representing
-- things that are definitely null.
convert :: [NullM a] -> Maybe [Maybe a]
convert = traverse go where
go IsNull = Just Nothing
go NotNull = Nothing
go (NullM a) = Just (Just a)
null_name = L (noAnnSrcSpan loc) null_RDR
null_match_ctxt = mkPrefixFunRhs null_name
null_bind = mkRdrFunBind null_name null_eqns
null_eqns = map null_eqn data_cons
null_eqn con
= flip evalState bs_RDRs $ do
parts <- sequence $ foldDataConArgs ft_null con
case convert parts of
Nothing -> return $
mkMatch null_match_ctxt [nlParPat (nlWildConPat con)]
false_Expr emptyLocalBinds
Just cp -> match_null [] con cp
-- Yields 'Just' an expression if we're folding over a type that mentions
-- the last type parameter of the datatype. Otherwise, yields 'Nothing'.
-- See Note [FFoldType and functorLikeTraverse]
ft_foldr :: FFoldType (State [RdrName] (Maybe (LHsExpr GhcPs)))
ft_foldr
= FT { ft_triv = return Nothing
-- foldr f = \x z -> z
, ft_var = return $ Just f_Expr
-- foldr f = f
, ft_tup = \t g -> do
gg <- sequence g
lam <- mkSimpleLam2 $ \x z ->
mkSimpleTupleCase (match_foldr z) t gg x
return (Just lam)
-- foldr f = (\x z -> case x of ...)
, ft_ty_app = \_ _ g -> do
gg <- g
mapM (\gg' -> mkSimpleLam2 $ \x z -> return $
nlHsApps foldable_foldr_RDR [gg',z,x]) gg
-- foldr f = (\x z -> foldr g z x)
, ft_forall = \_ g -> g
, ft_co_var = panic "contravariant in ft_foldr"
, ft_fun = panic "function in ft_foldr"
, ft_bad_app = panic "in other argument in ft_foldr" }
match_foldr :: Monad m
=> LHsExpr GhcPs
-> [LPat GhcPs]
-> DataCon
-> [Maybe (LHsExpr GhcPs)]
-> m (LMatch GhcPs (LHsExpr GhcPs))
match_foldr z = mkSimpleConMatch2 LambdaExpr $ \_ xs -> return (mkFoldr xs)
where
-- g1 v1 (g2 v2 (.. z))
mkFoldr :: [LHsExpr GhcPs] -> LHsExpr GhcPs
mkFoldr = foldr nlHsApp z
-- See Note [FFoldType and functorLikeTraverse]
ft_foldMap :: FFoldType (State [RdrName] (Maybe (LHsExpr GhcPs)))
ft_foldMap
= FT { ft_triv = return Nothing
-- foldMap f = \x -> mempty
, ft_var = return (Just f_Expr)
-- foldMap f = f
, ft_tup = \t g -> do
gg <- sequence g
lam <- mkSimpleLam $ mkSimpleTupleCase match_foldMap t gg
return (Just lam)
-- foldMap f = \x -> case x of (..,)
, ft_ty_app = \_ _ g -> fmap (nlHsApp foldMap_Expr) <$> g
-- foldMap f = foldMap g
, ft_forall = \_ g -> g
, ft_co_var = panic "contravariant in ft_foldMap"
, ft_fun = panic "function in ft_foldMap"
, ft_bad_app = panic "in other argument in ft_foldMap" }
match_foldMap :: Monad m
=> [LPat GhcPs]
-> DataCon
-> [Maybe (LHsExpr GhcPs)]
-> m (LMatch GhcPs (LHsExpr GhcPs))
match_foldMap = mkSimpleConMatch2 CaseAlt $ \_ xs -> return (mkFoldMap xs)
where
-- mappend v1 (mappend v2 ..)
mkFoldMap :: [LHsExpr GhcPs] -> LHsExpr GhcPs
mkFoldMap [] = mempty_Expr
mkFoldMap xs = foldr1 (\x y -> nlHsApps mappend_RDR [x,y]) xs
-- See Note [FFoldType and functorLikeTraverse]
-- Yields NullM an expression if we're folding over an expression
-- that may or may not be null. Yields IsNull if it's certainly
-- null, and yields NotNull if it's certainly not null.
-- See Note [Deriving null]
ft_null :: FFoldType (State [RdrName] (NullM (LHsExpr GhcPs)))
ft_null
= FT { ft_triv = return IsNull
-- null = \_ -> True
, ft_var = return NotNull
-- null = \_ -> False
, ft_tup = \t g -> do
gg <- sequence g
case convert gg of
Nothing -> pure NotNull
Just ggg ->
NullM <$> (mkSimpleLam $ mkSimpleTupleCase match_null t ggg)
-- null = \x -> case x of (..,)
, ft_ty_app = \_ _ g -> flip fmap g $ \nestedResult ->
case nestedResult of
-- If e definitely contains the parameter,
-- then we can test if (G e) contains it by
-- simply checking if (G e) is null
NotNull -> NullM null_Expr
-- This case is unreachable--it will actually be
-- caught by ft_triv
IsNull -> IsNull
-- The general case uses (all null),
-- (all (all null)), etc.
NullM nestedTest -> NullM $
nlHsApp all_Expr nestedTest
-- null fa = null fa, or null fa = all null fa, or null fa = True
, ft_forall = \_ g -> g
, ft_co_var = panic "contravariant in ft_null"
, ft_fun = panic "function in ft_null"
, ft_bad_app = panic "in other argument in ft_null" }
match_null :: Monad m
=> [LPat GhcPs]
-> DataCon
-> [Maybe (LHsExpr GhcPs)]
-> m (LMatch GhcPs (LHsExpr GhcPs))
match_null = mkSimpleConMatch2 CaseAlt $ \_ xs -> return (mkNull xs)
where
-- v1 && v2 && ..
mkNull :: [LHsExpr GhcPs] -> LHsExpr GhcPs
mkNull [] = true_Expr
mkNull xs = foldr1 (\x y -> nlHsApps and_RDR [x,y]) xs
data NullM a =
IsNull -- Definitely null
| NotNull -- Definitely not null
| NullM a -- Unknown
{-
************************************************************************
* *
Traversable instances
see http://www.mail-archive.com/haskell-prime@haskell.org/msg02116.html
* *
************************************************************************
Again, Traversable is much like Functor and Foldable.
The cases are:
$(traverse 'a 'a) = f
$(traverse 'a '(b1,b2)) = \x -> case x of (x1,x2) ->
liftA2 (,) ($(traverse 'a 'b1) x1) ($(traverse 'a 'b2) x2)
$(traverse 'a '(T b1 b2)) = traverse $(traverse 'a 'b2) -- when a only occurs in the last parameter, b2
Like -XDeriveFoldable, -XDeriveTraversable filters out arguments whose types
do not mention the last type parameter. Therefore, the following datatype:
data Foo a = Foo Int a Int
would have the following derived Traversable instance:
instance Traversable Foo where
traverse f (Foo x1 x2 x3) =
fmap (\b2 -> Foo x1 b2 x3) ( $(traverse 'a 'a) x2 )
since the two Int arguments do not produce any effects in a traversal.
One can envision a case for types that do not mention the last type parameter:
$(traverse 'a 'b) = pure -- when b does not contain a
But this case will never materialize, since the aforementioned filtering
removes all such types from consideration.
See Note [Generated code for DeriveFoldable and DeriveTraversable].
-}
gen_Traversable_binds :: SrcSpan -> TyCon -> [Type] -> (LHsBinds GhcPs, BagDerivStuff)
-- When the argument is phantom, we can use traverse = pure . coerce
-- See Note [Phantom types with Functor, Foldable, and Traversable]
gen_Traversable_binds loc tycon _
| Phantom <- last (tyConRoles tycon)
= (unitBag traverse_bind, emptyBag)
where
traverse_name = L (noAnnSrcSpan loc) traverse_RDR
traverse_bind = mkRdrFunBind traverse_name traverse_eqns
traverse_eqns =
[mkSimpleMatch traverse_match_ctxt
[nlWildPat, z_Pat]
(nlHsApps pure_RDR [nlHsApp coerce_Expr z_Expr])]
traverse_match_ctxt = mkPrefixFunRhs traverse_name
gen_Traversable_binds loc tycon tycon_args
= (unitBag traverse_bind, emptyBag)
where
data_cons = getPossibleDataCons tycon tycon_args
traverse_name = L (noAnnSrcSpan loc) traverse_RDR
-- See Note [EmptyDataDecls with Functor, Foldable, and Traversable]
traverse_bind = mkRdrFunBindEC 2 (nlHsApp pure_Expr)
traverse_name traverse_eqns
traverse_eqns = map traverse_eqn data_cons
traverse_eqn con
= evalState (match_for_con [f_Pat] con =<< parts) bs_RDRs
where
parts = sequence $ foldDataConArgs ft_trav con
-- Yields 'Just' an expression if we're folding over a type that mentions
-- the last type parameter of the datatype. Otherwise, yields 'Nothing'.
-- See Note [FFoldType and functorLikeTraverse]
ft_trav :: FFoldType (State [RdrName] (Maybe (LHsExpr GhcPs)))
ft_trav
= FT { ft_triv = return Nothing
-- traverse f = pure x
, ft_var = return (Just f_Expr)
-- traverse f = f x
, ft_tup = \t gs -> do
gg <- sequence gs
lam <- mkSimpleLam $ mkSimpleTupleCase match_for_con t gg
return (Just lam)
-- traverse f = \x -> case x of (a1,a2,..) ->
-- liftA2 (,,) (g1 a1) (g2 a2) <*> ..
, ft_ty_app = \_ _ g -> fmap (nlHsApp traverse_Expr) <$> g
-- traverse f = traverse g
, ft_forall = \_ g -> g
, ft_co_var = panic "contravariant in ft_trav"
, ft_fun = panic "function in ft_trav"
, ft_bad_app = panic "in other argument in ft_trav" }
-- Con a1 a2 ... -> liftA2 (\b1 b2 ... -> Con b1 b2 ...) (g1 a1)
-- (g2 a2) <*> ...
match_for_con :: Monad m
=> [LPat GhcPs]
-> DataCon
-> [Maybe (LHsExpr GhcPs)]
-> m (LMatch GhcPs (LHsExpr GhcPs))
match_for_con = mkSimpleConMatch2 CaseAlt $
\con xs -> return (mkApCon con xs)
where
-- liftA2 (\b1 b2 ... -> Con b1 b2 ...) x1 x2 <*> ..
mkApCon :: LHsExpr GhcPs -> [LHsExpr GhcPs] -> LHsExpr GhcPs
mkApCon con [] = nlHsApps pure_RDR [con]
mkApCon con [x] = nlHsApps fmap_RDR [con,x]
mkApCon con (x1:x2:xs) =
foldl' appAp (nlHsApps liftA2_RDR [con,x1,x2]) xs
where appAp x y = nlHsApps ap_RDR [x,y]
-----------------------------------------------------------------------
f_Expr, z_Expr, mempty_Expr, foldMap_Expr,
traverse_Expr, coerce_Expr, pure_Expr, true_Expr, false_Expr,
all_Expr, null_Expr :: LHsExpr GhcPs
f_Expr = nlHsVar f_RDR
z_Expr = nlHsVar z_RDR
mempty_Expr = nlHsVar mempty_RDR
foldMap_Expr = nlHsVar foldMap_RDR
traverse_Expr = nlHsVar traverse_RDR
coerce_Expr = nlHsVar (getRdrName coerceId)
pure_Expr = nlHsVar pure_RDR
true_Expr = nlHsVar true_RDR
false_Expr = nlHsVar false_RDR
all_Expr = nlHsVar all_RDR
null_Expr = nlHsVar null_RDR
f_RDR, z_RDR :: RdrName
f_RDR = mkVarUnqual (fsLit "f")
z_RDR = mkVarUnqual (fsLit "z")
as_RDRs, bs_RDRs :: [RdrName]
as_RDRs = [ mkVarUnqual (mkFastString ("a"++show i)) | i <- [(1::Int) .. ] ]
bs_RDRs = [ mkVarUnqual (mkFastString ("b"++show i)) | i <- [(1::Int) .. ] ]
as_Vars, bs_Vars :: [LHsExpr GhcPs]
as_Vars = map nlHsVar as_RDRs
bs_Vars = map nlHsVar bs_RDRs
f_Pat, z_Pat :: LPat GhcPs
f_Pat = nlVarPat f_RDR
z_Pat = nlVarPat z_RDR
{-
Note [DeriveFoldable with ExistentialQuantification]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Functor and Traversable instances can only be derived for data types whose
last type parameter is truly universally polymorphic. For example:
data T a b where
T1 :: b -> T a b -- YES, b is unconstrained
T2 :: Ord b => b -> T a b -- NO, b is constrained by (Ord b)
T3 :: b ~ Int => b -> T a b -- NO, b is constrained by (b ~ Int)
T4 :: Int -> T a Int -- NO, this is just like T3
T5 :: Ord a => a -> b -> T a b -- YES, b is unconstrained, even
-- though a is existential
T6 :: Int -> T Int b -- YES, b is unconstrained
For Foldable instances, however, we can completely lift the constraint that
the last type parameter be truly universally polymorphic. This means that T
(as defined above) can have a derived Foldable instance:
instance Foldable (T a) where
foldr f z (T1 b) = f b z
foldr f z (T2 b) = f b z
foldr f z (T3 b) = f b z
foldr f z (T4 b) = z
foldr f z (T5 a b) = f b z
foldr f z (T6 a) = z
foldMap f (T1 b) = f b
foldMap f (T2 b) = f b
foldMap f (T3 b) = f b
foldMap f (T4 b) = mempty
foldMap f (T5 a b) = f b
foldMap f (T6 a) = mempty
In a Foldable instance, it is safe to fold over an occurrence of the last type
parameter that is not truly universally polymorphic. However, there is a bit
of subtlety in determining what is actually an occurrence of a type parameter.
T3 and T4, as defined above, provide one example:
data T a b where
...
T3 :: b ~ Int => b -> T a b
T4 :: Int -> T a Int
...
instance Foldable (T a) where
...
foldr f z (T3 b) = f b z
foldr f z (T4 b) = z
...
foldMap f (T3 b) = f b
foldMap f (T4 b) = mempty
...
Notice that the argument of T3 is folded over, whereas the argument of T4 is
not. This is because we only fold over constructor arguments that
syntactically mention the universally quantified type parameter of that
particular data constructor. See foldDataConArgs for how this is implemented.
As another example, consider the following data type. The argument of each
constructor has the same type as the last type parameter:
data E a where
E1 :: (a ~ Int) => a -> E a
E2 :: Int -> E Int
E3 :: (a ~ Int) => a -> E Int
E4 :: (a ~ Int) => Int -> E a
Only E1's argument is an occurrence of a universally quantified type variable
that is syntactically equivalent to the last type parameter, so only E1's
argument will be folded over in a derived Foldable instance.
See #10447 for the original discussion on this feature. Also see
https://gitlab.haskell.org/ghc/ghc/wikis/commentary/compiler/derive-functor
for a more in-depth explanation.
Note [FFoldType and functorLikeTraverse]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Deriving Functor, Foldable, and Traversable all require generating expressions
which perform an operation on each argument of a data constructor depending
on the argument's type. In particular, a generated operation can be different
depending on whether the type mentions the last type variable of the datatype
(e.g., if you have data T a = MkT a Int, then a generated foldr expression would
fold over the first argument of MkT, but not the second).
This pattern is abstracted with the FFoldType datatype, which provides hooks
for the user to specify how a constructor argument should be folded when it
has a type with a particular "shape". The shapes are as follows (assume that
a is the last type variable in a given datatype):
* ft_triv: The type does not mention the last type variable at all.
Examples: Int, b
* ft_var: The type is syntactically equal to the last type variable.
Moreover, the type appears in a covariant position (see
the Deriving Functor instances section of the user's guide
for an in-depth explanation of covariance vs. contravariance).
Example: a (covariantly)
* ft_co_var: The type is syntactically equal to the last type variable.
Moreover, the type appears in a contravariant position.
Example: a (contravariantly)
* ft_fun: A function type which mentions the last type variable in
the argument position, result position or both.
Examples: a -> Int, Int -> a, Maybe a -> [a]
* ft_tup: A tuple type which mentions the last type variable in at least
one of its fields. The TyCon argument of ft_tup represents the
particular tuple's type constructor.
Examples: (a, Int), (Maybe a, [a], Either a Int), (# Int, a #)
* ft_ty_app: A type is being applied to the last type parameter, where the
applied type does not mention the last type parameter (if it
did, it would fall under ft_bad_app) and the argument type
mentions the last type parameter (if it did not, it would fall
under ft_triv). The first two Type arguments to
ft_ty_app represent the applied type and argument type,
respectively.
Currently, only DeriveFunctor makes use of the argument type.
It inspects the argument type so that it can generate more
efficient implementations of fmap
(see Note [Avoid unnecessary eta expansion in derived fmap implementations])
and (<$) (see Note [Deriving <$]) in certain cases.
Note that functions, tuples, and foralls are distinct cases
and take precedence over ft_ty_app. (For example, (Int -> a) would
fall under (ft_fun Int a), not (ft_ty_app ((->) Int) a).
Examples: Maybe a, Either b a
* ft_bad_app: A type application uses the last type parameter in a position
other than the last argument. This case is singled out because
Functor, Foldable, and Traversable instances cannot be derived
for datatypes containing arguments with such types.
Examples: Either a Int, Const a b
* ft_forall: A forall'd type mentions the last type parameter on its right-
hand side (and is not quantified on the left-hand side). This
case is present mostly for plumbing purposes.
Example: forall b. Either b a
If FFoldType describes a strategy for folding subcomponents of a Type, then
functorLikeTraverse is the function that applies that strategy to the entirety
of a Type, returning the final folded-up result.
foldDataConArgs applies functorLikeTraverse to every argument type of a
constructor, returning a list of the fold results. This makes foldDataConArgs
a natural way to generate the subexpressions in a generated fmap, foldr,
foldMap, or traverse definition (the subexpressions must then be combined in
a method-specific fashion to form the final generated expression).
Deriving Generic1 also does validity checking by looking for the last type
variable in certain positions of a constructor's argument types, so it also
uses foldDataConArgs. See Note [degenerate use of FFoldType] in GHC.Tc.Deriv.Generics.
Note [Generated code for DeriveFoldable and DeriveTraversable]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We adapt the algorithms for -XDeriveFoldable and -XDeriveTraversable based on
that of -XDeriveFunctor. However, there an important difference between deriving
the former two typeclasses and the latter one, which is best illustrated by the
following scenario:
data WithInt a = WithInt a Int# deriving (Functor, Foldable, Traversable)
The generated code for the Functor instance is straightforward:
instance Functor WithInt where
fmap f (WithInt a i) = WithInt (f a) i
But if we use too similar of a strategy for deriving the Foldable and
Traversable instances, we end up with this code:
instance Foldable WithInt where
foldMap f (WithInt a i) = f a <> mempty
instance Traversable WithInt where
traverse f (WithInt a i) = fmap WithInt (f a) <*> pure i
This is unsatisfying for two reasons:
1. The Traversable instance doesn't typecheck! Int# is of kind #, but pure
expects an argument whose type is of kind *. This effectively prevents
Traversable from being derived for any datatype with an unlifted argument
type (#11174).
2. The generated code contains superfluous expressions. By the Monoid laws,
we can reduce (f a <> mempty) to (f a), and by the Applicative laws, we can
reduce (fmap WithInt (f a) <*> pure i) to (fmap (\b -> WithInt b i) (f a)).
We can fix both of these issues by incorporating a slight twist to the usual
algorithm that we use for -XDeriveFunctor. The differences can be summarized
as follows:
1. In the generated expression, we only fold over arguments whose types
mention the last type parameter. Any other argument types will simply
produce useless 'mempty's or 'pure's, so they can be safely ignored.
2. In the case of -XDeriveTraversable, instead of applying ConName,
we apply (\b_i ... b_k -> ConName a_1 ... a_n), where
* ConName has n arguments
* {b_i, ..., b_k} is a subset of {a_1, ..., a_n} whose indices correspond
to the arguments whose types mention the last type parameter. As a
consequence, taking the difference of {a_1, ..., a_n} and
{b_i, ..., b_k} yields the all the argument values of ConName whose types
do not mention the last type parameter. Note that [i, ..., k] is a
strictly increasing—but not necessarily consecutive—integer sequence.
For example, the datatype
data Foo a = Foo Int a Int a
would generate the following Traversable instance:
instance Traversable Foo where
traverse f (Foo a1 a2 a3 a4) =
fmap (\b2 b4 -> Foo a1 b2 a3 b4) (f a2) <*> f a4
Technically, this approach would also work for -XDeriveFunctor as well, but we
decide not to do so because:
1. There's not much benefit to generating, e.g., ((\b -> WithInt b i) (f a))
instead of (WithInt (f a) i).
2. There would be certain datatypes for which the above strategy would
generate Functor code that would fail to typecheck. For example:
data Bar f a = Bar (forall f. Functor f => f a) deriving Functor
With the conventional algorithm, it would generate something like:
fmap f (Bar a) = Bar (fmap f a)
which typechecks. But with the strategy mentioned above, it would generate:
fmap f (Bar a) = (\b -> Bar b) (fmap f a)
which does not typecheck, since GHC cannot unify the rank-2 type variables
in the types of b and (fmap f a).
Note [Phantom types with Functor, Foldable, and Traversable]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Given a type F :: * -> * whose type argument has a phantom role, we can always
produce lawful Functor and Traversable instances using
fmap _ = coerce
traverse _ = pure . coerce
Indeed, these are equivalent to any *strictly lawful* instances one could
write, except that this definition of 'traverse' may be lazier. That is, if
instances obey the laws under true equality (rather than up to some equivalence
relation), then they will be essentially equivalent to these. These definitions
are incredibly cheap, so we want to use them even if it means ignoring some
non-strictly-lawful instance in an embedded type.
Foldable has far fewer laws to work with, which leaves us unwelcome
freedom in implementing it. At a minimum, we would like to ensure that
a derived foldMap is always at least as good as foldMapDefault with a
derived traverse. To accomplish that, we must define
foldMap _ _ = mempty
in these cases.
This may have different strictness properties from a standard derivation.
Consider
data NotAList a = Nil | Cons (NotAList a) deriving Foldable
The usual deriving mechanism would produce
foldMap _ Nil = mempty
foldMap f (Cons x) = foldMap f x
which is strict in the entire spine of the NotAList.
Final point: why do we even care about such types? Users will rarely if ever
map, fold, or traverse over such things themselves, but other derived
instances may:
data Hasn'tAList a = NotHere a (NotAList a) deriving Foldable
Note [EmptyDataDecls with Functor, Foldable, and Traversable]
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There are some slightly tricky decisions to make about how to handle
Functor, Foldable, and Traversable instances for types with no constructors.
For fmap, the two basic options are
fmap _ _ = error "Sorry, no constructors"
or
fmap _ z = case z of
In most cases, the latter is more helpful: if the thunk passed to fmap
throws an exception, we're generally going to be much more interested in
that exception than in the fact that there aren't any constructors.
In order to match the semantics for phantoms (see note above), we need to
be a bit careful about 'traverse'. The obvious definition would be
traverse _ z = case z of
but this is stricter than the one for phantoms. We instead use
traverse _ z = pure $ case z of
For foldMap, the obvious choices are
foldMap _ _ = mempty
or
foldMap _ z = case z of
We choose the first one to be consistent with what foldMapDefault does for
a derived Traversable instance.
-}
|