summaryrefslogtreecommitdiff
path: root/ml/comb.txt
blob: 4abb68c7e3affb2c69a3df997040f1e152f1b578 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
Functional Programming For Dynamic Programmers - Part 4
=======================================================

:author: Michele Simionato
:date: December 2007

This is the fourth of a series of articles on functional programming in
statically typed languages. It is intended for dynamic programmers, i.e. 
for programmers with a background in dynamically typed languages, such as Perl, 
Python, Ruby, or languages in the Lisp family. The approch is eminently practical
and example-based; the main goal is to see if we can stole some good idea from
statically typed languages. In order to be concrete, I will consider languages
in the ML family, because they are pretty nice and much easier to understand
that Haskell.

Academic languages vs enterprise languages
-----------------------------------------------------------

There is a widespread misconception that academic languages are clean but
unpractical, whereas enterprise languages are practical but dirty. I never
bought that argument, since I have seen examples of the opposite, and also
because I always believed that a language can be both clean *and* practical.

For instance in my opinion both Python and Ruby are reasonably clean and
reasonably practical and this is part of the reason why they are having so
much success lately. I believe the reason why they are so clean is that from
the beginning they managed to stay away from optimization, the root of all evil.
On the other hand, I am also convinced that it possible to write a language
which is both practical, clean and fast. Unfortunately, such a language does
not exist yet. SML is fast and somewhat clean but it is definitely not practical. 
Here by practical I mean enterprise-ready.

There is an irriducible aporia_ between enterprise programmers and
academics: people working on enterprise are facing every day problems
that are already solved; to them, the importanting thing is to avoid
reinventing the wheel. Given two languages, one that provides a solution
to a problem they have, and one providing the tools to build the solution
themselves, they prefer the first one. This is natural since in an enterprise 
environment one has deadlines and it is essential to be as productive
as possible; of course the best way to be productive is to *not write* code,
and to re-use code written by others. 
On the other hand, in academia one has to do with new problems, or
with new techniques to solve old problems: the proof of concept is
more important than the practical implementation. Also, academics
are (rightly) interested in writing papers, not in writing code. For all these 
reasons it is clear that you cannot face an
academic language such as SML with an enterprise mindset.
For instance, if you come from an enterprise environment you will be used to
expect ready availabity of a nearly infinite number
of libraries doing everything you may need and more. This is certainly
the situation for all enterprise-level languages such as C, C++, Java, C#,
Perl, Python and now even Ruby, which is younger but rapidly spreading.

As I said many time, I assume that my readers are dynamic programmers,
well acquainted with scripting languages; I feel the urge warn them that the
situation in SML is very different than in Perl, Python or Ruby [#]_ . Here
are a few differences.

1. 
    Scripting languages have a dominant (sometimes unique)
    implementation; ML has dozens of implementations, each with
    useful features which are not in the standard, so that
    different implementations provides incompatible extensions.

2.  
    Scripting languages have a BDFL_ (Benevolent Dictator For Life,
    i.e.  Guido van Rossum for Python, Larry Wall for Perl, Yukihiro
    Matsumoto for Ruby) whereas SML is a language designed by
    committee. That means that a scripting language can evolve much
    *much* faster than a language designed by committee. Also,
    languages designed by committee are compromises whereas
    single-person languages are much more consistent.

3. 
    SML has a very small community: the support you can have in newsgroup
    and mailing lists is simply not comparable to the support you can get in
    the scripting languages mailing list and newsgroups.

    if Andreas Rossberg won the first price
    at the lottery and decided to leave for a tropical paradise, nobody would
    provide support for Alice ML on the mailing list

4.
    SML is a deserted language, even compared to non-mainstream
    languages as OCaml, Haskell, Scheme, Common Lisp, Smalltalk, Tcl,
    ...; it is perhaps the least used language I have ever seen and
    this is a pity.

5.
    Generally speaking, if you want bindings for external libraries
    you have to write it yourself; there are no standard bindings for
    GUI toolkits, database drivers, scientific programming libraries.

6. 
    Scripting languages have traditionally a strong support for Web programming
    which is next to absent in SML.


All the issues I have just listed are just accidental, not
structural: in principle one could just solve them by writing code; in principle
a DHH_ could take an SML implementation and make it the next Ruby on Rails.
However that has not happened yet, so if you want to work with SML right now you
will be forced to reinvent the wheel. In this article I will reinvent a few wheels,
just to be able to give some meaninful example to the enterprise programmer
in the next articles.

.. _aporia: http://en.wikipedia.org/wiki/Aporia
.. _BDFL: http://en.wikipedia.org/wiki/BDFL
.. _DHH: http://en.wikipedia.org/wiki/DHH

.. [#] Notice that I put Perl, Python and Ruby in the mainstream
        languages, since even if the number of dynamic programmers is
        inferior the number of say C/C++, Java or .NET programmers,
        in practice scripting programmers have available nearly everything the
        other languages have available, and much better languages.

String interpolation the practical way
--------------------------------------------------------

Every language has some builtin mechanism to do string interpolation: C
has ``sprintf``, Python has the ``%`` operator, Lisp has ``format``, but
SML has none. Therefore I will provide here a very simple interpolation library
for SML, so that I will be able to provide more interesting examples later.
The library provides a single higher order function 
``format: string -> string list -> string`` converting a template
string into a 'string list -> string' function replacing the template 
with the provided list of arguments. For instance

 ``format "The square of $ is $\n" ["3", "9"]``

returns

     ``"The square of 3 is 9"``

For sake of simplicity, we will not implement any mechanism to escape
dollar signs, so if the format strings contains dollar signs for other
reasons (for instance there is money involved) the library will not
work.

The library is heavily based on list processing.
Lists are the most common data structures in 
functional languages (they formed the core of Lisp at its beginning) . 
You should look at the `basis library`_
to see what you can do with lists. Here I will just say that SML lists are 
linked lists in the same sense of Lisp or Scheme (with the difference
that they are immutable) and not as in Python (Python lists are actually arrays,
Python is abusing the name). Just as in Scheme, where

  ``(list 1 2)``

is a shortcut for

 ``(cons 1 (cons 2 '()))``

in ML

  ``[1, 2]``

is a shortcut for

 ``1::2::[]``

and ``::`` is the *cons* operator (short for constructor). 
If you don't know Scheme, a SML lists should be thought of as of a
nested Python list, i.e.

   ``[1, 2] (SML) => [1, [2, []]] (Python)``

.. _basis library: http://www.standardml.org/Basis/list.html 
 
The core of the algorithm used in the interpolation library is a
recursive function which takes a list of N+1 terms ``t_1 ... t_{N+1}``
and a list of N arguments ``a_1 .. a_N`` and returns the string
obtained by interspersing the terms with the arguments,

  ``t_1 ^ a_1 ^ ... _tN ^ a_N ^ t_{N+1}``

For instance, we will be interspersing ``["The square of ", " is ", "\n"]`` with
 ``["3", "9"]``.

Here is the code::

 $ cat format.aml

.. include:: format.aml 
    :literal: 

A few comments are in order.

1.
 We used the ability to define exceptions with a value:
 ``exception ArityError of string`` means that ``ArityError`` accepts
 a string argument (the error message).

2.
 We used the standard library ``String.fields`` utility, which splits a string
 according to a delimiter; in our case  ``String.fields (fn c => c = #"$") templ``
 splits the template into a list a strings using the character ``$`` as delimiter.
 In SML characters are specified with a sharp notation, so the character ``$``
 is written ``#"$"``.

3.
 The  builtin ``concat`` concatenates a list of strings whereas ``rev`` reverts
 a list; you are advised to read the documentation or any book on SML for more.

String interpolation the SML way
-------------------------------------------------

Functional languages have a reputation for abstractness 
but in this series of articles I have focused solely on very concrete 
earth-bound aspects. To correct the balance, in this section I will discuss
a few programming techniques which are quite common in 
functional languages, but unheard of in "regular" languages, and that
require a mathematically-inclined mind to be understood. I am doing so 
because these tricks are actually used a lot in functional languages, and 
I don't want to give a false impression by ignoring them 
and just discussing the easy things. 

For instance, you should not believe that functional
programming is just about functions; there is an higher order
approach to functional programming, in which the primary objects
are not functions, but operators operating on functions, the so
called *combinators*. For people with a background in Physics, I will
say this is exactly analogous to the switch from the
Schroedinger picture, in which the emphasis is on the wave functions,
to the Heisenberg picture, in which the emphasis is on the quantum
operators.

Having warned my readers that this section is not for the faint of heart
(but you may safely skip it if you don't feel adventurous)
I will start from the concatenation operator

::

 ^ : string * string -> string

which satifies the properties::
 
     a ^ (b ^ c)  = (a ^ b) ^ c = a  ^ b ^ c
     a ^ "" = "" ^ a = a

Mathematically speaking, the set of strings in SML 
is a monoid_ with respect to concatenation and the empty
string "" is the identity element.

.. _monoid: http://en.wikipedia.org/wiki/Monoid
.. _group: http://en.wikipedia.org/wiki/Group_%28mathematics%29
.. _group representations: http://en.wikipedia.org/wiki/Group_representation

If you have a background in Mathematics or in Physics you will be familiar
with the theory of `group representations`_; in particular groups (and monoids
too) can be represented with operators operating on function spaces.
In the case at hand, I can define a lift transformation converting plain
strings into operators by preserving the composition law::

 -  fun L s f s' = f (s' ^ s);
 val L : string -> (string -> 'a) -> string -> 'a = _fn

(this is the same as ``fun L s = fn f => fn s' => f (s' ^ s)``).
In other words, ``L s`` is an operator (combinator) taking a function and
returning a function, with the property

  ``L (s1 ^ s2) = (L s1) o (L s2)``

where ``o`` denotes the composition of operators. Just as ``L`` is an upgrade
operation, promoving a plain simple string to the celestial world of operators
in function spaces, I can define a downgrade operation ``l``, demoving
celestial operators back to earthly strings::

 - fun l O = O Fn. id "";
 val l : (('a -> 'a) -> string -> 'b) -> 'b = _fn

``l`` takes the operator, applies it to the identity function and
returns a function which is then applied to the empty string, finally
bringing back at home a plain string; ``l`` is the inverse of ``L``,
i.e. ``l o L`` is the identity in the space of strings where ``L o l``
is the identity in the space of operators. You can try yourself at the
prompt that

::

 - l(L"a" o L"b");
 val it : string = "ab"

so we succeded in making simple things hard and abstract. But not happy with
that, we are going to make things harder and more abstract, by defining
another combinator taking a function and returning an higher order function::

  - fun str f s s' = f (s ^ s')    
  val str : (string -> 'a) -> string -> string -> 'a = _fn
  
(this is the same as ``fun str f = fn s => fn s' => f ( s ^ s')``).
This combinator is so abstract than even when lift back to the mortal world
it is more than a mere string, it is actually the identity function on strings::

 - l str;
 val it : string -> string = _fn
 - (l str) "hello";
 val it : string = "hello"

The ``str`` combinator can be composed in the celestial word: when lift
back to the mortal world, it returns a higher order version of the concatenation
operator::

 - l (str o str);
 val it : string -> string -> string = _fn

 - l (str o str) "hello" " world";
 val it : string = "hello world"

We can also compose ``str`` with other combinators and we can write things like::

 - l( L"a" o str o L"b" o str o L"c") "?" ":";
 val it : string = "a?b:c"

In other words, we have reinvented string interpolation the difficult way.
Still, there is at least an advantage of this approach with respect to the
approach we used in the previous paragraph: combinator-based string interpolation
happens at compile type and *mistakes are discovered by the compiler*, not
at runtime. Moreover, the approach can be easily extended to manage
different types: for instance, if we have to do with numbers, we
can define the combinators::

  -  fun int f s n = f (s ^ (Int.toString n))    
  val str : (string -> 'a) -> string -> string -> 'a = _fn

  - fun real f s n = f (s ^ (Real.toString n))    
 val real : (string -> 'a) -> string -> real -> 'a = _fn

and use them as follows::

  - print (l(int o L" / " o int o L" = " o real o L"\n") 5 2 2.5);
  5 / 2 = 2.5
  val it : unit = ()

If you make a mistake like using an int instead of a real, or if you forget an
argument, you will get a compile time type error.

----

*Give a man a fish and he will eat for a day. Teach a man to fish and he will eat 
for the rest of his life.*  -- Chinese proverb