diff options
author | michele.simionato <devnull@localhost> | 2009-04-18 01:59:19 +0000 |
---|---|---|
committer | michele.simionato <devnull@localhost> | 2009-04-18 01:59:19 +0000 |
commit | 46ecf8fc125713c89d8d611b0a63927f536d9222 (patch) | |
tree | 778b4be3bad57c2acf0741ce81c73b639643a598 | |
parent | 97e3ff370e1a12d76fae45784dfcbe993f0c50ef (diff) | |
download | micheles-46ecf8fc125713c89d8d611b0a63927f536d9222.tar.gz |
Switched the chapters about macros with the chapters about the module system
-rw-r--r-- | artima/scheme/macro1.ss | 238 | ||||
-rw-r--r-- | artima/scheme/macro2.ss | 318 | ||||
-rw-r--r-- | artima/scheme/macro3.ss | 288 | ||||
-rw-r--r-- | artima/scheme/macro4.ss (renamed from artima/scheme/scheme22.ss) | 0 | ||||
-rw-r--r-- | artima/scheme/macro5.ss (renamed from artima/scheme/scheme23.ss) | 0 | ||||
-rw-r--r-- | artima/scheme/macro6.ss (renamed from artima/scheme/scheme24.ss) | 0 | ||||
-rw-r--r-- | artima/scheme/module-system.ss | 282 | ||||
-rw-r--r-- | artima/scheme/scheme19.ss | 504 | ||||
-rw-r--r-- | artima/scheme/scheme20.ss | 631 | ||||
-rw-r--r-- | artima/scheme/scheme21.ss | 406 | ||||
-rw-r--r-- | artima/scheme/scheme25.ss | 345 | ||||
-rw-r--r-- | artima/scheme/scheme26.ss | 210 |
12 files changed, 1611 insertions, 1611 deletions
diff --git a/artima/scheme/macro1.ss b/artima/scheme/macro1.ss new file mode 100644 index 0000000..253e817 --- /dev/null +++ b/artima/scheme/macro1.ss @@ -0,0 +1,238 @@ +#|Recursive macros +===================================================================== + +After a short introduction about the relevance of macros for +programming language design, I show a few common patterns of Scheme +macrology: recursive macros, accumulators, and the usage of literals +to incorporate helpers in macros. + +Should everybody be designing her own programming language? +------------------------------------------------------------------ + +Macros are the main reason why I first became interested in Scheme. At +the time - more or less six years ago - there was a bunch of lispers +trolling in comp.lang.python, and arguing for the addition of macros +to Python. Of course most Pythonista opposed the idea, but at the +time I had no idea of the advantages/disadvantages of macros; I felt +quite ignorant and powerless to argue. I never liked to feel ignorant, +so I decided to learn macros, especially Scheme macros, since they are +the state of the art. + +Nowadays I can say that the addition of macros to Python would be +a bad idea knowing what I am talking about. Actually, I have already +stated in episode 12_ an even stronger opinion, i.e. that macros +are more bad than good for any enterprise-oriented language (but notice +that I am *not* implying that every enterprise should adopt +only enterprise-oriented languages). + +My opinion against macro in (most) enterprise programming +does mean that macros are worthless, and indeed +I think they are extremely useful and important in another domain, +i.e. in the domain of design and research about programming +languages. As a matter of fact, *Scheme macros enable every programmer +to write her own programming language*. I think this is a valuable and nice +to have thing. Everybody who has got opinion +about language design, or about how an object should should work, or +questions like "what would a language look like if it had feature X?", +can solve his doubts by implementing the feature with macros. + +Perhaps not everybody should design its own programming language, +and certainly not everybody should *distribute* its own personal +language, however I think lots of people will get a benefit trying +to think about how to design a language and making some experiment. +The easier thing is to start from a Domain Specific Language (DSL), +which does not need to be a full grown programming language; for +instance in the Python world it seems that everybody is implementing templating +languages to generate web pages. In my opinion, this a good thing *per se*, +the probably is that everybody is distributing its own language so that +there a bit of anarchy, but this is not such a serious problem after all. + +Even for what concerns full grown programming languages we see nowadays +an explosion of new languages coming out, especially for the Java and +the CLR platforms, since it is relatively easy to implement a new +language on those platforms. However, it still takes a lot of work. + +On the other hand, writing a custom language embedded in Scheme by +means of macros is by far much easier and Scheme makes an ideal platform +for implementing languages and experimenting with new ideas. + +There is a `quote of Ian Bicking`_ about Web frameworks which struck me: + +*Sometimes Python is accused of having too many web frameworks. And +it's true, there are a lot. That said, I think writing a framework is +a useful exercise. It doesn’t let you skip over too much without +understanding it. It removes the magic. So even if you go on to use +another existing framework (which I'd probably advise you do), you' +ll be able to understand it better if you've written something like it +on your own.* + +You can the replace the words "web framework" with "programming +language" here and the quote still makes sense. You should read my +*Adventures* in this spirit: the goal of this series is to give +the technical competence to write your own language by means of +macros. Even if you are not going to design your own language, +macros will help you to understand how languages work. + +I personally am interested only in the +technical competence, *I do not want to write a new language*. +There are already lots of languages +out there, and writing a real language is a lot of grunt work, because +it means writing debugging tools, good error messages, wondering about +portability, interacting with an user community, et cetera et cetera. +Not everybody is good language designer and a good BDFL, for sure; +however everybody can have opinions about language design and some +experiment with macrology can help to put to test such opinions. + +.. _quote of Ian Bicking: http://pythonpaste.org/webob/do-it-yourself.html +.. _12: http://www.artima.com/weblogs/viewpost.jsp?thread=240836 + +Recursive macros with accumulators +---------------------------------------------------------- + +The goal of learning macros well enough to implement a programming language +is an ambitious one; it is not something we can attain in an episode of the +Adventures, nor in six. However, one episode is enough to explain at least +one useful techniques which is commonly used in Scheme macrology and which +is good to know in order to reach our final goal, in time. +The technique we will discuss in this episode is the accumulator trick, +which is analogous to the accumulator trick we first discussed in episode +6_ when talking about tail call optimization. In Scheme it is common +to introduce an auxiliary variable to store a value which is passed +in a loop: the same trick can be used in macros, at compile time instead +that at run time. + +In order to give an example of usage of the accumulator trick, let me +define a conditional macro ``cond-`` which works like ``cond``, but +with less parenthesis: + +.. code-block:: scheme + + (cond- + cond-1? return-1 + cond-2? return-2 + ... + else return-default) + +We want the code above to expand to: + +.. code-block:: scheme + + (cond + (cond-1? return-1) + (cond-2? return-2) + ... + (else return-default)) + + +Here is the solution, which makes use of an accumulator and of an auxiliary +macro: + +$$COND- + +The code should be clear. The auxiliary (private) macro ``cond-aux`` +is recursive: it works by collecting the arguments ``x1, x2, ..., xn`` +in the accumulator ``(acc ...)``. If the number of arguments is even, +at some point we end up having collected all the arguments in the +accumulator, which is then expanded into a standard conditional; if +the number of arguments is even, at some point we end up having +collected all the arguments except one, and a ``"Mismatched pairs"`` +exception is raised. The user-visible macro ``cond-`` just calls +``cond-aux`` by setting the initial value of the accumulator to ``()``. +The entire expansion and error checking is made at compile time. +Here is an example of usage:: + + > (let ((n 1)) + (cond- (= n 1) ; missing a clause + (= n 2) 'two + (= n 3) 'three + else 'unknown)) + Unhandled exception: + Condition components: + 1. &who: cond- + 2. &message: "Mismatched pairs" + 3. &syntax: + form: (((= n 1) (= n 2)) ('two (= n 3)) ('three else) 'unknown) + subform: 'unknown + +A trick to avoid auxiliary macros +---------------------------------------------------------------- + +I have nothing against auxiliary macros, however sometimes you may +want to keep all the code in a single macro. This is useful if you are +debugging a macro since an auxiliary macro is usually not exported and +you may not have access to it without changing the source code of the +module defining it and without recompiling it; on the other hand, you +have full access to an exported macro including the features of the +would be auxiliary macro. The trick is to introduce a literal to +defined the helper macro inside the main macro. Here is how it would +work in our example: + +$$COND2 + +If you do not want to use a literal identifier, you can use a literal string +instead: + +$$COND3 + +This kind of tricks are quite common in Scheme macros; the best reference +you can find detailing these technique and others is the `Syntax-Rules Primer +for the Merely Eccentric`_, by Joe Marshall. The title is a play on the essay +`An Advanced Syntax-Rules Primer for the Mildly Insane`_ by +Al Petrofsky. + +.. image:: mad-scientist.jpg + +Marshall's essay is quite nontrivial, and it is intended for expert +Scheme programmers. On the other hand, it is child play compared to +Petrofsky's essay, which is intended for foolish Scheme wizards ;) + +.. _An Advanced Syntax-Rules Primer for the Mildly Insane: http://groups.google.com/group/comp.lang.scheme/browse_frm/thread/86c338837de3a020/eb6cc6e11775b619?#eb6cc6e11775b619 +.. _6: http://www.artima.com/weblogs/viewpost.jsp?thread=240198 + +.. _Syntax-Rules Primer for the Merely Eccentric: http://www.xs4all.nl/~hipster/lib/scheme/gauche/define-syntax-primer.txt + +|# + +(import (rnrs) (sweet-macros)) + +;;COND- +(def-syntax cond-aux + (syntax-match () + (sub (cond-aux (acc ...)) + #'(cond acc ...)) + (sub (cond-aux (acc ...) x1) + #'(syntax-violation 'cond- "Mismatched pairs" '(acc ... x1) 'x1)) + (sub (cond-aux (acc ...) x1 x2 x3 ...) + #'(cond-aux (acc ... (x1 x2)) x3 ...)) + )) + +(def-syntax (cond- x1 x2 ...) + (cond-aux () x1 x2 ...)) +;;END + + +;;COND3 + (define-syntax cond- + (syntax-match () + (sub (cond- "aux" (acc ...)) + (cond acc ...)) + (sub (cond- "aux" (acc ...) x) + (syntax-violation 'cond- "Mismatched pairs" '(acc ... x) 'x)) + (sub (cond- "aux" (acc ...) x1 x2 x3 ...) + (cond- "aux" (acc ... (x1 x2)) x3 ...)) + (sub (cond- x1 x2 ...) + (cond- "aux" () x1 x2 ...)))) +;;END + +;;COND2 + (define-syntax cond- + (syntax-match (aux) + (sub (cond- aux (acc ...)) + (cond acc ...)) + (sub (cond- aux (acc ...) x1) + (syntax-violation 'cond- "Mismatched pairs" '(acc ... x1) 'x1)) + (sub (cond- aux (acc ...) x1 x2 x3 ...) + (cond- aux (acc ... (x1 x2)) x3 ...)) + (sub (cond- x1 x2 ...) + (cond- aux () x1 x2 ...)))) +;;END diff --git a/artima/scheme/macro2.ss b/artima/scheme/macro2.ss new file mode 100644 index 0000000..9821f37 --- /dev/null +++ b/artima/scheme/macro2.ss @@ -0,0 +1,318 @@ +#| +Can a language be both easy and powerful? +----------------------------------------------------------------------- + +When it comes to designing programming languages, easy of use and +power seems to go in opposite directions. There are plenty of examples +where something went wrong, i.e. simple languages which however are +good only for teaching and not for professional use, and +professional languages which are however too tricky to use +for the casual programmer. We have also examples of languages which +are both weak in power *and* difficult to use (insert your chosen +language here). + +Nevertheless, I think it is perfectly possible to design a language +which is both easy to use and powerful. For instance, Python is a good +example of such language (others will prefer Ruby, or Scala, or +anything else they like). + +There are various reasons why Python can be both easy to use and powerful, +the most important ones being the following, in my opinion: + +1. it is a one-man language (i.e. it is not a comprimise language made by a + committee); + +2. it is language made from scratch, with no preoccupations of backward + compatibility; + +3. between (premature) optimization and easy of use Python always chooses + the latter; + +4. it provides special syntax/libraries for common operations. + +Scheme does not share any of these characters, and as a consequence it +is definitively not an easy language. It is just a powerful language. + +However, it is powerful enough that you can make it easy to use, but +that requires (a lot of work) on the part of the programmer, which must +implement point 4 by himself, whereas +nowadays we are all spoiled and we expect the language implementors to +do this kind of work for us. + +.. image:: bikeshed.jpg + :class: right + :width: 400 + +I think the explanation for the current situation in Scheme is more historical +and social than technical. On one side, a lot of people in the Scheme +world want Scheme to stay the way it is, i.e. a language for language +experimentations and research more than a language for enterprise +work. On the other side, the fact that there are so many +implementations of Scheme makes difficult/impossible to specify too +much: this the reason why there are no standard debugging tools for +Scheme, but only implementation-specific ones. + +Finally, there is the infamous `bikeshed effect`_ to take in account. +The bikeshed effect is typical of any project designed by a committee: +when it comes to proposing advanced functionalities that very few +can understand, it is easy to get approval from the larger community. +However, when it comes to simple functionality of common usage, everybody +has got a different opinion and it is practically impossible to get +anything approved at all. + +To avoid that, the standard does not provide +directly usable instruments: instead, it provides general instruments +which are intended as building blocks on that of which everybody can +write the usable abstractions he/she prefers. Most people nowadays +prefer to have ready-made solutions, because they have deadlines, +projects to complete and no time nor interest in writing things +that should be made by language designers, so that Scheme is little +used in the enterprise world. + +There are other options, however, if you are interested in a Scheme +for usage in the enterprise world. You can just use a Scheme +implementation running on the .NET or the Java platform, or a Scheme-like +language such as Clojure_. Clojure runs on the Java Virtual Machine, +it is half lisp and half Scheme, it has a strong functional flavour in +it, it has interesting things to say about concurrency_, +it is a one-man language (Rich Hickey is the BDFL) and provides +access to all the Java libraries. Moreover it provides a whole set +of `syntax conveniences`_ that would never enter in the Scheme standard. + +Professionally I have never +interacted with the Java platform (and even there I would probably +choose Jython over Clojure for reason of familiarity) so I have not +checked out Clojure and I have no idea about it except what you can +infer after reading its web site. If amongst my readers +there is somebody with experience in Clojure, please feel free to add +a comment to this post. + +I personally am using Scheme since I am interested in macrology and no +language in existence can beat Scheme in this respect. + +.. _Clojure: http://clojure.org/ +.. _syntax conveniences: http://clojure.org/special_forms +.. _concurrency: http://clojure.org/concurrent_programming +.. _bikeshed effect: http://en.wikipedia.org/wiki/Bikeshed + +Second order macros +------------------------------------------------------------- + +There is not upper limit to the level of sophistication you can reach +with macros: in particular it is possible to define higher order +macros, i.e. macros taking other macros as arguments or macros +expanding to other macros. Higher order macros allows an extremely +elegant programming style; on the other hand, they are exposed to the +risk of making the code incomprehensible and very hard to debug. +In this episode we will give a couple of examples of second order +macros taking other macros as argument. + +Our first example is a generalization of the accumulator trick we +used last week to define the ``cond-`` macro. We will define a +``collecting-pairs`` macro, which as input another macro and a +sequence of arguments, and calls the input macro with the arguments +grouped in pairs. +Here is the code: + +$$COLLECTING-PAIRS + +``collecting-pairs`` can be used with many syntactic expressions like +``cond``, ``case``, ``syntax-rules``, et cetera. Here is an example +with the case_ expression:: + + > (collecting-pairs (case 1) + (1) 'one + (2) 'two + (3) 'three + else 'unknown)) + one + +macros generating macros +---------------------------------------------------- + +In this paragraph I will give an example of a second order macro +expanding to a regular (first order) macro. Here it is: + +$$DEF-VECTOR-TYPE + +``def-vector-type`` is a macro which defines a macro which is used to +manage classes of vectors; for instance + +$$BOOK + +defines a ``Book`` macro which is able to manage two-dimensional vectors +with fields ``title`` and ``author``. The expansion of ``Book`` is the +following: + +.. code-block:: scheme + + (def-syntax Book + (syntax-match (new ref set! title author) + (sub (ctx <name>) #''Book) + (sub (ctx <fields>) #'(list 'title 'author)) + (sub (ctx from-list ls) #'(list->vector ls)) + (sub (ctx new arg ...) #'(vector arg ...)) + (sub (ctx v ref title) #'(vector-ref v 0)) + (sub (ctx v ref author) #'(vector-ref v 1)) + (sub (ctx v set! title x) #'(vector-set! v 0 x)) + (sub (ctx v set! author x) #'(vector-set! v 1 x)))) + +From this expansion it is clear how ``Book`` works. For instance, + +.. code-block:: scheme + + > (define b (Book new "Title" "Author")) + +defines a vector of two strings: + +.. code-block:: scheme + + > b + #("Title" "Author") + +``(Book b ref title)`` retrieves the ``title`` field whereas +``(Book b ref author)`` retrieves the ``author`` field: + +.. code-block:: scheme + + > (Book b ref title) + "Title" + > (Book b ref author) + "Author" + +``(Book b set! title new-title)`` and ``(Book b set! author new-author)`` +allows to change the ``title`` and ``author`` fields. +It is also possible to convert a list into a ``Book`` vector: + + > (Book from-list '("t" "a")) + #("t" "a") + +Finally, the ``Book`` macro provides introspection features: + +.. code-block:: scheme + + > (Book <name>) + Book + > (book <fields>) + (title author) + +The secret of the ellipsis +----------------------------------------------------------------- + +.. _case: http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-14.html#node_idx_384 +.. _Arc: http://www.paulgraham.com/arcll1.html + +A two-level syntax +------------------------------------------------------------- + +Parens-haters may want to use ``collecting-pairs`` and the colon macro +to avoid parenthesis. They may even go further, and rant that the +basic Scheme syntax should require less parenthesis, since for +most programmers it is easier to write code with less parenthesis. +However, the Scheme philosophy favors automatic code generation +over manual writing. For instance, when writing macros, it is much easier +to use a conditional with more parenthesis like ``cond`` than a +conditional with less parenthesis like ``cond-``. The parenthesis +allows you to group expressions in group that can be repeated via +the ellipsis symbol; in practice, you can writing things like +``(cond (cnd? do-this ...) ...)`` which cannot be written +with ``cond-``. + +On the other hand, different languages adopt different philosophies; +for instance Paul Graham's Arc_ uses less parenthesis. This is +possible since it does not provide a macro system based on +pattern matching (which is a big *minus* in my opinion). Is it possible +to have both? A syntax with few parenthesis for writing code manually +and a syntax with many parenthesis for writing macros. The answer is yes: +the price to pay is to double the constructs of the language and to +use a Python-like approach. + +Python is a perfect example of language with a two-level syntax: a +simple syntax, limited but able to cover the most common case, and a +fully fledged syntax, giving all the power which is needed, which +however should be used only rarely. The best designed programming +language I know is Python. While not perfect, Python takes full +advantage of the two-level syntax idea. For instance + +==================== ================================= +Simplified syntax Full syntax +==================== ================================= +obj.attr getattr(obj, 'attr') +x + y x.__add__(y) +c = C() c = C.__new__(C); c.__init__() +==================== ================================= + +In the case of the conditional syntax, in principle we could have +a fully parenthesised ``__cond__`` syntax for usage in macros and +``cond`` syntax with less parens for manual usage. That, in theory: +in practice Scheme only provides the low level syntax, leaving to +the final user the freedom (and the burden) of implementing his +own preferred high level syntax. + +|# + +(import (rnrs) (sweet-macros) (for (aps lang) run expand) + (aps easy-test) (for (aps list-utils) run expand) (aps compat)) + +;;DEF-VECTOR-TYPE +(def-syntax (def-vector-type name (field-name checker?) ...) + (with-syntax (((i ...) (range (length #'(field-name ...))))) + #'(begin + (define (check-all vec) + (vector-map + (lambda (check? field arg) + (if (check? arg) arg (error 'name "TypeError" field arg))) + (vector checker? ...) (vector 'field-name ...) vec)) + (def-syntax name + (syntax-match (check <name> fields new ref set! field-name ...) + (sub (ctx check vec) #'(check-all vec)) + (sub (ctx <name>) #''name) + (sub (ctx fields) #'(list 'field-name ...)) + (sub (ctx from-list ls) #'(check-all (list->vector ls))) + (sub (ctx new arg (... ...)) #'(ctx from-list (list arg (... ...)))) + (sub (ctx v ref field-name) #'(vector-ref v i)) ... + (sub (ctx v set! field-name x) #'(vector-set! v i x)) ... + )))) + (distinct? free-identifier=? #'(field-name ...))) +;;END + +;;BOOK +(def-vector-type Book (title string?) (author string?)) +;;END + +(display (Book <name>)) (newline) + +(pretty-print (syntax-expand + (def-vector-type Book (title string?) (author string?)))) + +;;COLLECTING-PAIRS +(def-syntax collecting-pairs + (syntax-match () + (sub (collecting-pairs (name arg ...) x1 x2 ...) + #'(collecting-pairs "helper" (name arg ...) () x1 x2 ...)) + (sub (collecting-pairs "helper" (name arg ...) (acc ...)) + #'(name arg ... acc ...)) + (sub (collecting-pairs "helper" (name arg ...) (acc ...) x) + #'(syntax-violation 'name "Mismatched pairs" '(name arg ... acc ... x) 'x)) + (sub (collecting-pairs "helper" (name arg ...) (acc ...) x1 x2 x3 ...) + #'(collecting-pairs "helper" (name arg ...) (acc ... (x1 x2)) x3 ...)) + )) +;;END + +;;TEST-COLON +(run + (test "ok" + (: let* x 1 y x (+ x y)) + 2) +; (test "err" +; (catch-error (: let* x 1 y x z (+ x y))) +; "Odd number of arguments") + + (test "nv1" + (let () + (define b (Book new "T" "A")) + (Book b ref title)) + "T") + ) + +;;END diff --git a/artima/scheme/macro3.ss b/artima/scheme/macro3.ss new file mode 100644 index 0000000..e0fe3e7 --- /dev/null +++ b/artima/scheme/macro3.ss @@ -0,0 +1,288 @@ +#|Syntax objects +=================================================================== + +In the last dozen episodes I have defined plenty of macros, but I have +not really explained what macros are and how they work. This episode +will close the gap, and will explain the true meaning of macros by +introducing the concepts of *syntax object* and of *transformer* over +syntax objects. + +Syntax objects +------------------------------------------------------------------ + +Scheme macros are built over the concept of *syntax object*. +The concept is peculiar to Scheme and has no counterpart in other +languages (including Common Lisp), therefore it is worth to spend some time +on it. + +A *syntax-object* is a kind of enhanced *s*-espression: it contains +the source code as a list of symbols and primitive values, plus +additional informations, such as +the name of the file containing the source code, the line numbers, +a set of marks to distinguish identifiers according to their +lexical context, and more. + +It is possible to convert a name or a literal value into a +syntax object with the syntax quoting operation, i.e. the funny +``#'`` symbol you have seen in all the macros I have defined until now:: + + > #'x ; convert a name into an identifier + #<syntax x> + > #''x ; convert a literal symbol + #<syntax 'x> + > #'1 ; convert a literal number + #<syntax 1> + > #'"s" ; convert a literal string + #<syntax "s"> + > #''(1 "a" 'b) ; convert a literal data structure + #<syntax '(1 "a" 'b)> + +Here I am running all my examples under Ikarus; your Scheme +system may have a slightly different output representation for syntax +objects. + +In general ``#'`` - also spelled ``(syntax )`` - can be "applied" +to any expression:: + + > (define syntax-expr #'(display "hello")) + > syntax-expr + #<syntax (display "hello")> + +It is possible to extract the *s*-expression underlying the +syntax object with the ``syntax->datum`` primitive:: + + > (equal? (syntax->datum syntax-expr) '(display "hello")) + #t + +Different syntax-objects can be equivalent: for instance +the improper list of syntax objects ``(cons #'display (cons #'"hello" #'()))`` +is equivalent to the syntax object ``#'(display "hello")`` in +the sense that both corresponds to the same datum:: + + > (equal? (syntax->datum (cons #'display (cons #'"hello" #'()))) + (syntax->datum #'(display "hello"))) + #t + +The ``(syntax )`` macro is analogous to the ``(quote )`` macro; +moreover, there is a ``quasisyntax`` macro denoted with ``#``` which +is analogous to the ``quasiquote`` macro (`````) and, in analogy to +the operation ``,`` and ``,@`` on regular lists, there are two +operations ``unsyntax`` ``#,`` (*sharp comma*) e ``unsyntax-splicing`` +``#,@`` (*sharp comma splice*) on lists (including improper lists) of +syntax objects. + +Here is an example using sharp-comma:: + + > (let ((user "michele")) #`(display #,user)) + (#<syntax display> "michele" . #<syntax ()>) + +and here is an example using sharp-comma-splice:: + + > (define users (list #'"michele" #'"mario")) + > #`(display (list #,@users)) + (#<syntax display> + (#<syntax list> #<syntax "michele"> #<syntax "mario">) . #<syntax ()>) + +Notice that the output is an improper list. This is somewhat consistent +with the behavior of usual quoting: for usual quoting ``'(a b c)`` +is a shortcut for ``(cons* 'a 'b 'c '())``, which is a proper list, +and for syntax-quoting ``#'(a b c)`` is equivalent to +``(cons* #'a #'b #'c #'())``, which is an improper list. +The ``cons*`` operator here is a R6RS shortcut for nested conses: +``(cons* w x y z)`` is the same as ``(cons w (cons x (cons y z)))``. + +However, the result of a quasi quote interpolation is very much +*implementation-dependent*: Ikarus returns an improper list, but other +implementations returns different results; for instance ypsilon +returns a proper list of syntax objects whereas PLT Scheme returns +an atomic syntax object. The lesson is that you cannot +rely on properties of the inner representation of syntax objects: +what matters is the code they correspond to, i.e. the result of +``syntax->datum``. + +It is possible to promote a datum to a syntax object with the +``datum->syntax`` procedure, but in order +to do so you need to provide a lexical context, which can be specified +by using an identifier:: + + > (datum->syntax #'dummy-context '(display "hello")) + #<syntax (display "hello") + +(the meaning of the lexical context in ``datum->syntax`` is tricky and +I will go back to that in future episodes). + +What ``syntax-match`` really is +-------------------------------------------------------------- + +``syntax-match`` is a general utility to perform pattern matching +on syntax objects; it takes a syntax object in output and returns +another syntax object in output, depending on the patterns, skeletons and guards +used:: + + > (define transformer + (syntax-match () + (sub (name . args) #'name))); return the name as a syntax object + + > (transformer #'(a 1 2 3)) + #<syntax a> + +For convenience, ``syntax-match`` also accepts a second syntax +``(syntax-match x (lit ...) clause ...)`` to match syntax expressions +directly, more convenient than using +``((syntax-match (lit ...) clause ...) x)``. +Here is a simple example of usage:: + + > (syntax-match #'(a 1 2 3) () + (sub (name . args) #'args)); return the args as a syntax object + #<syntax (1 2 3)> + +Here is an example using ``quasisyntax`` and ``unsyntax-splicing``:: + + > (syntax-match #'(a 1 2 3) () + (sub (name . args) #`(name #,@#'args))) + (#<syntax a> #<syntax 1> #<syntax 2> #<syntax 3>) + +As you see, it easy to write hieroglyphs if you use ``quasisyntax`` +and ``unsyntax-splicing``. You can avoid that by means of the ``with-syntax`` +form introduced in episode XX:: + + > (syntax-match #'(a 1 2 3) () + (sub (name . args) (: with-syntax (a ...) #'args #'(name a ...)))) + (#<syntax a> #<syntax 1> #<syntax 2> #<syntax 3>) + + +The pattern variables introduced by ``with-syntax`` +are automatically expanded inside the syntax template, without +resorting to the quasisyntax notation (i.e. there is no need for +``#``` ``#,`` ``#,@``). + +Matching generic syntax lists +-------------------------------------------------------------- + +The previous paragraphs about syntax objects were a little abstract and +probably of unclear utility (but what would you expect from +an advanced macro tutorial? ;). Here I will be more +concrete and I will provide an example where +``syntax-match`` is used as a list matcher inside a bigger macro. +The final goal is to provide +a nicer syntax for association lists (an association list is just +a non-empty list of non-empty lists). The macro accepts a variable +number of arguments; every argument is of the form ``(name value)`` or +it is a single identifier: in this case latter case it must be +magically converted +into the form ``(name value)`` where ``value`` is the value of the +identifier, assuming it is bound in the current scope, otherwise +a run time error is raised ``"unbound identifier"``. If you try to +pass an argument which is not of the expected form, a compile time +syntax error must be raised. +In concrete, the macro works as follows: + +$$TEST-ALIST + +``(alist a (b (* 2 a)))`` would raise an error ``unbound identifier a``. +Here is the implementation: + +$$ALIST + +The expression ``#'(arg ...)`` expands to a list of syntax +objects which are then transformed by is the ``syntax-match`` transformer, +which converts identifiers of the form ``n`` into couples of the form +``(n n)``, whereas it leaves couples ``(n v)`` unchanged, however +by checking that ``n`` is an identifier. + +Macros as list transformers +--------------------------------------------------------------------- + +Macros are in one-to-one correspondence with list transformers, i.e. every +macro is associated to a transformer which converts a list of syntax objects +(the arguments of the macro) into another list of syntax objects (the expansion +of the macro). Scheme itself takes care of converting the input code +into a list of syntax objects (if you wish, internally there is a +``datum->syntax`` conversion) and the output syntax list into code +(an internal ``syntax->datum`` conversion). +The sharp-quote notation in macros is just an abbreviation for the underlying +list: for instance a macro describing function composition + +:: + + (def-syntax (app f g) + #'(f g)) + +can be written equivalently also as + +:: + + (def-syntax (app f g) + (list #'f #'g)) + +or + +:: + + (def-syntax (app f g) + (cons* #'f #'g #'())) + +The sharp-quoted syntax is more readable, but it hides the underlying list +representation which in some cases is pretty useful. This is why +``syntax-match`` macros are much more powerful than ``syntax-rules`` +macros. + +``sweet-macros`` provide a convenient feature: +it is possible to extract the associated +transformer for each macro defined via ``def-syntax``. For instance, +here is the transformer associated to the ``define-a`` macro: + +.. code-block:: scheme + + > (define tr (define-a <transformer>)) + > (tr (list #'dummy #'1)) + (#<syntax define> #<syntax a> 1) + +Notice that the name of the macro (in this case ``define-a`` is ignored +by the transformer, i.e. it is a dummy identifier. + +|# +(import (rnrs) (sweet-macros) (aps easy-test) (aps compat) + (for (aps list-utils) expand) (for (aps record-syntax) expand run)) + +;;ALIST +(def-syntax (alist arg ...) + (with-syntax (( + ((name value) ...) + (map (syntax-match () + (sub n #'(n n) (identifier? #'n)) + (sub (n v) #'(n v) (identifier? #'n))) + #'(arg ...)) )) + #'(let* ((name value) ...) + (list (list 'name name) ...)))) +;;END + +(def-syntax book (record-syntax title author)) +(pretty-print (syntax-expand (record-syntax title author))) (newline) + +(define b (vector "T" "A")) +(display (list (book b title) (book b author))) ;; seems an Ypsilon bug +;since this works +;(def-syntax book +; (syntax-match (title author) +; (sub (ctx v title) (syntax (vector-ref v 0))) +; (sub (ctx v author) (syntax (vector-ref v 1))))) + +(display (syntax-expand (alist (a 1) (b (* 2 a))))) + +(run + + ;;TEST-ALIST + (test "simple" + (alist (a 1) (b (* 2 a))) + '((a 1) (b 2))) + + ;;END + ;(test "with-error" + ; (catch-error (alist2 (a 1) (2 3))) + ; "invalid syntax") + +) + + + diff --git a/artima/scheme/scheme22.ss b/artima/scheme/macro4.ss index 66114a2..66114a2 100644 --- a/artima/scheme/scheme22.ss +++ b/artima/scheme/macro4.ss diff --git a/artima/scheme/scheme23.ss b/artima/scheme/macro5.ss index 0a0da02..0a0da02 100644 --- a/artima/scheme/scheme23.ss +++ b/artima/scheme/macro5.ss diff --git a/artima/scheme/scheme24.ss b/artima/scheme/macro6.ss index 7ddfa33..7ddfa33 100644 --- a/artima/scheme/scheme24.ss +++ b/artima/scheme/macro6.ss diff --git a/artima/scheme/module-system.ss b/artima/scheme/module-system.ss deleted file mode 100644 index 687f0c9..0000000 --- a/artima/scheme/module-system.ss +++ /dev/null @@ -1,282 +0,0 @@ -#| -The R6RS module system -========================================================= - -Preamble ---------------------------------------------------------- - -For nearly 30 years Scheme lived without a standard module system. -The consequences of this omission were the proliferation of dozens -of incompatible module systems and neverending debates. -The situation changed with the R6RS report: nowadays Scheme *has* -am official module system, finally. -Unfortunately the official module system is *not* used -by all Scheme implementations, and it is possible that some implementation -will *never* support it. Thus, the module system has certainly political -issues; this is unfortunate, but there -is nothing we can do about it. -On the other hand, the R6RS module system -has also a few technical issues. We can do something about those, by -explaining the subtle points and by documenting the most common pittfalls. -It will takes me six full episodes to -explain the module system and its trickiness, especially for macro -writers who want to write portable code. - -.. image:: Jigsaw.png - -Compiling Scheme modules vs compiling Python modules --------------------------------------------------------------- - -Since the title of this series is *The Adventures of a Pythonista in -Schemeland* let me begin my excursion about the R6RS module -system by contrasting it with the Python module system. - -How do Python modules work? All Pythonistas know the answer, but let -me spell it out loud for the benefit of other readers, and allow me to -give a simplified description of importing mechanism which however is -not far for the truth. - -When you run a script ``script.py`` depending on some library ``lib.py``, -the Python interpreter searches fo a bytecode-compiled -file ``lib.pyc`` consistent with the source file ``lib.py``; if it finds it, -it imports it, otherwise it compiles the source file *on-the-fly*, -generates a ``lib.pyc`` file and imports it. -A bytecompiled file is consistent with the source file if it has been -generated *after* the source file; if you modify the source file, -the ``lib.pyc`` file becomes outdated: the Python interpreter is -smart enough to recognize the issue and to seamlessly recompile ``lib.pyc``. - -In Scheme the compilation process is very much *implementation-dependent*. -Here I will focus on the Ikarus mechanism, which is the most Pythonic one. -Ikarus has two modes of operation; by default it just compiles -everything from scratch, without using any intermediate file. -This is possible since the Ikarus compiler is very fast. However, -this mechanism does not scale; if you have very large libraries, -it does not make sense to recompile everything every time you write a -little script. -Therefore Ikarus (in the latest development version) added a mechanism -similar to the Python one; if you have a file ``script.ss`` which -depends on a library ``lib.sls`` and run the command - -:: - - $ ikarus --compile-dependencies script.ss - Serializing "./lib.sls.ikarus-fasl" ... - -the compiler will automatically (re)generate a precompiled file -``lib.sls.ikarus-fasl`` from the source file ``lib.sls`` as needed, by -looking at the time stamps. Exactly the same as in Python. The only -difference is that Python compiles to bytecode, whereas Ikarus compile -to native code and therefore Ikarus programs are usually much faster -than Python programs. Notice that whereas in theory Ikarus should -always be much faster of Python, in practice this is not guaranteed: a -lot of Python programs are actually calling underlying C libraries, so -that Python can be pretty fast in some cases (for instance in -numeric computations using numpy). - -Modules are not first class objects -------------------------------------------------------------- - -There is a major difference between Python modules and Scheme modules: -Python modules are first class runtime objects which can be passed and -returned from functions, as well as modified and introspected freely; -Scheme modules instead are compile time entities which are not first -class objects, cannot be modified and cannot be introspected. - -Python modules are so flexible because they are basically -dictionaries. It would not be difficult to implement a Python-like -module system in Scheme, by making use of hash-tables, the equivalent -of Python dictionaries. However, the standard module system does not -follow this route, because Scheme modules may contain macros which are -not first class objects, therefore they cannot be first class objects -themselves. - -A remark is in order: if you have a Scheme library ``lib.sls`` which -defines a variable ``x``, and you import it with a prefix ``lib.``, -you can access the variable with the Python-like syntax -``lib.x``. However, ``lib.x`` in Scheme means something completely -different from ``lib.x`` in Python: ``lib.x`` in Scheme is just a name -with a prefix, whereas ``lib.x`` in Python means -"take the attribute ``x`` of the object ``lib``" -and that involves a function call. -In other words, Python must perform an hash table lookup everytime you -use the syntax ``lib.x``, whereas Scheme does not need to do so. - -Another difference is that it is possible to add -names dinamically to a Python module whereas it is impossible to do so -for a Scheme module. It is also impossible to get the list of names -exported by a module: the only way is to look at the export list -in the source code. It is also impossible to export all the names -from a module automatically: one has to list them all explicitly. - -In general Scheme is not too strong at introspection, and that it is -really disturbing to me since it is an issue that -could be easily solved. For instance, my ``sweet-macros`` library -provides introspection features, so that you can ask at runtime, for -instance from the REPL, what are the patterns and the literals -accepted by a macro, its source code and its associated transformer, -even if the macro is a -purely compile time entity. Therefore, it would be perfectly possible -to give an introspection API to every imported module. For instance, -every module could automagically define a variable - defined both -at runtime and compile time - containing the full -list of exported names and there could be some builtin syntax -to query the list. - -But introspection has been completely neglected by the current -standard. One wonders how Schemers cope with large libraries/frameworks -like the ones we use every day in the enterprise world, which export -thounsands and thousands of names in hundreds and hundreds of modules. -Let's hope for something better in the future. - -Compiling is not the same than executing ------------------------------------------------------------------ - -There are also similarities between a Python compiler and a Scheme compiler. -For instance, they are both very permissive, in the sense that they flag -very few errors at compile time. Consider for instance the following -Python module:: - - $ cat lib.py - x = 1/0 - -The module contains an obvious error, that in principle should be -visible to the (bytecode) compiler. However, the compiler only checks -that the module contains syntactically correct Python code, it *does -not evaluate it*, and generates a ``lib.pyc`` file without -complaining:: - - $ python -m py_compile lib.py # generates lib.pyc without errors - -The error will be flagged at runtime, only when you import the module:: - - $ python -c"import lib" - Traceback (most recent call last): - File "<string>", line 1, in <module> - File "lib.py", line 1, in <module> - x = 1/0 - ZeroDivisionError: integer division or modulo by zero - -Scheme uses a very similar model, but importing a module has a different -meaning. Consider for instance the library - -:: - - $ echo lib.sls - (library (lib) - (export x) - (import (rnrs)) - (define x (/ 1 0))) - -which compiles correctly and the script - -:: - - $ echo script.ss - (import (rnrs) (lib)) - -You can compile the library and run the script without seeing any error:: - - $ ikarus --compile-dependencies script.ss - Serializing "./lib.sls.ikarus-fasl" ... - $ ikarus --r6rs-script script.ss - -The difference with Python is the following: in Python, importing a module -(which is done at runtime) means both *compiling* and *evaluating* it; -in Scheme importing a module (which is done at compile time) means -*compiling* and *visiting* it, i.e. taking notes of the names exported -by the module and of all its dependencies; however, the module is not -evaluated, unless it is used. In particular, only when you try to -access the ``x`` variable, you will get the division error at runtime: - -:: - - $ echo script.ss - (import (rnrs) (prefix (lib) lib:)) - (begin - (display "running ...\n") - (display lib:x)) - $ ikarus --r6rs-script script.ss - Unhandled exception: - Condition components: - 1. &assertion - 2. &who: / - 3. &message: "division by 0" - 4. &irritants: () - -I have imported the names in ``lib`` with a prefix, -to stay closer to the Python style, but usually (and unfortunately) in -the Scheme world people do not use prefixes; by default all -exported names are imported, just as it is the case for Python -when the (discouraged) style ``from lib import *`` is used. - -Why there is so little checking at compile-time? ------------------------------------------------------------------------- - -I asked myself why Scheme compilers (but also the Python compiler) -are so stupid that they cannot recognize obvious errors like the -zero division error just discussed. I could not find an answer -so I asked on the Ikarus mailing list. It turns out the compilers -are not stupid at all: they can recognize the zero division error, -but they cannot signal it since it is forbidden by the Scheme -specifications. For instance, Llewellyn Pritchard (Leppie), the implementor of -IronScheme wrote: - -.. epigraph:: - - In IronScheme, if I can detect there is an issue at compile - time, I simply defer the computation to the runtime, or could even - just convert it into a closure that will return an error. This is only - one of the things that make Scheme quite hard to implement on a statically - typed runtime such as the CLR, as it forces me to box values at method - boundries and plenty type checking at runtime. - -whereas Abdul Aziz Ghoulum wrote: - -.. epigraph:: - - - Actually, Ikarus does some type checking, and it does - detect the division by 0. It however cannot do anything - about it in this case since Scheme requires that the - exception be raised when the division operation is - performed at run time. - -Aziz went further and explained to me the rationale for -the current specification. The reason is that we want -expressions like - -.. code-block:: scheme - - (define x (/ 1 0)) - -and - -.. code-block:: scheme - - (define thunk (lambda () (/ 1 0))) - -to be compilable. The second expression is really the same as -the first one, only nested one level more. Even if the thunk -will raise an error when called, the thunk itself should -be a valid compilable procedure. It is useful -to have functions that can raise predictable errors, especially -when writing test case, so a compiler should not reject them. -In general, you can think of a module as of a giant thunk; using a -module calls the thunk (the process is called *module instantiation*) -and possibly raises errors at runtime, but the module per se must be -compilable even if contains errors which are detectable at compile -time. - -This evaluation strategy also keeps the compiler simple: we know that the -compiler will just expand the macros, but will not perform any evaluation. -Finally, this semantics enable `cross compilation`_: macros will be expanded -independently from the architecture, whereas the -runtime structures will be compiled and linked differently depending on the -architecture of the target processor. - -.. image:: compiler-crosscompiler.jpg - -.. _cross compilation: http://chicken.wiki.br/cross-compilation -.. _cross compilation: http://en.wikipedia.org/wiki/Cross_compilation -|# diff --git a/artima/scheme/scheme19.ss b/artima/scheme/scheme19.ss index 253e817..687f0c9 100644 --- a/artima/scheme/scheme19.ss +++ b/artima/scheme/scheme19.ss @@ -1,238 +1,282 @@ -#|Recursive macros -===================================================================== - -After a short introduction about the relevance of macros for -programming language design, I show a few common patterns of Scheme -macrology: recursive macros, accumulators, and the usage of literals -to incorporate helpers in macros. - -Should everybody be designing her own programming language? ------------------------------------------------------------------- - -Macros are the main reason why I first became interested in Scheme. At -the time - more or less six years ago - there was a bunch of lispers -trolling in comp.lang.python, and arguing for the addition of macros -to Python. Of course most Pythonista opposed the idea, but at the -time I had no idea of the advantages/disadvantages of macros; I felt -quite ignorant and powerless to argue. I never liked to feel ignorant, -so I decided to learn macros, especially Scheme macros, since they are -the state of the art. - -Nowadays I can say that the addition of macros to Python would be -a bad idea knowing what I am talking about. Actually, I have already -stated in episode 12_ an even stronger opinion, i.e. that macros -are more bad than good for any enterprise-oriented language (but notice -that I am *not* implying that every enterprise should adopt -only enterprise-oriented languages). - -My opinion against macro in (most) enterprise programming -does mean that macros are worthless, and indeed -I think they are extremely useful and important in another domain, -i.e. in the domain of design and research about programming -languages. As a matter of fact, *Scheme macros enable every programmer -to write her own programming language*. I think this is a valuable and nice -to have thing. Everybody who has got opinion -about language design, or about how an object should should work, or -questions like "what would a language look like if it had feature X?", -can solve his doubts by implementing the feature with macros. - -Perhaps not everybody should design its own programming language, -and certainly not everybody should *distribute* its own personal -language, however I think lots of people will get a benefit trying -to think about how to design a language and making some experiment. -The easier thing is to start from a Domain Specific Language (DSL), -which does not need to be a full grown programming language; for -instance in the Python world it seems that everybody is implementing templating -languages to generate web pages. In my opinion, this a good thing *per se*, -the probably is that everybody is distributing its own language so that -there a bit of anarchy, but this is not such a serious problem after all. - -Even for what concerns full grown programming languages we see nowadays -an explosion of new languages coming out, especially for the Java and -the CLR platforms, since it is relatively easy to implement a new -language on those platforms. However, it still takes a lot of work. - -On the other hand, writing a custom language embedded in Scheme by -means of macros is by far much easier and Scheme makes an ideal platform -for implementing languages and experimenting with new ideas. - -There is a `quote of Ian Bicking`_ about Web frameworks which struck me: - -*Sometimes Python is accused of having too many web frameworks. And -it's true, there are a lot. That said, I think writing a framework is -a useful exercise. It doesn’t let you skip over too much without -understanding it. It removes the magic. So even if you go on to use -another existing framework (which I'd probably advise you do), you' -ll be able to understand it better if you've written something like it -on your own.* - -You can the replace the words "web framework" with "programming -language" here and the quote still makes sense. You should read my -*Adventures* in this spirit: the goal of this series is to give -the technical competence to write your own language by means of -macros. Even if you are not going to design your own language, -macros will help you to understand how languages work. - -I personally am interested only in the -technical competence, *I do not want to write a new language*. -There are already lots of languages -out there, and writing a real language is a lot of grunt work, because -it means writing debugging tools, good error messages, wondering about -portability, interacting with an user community, et cetera et cetera. -Not everybody is good language designer and a good BDFL, for sure; -however everybody can have opinions about language design and some -experiment with macrology can help to put to test such opinions. - -.. _quote of Ian Bicking: http://pythonpaste.org/webob/do-it-yourself.html -.. _12: http://www.artima.com/weblogs/viewpost.jsp?thread=240836 - -Recursive macros with accumulators ----------------------------------------------------------- - -The goal of learning macros well enough to implement a programming language -is an ambitious one; it is not something we can attain in an episode of the -Adventures, nor in six. However, one episode is enough to explain at least -one useful techniques which is commonly used in Scheme macrology and which -is good to know in order to reach our final goal, in time. -The technique we will discuss in this episode is the accumulator trick, -which is analogous to the accumulator trick we first discussed in episode -6_ when talking about tail call optimization. In Scheme it is common -to introduce an auxiliary variable to store a value which is passed -in a loop: the same trick can be used in macros, at compile time instead -that at run time. - -In order to give an example of usage of the accumulator trick, let me -define a conditional macro ``cond-`` which works like ``cond``, but -with less parenthesis: +#| +The R6RS module system +========================================================= + +Preamble +--------------------------------------------------------- + +For nearly 30 years Scheme lived without a standard module system. +The consequences of this omission were the proliferation of dozens +of incompatible module systems and neverending debates. +The situation changed with the R6RS report: nowadays Scheme *has* +am official module system, finally. +Unfortunately the official module system is *not* used +by all Scheme implementations, and it is possible that some implementation +will *never* support it. Thus, the module system has certainly political +issues; this is unfortunate, but there +is nothing we can do about it. +On the other hand, the R6RS module system +has also a few technical issues. We can do something about those, by +explaining the subtle points and by documenting the most common pittfalls. +It will takes me six full episodes to +explain the module system and its trickiness, especially for macro +writers who want to write portable code. + +.. image:: Jigsaw.png + +Compiling Scheme modules vs compiling Python modules +-------------------------------------------------------------- + +Since the title of this series is *The Adventures of a Pythonista in +Schemeland* let me begin my excursion about the R6RS module +system by contrasting it with the Python module system. + +How do Python modules work? All Pythonistas know the answer, but let +me spell it out loud for the benefit of other readers, and allow me to +give a simplified description of importing mechanism which however is +not far for the truth. + +When you run a script ``script.py`` depending on some library ``lib.py``, +the Python interpreter searches fo a bytecode-compiled +file ``lib.pyc`` consistent with the source file ``lib.py``; if it finds it, +it imports it, otherwise it compiles the source file *on-the-fly*, +generates a ``lib.pyc`` file and imports it. +A bytecompiled file is consistent with the source file if it has been +generated *after* the source file; if you modify the source file, +the ``lib.pyc`` file becomes outdated: the Python interpreter is +smart enough to recognize the issue and to seamlessly recompile ``lib.pyc``. + +In Scheme the compilation process is very much *implementation-dependent*. +Here I will focus on the Ikarus mechanism, which is the most Pythonic one. +Ikarus has two modes of operation; by default it just compiles +everything from scratch, without using any intermediate file. +This is possible since the Ikarus compiler is very fast. However, +this mechanism does not scale; if you have very large libraries, +it does not make sense to recompile everything every time you write a +little script. +Therefore Ikarus (in the latest development version) added a mechanism +similar to the Python one; if you have a file ``script.ss`` which +depends on a library ``lib.sls`` and run the command + +:: + + $ ikarus --compile-dependencies script.ss + Serializing "./lib.sls.ikarus-fasl" ... + +the compiler will automatically (re)generate a precompiled file +``lib.sls.ikarus-fasl`` from the source file ``lib.sls`` as needed, by +looking at the time stamps. Exactly the same as in Python. The only +difference is that Python compiles to bytecode, whereas Ikarus compile +to native code and therefore Ikarus programs are usually much faster +than Python programs. Notice that whereas in theory Ikarus should +always be much faster of Python, in practice this is not guaranteed: a +lot of Python programs are actually calling underlying C libraries, so +that Python can be pretty fast in some cases (for instance in +numeric computations using numpy). + +Modules are not first class objects +------------------------------------------------------------- + +There is a major difference between Python modules and Scheme modules: +Python modules are first class runtime objects which can be passed and +returned from functions, as well as modified and introspected freely; +Scheme modules instead are compile time entities which are not first +class objects, cannot be modified and cannot be introspected. + +Python modules are so flexible because they are basically +dictionaries. It would not be difficult to implement a Python-like +module system in Scheme, by making use of hash-tables, the equivalent +of Python dictionaries. However, the standard module system does not +follow this route, because Scheme modules may contain macros which are +not first class objects, therefore they cannot be first class objects +themselves. + +A remark is in order: if you have a Scheme library ``lib.sls`` which +defines a variable ``x``, and you import it with a prefix ``lib.``, +you can access the variable with the Python-like syntax +``lib.x``. However, ``lib.x`` in Scheme means something completely +different from ``lib.x`` in Python: ``lib.x`` in Scheme is just a name +with a prefix, whereas ``lib.x`` in Python means +"take the attribute ``x`` of the object ``lib``" +and that involves a function call. +In other words, Python must perform an hash table lookup everytime you +use the syntax ``lib.x``, whereas Scheme does not need to do so. + +Another difference is that it is possible to add +names dinamically to a Python module whereas it is impossible to do so +for a Scheme module. It is also impossible to get the list of names +exported by a module: the only way is to look at the export list +in the source code. It is also impossible to export all the names +from a module automatically: one has to list them all explicitly. + +In general Scheme is not too strong at introspection, and that it is +really disturbing to me since it is an issue that +could be easily solved. For instance, my ``sweet-macros`` library +provides introspection features, so that you can ask at runtime, for +instance from the REPL, what are the patterns and the literals +accepted by a macro, its source code and its associated transformer, +even if the macro is a +purely compile time entity. Therefore, it would be perfectly possible +to give an introspection API to every imported module. For instance, +every module could automagically define a variable - defined both +at runtime and compile time - containing the full +list of exported names and there could be some builtin syntax +to query the list. + +But introspection has been completely neglected by the current +standard. One wonders how Schemers cope with large libraries/frameworks +like the ones we use every day in the enterprise world, which export +thounsands and thousands of names in hundreds and hundreds of modules. +Let's hope for something better in the future. + +Compiling is not the same than executing +----------------------------------------------------------------- + +There are also similarities between a Python compiler and a Scheme compiler. +For instance, they are both very permissive, in the sense that they flag +very few errors at compile time. Consider for instance the following +Python module:: + + $ cat lib.py + x = 1/0 + +The module contains an obvious error, that in principle should be +visible to the (bytecode) compiler. However, the compiler only checks +that the module contains syntactically correct Python code, it *does +not evaluate it*, and generates a ``lib.pyc`` file without +complaining:: + + $ python -m py_compile lib.py # generates lib.pyc without errors + +The error will be flagged at runtime, only when you import the module:: + + $ python -c"import lib" + Traceback (most recent call last): + File "<string>", line 1, in <module> + File "lib.py", line 1, in <module> + x = 1/0 + ZeroDivisionError: integer division or modulo by zero + +Scheme uses a very similar model, but importing a module has a different +meaning. Consider for instance the library + +:: + + $ echo lib.sls + (library (lib) + (export x) + (import (rnrs)) + (define x (/ 1 0))) + +which compiles correctly and the script + +:: + + $ echo script.ss + (import (rnrs) (lib)) + +You can compile the library and run the script without seeing any error:: + + $ ikarus --compile-dependencies script.ss + Serializing "./lib.sls.ikarus-fasl" ... + $ ikarus --r6rs-script script.ss + +The difference with Python is the following: in Python, importing a module +(which is done at runtime) means both *compiling* and *evaluating* it; +in Scheme importing a module (which is done at compile time) means +*compiling* and *visiting* it, i.e. taking notes of the names exported +by the module and of all its dependencies; however, the module is not +evaluated, unless it is used. In particular, only when you try to +access the ``x`` variable, you will get the division error at runtime: + +:: + + $ echo script.ss + (import (rnrs) (prefix (lib) lib:)) + (begin + (display "running ...\n") + (display lib:x)) + $ ikarus --r6rs-script script.ss + Unhandled exception: + Condition components: + 1. &assertion + 2. &who: / + 3. &message: "division by 0" + 4. &irritants: () + +I have imported the names in ``lib`` with a prefix, +to stay closer to the Python style, but usually (and unfortunately) in +the Scheme world people do not use prefixes; by default all +exported names are imported, just as it is the case for Python +when the (discouraged) style ``from lib import *`` is used. + +Why there is so little checking at compile-time? +------------------------------------------------------------------------ + +I asked myself why Scheme compilers (but also the Python compiler) +are so stupid that they cannot recognize obvious errors like the +zero division error just discussed. I could not find an answer +so I asked on the Ikarus mailing list. It turns out the compilers +are not stupid at all: they can recognize the zero division error, +but they cannot signal it since it is forbidden by the Scheme +specifications. For instance, Llewellyn Pritchard (Leppie), the implementor of +IronScheme wrote: + +.. epigraph:: + + In IronScheme, if I can detect there is an issue at compile + time, I simply defer the computation to the runtime, or could even + just convert it into a closure that will return an error. This is only + one of the things that make Scheme quite hard to implement on a statically + typed runtime such as the CLR, as it forces me to box values at method + boundries and plenty type checking at runtime. + +whereas Abdul Aziz Ghoulum wrote: + +.. epigraph:: + + + Actually, Ikarus does some type checking, and it does + detect the division by 0. It however cannot do anything + about it in this case since Scheme requires that the + exception be raised when the division operation is + performed at run time. + +Aziz went further and explained to me the rationale for +the current specification. The reason is that we want +expressions like .. code-block:: scheme - (cond- - cond-1? return-1 - cond-2? return-2 - ... - else return-default) + (define x (/ 1 0)) -We want the code above to expand to: +and .. code-block:: scheme - (cond - (cond-1? return-1) - (cond-2? return-2) - ... - (else return-default)) - - -Here is the solution, which makes use of an accumulator and of an auxiliary -macro: - -$$COND- - -The code should be clear. The auxiliary (private) macro ``cond-aux`` -is recursive: it works by collecting the arguments ``x1, x2, ..., xn`` -in the accumulator ``(acc ...)``. If the number of arguments is even, -at some point we end up having collected all the arguments in the -accumulator, which is then expanded into a standard conditional; if -the number of arguments is even, at some point we end up having -collected all the arguments except one, and a ``"Mismatched pairs"`` -exception is raised. The user-visible macro ``cond-`` just calls -``cond-aux`` by setting the initial value of the accumulator to ``()``. -The entire expansion and error checking is made at compile time. -Here is an example of usage:: - - > (let ((n 1)) - (cond- (= n 1) ; missing a clause - (= n 2) 'two - (= n 3) 'three - else 'unknown)) - Unhandled exception: - Condition components: - 1. &who: cond- - 2. &message: "Mismatched pairs" - 3. &syntax: - form: (((= n 1) (= n 2)) ('two (= n 3)) ('three else) 'unknown) - subform: 'unknown - -A trick to avoid auxiliary macros ----------------------------------------------------------------- - -I have nothing against auxiliary macros, however sometimes you may -want to keep all the code in a single macro. This is useful if you are -debugging a macro since an auxiliary macro is usually not exported and -you may not have access to it without changing the source code of the -module defining it and without recompiling it; on the other hand, you -have full access to an exported macro including the features of the -would be auxiliary macro. The trick is to introduce a literal to -defined the helper macro inside the main macro. Here is how it would -work in our example: - -$$COND2 - -If you do not want to use a literal identifier, you can use a literal string -instead: - -$$COND3 - -This kind of tricks are quite common in Scheme macros; the best reference -you can find detailing these technique and others is the `Syntax-Rules Primer -for the Merely Eccentric`_, by Joe Marshall. The title is a play on the essay -`An Advanced Syntax-Rules Primer for the Mildly Insane`_ by -Al Petrofsky. - -.. image:: mad-scientist.jpg - -Marshall's essay is quite nontrivial, and it is intended for expert -Scheme programmers. On the other hand, it is child play compared to -Petrofsky's essay, which is intended for foolish Scheme wizards ;) - -.. _An Advanced Syntax-Rules Primer for the Mildly Insane: http://groups.google.com/group/comp.lang.scheme/browse_frm/thread/86c338837de3a020/eb6cc6e11775b619?#eb6cc6e11775b619 -.. _6: http://www.artima.com/weblogs/viewpost.jsp?thread=240198 - -.. _Syntax-Rules Primer for the Merely Eccentric: http://www.xs4all.nl/~hipster/lib/scheme/gauche/define-syntax-primer.txt - + (define thunk (lambda () (/ 1 0))) + +to be compilable. The second expression is really the same as +the first one, only nested one level more. Even if the thunk +will raise an error when called, the thunk itself should +be a valid compilable procedure. It is useful +to have functions that can raise predictable errors, especially +when writing test case, so a compiler should not reject them. +In general, you can think of a module as of a giant thunk; using a +module calls the thunk (the process is called *module instantiation*) +and possibly raises errors at runtime, but the module per se must be +compilable even if contains errors which are detectable at compile +time. + +This evaluation strategy also keeps the compiler simple: we know that the +compiler will just expand the macros, but will not perform any evaluation. +Finally, this semantics enable `cross compilation`_: macros will be expanded +independently from the architecture, whereas the +runtime structures will be compiled and linked differently depending on the +architecture of the target processor. + +.. image:: compiler-crosscompiler.jpg + +.. _cross compilation: http://chicken.wiki.br/cross-compilation +.. _cross compilation: http://en.wikipedia.org/wiki/Cross_compilation |# - -(import (rnrs) (sweet-macros)) - -;;COND- -(def-syntax cond-aux - (syntax-match () - (sub (cond-aux (acc ...)) - #'(cond acc ...)) - (sub (cond-aux (acc ...) x1) - #'(syntax-violation 'cond- "Mismatched pairs" '(acc ... x1) 'x1)) - (sub (cond-aux (acc ...) x1 x2 x3 ...) - #'(cond-aux (acc ... (x1 x2)) x3 ...)) - )) - -(def-syntax (cond- x1 x2 ...) - (cond-aux () x1 x2 ...)) -;;END - - -;;COND3 - (define-syntax cond- - (syntax-match () - (sub (cond- "aux" (acc ...)) - (cond acc ...)) - (sub (cond- "aux" (acc ...) x) - (syntax-violation 'cond- "Mismatched pairs" '(acc ... x) 'x)) - (sub (cond- "aux" (acc ...) x1 x2 x3 ...) - (cond- "aux" (acc ... (x1 x2)) x3 ...)) - (sub (cond- x1 x2 ...) - (cond- "aux" () x1 x2 ...)))) -;;END - -;;COND2 - (define-syntax cond- - (syntax-match (aux) - (sub (cond- aux (acc ...)) - (cond acc ...)) - (sub (cond- aux (acc ...) x1) - (syntax-violation 'cond- "Mismatched pairs" '(acc ... x1) 'x1)) - (sub (cond- aux (acc ...) x1 x2 x3 ...) - (cond- aux (acc ... (x1 x2)) x3 ...)) - (sub (cond- x1 x2 ...) - (cond- aux () x1 x2 ...)))) -;;END diff --git a/artima/scheme/scheme20.ss b/artima/scheme/scheme20.ss index 9821f37..04db2f7 100644 --- a/artima/scheme/scheme20.ss +++ b/artima/scheme/scheme20.ss @@ -1,318 +1,345 @@ -#| -Can a language be both easy and powerful? ------------------------------------------------------------------------ - -When it comes to designing programming languages, easy of use and -power seems to go in opposite directions. There are plenty of examples -where something went wrong, i.e. simple languages which however are -good only for teaching and not for professional use, and -professional languages which are however too tricky to use -for the casual programmer. We have also examples of languages which -are both weak in power *and* difficult to use (insert your chosen -language here). - -Nevertheless, I think it is perfectly possible to design a language -which is both easy to use and powerful. For instance, Python is a good -example of such language (others will prefer Ruby, or Scala, or -anything else they like). - -There are various reasons why Python can be both easy to use and powerful, -the most important ones being the following, in my opinion: - -1. it is a one-man language (i.e. it is not a comprimise language made by a - committee); - -2. it is language made from scratch, with no preoccupations of backward - compatibility; - -3. between (premature) optimization and easy of use Python always chooses - the latter; - -4. it provides special syntax/libraries for common operations. - -Scheme does not share any of these characters, and as a consequence it -is definitively not an easy language. It is just a powerful language. - -However, it is powerful enough that you can make it easy to use, but -that requires (a lot of work) on the part of the programmer, which must -implement point 4 by himself, whereas -nowadays we are all spoiled and we expect the language implementors to -do this kind of work for us. - -.. image:: bikeshed.jpg - :class: right - :width: 400 - -I think the explanation for the current situation in Scheme is more historical -and social than technical. On one side, a lot of people in the Scheme -world want Scheme to stay the way it is, i.e. a language for language -experimentations and research more than a language for enterprise -work. On the other side, the fact that there are so many -implementations of Scheme makes difficult/impossible to specify too -much: this the reason why there are no standard debugging tools for -Scheme, but only implementation-specific ones. - -Finally, there is the infamous `bikeshed effect`_ to take in account. -The bikeshed effect is typical of any project designed by a committee: -when it comes to proposing advanced functionalities that very few -can understand, it is easy to get approval from the larger community. -However, when it comes to simple functionality of common usage, everybody -has got a different opinion and it is practically impossible to get -anything approved at all. - -To avoid that, the standard does not provide -directly usable instruments: instead, it provides general instruments -which are intended as building blocks on that of which everybody can -write the usable abstractions he/she prefers. Most people nowadays -prefer to have ready-made solutions, because they have deadlines, -projects to complete and no time nor interest in writing things -that should be made by language designers, so that Scheme is little -used in the enterprise world. - -There are other options, however, if you are interested in a Scheme -for usage in the enterprise world. You can just use a Scheme -implementation running on the .NET or the Java platform, or a Scheme-like -language such as Clojure_. Clojure runs on the Java Virtual Machine, -it is half lisp and half Scheme, it has a strong functional flavour in -it, it has interesting things to say about concurrency_, -it is a one-man language (Rich Hickey is the BDFL) and provides -access to all the Java libraries. Moreover it provides a whole set -of `syntax conveniences`_ that would never enter in the Scheme standard. - -Professionally I have never -interacted with the Java platform (and even there I would probably -choose Jython over Clojure for reason of familiarity) so I have not -checked out Clojure and I have no idea about it except what you can -infer after reading its web site. If amongst my readers -there is somebody with experience in Clojure, please feel free to add -a comment to this post. - -I personally am using Scheme since I am interested in macrology and no -language in existence can beat Scheme in this respect. - -.. _Clojure: http://clojure.org/ -.. _syntax conveniences: http://clojure.org/special_forms -.. _concurrency: http://clojure.org/concurrent_programming -.. _bikeshed effect: http://en.wikipedia.org/wiki/Bikeshed - -Second order macros -------------------------------------------------------------- - -There is not upper limit to the level of sophistication you can reach -with macros: in particular it is possible to define higher order -macros, i.e. macros taking other macros as arguments or macros -expanding to other macros. Higher order macros allows an extremely -elegant programming style; on the other hand, they are exposed to the -risk of making the code incomprehensible and very hard to debug. -In this episode we will give a couple of examples of second order -macros taking other macros as argument. - -Our first example is a generalization of the accumulator trick we -used last week to define the ``cond-`` macro. We will define a -``collecting-pairs`` macro, which as input another macro and a -sequence of arguments, and calls the input macro with the arguments -grouped in pairs. -Here is the code: - -$$COLLECTING-PAIRS - -``collecting-pairs`` can be used with many syntactic expressions like -``cond``, ``case``, ``syntax-rules``, et cetera. Here is an example -with the case_ expression:: - - > (collecting-pairs (case 1) - (1) 'one - (2) 'two - (3) 'three - else 'unknown)) - one - -macros generating macros ----------------------------------------------------- - -In this paragraph I will give an example of a second order macro -expanding to a regular (first order) macro. Here it is: - -$$DEF-VECTOR-TYPE - -``def-vector-type`` is a macro which defines a macro which is used to -manage classes of vectors; for instance - -$$BOOK - -defines a ``Book`` macro which is able to manage two-dimensional vectors -with fields ``title`` and ``author``. The expansion of ``Book`` is the -following: +#|The evaluation strategy of Scheme programs +================================================ + +The Scheme module system is complex, because of the +complications caused by macros and because of the want of +separate compilation and cross compilation. +However, fortunately, the complication +is hidden, and the module system works well enough for many +simple cases. The proof is that we introduced the R6RS module +system in episode 5_, and for 20 episode we could go on safely +by just using the basic import/export syntax. However, once +nontrivial macros enters in the game, things start to become +interesting. + +.. _5: http://www.artima.com/weblogs/viewpost.jsp?thread=239699 +.. _R6RS document: http://www.r6rs.org/final/html/r6rs-lib/r6rs-lib-Z-H-13.html#node_idx_1142 + +Interpreter semantics vs compiler semantics +------------------------------------------------------------------ + +One of the trickiest things about Scheme, coming from Python, is its +distinction between *interpreter semantics* and *compiler semantics*. + +To understand the issue, let me first point out that having +interpreter semantics or compiler semantics has nothing to do with +being an interpreted or compiled language: both Scheme interpreters +and Scheme compilers exhibit both semantics. For instance, Ikarus, +which is a native code compiler provides interpreter semantics at +the REPL whereas Ypsilon which is an interpreter, provides +compiler semantics in scripts, when the R6RS compatibility flag is set. + +In general the same program in the same implementation can be run both +with interpreter semantics (when typed at the REPL) and with compiler +semantics (when used as a library), but the way the program behaves is +different, depending on the semantics used. + +There is no such distinction in Python, which has only +interpreter semantics. In Python, everything happens at runtime, +including bytecode compilation (it is true that technically bytecode +compilation is cached, but conceptually you may very well think that +every module is recompiled at runtime, when you import it - which is +actually what happens if the module has changed in the +meanwhile). Since Python has only interpreter semantics there is no +substantial difference between typing commands at the REPL and writing +a script. + +Things are quite different in Scheme. The interpreter semantics is +*not specified* by the R6RS standard and it is completely +implementation-dependent. It is also compatible with the standard to +not provide interpreter semantics at all, and to not provide a REPL: +for instance PLT Scheme does not provide a REPL for R6RS programs. On +the other hand, the compiler semantics is specified by the R6RS +standard and is used in scripts and libraries. + +The two semantics are quite different. When a +program is read in interpreter semantics, everything happens at +runtime: it is possible to define a function and immediately after a +macro using that function. Each expression entered is +compiled (possibly to native code as in Ikarus) and executed +immediately. Each new definition augments the namespace of known +names at runtime, both for first class objects and macros. Macros +are also expanded at runtime. + +When a program is read in compiler semantics instead, all the definitions +and the expressions are read, the macros are expanded and the program compiled, +*before* execution. Whereas an interpreter looks at a program one expression +at the time, a compiler looks at it as a whole: in particular, the order +of evaluation of expressions in a compiled program is unspecified, +unless you specify it by using a ``begin`` form. + +Let me notice that in my opinion having +an unspecified evaluation order is an clear case of premature +optimization and a serious mistake, but unfortunately this is the +way it is. The rationale is that in some specific circumstances +some compiler could take advantage of the unspecified evaluation order +to optimize the computation of some expression and run a few percent +faster but this is certainly *not* worth the complication. + +Anyway, since the interpreter semantics is not specified by the R6RS +and thus very much implementation-dependent, I will focus on the +compiler semantics of Scheme programs. Such semantics is quite +tricky, especially when macros enters in the game. + +Macros and helper functions +--------------------------------------------------------------------- + +You can see the problem of compiler semantics once you start using macros +which depend from auxiliary functions. For instance, consider this +simple macro + +$$ASSERT-DISTINCT + +which raises a compile-time exception (syntax-violation) if it is +invoked with duplicate arguments. A typical use case for such macro +is definining specialized lambda forms. The +macro relies on the builtin function +``free-identifier=?`` which returns true when two identifiers +are equal and false otherwise (this is a simplified explanation, +let me refer to the `R6RS document`_ for the gory details) and +on the helper function ``distinct?`` defined as follows: + +$$list-utils:DISTINCT? + +Here are a couple of test cases for ``distinct?``: + +$$TEST-DISTINCT + +The problem with the evaluation semantics is that it is natural, when +writing the code, to define first the function and then the macro, and +to try things at the REPL. Here everything works; in some +Scheme implementation, like Ypsilon, this will also work as a +script, unless the strict R6RS-compatibility flag is set. +However, in R6RS-conforming implementations, if you cut and paste +from the REPL and convert it into a script, you will run into +an error! + +The problem is due to the fact than in the compiler semantics macro +definitions and function definitions happens at *different times*. In +particular, macro definitions are taken in consideration *before* +function definitions, independently from their relative position in +the source code. Therefore our example fails to compile since the +``assert-distinct`` macro makes use of the ``distinct?`` function +which is *not yet defined* at the time the macro is considered, +i.e. at expansion time. Actually, functions are not evaluated +at expansion time, since functions are first class values and the right +hand side of any definition is left unevaluated by the compiler. +As we saw in the previous episode, both ``(define x (/ 1 0))`` and +``(define (f) (/ 1 0))`` (i.e. ``(define f (lambda () (/ 1 0)))``) are +compiled correctly but not evaluated until the runtime, therefore +both ``x`` and ``f`` cannot be used inside a macro. + +*The only portable way to make +available at expand time a function defined at runtime is to +define the function in a different module and to import it at +expand time* + +Phase separation +-------------------------------------------------------------- + +I have put ``distinct?`` in the ``(aps list-utils)`` +module, so that you can import it. This is enough to solve the +problem for Ikarus, which has no concept of *phase separation*, but it is not +enough for PLT Scheme or Larceny, which have full *phase separation*. +In other words, in Ikarus (but also Ypsilon, IronScheme and Mosh) +the following script + +$$assert-distinct: + +is correct, but in PLT Scheme and Larceny it raises an error:: + + $ plt-r6rs assert-distinct.ss + assert-distinct.ss:5:3: compile: unbound variable in module + (transformer environment) in: distinct? + +.. image:: salvador-dali-clock.jpg + +The problem is that PLT Scheme has *strong phase separation*: by default +names defined in external modules are imported *only* at runtime, *not* +at compile time. In some sense this is absurd since +names defined in an external pre-compiled modules +are of course known at compile time +(this is why Ikarus has no trouble to import them at compile time); +nevertheless PLT Scheme and Larceny Scheme forces you to specify at +which phase the functions must be imported. Notice that personally I +do not like the PLT and Larceny semantics since it makes things more +complicated, and that I prefer the Ikarus semantics: +nevertheless, if you want to write +portable code, you must use the PLT/Larceny semantics, which is the +one blessed by the R6RS document. + +If you want to import a few auxiliary functions +at expansion time (the time when macros are processed; often +incorrectly used as synonymous for compilation time) you must +use the ``(for expand)`` form: + +``(import (for (only (aps list-utils) distinct?) expand))`` + +With this import form, the script is portable in all R6RS implementations. + +Discussion +------------------------------------------------- + +Is phase separation a good thing? +In my opinion, from the programmer's point of view, the simplest thing +is lack of complete lack of phase separation, and interpreter semantics, in +which everything happens at runtime. +If you look at it with honesty, at the end the compiler semantics is +nothing else that a *performance hack*: by separing compilation time +from runtime you can perform some computation only once at compilation time +and gain performance. Moreover, in this way you can make cross compilation +easier. Therefore the compiler semantics has practical advantages and +I am willing cope with it, even if conceptually I still prefer the +straightforwardness of interpreter semantics. +Moreover, there are (non-portable) tricks to define helper functions +at expand time without need to move them into a separate module, therefore +compiler semantics is not so unbearable. + +The thing I really dislike is full phase separation. But a full discussion +of the issues releated to phase separation will require a whole episode. +See you next week! + +Therefore, if you have a compiled version of Scheme, +it makes sense to separate compilation time from runtime, and to +expand macros *before* compiling the helper functions (in absence of +phase separation, macros are still expanded before running any runtime +code, but *after* recognizing the helper functions). +Notice that Scheme has a concept of *macro expansion time* which is +valid even for interpreted implementation when there is no compilation +time. The `expansion process`_ of Scheme source code is specified in +the R6RS. + +There is still the question if strong phase separation is a good thing, +or if weak phase separation (as in Ikarus) is enough. For the programmer +weak phase separation is easier, since he does not need to specify +the phase in which he wants to import names. Strong phase separation +has been introduced so that at compile time a language which is +completely different from the language you use at runtime. In particular +you could decide to use in macros a subset of the full R6RS language. + +Suppose for instance you are a teacher, and you want to force your +students to write their macros using only a functional subset of Scheme. +You could then import at compile time all R6RS procedures except the +nonfunctional ones (like ``set!``) while keeping import at runtime +the whole R6RS. You could even perform the opposite, and remove ``set!`` +from the runtime, but allowing it at compile time. + +Therefore strong phase separation is strictly more powerful than week +phase separation, since it gives you more control. In Ikarus, when +you import a name in your module, the name is imported in all phases, +and there is nothing you can do about it. For instance this program +in Ikarus (but also IronScheme, Ypsilon, MoshScheme) .. code-block:: scheme + (import (rnrs) (for (only (aps list-utils) distinct?) expand)) + (display distinct?) + +runs, contrarily to what one would expect, because it is impossible +to import the name ``distinct?`` at expand time and not at runtime. +In PLT Scheme and Larceny instead the program will not run, as you +would expect. + +You may think the R6RS document to be schizophrenic, since it +accepts both implementations with phase separation and without +phase separation. The previous program is *conforming* R6RS code, but +behaves *differently* in R6RS-compliant implementations! + +but using the semantics without phase separation results in +non-portable code. Here a bold decision was required to ensure +portability: to declare the PLT semantics as the only acceptable one, +or to declare the Dibvig-Gouloum semantics as the only acceptable one. + +De facto, the R6RS document is the result +of a compromise between the partisans of phase separation +and absence of phase separation. + + +On the other hand strong phase separation makes everything more complicated: +it is somewhat akin to the introduction of multiple namespace, because +the same name can be imported in a given phase and not in another, +and that can lead to confusion. To contain the confusion, the R6RS +documents states that *the same name cannot be used in different phases +with different meaning in the same module*. +For instance, if the identifier ``x`` is bound to the +value ``v`` at the compilation time and ``x`` is defined even +at runtime, ``x`` must be bound to ``v`` even at runtime. However, it +is possible to have ``x`` bound at runtime and not at compile time, or +viceversa. This is a compromise, since PLT Scheme in non R6RS-compliant mode +can use different bindings for the same name at different phases. + +There are people in the Scheme community thinking that strong phase +separation is a mistake, and that weak phase separation is the right thing +to do. On the other side people (especially from the PLT community where +all this originated) sing the virtues of strong phase separation and say +all good things about it. I personally I have not seen a compelling +use case for strong phase separation yet. +On the other hand, I am well known for preferring simplicity over +(unneeded) power. + +.. _expansion process: http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-13.html#node_chap_10 - (def-syntax Book - (syntax-match (new ref set! title author) - (sub (ctx <name>) #''Book) - (sub (ctx <fields>) #'(list 'title 'author)) - (sub (ctx from-list ls) #'(list->vector ls)) - (sub (ctx new arg ...) #'(vector arg ...)) - (sub (ctx v ref title) #'(vector-ref v 0)) - (sub (ctx v ref author) #'(vector-ref v 1)) - (sub (ctx v set! title x) #'(vector-set! v 0 x)) - (sub (ctx v set! author x) #'(vector-set! v 1 x)))) - -From this expansion it is clear how ``Book`` works. For instance, - -.. code-block:: scheme - - > (define b (Book new "Title" "Author")) - -defines a vector of two strings: - -.. code-block:: scheme - - > b - #("Title" "Author") - -``(Book b ref title)`` retrieves the ``title`` field whereas -``(Book b ref author)`` retrieves the ``author`` field: - -.. code-block:: scheme - - > (Book b ref title) - "Title" - > (Book b ref author) - "Author" - -``(Book b set! title new-title)`` and ``(Book b set! author new-author)`` -allows to change the ``title`` and ``author`` fields. -It is also possible to convert a list into a ``Book`` vector: - - > (Book from-list '("t" "a")) - #("t" "a") +|# -Finally, the ``Book`` macro provides introspection features: +(import (rnrs) (sweet-macros) (for (aps list-utils) expand) + (for (aps lang) expand run) (aps compat) (aps easy-test)) -.. code-block:: scheme +;;ASSERT-DISTINCT +(def-syntax (assert-distinct arg ...) + #'(#f) + (distinct? free-identifier=? #'(arg ...)) + (syntax-violation 'assert-distinct "Duplicate name" #'(arg ...))) +;;END - > (Book <name>) - Book - > (book <fields>) - (title author) - -The secret of the ellipsis ------------------------------------------------------------------ - -.. _case: http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-14.html#node_idx_384 -.. _Arc: http://www.paulgraham.com/arcll1.html - -A two-level syntax -------------------------------------------------------------- - -Parens-haters may want to use ``collecting-pairs`` and the colon macro -to avoid parenthesis. They may even go further, and rant that the -basic Scheme syntax should require less parenthesis, since for -most programmers it is easier to write code with less parenthesis. -However, the Scheme philosophy favors automatic code generation -over manual writing. For instance, when writing macros, it is much easier -to use a conditional with more parenthesis like ``cond`` than a -conditional with less parenthesis like ``cond-``. The parenthesis -allows you to group expressions in group that can be repeated via -the ellipsis symbol; in practice, you can writing things like -``(cond (cnd? do-this ...) ...)`` which cannot be written -with ``cond-``. - -On the other hand, different languages adopt different philosophies; -for instance Paul Graham's Arc_ uses less parenthesis. This is -possible since it does not provide a macro system based on -pattern matching (which is a big *minus* in my opinion). Is it possible -to have both? A syntax with few parenthesis for writing code manually -and a syntax with many parenthesis for writing macros. The answer is yes: -the price to pay is to double the constructs of the language and to -use a Python-like approach. - -Python is a perfect example of language with a two-level syntax: a -simple syntax, limited but able to cover the most common case, and a -fully fledged syntax, giving all the power which is needed, which -however should be used only rarely. The best designed programming -language I know is Python. While not perfect, Python takes full -advantage of the two-level syntax idea. For instance - -==================== ================================= -Simplified syntax Full syntax -==================== ================================= -obj.attr getattr(obj, 'attr') -x + y x.__add__(y) -c = C() c = C.__new__(C); c.__init__() -==================== ================================= - -In the case of the conditional syntax, in principle we could have -a fully parenthesised ``__cond__`` syntax for usage in macros and -``cond`` syntax with less parens for manual usage. That, in theory: -in practice Scheme only provides the low level syntax, leaving to -the final user the freedom (and the burden) of implementing his -own preferred high level syntax. +;(assert-distinct a b a) -|# +;;DEF-BOOK +(def-syntax (def-book name title author) + (: with-syntax + name-title (identifier-append #'name "-title") + name-author (identifier-append #'name "-author") + #'(begin + (define name (vector title author)) + (define name-title (vector-ref name 0)) + (define name-author (vector-ref name 1))))) -(import (rnrs) (sweet-macros) (for (aps lang) run expand) - (aps easy-test) (for (aps list-utils) run expand) (aps compat)) - -;;DEF-VECTOR-TYPE -(def-syntax (def-vector-type name (field-name checker?) ...) - (with-syntax (((i ...) (range (length #'(field-name ...))))) - #'(begin - (define (check-all vec) - (vector-map - (lambda (check? field arg) - (if (check? arg) arg (error 'name "TypeError" field arg))) - (vector checker? ...) (vector 'field-name ...) vec)) - (def-syntax name - (syntax-match (check <name> fields new ref set! field-name ...) - (sub (ctx check vec) #'(check-all vec)) - (sub (ctx <name>) #''name) - (sub (ctx fields) #'(list 'field-name ...)) - (sub (ctx from-list ls) #'(check-all (list->vector ls))) - (sub (ctx new arg (... ...)) #'(ctx from-list (list arg (... ...)))) - (sub (ctx v ref field-name) #'(vector-ref v i)) ... - (sub (ctx v set! field-name x) #'(vector-set! v i x)) ... - )))) - (distinct? free-identifier=? #'(field-name ...))) ;;END +(pretty-print (syntax-expand (def-book bible "The Bible" "God"))) -;;BOOK -(def-vector-type Book (title string?) (author string?)) -;;END -(display (Book <name>)) (newline) - -(pretty-print (syntax-expand - (def-vector-type Book (title string?) (author string?)))) - -;;COLLECTING-PAIRS -(def-syntax collecting-pairs - (syntax-match () - (sub (collecting-pairs (name arg ...) x1 x2 ...) - #'(collecting-pairs "helper" (name arg ...) () x1 x2 ...)) - (sub (collecting-pairs "helper" (name arg ...) (acc ...)) - #'(name arg ... acc ...)) - (sub (collecting-pairs "helper" (name arg ...) (acc ...) x) - #'(syntax-violation 'name "Mismatched pairs" '(name arg ... acc ... x) 'x)) - (sub (collecting-pairs "helper" (name arg ...) (acc ...) x1 x2 x3 ...) - #'(collecting-pairs "helper" (name arg ...) (acc ... (x1 x2)) x3 ...)) - )) + ;;TEST-DEF-BOOK + (test "def-book" + (let () + (def-book bible "The Bible" "God") + (list bible-title bible-author)) + (list "The Bible" "God")) + ;;END + + +;;ALIST2 + (def-syntax (alist2 arg ...) + (: with-syntax ((name value) ...) (normalize #'(arg ...)) + (if (for-all identifier? #'(name ...)) + #'(let* ((name value) ...) + (list (list 'name name) ...)) + (syntax-violation 'alist "Found non identifier" #'(name ...) + (remp identifier? #'(name ...)))))) ;;END -;;TEST-COLON (run - (test "ok" - (: let* x 1 y x (+ x y)) - 2) -; (test "err" -; (catch-error (: let* x 1 y x z (+ x y))) -; "Odd number of arguments") - - (test "nv1" - (let () - (define b (Book new "T" "A")) - (Book b ref title)) - "T") + + ;;TEST-DISTINCT + (test "distinct" + (distinct? eq? '(a b c)) + #t) + + (test "not-distinct" + (distinct? eq? '(a b a)) + #f) + ;;END + + (let ((a 1)) + (test "mixed" + (alist2 a (b (* 2 a))) + '((a 1) (b 2)))) ) -;;END diff --git a/artima/scheme/scheme21.ss b/artima/scheme/scheme21.ss index e0fe3e7..513ce49 100644 --- a/artima/scheme/scheme21.ss +++ b/artima/scheme/scheme21.ss @@ -1,288 +1,210 @@ -#|Syntax objects +#|More on phase separation =================================================================== -In the last dozen episodes I have defined plenty of macros, but I have -not really explained what macros are and how they work. This episode -will close the gap, and will explain the true meaning of macros by -introducing the concepts of *syntax object* and of *transformer* over -syntax objects. +In this episode I will discuss in detail the trickiness +associated with the concept of phase separation. -Syntax objects ------------------------------------------------------------------- +More examples of macros depending on helper functions +----------------------------------------------------------------- -Scheme macros are built over the concept of *syntax object*. -The concept is peculiar to Scheme and has no counterpart in other -languages (including Common Lisp), therefore it is worth to spend some time -on it. - -A *syntax-object* is a kind of enhanced *s*-espression: it contains -the source code as a list of symbols and primitive values, plus -additional informations, such as -the name of the file containing the source code, the line numbers, -a set of marks to distinguish identifiers according to their -lexical context, and more. - -It is possible to convert a name or a literal value into a -syntax object with the syntax quoting operation, i.e. the funny -``#'`` symbol you have seen in all the macros I have defined until now:: - - > #'x ; convert a name into an identifier - #<syntax x> - > #''x ; convert a literal symbol - #<syntax 'x> - > #'1 ; convert a literal number - #<syntax 1> - > #'"s" ; convert a literal string - #<syntax "s"> - > #''(1 "a" 'b) ; convert a literal data structure - #<syntax '(1 "a" 'b)> - -Here I am running all my examples under Ikarus; your Scheme -system may have a slightly different output representation for syntax -objects. - -In general ``#'`` - also spelled ``(syntax )`` - can be "applied" -to any expression:: - - > (define syntax-expr #'(display "hello")) - > syntax-expr - #<syntax (display "hello")> - -It is possible to extract the *s*-expression underlying the -syntax object with the ``syntax->datum`` primitive:: - - > (equal? (syntax->datum syntax-expr) '(display "hello")) - #t - -Different syntax-objects can be equivalent: for instance -the improper list of syntax objects ``(cons #'display (cons #'"hello" #'()))`` -is equivalent to the syntax object ``#'(display "hello")`` in -the sense that both corresponds to the same datum:: - - > (equal? (syntax->datum (cons #'display (cons #'"hello" #'()))) - (syntax->datum #'(display "hello"))) - #t - -The ``(syntax )`` macro is analogous to the ``(quote )`` macro; -moreover, there is a ``quasisyntax`` macro denoted with ``#``` which -is analogous to the ``quasiquote`` macro (`````) and, in analogy to -the operation ``,`` and ``,@`` on regular lists, there are two -operations ``unsyntax`` ``#,`` (*sharp comma*) e ``unsyntax-splicing`` -``#,@`` (*sharp comma splice*) on lists (including improper lists) of -syntax objects. - -Here is an example using sharp-comma:: - - > (let ((user "michele")) #`(display #,user)) - (#<syntax display> "michele" . #<syntax ()>) - -and here is an example using sharp-comma-splice:: - - > (define users (list #'"michele" #'"mario")) - > #`(display (list #,@users)) - (#<syntax display> - (#<syntax list> #<syntax "michele"> #<syntax "mario">) . #<syntax ()>) - -Notice that the output is an improper list. This is somewhat consistent -with the behavior of usual quoting: for usual quoting ``'(a b c)`` -is a shortcut for ``(cons* 'a 'b 'c '())``, which is a proper list, -and for syntax-quoting ``#'(a b c)`` is equivalent to -``(cons* #'a #'b #'c #'())``, which is an improper list. -The ``cons*`` operator here is a R6RS shortcut for nested conses: -``(cons* w x y z)`` is the same as ``(cons w (cons x (cons y z)))``. - -However, the result of a quasi quote interpolation is very much -*implementation-dependent*: Ikarus returns an improper list, but other -implementations returns different results; for instance ypsilon -returns a proper list of syntax objects whereas PLT Scheme returns -an atomic syntax object. The lesson is that you cannot -rely on properties of the inner representation of syntax objects: -what matters is the code they correspond to, i.e. the result of -``syntax->datum``. - -It is possible to promote a datum to a syntax object with the -``datum->syntax`` procedure, but in order -to do so you need to provide a lexical context, which can be specified -by using an identifier:: - - > (datum->syntax #'dummy-context '(display "hello")) - #<syntax (display "hello") - -(the meaning of the lexical context in ``datum->syntax`` is tricky and -I will go back to that in future episodes). - -What ``syntax-match`` really is --------------------------------------------------------------- +In the previous episode I have shown an example of a macro +(``assert-distinct``) depending by a helper function (``distinct?``) +appearing in a guarded pattern. This is not the only example of +macros depending on an external function; actually, external +functions appears more commonly in conjunction with the R6RS standard +form ``with-syntax``, which +allows to introduce auxiliary pattern variables +(a better name would have been ``let-pattern-vars``). +Here is a simple example of usage of ``with-syntax`` in conjunction +with the ``range`` function we introduced in episode 5_, to define +a compile-time indexer macro: -``syntax-match`` is a general utility to perform pattern matching -on syntax objects; it takes a syntax object in output and returns -another syntax object in output, depending on the patterns, skeletons and guards -used:: +$$INDEXER-SYNTAX - > (define transformer - (syntax-match () - (sub (name . args) #'name))); return the name as a syntax object +Since both ``range`` and ``distinct?`` are defined in the list-utils +module and used at compile-time, everything works if you use +the import form ``(for (aps list-utils) expand)``. - > (transformer #'(a 1 2 3)) - #<syntax a> +You can understand how the macro works by expanding it; for instance +``(indexer-syntax a b c)`` expands into a macro transformer that +associates an index from 0 to 2 to each literal identifier from +``a`` to ``c``: -For convenience, ``syntax-match`` also accepts a second syntax -``(syntax-match x (lit ...) clause ...)`` to match syntax expressions -directly, more convenient than using -``((syntax-match (lit ...) clause ...) x)``. -Here is a simple example of usage:: +.. code-block:: scheme - > (syntax-match #'(a 1 2 3) () - (sub (name . args) #'args)); return the args as a syntax object - #<syntax (1 2 3)> + > (syntax-expand (indexer-syntax a b c)) + (syntax-match (a b c) + (sub (ctx a) #'0) + (sub (ctx b) #'1) + (sub (ctx c) #'2)) -Here is an example using ``quasisyntax`` and ``unsyntax-splicing``:: +The ``with-syntax`` form introduces the list of pattern variables ``(i ...)`` +corresponding to the list ``(0 1 2)`` generated by the ``range`` function +by looking at the number of arguments, the ``(name ...)`` list. +The guarded pattern also checks that the names are distinct. +Thus, the following test passes: - > (syntax-match #'(a 1 2 3) () - (sub (name . args) #`(name #,@#'args))) - (#<syntax a> #<syntax 1> #<syntax 2> #<syntax 3>) +$$TEST-INDEXER-SYNTAX -As you see, it easy to write hieroglyphs if you use ``quasisyntax`` -and ``unsyntax-splicing``. You can avoid that by means of the ``with-syntax`` -form introduced in episode XX:: +The basic feature of the indexer is that ``i`` is a macro and therefore literal +identifiers are turned into integers at compile time, *without runtime +penalty*. On the other hand, if you want to turn symbols known only +at runtime into integers, the previous approach does not work, and +you can define a runtime indexer as follows: - > (syntax-match #'(a 1 2 3) () - (sub (name . args) (: with-syntax (a ...) #'args #'(name a ...)))) - (#<syntax a> #<syntax 1> #<syntax 2> #<syntax 3>) - +$$INDEXER -The pattern variables introduced by ``with-syntax`` -are automatically expanded inside the syntax template, without -resorting to the quasisyntax notation (i.e. there is no need for -``#``` ``#,`` ``#,@``). +Here the indexer is a simple function; the acceptable symbols +are specified (and checked) at expand time, but the dispatch +is performed at runtime. Here is a test: -Matching generic syntax lists --------------------------------------------------------------- +$$TEST-INDEXER -The previous paragraphs about syntax objects were a little abstract and -probably of unclear utility (but what would you expect from -an advanced macro tutorial? ;). Here I will be more -concrete and I will provide an example where -``syntax-match`` is used as a list matcher inside a bigger macro. -The final goal is to provide -a nicer syntax for association lists (an association list is just -a non-empty list of non-empty lists). The macro accepts a variable -number of arguments; every argument is of the form ``(name value)`` or -it is a single identifier: in this case latter case it must be -magically converted -into the form ``(name value)`` where ``value`` is the value of the -identifier, assuming it is bound in the current scope, otherwise -a run time error is raised ``"unbound identifier"``. If you try to -pass an argument which is not of the expected form, a compile time -syntax error must be raised. -In concrete, the macro works as follows: +You may enjoy yourself with performance benchmarks comparing the macro +solution with the function solution; moreover you +can contrast this solution with the builtin way of R6RS Scheme +to build indexers as sets. -$$TEST-ALIST +The problem with auxiliary macros +------------------------------------------------------------------ -``(alist a (b (* 2 a)))`` would raise an error ``unbound identifier a``. -Here is the implementation: +We said a few times that auxiliary functions are not available to macros +defined in the same module, but actually in general +there is the *same* problem for any identifier which is used in the +right hand side of a macro definition, *including auxiliary macros*. +For instance, we may regard the ``indexer-syntax`` macro +as an auxiliary macro, to be used in the right hand side of a +``def-syntax`` form. -$$ALIST +In systems with strong phase separation, like +PLT Scheme and Larceny, auxiliary macros +are not special, and they behave as auxiliary functions: you +must put them into a separare module and you must import them +with ``(for (only (module) helper-macro) expand)`` before using them. +This is why the following script -The expression ``#'(arg ...)`` expands to a list of syntax -objects which are then transformed by is the ``syntax-match`` transformer, -which converts identifiers of the form ``n`` into couples of the form -``(n n)``, whereas it leaves couples ``(n v)`` unchanged, however -by checking that ``n`` is an identifier. +$$indexer-syntax: -Macros as list transformers ---------------------------------------------------------------------- +fails in PLT scheme: -Macros are in one-to-one correspondence with list transformers, i.e. every -macro is associated to a transformer which converts a list of syntax objects -(the arguments of the macro) into another list of syntax objects (the expansion -of the macro). Scheme itself takes care of converting the input code -into a list of syntax objects (if you wish, internally there is a -``datum->syntax`` conversion) and the output syntax list into code -(an internal ``syntax->datum`` conversion). -The sharp-quote notation in macros is just an abbreviation for the underlying -list: for instance a macro describing function composition -:: + $ plt-r6rs indexer-syntax.ss + indexer-syntax.ss:9:0: def-syntax: bad syntax in: (def-syntax (indexer-syntax a b c)) - (def-syntax (app f g) - #'(f g)) + === context === + /usr/lib/plt/collects/rnrs/base-6.ss:492:6 -can be written equivalently also as -:: +The problem is that in PLT and Larceny, the second ``def-syntax`` does +not see the binding for the ``indexer-syntax`` macro. - (def-syntax (app f g) - (list #'f #'g)) +This is a precise design choice: systems with strong phase +separation are making the life harder for programmers, +by forcing them to put auxiliary functions/macros/objects +in auxiliary modules, to keep absolute control on how the +names enter in the different phases and to make possible +to use different languages at different phases. -or +I have yet to see a convincing example of why keeping +different languages at different phases is worth +the annoyance. -:: +On the other hand, in systems with weak phase separation, +like Ikarus/IronScheme/Mosh/Ypsilon, +*there is no need to put auxiliary +macros in an external module.* The reason is that all macro +definitions are read at the same time, and the compiler knows +about the helper macros, so it can use them. The consequence +is that a long as the compiler reads macro definitions, it expands +the compile-time namespace of recognized names which are available +to successive syntax definitions. - (def-syntax (app f g) - (cons* #'f #'g #'())) +In Ikarus the script ``identifier-syntax.ss`` is +perfectly valid: the first syntax +definition would add a binding for ``identifier-syntax`` to the macro +namespace, so that it would be seen by the second syntax definition. -The sharp-quoted syntax is more readable, but it hides the underlying list -representation which in some cases is pretty useful. This is why -``syntax-match`` macros are much more powerful than ``syntax-rules`` -macros. -``sweet-macros`` provide a convenient feature: -it is possible to extract the associated -transformer for each macro defined via ``def-syntax``. For instance, -here is the transformer associated to the ``define-a`` macro: +Systems with strong phase separation instead are effectively using +different namespaces for each phase. -.. code-block:: scheme - > (define tr (define-a <transformer>)) - > (tr (list #'dummy #'1)) - (#<syntax define> #<syntax a> 1) +Implementing a first class module system +----------------------------------------- + +This is the same as implementing a Pythonic interface over hash tables +or association lists. -Notice that the name of the macro (in this case ``define-a`` is ignored -by the transformer, i.e. it is a dummy identifier. +Working around phase separation +-------------------------------------------------------------- +I have always hated being forced to put my helper functions in an +auxiliary module, because I often use auxiliary functions which +are intended to be used only once inside a given macro, thus +it makes sense to put those auxiliary functions *in the same +module* as the macro the are used in. +In principle you could solve the problem by definining all the +functions *inside* the macro, but I hate this, both for dogmatic +reasons (it is a Pythonista dogma that *flat is better than nested*) +and for pragmatic reasons, i.e. I want to be able to debug my +helper functions and this is impossible if they are hidden inside +the macro. They must be available at the top-level. Moreover, you +never know, and a functionality which was intended for use in a specific +macro my turn useful for another macro after all, and it is much +more convenient if the functionality is already encapsulated in +a nice exportable top level function. + +I am not the only to believe that it should be possible to define +helper functions in the same module as the macro and actually +many Scheme implementations provide a way to do so via a +``define-for-syntax`` form which allows to define function +at *expand time*, so that they are available for usage in macros. + +If your Scheme does not provide ``define-for-syntax``, which is not +part of the R6RS specification, you can still work around phase +separation with some clever hack. For instance, you could +use the following macro: + +$$lang:literal-replace |# -(import (rnrs) (sweet-macros) (aps easy-test) (aps compat) - (for (aps list-utils) expand) (for (aps record-syntax) expand run)) - -;;ALIST -(def-syntax (alist arg ...) - (with-syntax (( - ((name value) ...) - (map (syntax-match () - (sub n #'(n n) (identifier? #'n)) - (sub (n v) #'(n v) (identifier? #'n))) - #'(arg ...)) )) - #'(let* ((name value) ...) - (list (list 'name name) ...)))) + +(import (rnrs) (sweet-macros) (for (aps list-utils) expand run) + (for (aps lang) expand) (aps compat) (aps easy-test)) + +;;INDEXER-SYNTAX +(def-syntax (indexer-syntax name ...) + (with-syntax (((i ...) (range (length #'(name ...))))) + #'(syntax-match (name ...) (sub (ctx name) #'i) ...)) + (distinct? free-identifier=? #'(name ...)) + (syntax-violation 'assert-distinct "Duplicate name" #'(name ...))) ;;END -(def-syntax book (record-syntax title author)) -(pretty-print (syntax-expand (record-syntax title author))) (newline) -(define b (vector "T" "A")) -(display (list (book b title) (book b author))) ;; seems an Ypsilon bug -;since this works -;(def-syntax book -; (syntax-match (title author) -; (sub (ctx v title) (syntax (vector-ref v 0))) -; (sub (ctx v author) (syntax (vector-ref v 1))))) +;;INDEXER +(def-syntax (indexer name ...) + (with-syntax (((i ...) (range (length #'(name ...))))) + #'(lambda (x) (case x ((name) i) ...))) + (distinct? free-identifier=? #'(name ...)) + (syntax-violation 'assert-distinct "Duplicate name" #'(name ...))) +;;END -(display (syntax-expand (alist (a 1) (b (* 2 a))))) +(display (syntax-expand (indexer-syntax a b c))) (newline) (run - ;;TEST-ALIST - (test "simple" - (alist (a 1) (b (* 2 a))) - '((a 1) (b 2))) - - ;;END - ;(test "with-error" - ; (catch-error (alist2 (a 1) (2 3))) - ; "invalid syntax") +;;TEST-INDEXER-SYNTAX +(test "indexer-syntax" + (let () + (def-syntax i (indexer-syntax a b c)) + (list (i a) (i b) (i c))) + '(0 1 2)) +;;END +;;TEST-INDEXER +(test "indexer" + (let () + (define i (indexer a b c)) + (list (i 'a) (i 'b) (i 'c))) + '(0 1 2)) +;;END ) - - - diff --git a/artima/scheme/scheme25.ss b/artima/scheme/scheme25.ss deleted file mode 100644 index 04db2f7..0000000 --- a/artima/scheme/scheme25.ss +++ /dev/null @@ -1,345 +0,0 @@ -#|The evaluation strategy of Scheme programs -================================================ - -The Scheme module system is complex, because of the -complications caused by macros and because of the want of -separate compilation and cross compilation. -However, fortunately, the complication -is hidden, and the module system works well enough for many -simple cases. The proof is that we introduced the R6RS module -system in episode 5_, and for 20 episode we could go on safely -by just using the basic import/export syntax. However, once -nontrivial macros enters in the game, things start to become -interesting. - -.. _5: http://www.artima.com/weblogs/viewpost.jsp?thread=239699 -.. _R6RS document: http://www.r6rs.org/final/html/r6rs-lib/r6rs-lib-Z-H-13.html#node_idx_1142 - -Interpreter semantics vs compiler semantics ------------------------------------------------------------------- - -One of the trickiest things about Scheme, coming from Python, is its -distinction between *interpreter semantics* and *compiler semantics*. - -To understand the issue, let me first point out that having -interpreter semantics or compiler semantics has nothing to do with -being an interpreted or compiled language: both Scheme interpreters -and Scheme compilers exhibit both semantics. For instance, Ikarus, -which is a native code compiler provides interpreter semantics at -the REPL whereas Ypsilon which is an interpreter, provides -compiler semantics in scripts, when the R6RS compatibility flag is set. - -In general the same program in the same implementation can be run both -with interpreter semantics (when typed at the REPL) and with compiler -semantics (when used as a library), but the way the program behaves is -different, depending on the semantics used. - -There is no such distinction in Python, which has only -interpreter semantics. In Python, everything happens at runtime, -including bytecode compilation (it is true that technically bytecode -compilation is cached, but conceptually you may very well think that -every module is recompiled at runtime, when you import it - which is -actually what happens if the module has changed in the -meanwhile). Since Python has only interpreter semantics there is no -substantial difference between typing commands at the REPL and writing -a script. - -Things are quite different in Scheme. The interpreter semantics is -*not specified* by the R6RS standard and it is completely -implementation-dependent. It is also compatible with the standard to -not provide interpreter semantics at all, and to not provide a REPL: -for instance PLT Scheme does not provide a REPL for R6RS programs. On -the other hand, the compiler semantics is specified by the R6RS -standard and is used in scripts and libraries. - -The two semantics are quite different. When a -program is read in interpreter semantics, everything happens at -runtime: it is possible to define a function and immediately after a -macro using that function. Each expression entered is -compiled (possibly to native code as in Ikarus) and executed -immediately. Each new definition augments the namespace of known -names at runtime, both for first class objects and macros. Macros -are also expanded at runtime. - -When a program is read in compiler semantics instead, all the definitions -and the expressions are read, the macros are expanded and the program compiled, -*before* execution. Whereas an interpreter looks at a program one expression -at the time, a compiler looks at it as a whole: in particular, the order -of evaluation of expressions in a compiled program is unspecified, -unless you specify it by using a ``begin`` form. - -Let me notice that in my opinion having -an unspecified evaluation order is an clear case of premature -optimization and a serious mistake, but unfortunately this is the -way it is. The rationale is that in some specific circumstances -some compiler could take advantage of the unspecified evaluation order -to optimize the computation of some expression and run a few percent -faster but this is certainly *not* worth the complication. - -Anyway, since the interpreter semantics is not specified by the R6RS -and thus very much implementation-dependent, I will focus on the -compiler semantics of Scheme programs. Such semantics is quite -tricky, especially when macros enters in the game. - -Macros and helper functions ---------------------------------------------------------------------- - -You can see the problem of compiler semantics once you start using macros -which depend from auxiliary functions. For instance, consider this -simple macro - -$$ASSERT-DISTINCT - -which raises a compile-time exception (syntax-violation) if it is -invoked with duplicate arguments. A typical use case for such macro -is definining specialized lambda forms. The -macro relies on the builtin function -``free-identifier=?`` which returns true when two identifiers -are equal and false otherwise (this is a simplified explanation, -let me refer to the `R6RS document`_ for the gory details) and -on the helper function ``distinct?`` defined as follows: - -$$list-utils:DISTINCT? - -Here are a couple of test cases for ``distinct?``: - -$$TEST-DISTINCT - -The problem with the evaluation semantics is that it is natural, when -writing the code, to define first the function and then the macro, and -to try things at the REPL. Here everything works; in some -Scheme implementation, like Ypsilon, this will also work as a -script, unless the strict R6RS-compatibility flag is set. -However, in R6RS-conforming implementations, if you cut and paste -from the REPL and convert it into a script, you will run into -an error! - -The problem is due to the fact than in the compiler semantics macro -definitions and function definitions happens at *different times*. In -particular, macro definitions are taken in consideration *before* -function definitions, independently from their relative position in -the source code. Therefore our example fails to compile since the -``assert-distinct`` macro makes use of the ``distinct?`` function -which is *not yet defined* at the time the macro is considered, -i.e. at expansion time. Actually, functions are not evaluated -at expansion time, since functions are first class values and the right -hand side of any definition is left unevaluated by the compiler. -As we saw in the previous episode, both ``(define x (/ 1 0))`` and -``(define (f) (/ 1 0))`` (i.e. ``(define f (lambda () (/ 1 0)))``) are -compiled correctly but not evaluated until the runtime, therefore -both ``x`` and ``f`` cannot be used inside a macro. - -*The only portable way to make -available at expand time a function defined at runtime is to -define the function in a different module and to import it at -expand time* - -Phase separation --------------------------------------------------------------- - -I have put ``distinct?`` in the ``(aps list-utils)`` -module, so that you can import it. This is enough to solve the -problem for Ikarus, which has no concept of *phase separation*, but it is not -enough for PLT Scheme or Larceny, which have full *phase separation*. -In other words, in Ikarus (but also Ypsilon, IronScheme and Mosh) -the following script - -$$assert-distinct: - -is correct, but in PLT Scheme and Larceny it raises an error:: - - $ plt-r6rs assert-distinct.ss - assert-distinct.ss:5:3: compile: unbound variable in module - (transformer environment) in: distinct? - -.. image:: salvador-dali-clock.jpg - -The problem is that PLT Scheme has *strong phase separation*: by default -names defined in external modules are imported *only* at runtime, *not* -at compile time. In some sense this is absurd since -names defined in an external pre-compiled modules -are of course known at compile time -(this is why Ikarus has no trouble to import them at compile time); -nevertheless PLT Scheme and Larceny Scheme forces you to specify at -which phase the functions must be imported. Notice that personally I -do not like the PLT and Larceny semantics since it makes things more -complicated, and that I prefer the Ikarus semantics: -nevertheless, if you want to write -portable code, you must use the PLT/Larceny semantics, which is the -one blessed by the R6RS document. - -If you want to import a few auxiliary functions -at expansion time (the time when macros are processed; often -incorrectly used as synonymous for compilation time) you must -use the ``(for expand)`` form: - -``(import (for (only (aps list-utils) distinct?) expand))`` - -With this import form, the script is portable in all R6RS implementations. - -Discussion -------------------------------------------------- - -Is phase separation a good thing? -In my opinion, from the programmer's point of view, the simplest thing -is lack of complete lack of phase separation, and interpreter semantics, in -which everything happens at runtime. -If you look at it with honesty, at the end the compiler semantics is -nothing else that a *performance hack*: by separing compilation time -from runtime you can perform some computation only once at compilation time -and gain performance. Moreover, in this way you can make cross compilation -easier. Therefore the compiler semantics has practical advantages and -I am willing cope with it, even if conceptually I still prefer the -straightforwardness of interpreter semantics. -Moreover, there are (non-portable) tricks to define helper functions -at expand time without need to move them into a separate module, therefore -compiler semantics is not so unbearable. - -The thing I really dislike is full phase separation. But a full discussion -of the issues releated to phase separation will require a whole episode. -See you next week! - -Therefore, if you have a compiled version of Scheme, -it makes sense to separate compilation time from runtime, and to -expand macros *before* compiling the helper functions (in absence of -phase separation, macros are still expanded before running any runtime -code, but *after* recognizing the helper functions). -Notice that Scheme has a concept of *macro expansion time* which is -valid even for interpreted implementation when there is no compilation -time. The `expansion process`_ of Scheme source code is specified in -the R6RS. - -There is still the question if strong phase separation is a good thing, -or if weak phase separation (as in Ikarus) is enough. For the programmer -weak phase separation is easier, since he does not need to specify -the phase in which he wants to import names. Strong phase separation -has been introduced so that at compile time a language which is -completely different from the language you use at runtime. In particular -you could decide to use in macros a subset of the full R6RS language. - -Suppose for instance you are a teacher, and you want to force your -students to write their macros using only a functional subset of Scheme. -You could then import at compile time all R6RS procedures except the -nonfunctional ones (like ``set!``) while keeping import at runtime -the whole R6RS. You could even perform the opposite, and remove ``set!`` -from the runtime, but allowing it at compile time. - -Therefore strong phase separation is strictly more powerful than week -phase separation, since it gives you more control. In Ikarus, when -you import a name in your module, the name is imported in all phases, -and there is nothing you can do about it. For instance this program -in Ikarus (but also IronScheme, Ypsilon, MoshScheme) - -.. code-block:: scheme - (import (rnrs) (for (only (aps list-utils) distinct?) expand)) - (display distinct?) - -runs, contrarily to what one would expect, because it is impossible -to import the name ``distinct?`` at expand time and not at runtime. -In PLT Scheme and Larceny instead the program will not run, as you -would expect. - -You may think the R6RS document to be schizophrenic, since it -accepts both implementations with phase separation and without -phase separation. The previous program is *conforming* R6RS code, but -behaves *differently* in R6RS-compliant implementations! - -but using the semantics without phase separation results in -non-portable code. Here a bold decision was required to ensure -portability: to declare the PLT semantics as the only acceptable one, -or to declare the Dibvig-Gouloum semantics as the only acceptable one. - -De facto, the R6RS document is the result -of a compromise between the partisans of phase separation -and absence of phase separation. - - -On the other hand strong phase separation makes everything more complicated: -it is somewhat akin to the introduction of multiple namespace, because -the same name can be imported in a given phase and not in another, -and that can lead to confusion. To contain the confusion, the R6RS -documents states that *the same name cannot be used in different phases -with different meaning in the same module*. -For instance, if the identifier ``x`` is bound to the -value ``v`` at the compilation time and ``x`` is defined even -at runtime, ``x`` must be bound to ``v`` even at runtime. However, it -is possible to have ``x`` bound at runtime and not at compile time, or -viceversa. This is a compromise, since PLT Scheme in non R6RS-compliant mode -can use different bindings for the same name at different phases. - -There are people in the Scheme community thinking that strong phase -separation is a mistake, and that weak phase separation is the right thing -to do. On the other side people (especially from the PLT community where -all this originated) sing the virtues of strong phase separation and say -all good things about it. I personally I have not seen a compelling -use case for strong phase separation yet. -On the other hand, I am well known for preferring simplicity over -(unneeded) power. - -.. _expansion process: http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-13.html#node_chap_10 - -|# - -(import (rnrs) (sweet-macros) (for (aps list-utils) expand) - (for (aps lang) expand run) (aps compat) (aps easy-test)) - -;;ASSERT-DISTINCT -(def-syntax (assert-distinct arg ...) - #'(#f) - (distinct? free-identifier=? #'(arg ...)) - (syntax-violation 'assert-distinct "Duplicate name" #'(arg ...))) -;;END - -;(assert-distinct a b a) - -;;DEF-BOOK -(def-syntax (def-book name title author) - (: with-syntax - name-title (identifier-append #'name "-title") - name-author (identifier-append #'name "-author") - #'(begin - (define name (vector title author)) - (define name-title (vector-ref name 0)) - (define name-author (vector-ref name 1))))) - -;;END -(pretty-print (syntax-expand (def-book bible "The Bible" "God"))) - - - ;;TEST-DEF-BOOK - (test "def-book" - (let () - (def-book bible "The Bible" "God") - (list bible-title bible-author)) - (list "The Bible" "God")) - ;;END - - -;;ALIST2 - (def-syntax (alist2 arg ...) - (: with-syntax ((name value) ...) (normalize #'(arg ...)) - (if (for-all identifier? #'(name ...)) - #'(let* ((name value) ...) - (list (list 'name name) ...)) - (syntax-violation 'alist "Found non identifier" #'(name ...) - (remp identifier? #'(name ...)))))) -;;END - -(run - - ;;TEST-DISTINCT - (test "distinct" - (distinct? eq? '(a b c)) - #t) - - (test "not-distinct" - (distinct? eq? '(a b a)) - #f) - ;;END - - (let ((a 1)) - (test "mixed" - (alist2 a (b (* 2 a))) - '((a 1) (b 2)))) - ) - diff --git a/artima/scheme/scheme26.ss b/artima/scheme/scheme26.ss deleted file mode 100644 index 513ce49..0000000 --- a/artima/scheme/scheme26.ss +++ /dev/null @@ -1,210 +0,0 @@ -#|More on phase separation -=================================================================== - -In this episode I will discuss in detail the trickiness -associated with the concept of phase separation. - -More examples of macros depending on helper functions ------------------------------------------------------------------ - -In the previous episode I have shown an example of a macro -(``assert-distinct``) depending by a helper function (``distinct?``) -appearing in a guarded pattern. This is not the only example of -macros depending on an external function; actually, external -functions appears more commonly in conjunction with the R6RS standard -form ``with-syntax``, which -allows to introduce auxiliary pattern variables -(a better name would have been ``let-pattern-vars``). -Here is a simple example of usage of ``with-syntax`` in conjunction -with the ``range`` function we introduced in episode 5_, to define -a compile-time indexer macro: - -$$INDEXER-SYNTAX - -Since both ``range`` and ``distinct?`` are defined in the list-utils -module and used at compile-time, everything works if you use -the import form ``(for (aps list-utils) expand)``. - -You can understand how the macro works by expanding it; for instance -``(indexer-syntax a b c)`` expands into a macro transformer that -associates an index from 0 to 2 to each literal identifier from -``a`` to ``c``: - -.. code-block:: scheme - - > (syntax-expand (indexer-syntax a b c)) - (syntax-match (a b c) - (sub (ctx a) #'0) - (sub (ctx b) #'1) - (sub (ctx c) #'2)) - -The ``with-syntax`` form introduces the list of pattern variables ``(i ...)`` -corresponding to the list ``(0 1 2)`` generated by the ``range`` function -by looking at the number of arguments, the ``(name ...)`` list. -The guarded pattern also checks that the names are distinct. -Thus, the following test passes: - -$$TEST-INDEXER-SYNTAX - -The basic feature of the indexer is that ``i`` is a macro and therefore literal -identifiers are turned into integers at compile time, *without runtime -penalty*. On the other hand, if you want to turn symbols known only -at runtime into integers, the previous approach does not work, and -you can define a runtime indexer as follows: - -$$INDEXER - -Here the indexer is a simple function; the acceptable symbols -are specified (and checked) at expand time, but the dispatch -is performed at runtime. Here is a test: - -$$TEST-INDEXER - -You may enjoy yourself with performance benchmarks comparing the macro -solution with the function solution; moreover you -can contrast this solution with the builtin way of R6RS Scheme -to build indexers as sets. - -The problem with auxiliary macros ------------------------------------------------------------------- - -We said a few times that auxiliary functions are not available to macros -defined in the same module, but actually in general -there is the *same* problem for any identifier which is used in the -right hand side of a macro definition, *including auxiliary macros*. -For instance, we may regard the ``indexer-syntax`` macro -as an auxiliary macro, to be used in the right hand side of a -``def-syntax`` form. - -In systems with strong phase separation, like -PLT Scheme and Larceny, auxiliary macros -are not special, and they behave as auxiliary functions: you -must put them into a separare module and you must import them -with ``(for (only (module) helper-macro) expand)`` before using them. -This is why the following script - -$$indexer-syntax: - -fails in PLT scheme: - - - $ plt-r6rs indexer-syntax.ss - indexer-syntax.ss:9:0: def-syntax: bad syntax in: (def-syntax (indexer-syntax a b c)) - - === context === - /usr/lib/plt/collects/rnrs/base-6.ss:492:6 - - -The problem is that in PLT and Larceny, the second ``def-syntax`` does -not see the binding for the ``indexer-syntax`` macro. - -This is a precise design choice: systems with strong phase -separation are making the life harder for programmers, -by forcing them to put auxiliary functions/macros/objects -in auxiliary modules, to keep absolute control on how the -names enter in the different phases and to make possible -to use different languages at different phases. - -I have yet to see a convincing example of why keeping -different languages at different phases is worth -the annoyance. - -On the other hand, in systems with weak phase separation, -like Ikarus/IronScheme/Mosh/Ypsilon, -*there is no need to put auxiliary -macros in an external module.* The reason is that all macro -definitions are read at the same time, and the compiler knows -about the helper macros, so it can use them. The consequence -is that a long as the compiler reads macro definitions, it expands -the compile-time namespace of recognized names which are available -to successive syntax definitions. - -In Ikarus the script ``identifier-syntax.ss`` is -perfectly valid: the first syntax -definition would add a binding for ``identifier-syntax`` to the macro -namespace, so that it would be seen by the second syntax definition. - - -Systems with strong phase separation instead are effectively using -different namespaces for each phase. - - -Implementing a first class module system ------------------------------------------ - -This is the same as implementing a Pythonic interface over hash tables -or association lists. - -Working around phase separation --------------------------------------------------------------- - -I have always hated being forced to put my helper functions in an -auxiliary module, because I often use auxiliary functions which -are intended to be used only once inside a given macro, thus -it makes sense to put those auxiliary functions *in the same -module* as the macro the are used in. -In principle you could solve the problem by definining all the -functions *inside* the macro, but I hate this, both for dogmatic -reasons (it is a Pythonista dogma that *flat is better than nested*) -and for pragmatic reasons, i.e. I want to be able to debug my -helper functions and this is impossible if they are hidden inside -the macro. They must be available at the top-level. Moreover, you -never know, and a functionality which was intended for use in a specific -macro my turn useful for another macro after all, and it is much -more convenient if the functionality is already encapsulated in -a nice exportable top level function. - -I am not the only to believe that it should be possible to define -helper functions in the same module as the macro and actually -many Scheme implementations provide a way to do so via a -``define-for-syntax`` form which allows to define function -at *expand time*, so that they are available for usage in macros. - -If your Scheme does not provide ``define-for-syntax``, which is not -part of the R6RS specification, you can still work around phase -separation with some clever hack. For instance, you could -use the following macro: - -$$lang:literal-replace -|# - -(import (rnrs) (sweet-macros) (for (aps list-utils) expand run) - (for (aps lang) expand) (aps compat) (aps easy-test)) - -;;INDEXER-SYNTAX -(def-syntax (indexer-syntax name ...) - (with-syntax (((i ...) (range (length #'(name ...))))) - #'(syntax-match (name ...) (sub (ctx name) #'i) ...)) - (distinct? free-identifier=? #'(name ...)) - (syntax-violation 'assert-distinct "Duplicate name" #'(name ...))) -;;END - - -;;INDEXER -(def-syntax (indexer name ...) - (with-syntax (((i ...) (range (length #'(name ...))))) - #'(lambda (x) (case x ((name) i) ...))) - (distinct? free-identifier=? #'(name ...)) - (syntax-violation 'assert-distinct "Duplicate name" #'(name ...))) -;;END - -(display (syntax-expand (indexer-syntax a b c))) (newline) - -(run - -;;TEST-INDEXER-SYNTAX -(test "indexer-syntax" - (let () - (def-syntax i (indexer-syntax a b c)) - (list (i a) (i b) (i c))) - '(0 1 2)) -;;END - -;;TEST-INDEXER -(test "indexer" - (let () - (define i (indexer a b c)) - (list (i 'a) (i 'b) (i 'c))) - '(0 1 2)) -;;END -) |