summaryrefslogtreecommitdiff
path: root/artima
diff options
context:
space:
mode:
authormichele.simionato <devnull@localhost>2009-06-29 15:26:59 +0000
committermichele.simionato <devnull@localhost>2009-06-29 15:26:59 +0000
commitd8e3dcdb814db0c6c11e85cf6961a6691c3d8d00 (patch)
treeddc8860a7cf908f51fcbc0789fde2c3ead8b2ab1 /artima
parent3e172339013209f767e655a44d601766d5bfd9b5 (diff)
downloadmicheles-d8e3dcdb814db0c6c11e85cf6961a6691c3d8d00.tar.gz
Lot of work on scheme30
Diffstat (limited to 'artima')
-rw-r--r--artima/scheme/scheme30.ss506
1 files changed, 299 insertions, 207 deletions
diff --git a/artima/scheme/scheme30.ss b/artima/scheme/scheme30.ss
index 7bae815..231c8d9 100644
--- a/artima/scheme/scheme30.ss
+++ b/artima/scheme/scheme30.ss
@@ -1,264 +1,356 @@
-#|Runtime pattern matching and object oriented APIs
-=======================================================
+#|Comparing identifiers
+=================================================
-As a final example of macros power, this episode will implement a syntax
-for full runtime pattern matching of lists .
+Equality of identifiers is one of the trickiest things in R6RS
+Scheme, since the meaning of identifier equality in a lexically
+scoped language with hygienic macros is non-obvious.
-.. _13: http://www.artima.com/weblogs/viewpost.jsp?thread=248953
-.. _15: http://www.artima.com/weblogs/viewpost.jsp?thread=249681
-.. _association lists:
+Identifier equality is a compile-time
+concept which has nothing to do with the run-time concept of equality
+between variables: two variables are equal if their underlying values
+are equal, but the value to which a variable is bound is only known
+at run-time, therefore identifier equality cannot rely on it.
-list-match
-----------------------------------------------------------------
+Identifiers are not variables: they are syntax objects with an
+underlying symbol and an underlying lexical context, which are known
+statically at compile time. It is possible to know if an
+identifier is bound or not at compile-time, but the value the
+identifier will be bound to at run-time is unknown.
-In episode 15_ I have discussed a very limited form of pattern
-matching, i.e. list destructuring. There is more to
-pattern matching than that.
+The Scheme standard define two different equality procedures
+for identifiers (``bound-identifier=?`` and ``free-identifier=?``,
+with absolutely misleading names); to those, I will add a third one,
+``symbol-identifier=?``.
-A general list matcher should be able to
-destructure a list, whatever complicate it may be, and to perform
-*different* actions for different structures. In this sense, pattern
-matching is a kind of if statement on steroids.
+This episode will be spent discussing the subtilities of identifier
+equality.
-The Scheme standard does not provide a default list matcher, but there
-are tons of libraries providing such a feature, and in the appendix
-we will provide an implementation of ``list-match`` ourselves, so
-let us assume we have list matcher accepting the following patterns:
+.. image:: Love_equality.png
-.. code-block:: scheme
+``symbol-identifier=?``
+-----------------------------------------------
- > (list-match <patterns>)
- ((list-match lst (sub pattern action guard ...) ...))
+The simplest concept of identifier equality is expressed by
+the following ``symbol-identifier=?`` comparison function
+(for convenience, I have added the ``symbol-identifier=?``
+precedure in the ``(aps lang)`` library):
-Given an object (which usually is a list), the matcher compares
-it with various patterns, and perform different actions depending
-on the match, which can be further restricted with a guard.
-Here are a few trivial tests:
+$$lang:SYMBOL-IDENTIFIER=?
-$$MATCH-TESTS
+Two identifiers are ``symbol-identifier=?`` if they are equal as
+symbols.
-As usual, the understore ``_`` means "match but do not bind.
-Here are two further tests, that makes use of the following helper function:
+``symbol-identifier=?`` can
+be used to manage duplicate names in macros defining multiple
+identifiers at once, or in macros defining name->value tables, such as
+the ``static-map`` I discussed in episode 22. Moreover,
+``symbol-identifier=?`` can be used to reject reserved
+identifiers (you may need such
+functionality if are buildin a mini-language on top of Scheme
+and you want to reject a few identifiers as language keywords), as
+in the following macro:
-$$MATCH-ODD
+$$CHECK-RESERVED
-``replace-odd`` makes use of guarded patterns to replace the odd
-numbers in a list of numbers with the string "odd":
+``(check-reserved id)`` will raise a syntax-violation if ``id``
+is one of the keyword ``reserved1`` or ``reserved2``.
-$$MATCH-ODD1
+``bound-identifier=?``
+-----------------------------------------------
-``replace-odd`` only recognizes numbers and lists: it chokes on strings
-or any other kind of object, by raising a ``pattern failure`` error:
+``symbol-identifier=?`` is simple and easy to understand, but it cannot
+be used in all situations. Consider for instance the very first macro
+I wrote, in episode 9_:
-$$MATCH-ODD2
+.. code-block:: scheme
-This example is simple, but it does not make justice to the power of
-pattern matching. The following example, however, will do (or at least
-I hope so): it explains how to implement object oriented APIs in terms
-of pattern matching functions.
+ (def-syntax (multi-define (name1 name2 ...) (value1 value2 ...))
+ #'(begin (define name1 value1) (define name2 value2) ...))
-Pattern matching vs message passing
--------------------------------------------------------
+It is quite common to write macros defining multiple bindings, such
+as ``multi-define``. There is also the common situation of
+macros which expand themselves to macros defining
+multiple bindings. When code is generated automatically, errors are
+more than likely and it makes sense to manage explicitly
+the situation when the list of names to be defined contains some
+duplicate. ``multi-define`` as written does not perform any check, so that it
+relies on the standard behavior of R6RS scheme, raising an error.
+Unfortunately, the standard behavior only applies to programs and scripts,
+whereas the REPL is quite free to behaves differently and indeed it does:
-A typical use case for
-full pattern matching is message passing: in functional languages you
-can send lists as messages to an object, and the object will respond
-differently depending on the structure of the list. This is the
-standard mechanism to send messages in Erlang, but can be done in any
-functional language with pattern matching.
+.. code-block:: scheme
-I you think about it for a minute, you will recognize that pattern
-matching makes the creation of object oriented APIs rather trivial.
-Let consider for instance Scheme `association lists`_, which by default
-have a functional API, and suppose we want to provide an object oriented
-interface on top on it, say a Python-like API. The task can be easily
-performed via pattern matching, as follows:
+ > (multi-define (a a) (1 2))
+ a
+ 2
-$$ALIST
+(in the REPL latter definitions override previous definitions). Being
+unhappy with that, we can introduce a ``bound-identifier=?`` check
+and raise a custom exception:
-Here we used the R6RS procedures ``(assq k a)`` to extract the slot
-corresponding to the key ``k`` from the association list ``a``, as
-well as the procedure ``set-cdr!`` described in episode 13_ to modify
-the ``cdr`` of a slot (imported from ``(rnrs mutable-pairs)``).
-Notice that I am using the term *slot* here to refer to
-the inner lists that compose an association list.
+$$MULTI-DEFINE
-Here are a few tests:
+Two identifiers are equal according to ``bound-identifier=?`` only if
+they have the same name *and* the same marks. The name is
+misleading since the arguments of ``bound-identifier=?`` are not
+required to be bound identifiers; a better name would be
+``strict-identifier=?``.
-$$ALIST-TESTS
+You can ``check`` that ``multi-define`` correctly reject duplicated
+identifiers:
-Appendix: implementation of ``list-match``
--------------------------------------------------------------
+.. code-block:: scheme
-The implementation below is not my own, I have adapted
-(i.e. shamelessly copied) it from Phil Bewig, which in turns copied it
-from Jos Koot, which maybe is the original author, or maybe he copied
-it from some other source. But attribution is not important, the
-important thing is the code:
+ > (multi-define (a a) (1 2))
+ Unhandled exception
+ Condition components:
+ 1. &who: multi-define
+ 2. &message: "Found duplicated identifier in"
+ 3. &syntax:
+ form: (a a)
+ subform: #f
-$$aps/list-match:
+You may also believe that using ``symbol-identifier=?`` would work too.
+However this is not the case, in general. Consider
+for instance the following macro expanding to ``multi-define``:
-The implementation above makes use of a few tricks which are well known
-to experienced Schemers, but may look unfamiliar to my readers.
+$$MULTI-DEFINE2
-First of all, we leverage on the fact that ``(and boolean expr)`` returns
-``expr`` if ``boolean`` is true and ``#f`` otherwise, i.e. it is equivalent
-to ``(if boolean expr #f)``. Then, we use the ``=>`` form of the conditional
-expression.
-As explained in the `cond specification`, a ``cond`` clause may
-have the form ``(value => func)``; when the value is not false,
-then the function is applied to it and ``cond`` returns the result,
-otherwise ``cond`` looks at the next clause. For instance
+The macro introduces a dummy identifier ``id2``.
+Had we defined ``multi-define`` in terms of ``symbol-identifier=?``,
+calling ``multi-define2`` with argument ``id`` equal to ``id2``
+would have generated a spurious name clash. Fortunately, since we defined
+``multi-define`` in terms of ``bound-identifier=?``, nothing bad happens:
.. code-block:: scheme
- > (cond (#f => any-function) ('(1) => car))
+ > (multi-define2 id2 1)
+ id2
1
-Notice that the underscore is a special identifier in macros
-(just as the ellipsis identifier ``...``` is special) and it
-cannot be used as a literal, this is why the trick
-``(free-identifier=? #'_ underscore)`` is used.
+``bound-identifier=?`` works in this case because
+the identifier ``id2`` introduced by the macro has
+different marks from the identifier ``id2`` coming as an argument.
+
+``free-identifier=?``
+------------------------------------
+
+``bound-identifier=?`` is not good for every circumstance. Consider
+for instance the following variation of ``multi-define``, featuring
+a literal keyword ``as``:
+
+$$MULTI-DEF
+
+This work, but the error messages could stand some improvement.
+For instance, if an user misspells the infix identifier ``as``,
+she gets a generic ``"invalid syntax"`` error, whereas we
+would like to provide a customized error message showing the misspelled
+literal identifier. Using ``bound-identifier=?`` we could try to
+solve the problem as follows:
+
+$$MULTI-DEF-BAD
+
+Unfortunately this solution does not work at all, since it raises
+an error even when the ``as`` identifiers are spelled correctly:
-I have exported the internal macro ``_match``, to make it
-possible to understand how the implementation works. ``_match`` is
-able to match a single pattern. For instance we have
+.. code-block:: scheme
+
+ > (multi-def-bad (x as y) (1 as 2))
+ Unhandled exception
+ Condition components:
+ 1. &who: multi-def-bad
+ 2. &message: "Offending infix syntax (expected `as')"
+ 3. &syntax:
+ form: (as as)
+ subform: #f
+
+The reason is that ``as`` is not ``bound-identifier=?``
+to ``#'as``. We need a less strict comparison predicate such as
+``symbol-identifier=?`` or ``free-identifier=?``.
+
+``free-identifier=?`` is
+the most complicated equality predicate. As
+usual, the name is misleading since the arguments of
+``free-identifier=?`` are not required to be free identifiers.
+A better name would be ``lax-identifier=?``.
+Two identifiers are ``free-identifier=?`` if
+
+1. they are both bound and they share the same name, or they shared
+ the same name before renaming at import time;
+2. they are both unbound and they share the same name.
+
+In implementations with full phase separation, the identifiers
+must also be both bound/unbound in the same phase.
+In all other cases the two identifiers are not ``free-identifier=?``.
+Here is an example:
+
+.. code-block:: scheme
+
+ > (import (only (aps list-utils) range))
+ > (import (rename (aps list-utils) (range r)))
+ > (free-identifier=? #'r #'range)
+ #t
+
+Notice that both ``symbol-identifier=?`` and ``bound-identifier=?``
+would fail to recognize the identity of ``range`` and ``r`` in this
+case.
+
+.. _9: http://www.artima.com/weblogs/viewpost.jsp?thread=240804
+
+Literal identifiers and auxiliary syntax
+-----------------------------------------------------
+
+Macros with literal identifiers may be surprising, because internally
+the literal identifiers are compared by using ``free-identifier=?``.
+Consider for instance the macro ``multi-def`` defined in the previous
+paragraph. This works:
.. code-block:: scheme
- > (_match '(1 2) (x y) (list (+ x y)))
- (3)
+ > (let ()
+ (multi-def (x as 1) (y as 2))
+ (list x y))
+ (1 2)
-Using ``syntax-expand`` we can see the inner working of ``_match``:
+But this does not work:
.. code-block:: scheme
- > (syntax-expand(_match '(1 2) (x y) (list (+ x y)) #t))
- (let ((ob '(1 2)))
- (and (pair? ob)
- (let ((kar-obj ob) (kdr-obj (cdr ob)))
- (_match kar-obj x
- (_match kdr-obj (y) (list (+ x y)) #t)))))
+ > (let ((as 2))
+ (multi-def (x as 1) (y as 2))
+ (list x y))
+ Unhandled exception
+ Condition components:
+ 1. &message: "invalid syntax"
+ 2. &syntax:
+ form: (multi-def (x as 1) (y as 2))
+ subform: #f
+ 3. &trace: #<syntax (multi-def (x as 1) (y as 2))>
+
+The reason is that in the second example ``as`` is bound, and therefore
+it is not ``free-identifier=?`` to the literal identifier ``#'as``.
+
+The recommended "solution" is to define at top-level and to
+export some dummy definition for the literal identifiers you
+are going to use in your macro. I think this is fundamentally
+broken: literal identifiers should be a concept internal to
+the macro and they should not be exported. The mistake is
+that the R6RS requires the literal identifiers to be matched
+via ``free-identifier=?``, whereas they should be matched
+with ``symbol-identifier=?``. I never understood why the editors
+decided to use ``free-identifier=?``, perhaps because it makes it
+possible to rename the identifiers used as literal identifiers,
+a feature that looks more like a misfeature to me.
+
+Anyway, the R6RS document defines a set of special macros, such as
+``_``, ``...``, ``else`` and ``=>``, which lives in the global
+namespace and are available to all R6RS programs. Such macros are used
+as auxiliary syntax in various special forms, like ``cond`` and
+``syntax-case``; for this reason they are usually called auxiliary
+keywords. The existence of such global variables makes it impossible
+to redefine them at top-level in scripts (but it can be done at the
+REPL); however they can be redefined locally, thus breaking the macros
+using the auxiliary syntax:
+.. code-block:: scheme
-.. _cond specification: http://www.r6rs.org/final/html/r6rs/r6rs-Z-H-14.html#node_idx_376
+ > (let ((else #f))
+ (cond (else 'something)))
+ > ; does not return something
|#
-(import (rnrs) (sweet-macros) (aps list-utils) (aps list-match)
- (rnrs mutable-pairs) (aps easy-test) (aps compat))
-
-;;ALIST
-;; a function performing pattern matching which provides a Pythonic
-;; dictionary interface over an association list
-(define (alist . a)
- (lambda args
- (list-match args
- (sub () a); return the underlying alist
- (sub ('->keys); return the keys
- (map car a))
- (sub ('->values); return the values
- (map cdr a))
- (sub (k); emulate Python .__getitem__
- (let ((slot (assq k a)))
- (if slot
- (cdr slot)
- (error 'alist (format "Missing key ~a" k)))))
- (sub (k 'or val); emulate Python .get
- (let ((slot (assq k a)))
- (if slot
- (cdr slot)
- val)))
- (sub (k '! val); emulate Python .__setitem__
- (let ((slot (assq k a)))
- (if slot
- ;; modify existing slot
- (begin (set-cdr! slot val) a)
- ;; else append new slot
- (let ((new-a (append a (list (cons k val)))))
- (set! a new-a) a))))
- )))
+(import (rnrs) (sweet-macros) (aps compat)
+ (for (aps cut) expand)
+ (for (aps lang) expand)
+ (for (aps list-utils) expand)
+ (aps easy-test))
+
+;;STATIC-MAP
+(def-syntax (static-map (name value) ...)
+ #'(syntax-match (<names> name ...)
+ (sub (ctx <names>) #''(name ...))
+ (sub (ctx name) #'value)
+ ...)
+ (distinct? symbol-identifier=? #'(name ...))
+ (syntax-violation 'static-map "Found duplicated identifier in"
+ #'(name ...)))
;;END
-(define a (alist '(x . 1) '(y . 2)))
+;;MULTI-DEFINE
+(def-syntax (multi-define (name1 name2 ...) (value1 value2 ...))
+ #'(begin (define name1 value1) (define name2 value2) ...)
+ (distinct? bound-identifier=? #'(name1 name2 ...))
+ (syntax-violation 'multi-define "Found duplicated identifier in"
+ #'(name1 name2 ...)))
+;;END
-;;MATCH-ODD
-(define (replace-odd obj)
- (list-match obj
- (sub x "odd" (and (integer? x) (odd? x)))
- (sub x x (integer? x))
- (sub x (map replace-odd x) (list? x))
- ))
+;;MULTI-DEFINE2
+(def-syntax (multi-define2 id value)
+ #'(multi-define (id id2) (value 'dummy)))
;;END
-(run
+(def-syntax (compare-ids id1 id2)
+ (let ((res '()))
+ (when (symbol-identifier=? #'id1 #'id2)
+ (set! res (cons 'raw= res)))
+ (when (bound-identifier=? #'id1 #'id2)
+ (set! res (cons 'bound= res)))
+ (when (free-identifier=? #'id1 #'id2)
+ (set! res (cons 'free= res)))
+ (datum->syntax #'compare-ids `',res)))
+
+(let ((a 1))
+ (define id-a #'a)
+ (let ((a a))
+ (free-identifier=? #'a id-a)))
+
+;;MULTI-DEF-BAD
+(def-syntax (multi-def-bad (name as_ value) ...)
+ #'(begin (define name value) ...)
+ (for-all (lambda (id) (bound-identifier=? id #'as)) #'(as_ ...))
+ (syntax-violation 'multi-def-bad "Offending infix syntax (expected `as')"
+ (remp (lambda (id) (bound-identifier=? id #'as)) #'(as_ ...))))
+;;END
- (test "patterns"
- (list-match <patterns>)
- '((list-match lst (sub pattern template guard ...) ...)))
+;;MULTI-DEF
+ (define-syntax multi-def
+ (syntax-rules (as)
+ ((multi-def (name as value) ...)
+ (begin (define name value) ...))))
+;;END
- ;;MATCH-TESTS
- (test "match1"
- (list-match '(1) (sub (x) x))
- 1)
-
- (test "match2"
- (list-match '(1 2) (sub (_ y) y))
- 2)
-
- (test "match3"
- (list-match '(1 (2 3)) (sub (_ (_ z)) z))
- 3)
- ;;END
-
- ;;MATCH-ODD1
- (test "match-odd1"
- (replace-odd '(1 (2 3)))
- '("odd" (2 "odd")))
- ;;END
-
- ;;MATCH-ODD2
- (test "match-odd2"
- (catch-error (replace-odd "string"))
- "pattern failure")
- ;;END
-
- ;;ALIST-TESTS
- (test "alist1"
- (a 'x)
- 1)
-
- (test "alist2"
- (a 'y)
- 2)
+;;FREE-DISTINCT
+(def-syntax (free-distinct name ...)
+ (distinct? free-identifier=? #'(name ...)))
+;;END
- (test "alist3"
- (catch-error (a 'z))
- "Missing key z")
-
- (test "alist4"
- (a 'z 'or 3)
- 3)
-
- (test "alist5"
- (a 'z '! 3)
- '((x . 1) (y . 2) (z . 3)))
+;;BOUND-DISTINCT
+(def-syntax (bound-distinct name ...)
+ (distinct? bound-identifier=? #'(name ...)))
+;;END
+
+;;SYMBOL-DISTINCT
+(def-syntax (symbol-distinct name ...)
+ (distinct? symbol-identifier=? #'(name ...)))
+;;END
+
+;;CHECK-RESERVED
+(def-syntax (check-reserved id)
+ (syntax-violation 'check-reserved "Reserved identifier" #'id)
+ (exists (cut symbol-identifier=? #'id <>) (list #'reserved1 #'reserved2))
+ #f)
+;;END
+
+(def-syntax (distinct-x a)
+ #'(symbol-distinct x a))
+
+(run
+ (test "symbol1" (symbol-distinct a b) #t)
+ (test "symbol2" (symbol-distinct a a) #f)
- (test "alist6"
- (a 'y '! 4)
- '((x . 1) (y . 4) (z . 3)))
-
- (test "alist7"
- (a '->keys)
- '(x y z))
-
- (test "alist8"
- (a '->values)
- '(1 4 3))
- ;;END
+ (test "bound1" (bound-distinct a b) #t)
+ (test "bound2" (bound-distinct a a) #f)
+ (test "free1" (free-distinct a b) #t)
+ (test "free2" (free-distinct a a) #f)
)
-
-; (test "match4"
-; (list-match '(1 (2 (x 4) 5) 6)
-; (sub x (and (integer? x) (odd? x)) "odd")
-; (sub x (and (integer? x) (odd? x)) "odd")) 1)