lang/elisp/README


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321

                                                    -*- outline -*-

This directory holds the Scheme side of a translator for Emacs Lisp.

* Usage

To load up the base Elisp environment:

    (use-modules (lang elisp base))

Then you can switch into this module

    (define-module (lang elisp base))

and start typing away in Elisp, or evaluate an individual Elisp
expression from Scheme:

    (eval EXP (resolve-module '(lang elisp base)))

A more convenient, higher-level interface is provided by (lang elisp
interface):

    (use-modules (lang elisp interface))

With this interface, you can evaluate an Elisp expression

    (eval-elisp EXP)

load an Elisp file with no effect on the Scheme world

    (load-elisp-file "/home/neil/Guile/cvs/guile-core/lang/elisp/example.el")

load an Elisp file, automatically importing top level definitions into
Scheme

    (use-elisp-file "/home/neil/Guile/cvs/guile-core/lang/elisp/example.el")

export Scheme objects to Elisp

    (export-to-elisp + - * my-func 'my-var)

and try to bootstrap a complete Emacs environment:

    (load-emacs)

* Status

Please note that this is work in progress; the translator is
incomplete and not yet widely tested.

** Trying to load a complete Emacs environment.

To try this, type `(use-modules (lang elisp interface))' and then
`(load-emacs)'.  The following output shows how far I get when I try
this.

guile> (use-modules (lang elisp interface))
guile> (load-emacs)
Calling loadup.el to clothe the bare Emacs...
Loading /usr/share/emacs/20.7/lisp/loadup.el...
Using load-path ("/usr/share/emacs/20.7/lisp/" "/usr/share/emacs/20.7/lisp/emacs-lisp/")
Loading /usr/share/emacs/20.7/lisp/byte-run.el...
Loading /usr/share/emacs/20.7/lisp/byte-run.el...done
Loading /usr/share/emacs/20.7/lisp/subr.el...
Loading /usr/share/emacs/20.7/lisp/subr.el...done
Loading /usr/share/emacs/20.7/lisp/version.el...
Loading /usr/share/emacs/20.7/lisp/version.el...done
Loading /usr/share/emacs/20.7/lisp/map-ynp.el...
Loading /usr/share/emacs/20.7/lisp/map-ynp.el...done
Loading /usr/share/emacs/20.7/lisp/widget.el...
Loading /usr/share/emacs/20.7/lisp/emacs-lisp/cl.el...
Loading /usr/share/emacs/20.7/lisp/emacs-lisp/cl.el...done
Loading /usr/share/emacs/20.7/lisp/widget.el...done
Loading /usr/share/emacs/20.7/lisp/custom.el...
Loading /usr/share/emacs/20.7/lisp/custom.el...done
Loading /usr/share/emacs/20.7/lisp/cus-start.el...
Note, built-in variable `abbrev-all-caps' not bound
  ... [many other variable not bound messages] ...
Loading /usr/share/emacs/20.7/lisp/cus-start.el...done
Loading /usr/share/emacs/20.7/lisp/international/mule.el...
<unnamed port>: In procedure make-char-table in expression (@fop make-char-table (# #)):
<unnamed port>: Symbol's function definition is void
ABORT: (misc-error)

Type "(backtrace)" to get more information or "(debug)" to enter the debugger.
guile> 

That's 3279 lines ("wc -l") of Elisp code already, which isn't bad!

I think that progress beyond this point basically means implementing
multilingual and multibyte strings properly for Guile.  Which is a
_lot_ of work and requires IMO a very clear plan for Guile's role with
respect to Emacs.

* Design

When thinking about how to implement an Elisp translator for Guile, it
is important to realize that the great power of Emacs does not arise
from Elisp (seen as a language in syntactic terms) alone, but from the
combination of this language with the collection of primitives
provided by the Emacs C source code.  Therefore, to be of practical
use, an Elisp translator needs to be more than just a transformer that
translates sexps to Scheme expressions.

The finished translator should consist of several parts...

** Syntax transformation

Although syntax transformation isn't all we need, we do still need it!

This part is implemented by the (lang elisp transform) module; it is
close to complete and seems to work pretty reliably.

Note that transformed expressions use the `@fop' and `@bind' macros
provided by...

** C support for transformed expressions

For performance and historical reasons (and perhaps necessity - I
haven't thought about it enough yet), some of the transformation
support is written in C.

*** @fop

The `@fop' macro is used to dispatch Elisp applications.  Its first
argument is a symbol, and this symbol's function slot is examined to
find a procedure or macro to apply to the remaining arguments.  `@fop'
also handles aliasing (`defalias'): in this case the function slot
contains another symbol.

Once `@fop' has found the appropriate procedure or macro to apply, it
returns an application expression in which that procedure or macro
replaces the `@fop' and the original symbol.  Hence no Elisp-specific
evaluator support is required to perform the application.

*** @bind

Currently, Elisp variables are the same as Scheme variables, so
variable references are effectively untransformed.

The `@bind' macro does Elisp-style dynamic variable binding.
Basically, it locates the named top level variables, `set!'s them to
new values, evaluates its body, and then uses `set!' again to restore
the original values.

Because of the body evaluation, `@bind' requires evaluator support.
In fact, the `@bind' macro code does little more than replace itself
with the memoized SCM_IM_BIND.  Most of the work is done by the
evaluator when it hits SCM_IM_BIND.

One theoretical problem with `@bind' is that any local Scheme variable
in the same scope and with the same name as an Elisp variable will
shadow the Elisp variable.  But in practice it's difficult to set up
such a situation; an exception is the translator code itself, so there
we mangle the relevant Scheme variable names a bit to avoid the
problem.

Other possible problems with this approach are that it might not be
possible to implement buffer local variables properly, and that
`@bind' might become too inefficient when we implement full support
for undefining Scheme variables.  So we might in future have to
transform Elisp variable references after all.

*** Truth value stuff

Lots of stuff to do with providing the special self-evaluating `nil'
and `t' symbols, and macros that convert between Scheme and Elisp
truth values, and so on.

I'm hoping that most of this will go away, but I need to show that
it's feasible first.

** Emacs editing primitives

Buffers, keymaps, text properties, windows, frames etc. etc.

Basically, everything that is implemented as a primitive in the Emacs
C code needs to be implemented either in Scheme or in C for Guile.

The Scheme files in the primitives subdirectory implement some of
these primitives in Scheme.  Not because that is the right decision,
but because this is a proof of concept and it's quicker to write badly
performing code in Scheme.

Ultimately, most of these primitive definitions should really come
from the Emacs C code itself, translated or preprocessed in a way that
makes it compile with Guile.  I think this is pretty close to the work
that Ken Raeburn has been doing on the Emacs codebase.

** Reading and printing support

Elisp is close enough to Scheme that it's convenient to coopt the
existing Guile reader rather than to write a new one from scratch, but
there are a few syntactic differences that will require adding Elisp
support to the reader.

- Character syntax is `?a' rather than `#\a'.  (Not done.  More
  precisely, `?a' in Elisp isn't character syntax but an alternative
  integer syntax.  Note that we could support most of the `?a' syntax
  simply by doing 

      (define ?a (char->integer #\a)
      (define ?b (char->integer #\b)

  and so on.)

- `nil' and `t' should be read (I think) as #f and #t.  (Done.)

- Vector syntax is `[1 2 3]' rather than `#(1 2 3)'.  (Not done.)

Correspondingly, when printing, #f and '() should be written as
`nil'.  (Not done.)

** The Elisp evaluation module (lang elisp base)

Fundamentally, Guile's module system can't be used to package Elisp
code in the same way it is used for Scheme code, because Elisp
function definitions are stored as symbol properties (in the symbol's
"function slot") and so are global.  On the other hand, it is useful
(necessary?) to associate some particular module with Elisp evaluation
because

- Elisp variables are currently implemented as Scheme variables and so
  need to live in some module

- a syntax transformer is a property of a module.

Therefore we have the (lang elisp base) module, which acts as the
repository for all Elisp variables and the site of all Elisp
evaluation.

The initial environment provided by this module is intended to be a
non-Emacs-dependent subset of Elisp.  To get the idea, imagine someone
who wants to write an extension function for, say Gnucash, and simply
prefers to write in Elisp rather than in Scheme.  He/she therefore
doesn't buffers, keymaps and so on, just the basic language syntax and
core data functions like +, *, concat, length etc., plus specific
functions made available by Gnucash.

(lang elisp base) achieves this by

- importing Scheme definitions for some Emacs primitives from the
  files in the primitives subdirectory

- then switching into Elisp syntax.

After this point, `(eval XXX (resolve-module '(lang elisp base)))'
will evaluate XXX as an Elisp expression in the (lang elisp base)
module.  (`eval-elisp' in (lang elisp interface) is a more convenient
wrapper for this.)

** Full Emacs environment

The difference between the initial (lang elisp base) environment and a
fully loaded Emacs equivalent is

- more primitives: buffers, char-tables and many others

- the bootstrap Elisp code that an undumped Emacs loads during
  installation by calling `(load "loadup.el")'.

We don't have all the missing primitives, but we can already get
through some of loadup.el.  The Elisp function `load-emacs' (defined
in (lang elisp base) initiates the loading of loadup.el; (lang elisp
interface) exports `load-emacs' to Scheme.

`load-emacs' loads so much Elisp code that it's an excellent way to
test the translator.  In current practice, it runs for a while and
then fails when it gets to an undefined primitive or a bug in the
translator.  Eventually, it should go all the way.  (And then we can
worry about adding unexec support to Guile!)  For the output that
currently results from calling `(load-emacs)', see above in the Status
section.

* nil, #f and '()

For Jim Blandy's notes on this, see the reference at the bottom of
this file.  Currently I'm investigating a different approach, which is
better IMO than Jim's proposal because it avoids requiring multiple
false values in the Scheme world.

According to my approach...

- `nil' and `t' are read (when in Elisp mode) as #f and #t.

- `(if x ...)', `(while x ...)' etc. are translated to something
  like `(if (and x (not (null? x))) ...)'.

- Functions which interpret an argument as a list --
  `cons', `setcdr', `memq', etc. -- either convert #f to '(), or
  handle the #f case specially.

- `eq' treats #f and '() as the same.

- Optionally, functions which produce '() values -- i.e. the reader
  and `cdr' -- could convert those immediately to #f.  This shouldn't
  affect the validity of any Elisp code, but it alters the balance of
  #f and '() values swimming around in that code and so affects what
  happens if two such values are returned to the Scheme world and then
  compared.  However, since you can never completely solve this
  problem (unless you are prepared to convert arbitrarily deep
  structures on entry to the Elisp world, which would kill performance),
  I'm inclined not to try to solve it at all.

* Resources

** Ken Raeburn's Guile Emacs page

http://www.mit.edu/~raeburn/guilemacs/

** Keisuke Nishida's Gemacs project

http://gemacs.sourceforge.net

** Jim Blandy's nil/#f/() notes

http://sanpietro.red-bean.com/guile/guile/old/3114.html

** Mikael Djurfeldt's notes on translation

See file guile-cvs/devel/translation/langtools.text in Guile CVS.