summaryrefslogtreecommitdiff
path: root/doc/ref/api-memory.texi
blob: ff202e00ef458f92309472138aa4370f4e6e9074 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
@c -*-texinfo-*-
@c This is part of the GNU Guile Reference Manual.
@c Copyright (C)  1996, 1997, 2000, 2001, 2002, 2003, 2004, 2009, 2010, 2012
@c   Free Software Foundation, Inc.
@c See the file guile.texi for copying conditions.

@node Memory Management
@section Memory Management and Garbage Collection

Guile uses a @emph{garbage collector} to manage most of its objects.
While the garbage collector is designed to be mostly invisible, you 
sometimes need to interact with it explicitly.

See @ref{Garbage Collection} for a general discussion of how garbage
collection relates to using Guile from C.

@menu
* Garbage Collection Functions::
* Memory Blocks::
* Weak References::
* Guardians::
@end menu


@node Garbage Collection Functions
@subsection Function related to Garbage Collection

@deffn {Scheme Procedure} gc
@deffnx {C Function} scm_gc ()
Scans all of SCM objects and reclaims for further use those that are
no longer accessible.  You normally don't need to call this function
explicitly.  It is called automatically when appropriate.
@end deffn

@deftypefn {C Function} SCM scm_gc_protect_object (SCM @var{obj})
Protects @var{obj} from being freed by the garbage collector, when it
otherwise might be.  When you are done with the object, call
@code{scm_gc_unprotect_object} on the object. Calls to
@code{scm_gc_protect}/@code{scm_gc_unprotect_object} can be nested, and
the object remains protected until it has been unprotected as many times
as it was protected. It is an error to unprotect an object more times
than it has been protected. Returns the SCM object it was passed.

Note that storing @var{obj} in a C global variable has the same
effect@footnote{In Guile up to version 1.8, C global variables were not
scanned by the garbage collector; hence, @code{scm_gc_protect_object}
was the only way in C to prevent a Scheme object from being freed.}.
@end deftypefn

@deftypefn {C Function} SCM scm_gc_unprotect_object (SCM @var{obj})

Unprotects an object from the garbage collector which was protected by
@code{scm_gc_unprotect_object}. Returns the SCM object it was passed.
@end deftypefn

@deftypefn {C Function} SCM scm_permanent_object (SCM @var{obj})

Similar to @code{scm_gc_protect_object} in that it causes the
collector to always mark the object, except that it should not be
nested (only call @code{scm_permanent_object} on an object once), and
it has no corresponding unpermanent function. Once an object is
declared permanent, it will never be freed. Returns the SCM object it
was passed.
@end deftypefn

@c  NOTE: The varargs scm_remember_upto_here is deliberately not
@c  documented, because we don't think it can be implemented as a nice
@c  inline compiler directive or asm block.  New _3, _4 or whatever
@c  forms could certainly be added though, if needed.

@deftypefn {C Macro} void scm_remember_upto_here_1 (SCM obj)
@deftypefnx {C Macro} void scm_remember_upto_here_2 (SCM obj1, SCM obj2)
Create a reference to the given object or objects, so they're certain
to be present on the stack or in a register and hence will not be
freed by the garbage collector before this point.

Note that these functions can only be applied to ordinary C local
variables (ie.@: ``automatics'').  Objects held in global or static
variables or some malloced block or the like cannot be protected with
this mechanism.
@end deftypefn

@deffn {Scheme Procedure} gc-stats
@deffnx {C Function} scm_gc_stats ()
Return an association list of statistics about Guile's current
use of storage.
@end deffn

@deffn {Scheme Procedure} gc-live-object-stats
@deffnx {C Function} scm_gc_live_object_stats ()
Return an alist of statistics of the current live objects. 
@end deffn

@deftypefun void scm_gc_mark (SCM @var{x})
Mark the object @var{x}, and recurse on any objects @var{x} refers to.
If @var{x}'s mark bit is already set, return immediately.  This function
must only be called during the mark-phase of garbage collection,
typically from a smob @emph{mark} function.
@end deftypefun


@node Memory Blocks
@subsection Memory Blocks

@cindex automatically-managed memory
@cindex GC-managed memory
@cindex conservative garbage collection

In C programs, dynamic management of memory blocks is normally done
with the functions malloc, realloc, and free.  Guile has additional
functions for dynamic memory allocation that are integrated into the
garbage collector and the error reporting system.

Memory blocks that are associated with Scheme objects (for example a
smob) should be allocated with @code{scm_gc_malloc} or
@code{scm_gc_malloc_pointerless}.  These two functions will either
return a valid pointer or signal an error.  Memory blocks allocated this
way can be freed with @code{scm_gc_free}; however, this is not strictly
needed: memory allocated with @code{scm_gc_malloc} or
@code{scm_gc_malloc_pointerless} is automatically reclaimed when the
garbage collector no longer sees any live reference to it@footnote{In
Guile up to version 1.8, memory allocated with @code{scm_gc_malloc}
@emph{had} to be freed with @code{scm_gc_free}.}.

Memory allocated with @code{scm_gc_malloc} is scanned for live pointers.
This means that if @code{scm_gc_malloc}-allocated memory contains a
pointer to some other part of the memory, the garbage collector notices
it and prevents it from being reclaimed@footnote{In Guile up to 1.8,
memory allocated with @code{scm_gc_malloc} was @emph{not} scanned.
Consequently, the GC had to be told explicitly about pointers to live
objects contained in the memory block, e.g., @i{via} SMOB mark functions
(@pxref{Smobs, @code{scm_set_smob_mark}})}.  Conversely, memory
allocated with @code{scm_gc_malloc_pointerless} is assumed to be
``pointer-less'' and is not scanned.

For memory that is not associated with a Scheme object, you can use
@code{scm_malloc} instead of @code{malloc}.  Like
@code{scm_gc_malloc}, it will either return a valid pointer or signal
an error.  However, it will not assume that the new memory block can
be freed by a garbage collection.  The memory must be explicitly freed
with @code{free}.

There is also @code{scm_gc_realloc} and @code{scm_realloc}, to be used
in place of @code{realloc} when appropriate, and @code{scm_gc_calloc}
and @code{scm_calloc}, to be used in place of @code{calloc} when
appropriate.

The function @code{scm_dynwind_free} can be useful when memory should be
freed with libc's @code{free} when leaving a dynwind context,
@xref{Dynamic Wind}.

@deftypefn {C Function} {void *} scm_malloc (size_t @var{size})
@deftypefnx {C Function} {void *} scm_calloc (size_t @var{size})
Allocate @var{size} bytes of memory and return a pointer to it.  When
@var{size} is 0, return @code{NULL}.  When not enough memory is
available, signal an error.  This function runs the GC to free up some
memory when it deems it appropriate.

The memory is allocated by the libc @code{malloc} function and can be
freed with @code{free}.  There is no @code{scm_free} function to go
with @code{scm_malloc} to make it easier to pass memory back and forth
between different modules.  

The function @code{scm_calloc} is similar to @code{scm_malloc}, but
initializes the block of memory to zero as well.

These functions will (indirectly) call
@code{scm_gc_register_allocation}.
@end deftypefn

@deftypefn {C Function} {void *} scm_realloc (void *@var{mem}, size_t @var{new_size})
Change the size of the memory block at @var{mem} to @var{new_size} and
return its new location.  When @var{new_size} is 0, this is the same
as calling @code{free} on @var{mem} and @code{NULL} is returned.  When
@var{mem} is @code{NULL}, this function behaves like @code{scm_malloc}
and allocates a new block of size @var{new_size}.

When not enough memory is available, signal an error.  This function
runs the GC to free up some memory when it deems it appropriate.

This function will call @code{scm_gc_register_allocation}.
@end deftypefn




@deftypefn {C Function} {void *} scm_gc_malloc (size_t @var{size}, const char *@var{what})
@deftypefnx {C Function} {void *} scm_gc_malloc_pointerless (size_t @var{size}, const char *@var{what})
@deftypefnx {C Function} {void *} scm_gc_realloc (void *@var{mem}, size_t @var{old_size}, size_t @var{new_size}, const char *@var{what});
@deftypefnx {C Function} {void *} scm_gc_calloc (size_t @var{size}, const char *@var{what})
Allocate @var{size} bytes of automatically-managed memory.  The memory
is automatically freed when no longer referenced from any live memory
block.

Memory allocated with @code{scm_gc_malloc} or @code{scm_gc_calloc} is
scanned for pointers.  Memory allocated by
@code{scm_gc_malloc_pointerless} is not scanned.

The @code{scm_gc_realloc} call preserves the ``pointerlessness'' of the
memory area pointed to by @var{mem}.  Note that you need to pass the old
size of a reallocated memory block as well.  See below for a motivation.
@end deftypefn


@deftypefn {C Function} void scm_gc_free (void *@var{mem}, size_t @var{size}, const char *@var{what})
Explicitly free the memory block pointed to by @var{mem}, which was
previously allocated by one of the above @code{scm_gc} functions.

Note that you need to explicitly pass the @var{size} parameter.  This
is done since it should normally be easy to provide this parameter
(for memory that is associated with GC controlled objects) and help keep
the memory management overhead very low.  However, in Guile 2.x,
@var{size} is always ignored.
@end deftypefn


@deftypefn {C Function} void scm_gc_register_allocation (size_t @var{size})
Informs the garbage collector that @var{size} bytes have been allocated,
which the collector would otherwise not have known about.

In general, Scheme will decide to collect garbage only after some amount
of memory has been allocated.  Calling this function will make the
Scheme garbage collector know about more allocation, and thus run more
often (as appropriate).

It is especially important to call this function when large unmanaged
allocations, like images, may be freed by small Scheme allocations, like
SMOBs.
@end deftypefn


@deftypefn {C Function} void scm_dynwind_free (void *mem)
Equivalent to @code{scm_dynwind_unwind_handler (free, @var{mem},
SCM_F_WIND_EXPLICITLY)}.  That is, the memory block at @var{mem} will be
freed (using @code{free} from the C library) when the current dynwind is
left.
@end deftypefn

@deffn {Scheme Procedure} malloc-stats
Return an alist ((@var{what} . @var{n}) ...) describing number
of malloced objects.
@var{what} is the second argument to @code{scm_gc_malloc},
@var{n} is the number of objects of that type currently
allocated.

This function is only available if the @code{GUILE_DEBUG_MALLOC}
preprocessor macro was defined when Guile was compiled.
@end deffn


@subsubsection Upgrading from scm_must_malloc et al.

Version 1.6 of Guile and earlier did not have the functions from the
previous section.  In their place, it had the functions
@code{scm_must_malloc}, @code{scm_must_realloc} and
@code{scm_must_free}.  This section explains why we want you to stop
using them, and how to do this.

@findex scm_must_malloc
@findex scm_must_realloc
@findex scm_must_calloc
@findex scm_must_free
The functions @code{scm_must_malloc} and @code{scm_must_realloc}
behaved like @code{scm_gc_malloc} and @code{scm_gc_realloc} do now,
respectively.  They would inform the GC about the newly allocated
memory via the internal equivalent of
@code{scm_gc_register_allocation}.  However,
@code{scm_must_free} did not unregister the memory it was about to
free.  The usual way to unregister memory was to return its size from
a smob free function.

This disconnectedness of the actual freeing of memory and reporting
this to the GC proved to be bad in practice.  It was easy to make
mistakes and report the wrong size because allocating and freeing was
not done with symmetric code, and because it is cumbersome to compute
the total size of nested data structures that were freed with multiple
calls to @code{scm_must_free}.  Additionally, there was no equivalent
to @code{scm_malloc}, and it was tempting to just use
@code{scm_must_malloc} and never to tell the GC that the memory has
been freed.

The effect was that the internal statistics kept by the GC drifted out
of sync with reality and could even overflow in long running programs.
When this happened, the result was a dramatic increase in (senseless)
GC activity which would effectively stop the program dead.

@findex scm_done_malloc
@findex scm_done_free
The functions @code{scm_done_malloc} and @code{scm_done_free} were
introduced to help restore balance to the force, but existing bugs did
not magically disappear, of course.

Therefore we decided to force everybody to review their code by
deprecating the existing functions and introducing new ones in their
place that are hopefully easier to use correctly.

For every use of @code{scm_must_malloc} you need to decide whether to
use @code{scm_malloc} or @code{scm_gc_malloc} in its place.  When the
memory block is not part of a smob or some other Scheme object whose
lifetime is ultimately managed by the garbage collector, use
@code{scm_malloc} and @code{free}.  When it is part of a smob, use
@code{scm_gc_malloc} and change the smob free function to use
@code{scm_gc_free} instead of @code{scm_must_free} or @code{free} and
make it return zero.

The important thing is to always pair @code{scm_malloc} with
@code{free}; and to always pair @code{scm_gc_malloc} with
@code{scm_gc_free}.

The same reasoning applies to @code{scm_must_realloc} and
@code{scm_realloc} versus @code{scm_gc_realloc}.


@node Weak References
@subsection Weak References

[FIXME: This chapter is based on Mikael Djurfeldt's answer to a
question by Michael Livshin. Any mistakes are not theirs, of course. ]

Weak references let you attach bookkeeping information to data so that
the additional information automatically disappears when the original
data is no longer in use and gets garbage collected. In a weak key hash,
the hash entry for that key disappears as soon as the key is no longer
referenced from anywhere else. For weak value hashes, the same happens
as soon as the value is no longer in use. Entries in a doubly weak hash
disappear when either the key or the value are not used anywhere else
anymore.

Object properties offer the same kind of functionality as weak key
hashes in many situations. (@pxref{Object Properties})

Here's an example (a little bit strained perhaps, but one of the
examples is actually used in Guile):

Assume that you're implementing a debugging system where you want to
associate information about filename and position of source code
expressions with the expressions themselves.

Hashtables can be used for that, but if you use ordinary hash tables
it will be impossible for the scheme interpreter to "forget" old
source when, for example, a file is reloaded.

To implement the mapping from source code expressions to positional
information it is necessary to use weak-key tables since we don't want
the expressions to be remembered just because they are in our table.

To implement a mapping from source file line numbers to source code
expressions you would use a weak-value table.

To implement a mapping from source code expressions to the procedures
they constitute a doubly-weak table has to be used.

@menu
* Weak hash tables::
* Weak vectors::
@end menu


@node Weak hash tables
@subsubsection Weak hash tables

@deffn {Scheme Procedure} make-weak-key-hash-table size
@deffnx {Scheme Procedure} make-weak-value-hash-table size
@deffnx {Scheme Procedure} make-doubly-weak-hash-table size
@deffnx {C Function} scm_make_weak_key_hash_table (size)
@deffnx {C Function} scm_make_weak_value_hash_table (size)
@deffnx {C Function} scm_make_doubly_weak_hash_table (size)
Return a weak hash table with @var{size} buckets. As with any
hash table, choosing a good size for the table requires some
caution.

You can modify weak hash tables in exactly the same way you
would modify regular hash tables. (@pxref{Hash Tables})
@end deffn

@deffn {Scheme Procedure} weak-key-hash-table? obj
@deffnx {Scheme Procedure} weak-value-hash-table? obj
@deffnx {Scheme Procedure} doubly-weak-hash-table? obj
@deffnx {C Function} scm_weak_key_hash_table_p (obj)
@deffnx {C Function} scm_weak_value_hash_table_p (obj)
@deffnx {C Function} scm_doubly_weak_hash_table_p (obj)
Return @code{#t} if @var{obj} is the specified weak hash
table. Note that a doubly weak hash table is neither a weak key
nor a weak value hash table.
@end deffn

@node Weak vectors
@subsubsection Weak vectors

Weak vectors are mainly useful in Guile's implementation of weak hash
tables.

@deffn {Scheme Procedure} make-weak-vector size [fill]
@deffnx {C Function} scm_make_weak_vector (size, fill)
Return a weak vector with @var{size} elements. If the optional
argument @var{fill} is given, all entries in the vector will be
set to @var{fill}. The default value for @var{fill} is the
empty list.
@end deffn

@deffn {Scheme Procedure} weak-vector elem @dots{}
@deffnx {Scheme Procedure} list->weak-vector l
@deffnx {C Function} scm_weak_vector (l)
Construct a weak vector from a list: @code{weak-vector} uses
the list of its arguments while @code{list->weak-vector} uses
its only argument @var{l} (a list) to construct a weak vector
the same way @code{list->vector} would.
@end deffn

@deffn {Scheme Procedure} weak-vector? obj
@deffnx {C Function} scm_weak_vector_p (obj)
Return @code{#t} if @var{obj} is a weak vector. Note that all
weak hashes are also weak vectors.
@end deffn


@node Guardians
@subsection Guardians

Guardians provide a way to be notified about objects that would
otherwise be collected as garbage.  Guarding them prevents the objects
from being collected and cleanup actions can be performed on them, for
example.

See R. Kent Dybvig, Carl Bruggeman, and David Eby (1993) "Guardians in
a Generation-Based Garbage Collector".  ACM SIGPLAN Conference on
Programming Language Design and Implementation, June 1993.

@deffn {Scheme Procedure} make-guardian
@deffnx {C Function} scm_make_guardian ()
Create a new guardian.  A guardian protects a set of objects from
garbage collection, allowing a program to apply cleanup or other
actions.

@code{make-guardian} returns a procedure representing the guardian.
Calling the guardian procedure with an argument adds the argument to
the guardian's set of protected objects.  Calling the guardian
procedure without an argument returns one of the protected objects
which are ready for garbage collection, or @code{#f} if no such object
is available.  Objects which are returned in this way are removed from
the guardian.

You can put a single object into a guardian more than once and you can
put a single object into more than one guardian.  The object will then
be returned multiple times by the guardian procedures.

An object is eligible to be returned from a guardian when it is no
longer referenced from outside any guardian.

There is no guarantee about the order in which objects are returned
from a guardian.  If you want to impose an order on finalization
actions, for example, you can do that by keeping objects alive in some
global data structure until they are no longer needed for finalizing
other objects.

Being an element in a weak vector, a key in a hash table with weak
keys, or a value in a hash table with weak values does not prevent an
object from being returned by a guardian.  But as long as an object
can be returned from a guardian it will not be removed from such a
weak vector or hash table.  In other words, a weak link does not
prevent an object from being considered collectable, but being inside
a guardian prevents a weak link from being broken.

A key in a weak key hash table can be thought of as having a strong
reference to its associated value as long as the key is accessible.
Consequently, when the key is only accessible from within a guardian,
the reference from the key to the value is also considered to be
coming from within a guardian.  Thus, if there is no other reference
to the value, it is eligible to be returned from a guardian.
@end deffn


@c Local Variables:
@c TeX-master: "guile.texi"
@c End: