summaryrefslogtreecommitdiff
path: root/doc/lispref/hash.texi
blob: 4607bb0a0d13c394783873e1c3bece09decfbb1d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
@c -*-texinfo-*-
@c This is part of the GNU Emacs Lisp Reference Manual.
@c Copyright (C) 1999, 2001-2016 Free Software Foundation, Inc.
@c See the file elisp.texi for copying conditions.
@node Hash Tables
@chapter Hash Tables
@cindex hash tables
@cindex lookup tables

  A hash table is a very fast kind of lookup table, somewhat like an
alist (@pxref{Association Lists}) in that it maps keys to
corresponding values.  It differs from an alist in these ways:

@itemize @bullet
@item
Lookup in a hash table is extremely fast for large tables---in fact, the
time required is essentially @emph{independent} of how many elements are
stored in the table.  For smaller tables (a few tens of elements)
alists may still be faster because hash tables have a more-or-less
constant overhead.

@item
The correspondences in a hash table are in no particular order.

@item
There is no way to share structure between two hash tables,
the way two alists can share a common tail.
@end itemize

  Emacs Lisp provides a general-purpose hash table data type, along
with a series of functions for operating on them.  Hash tables have a
special printed representation, which consists of @samp{#s} followed
by a list specifying the hash table properties and contents.
@xref{Creating Hash}.
(Hash notation, the initial @samp{#} character used in the printed
representations of objects with no read representation, has nothing to
do with hash tables.  @xref{Printed Representation}.)

  Obarrays are also a kind of hash table, but they are a different type
of object and are used only for recording interned symbols
(@pxref{Creating Symbols}).

@menu
* Creating Hash::       Functions to create hash tables.
* Hash Access::         Reading and writing the hash table contents.
* Defining Hash::       Defining new comparison methods.
* Other Hash::          Miscellaneous.
@end menu

@node Creating Hash
@section Creating Hash Tables
@cindex creating hash tables

  The principal function for creating a hash table is
@code{make-hash-table}.

@defun make-hash-table &rest keyword-args
This function creates a new hash table according to the specified
arguments.  The arguments should consist of alternating keywords
(particular symbols recognized specially) and values corresponding to
them.

Several keywords make sense in @code{make-hash-table}, but the only two
that you really need to know about are @code{:test} and @code{:weakness}.

@table @code
@item :test @var{test}
This specifies the method of key lookup for this hash table.  The
default is @code{eql}; @code{eq} and @code{equal} are other
alternatives:

@table @code
@item eql
Keys which are numbers are the same if they are @code{equal}, that
is, if they are equal in value and either both are integers or both
are floating point; otherwise, two distinct objects are never
the same.

@item eq
Any two distinct Lisp objects are different as keys.

@item equal
Two Lisp objects are the same, as keys, if they are equal
according to @code{equal}.
@end table

You can use @code{define-hash-table-test} (@pxref{Defining Hash}) to
define additional possibilities for @var{test}.

@item :weakness @var{weak}
The weakness of a hash table specifies whether the presence of a key or
value in the hash table preserves it from garbage collection.

The value, @var{weak}, must be one of @code{nil}, @code{key},
@code{value}, @code{key-or-value}, @code{key-and-value}, or @code{t}
which is an alias for @code{key-and-value}.  If @var{weak} is @code{key}
then the hash table does not prevent its keys from being collected as
garbage (if they are not referenced anywhere else); if a particular key
does get collected, the corresponding association is removed from the
hash table.

If @var{weak} is @code{value}, then the hash table does not prevent
values from being collected as garbage (if they are not referenced
anywhere else); if a particular value does get collected, the
corresponding association is removed from the hash table.

If @var{weak} is @code{key-and-value} or @code{t}, both the key and
the value must be live in order to preserve the association.  Thus,
the hash table does not protect either keys or values from garbage
collection; if either one is collected as garbage, that removes the
association.

If @var{weak} is @code{key-or-value}, either the key or
the value can preserve the association.  Thus, associations are
removed from the hash table when both their key and value would be
collected as garbage (if not for references from weak hash tables).

The default for @var{weak} is @code{nil}, so that all keys and values
referenced in the hash table are preserved from garbage collection.

@item :size @var{size}
This specifies a hint for how many associations you plan to store in the
hash table.  If you know the approximate number, you can make things a
little more efficient by specifying it this way.  If you specify too
small a size, the hash table will grow automatically when necessary, but
doing that takes some extra time.

The default size is 65.

@item :rehash-size @var{rehash-size}
When you add an association to a hash table and the table is full,
it grows automatically.  This value specifies how to make the hash table
larger, at that time.

If @var{rehash-size} is an integer, it should be positive, and the hash
table grows by adding that much to the nominal size.  If
@var{rehash-size} is floating point, it had better be greater
than 1, and the hash table grows by multiplying the old size by that
number.

The default value is 1.5.

@item :rehash-threshold @var{threshold}
This specifies the criterion for when the hash table is full (so
it should be made larger).  The value, @var{threshold}, should be a
positive floating-point number, no greater than 1.  The hash table is
full whenever the actual number of entries exceeds this fraction
of the nominal size.  The default for @var{threshold} is 0.8.
@end table
@end defun

You can also create a new hash table using the printed representation
for hash tables.  The Lisp reader can read this printed
representation, provided each element in the specified hash table has
a valid read syntax (@pxref{Printed Representation}).  For instance,
the following specifies a new hash table containing the keys
@code{key1} and @code{key2} (both symbols) associated with @code{val1}
(a symbol) and @code{300} (a number) respectively.

@example
#s(hash-table size 30 data (key1 val1 key2 300))
@end example

@noindent
The printed representation for a hash table consists of @samp{#s}
followed by a list beginning with @samp{hash-table}.  The rest of the
list should consist of zero or more property-value pairs specifying
the hash table's properties and initial contents.  The properties and
values are read literally.  Valid property names are @code{size},
@code{test}, @code{weakness}, @code{rehash-size},
@code{rehash-threshold}, and @code{data}.  The @code{data} property
should be a list of key-value pairs for the initial contents; the
other properties have the same meanings as the matching
@code{make-hash-table} keywords (@code{:size}, @code{:test}, etc.),
described above.

Note that you cannot specify a hash table whose initial contents
include objects that have no read syntax, such as buffers and frames.
Such objects may be added to the hash table after it is created.

@node Hash Access
@section Hash Table Access
@cindex accessing hash tables
@cindex hash table access

  This section describes the functions for accessing and storing
associations in a hash table.  In general, any Lisp object can be used
as a hash key, unless the comparison method imposes limits.  Any Lisp
object can also be used as the value.

@defun gethash key table &optional default
This function looks up @var{key} in @var{table}, and returns its
associated @var{value}---or @var{default}, if @var{key} has no
association in @var{table}.
@end defun

@defun puthash key value table
This function enters an association for @var{key} in @var{table}, with
value @var{value}.  If @var{key} already has an association in
@var{table}, @var{value} replaces the old associated value.
@end defun

@defun remhash key table
This function removes the association for @var{key} from @var{table}, if
there is one.  If @var{key} has no association, @code{remhash} does
nothing.

@b{Common Lisp note:} In Common Lisp, @code{remhash} returns
non-@code{nil} if it actually removed an association and @code{nil}
otherwise.  In Emacs Lisp, @code{remhash} always returns @code{nil}.
@end defun

@defun clrhash table
This function removes all the associations from hash table @var{table},
so that it becomes empty.  This is also called @dfn{clearing} the hash
table.

@b{Common Lisp note:} In Common Lisp, @code{clrhash} returns the empty
@var{table}.  In Emacs Lisp, it returns @code{nil}.
@end defun

@defun maphash function table
@anchor{Definition of maphash}
This function calls @var{function} once for each of the associations in
@var{table}.  The function @var{function} should accept two
arguments---a @var{key} listed in @var{table}, and its associated
@var{value}.  @code{maphash} returns @code{nil}.
@end defun

@node Defining Hash
@section Defining Hash Comparisons
@cindex hash code
@cindex define hash comparisons

  You can define new methods of key lookup by means of
@code{define-hash-table-test}.  In order to use this feature, you need
to understand how hash tables work, and what a @dfn{hash code} means.

  You can think of a hash table conceptually as a large array of many
slots, each capable of holding one association.  To look up a key,
@code{gethash} first computes an integer, the hash code, from the key.
It reduces this integer modulo the length of the array, to produce an
index in the array.  Then it looks in that slot, and if necessary in
other nearby slots, to see if it has found the key being sought.

  Thus, to define a new method of key lookup, you need to specify both a
function to compute the hash code from a key, and a function to compare
two keys directly.

@defun define-hash-table-test name test-fn hash-fn
This function defines a new hash table test, named @var{name}.

After defining @var{name} in this way, you can use it as the @var{test}
argument in @code{make-hash-table}.  When you do that, the hash table
will use @var{test-fn} to compare key values, and @var{hash-fn} to compute
a hash code from a key value.

The function @var{test-fn} should accept two arguments, two keys, and
return non-@code{nil} if they are considered the same.

The function @var{hash-fn} should accept one argument, a key, and return
an integer that is the hash code of that key.  For good results, the
function should use the whole range of integers for hash codes,
including negative integers.

The specified functions are stored in the property list of @var{name}
under the property @code{hash-table-test}; the property value's form is
@code{(@var{test-fn} @var{hash-fn})}.
@end defun

@defun sxhash-equal obj
This function returns a hash code for Lisp object @var{obj}.
This is an integer which reflects the contents of @var{obj}
and the other Lisp objects it points to.

If two objects @var{obj1} and @var{obj2} are @code{equal}, then
@code{(sxhash-equal @var{obj1})} and @code{(sxhash-equal @var{obj2})}
are the same integer.

If the two objects are not @code{equal}, the values returned by
@code{sxhash-equal} are usually different, but not always; once in a
rare while, by luck, you will encounter two distinct-looking objects
that give the same result from @code{sxhash-equal}.

@b{Common Lisp note:} In Common Lisp a similar function is called
@code{sxhash}.  Emacs provides this name as a compatibility alias for
@code{sxhash-equal}.
@end defun

@defun sxhash-eq obj
This function returns a hash code for Lisp object @var{obj}.  Its
result reflects identity of @var{obj}, but not its contents.

If two objects @var{obj1} and @var{obj2} are @code{eq}, then
@code{(xhash @var{obj1})} and @code{(xhash @var{obj2})} are the same
integer.
@end defun

@defun sxhash-eql obj
This function returns a hash code for Lisp object @var{obj} suitable
for @code{eql} comparison.  I.e. it reflects identity of @var{obj}
except for the case where the object is a float number, in which case
hash code is generated for the value.

If two objects @var{obj1} and @var{obj2} are @code{eql}, then
@code{(xhash @var{obj1})} and @code{(xhash @var{obj2})} are the same
integer.
@end defun

  This example creates a hash table whose keys are strings that are
compared case-insensitively.

@example
(defun case-fold-string= (a b)
  (eq t (compare-strings a nil nil b nil nil t)))
(defun case-fold-string-hash (a)
  (sxhash-equal (upcase a)))

(define-hash-table-test 'case-fold
  'case-fold-string= 'case-fold-string-hash)

(make-hash-table :test 'case-fold)
@end example

  Here is how you could define a hash table test equivalent to the
predefined test value @code{equal}.  The keys can be any Lisp object,
and equal-looking objects are considered the same key.

@example
(define-hash-table-test 'contents-hash 'equal 'sxhash-equal)

(make-hash-table :test 'contents-hash)
@end example

@node Other Hash
@section Other Hash Table Functions

  Here are some other functions for working with hash tables.

@defun hash-table-p table
This returns non-@code{nil} if @var{table} is a hash table object.
@end defun

@defun copy-hash-table table
This function creates and returns a copy of @var{table}.  Only the table
itself is copied---the keys and values are shared.
@end defun

@defun hash-table-count table
This function returns the actual number of entries in @var{table}.
@end defun

@defun hash-table-test table
This returns the @var{test} value that was given when @var{table} was
created, to specify how to hash and compare keys.  See
@code{make-hash-table} (@pxref{Creating Hash}).
@end defun

@defun hash-table-weakness table
This function returns the @var{weak} value that was specified for hash
table @var{table}.
@end defun

@defun hash-table-rehash-size table
This returns the rehash size of @var{table}.
@end defun

@defun hash-table-rehash-threshold table
This returns the rehash threshold of @var{table}.
@end defun

@defun hash-table-size table
This returns the current nominal size of @var{table}.
@end defun