diff options
author | Richard M. Stallman <rms@gnu.org> | 1994-03-21 06:42:21 +0000 |
---|---|---|
committer | Richard M. Stallman <rms@gnu.org> | 1994-03-21 06:42:21 +0000 |
commit | 5b35991872231fc8d9ac00d65972bca83edd874d (patch) | |
tree | e0e79e0e23a81dacd66850bcf93adbc6ed4cfc53 /lispref | |
parent | 9aab521af638669367337d8e7826e4185c83fd83 (diff) | |
download | emacs-5b35991872231fc8d9ac00d65972bca83edd874d.tar.gz |
Initial revision
Diffstat (limited to 'lispref')
-rw-r--r-- | lispref/objects.texi | 1468 |
1 files changed, 1468 insertions, 0 deletions
diff --git a/lispref/objects.texi b/lispref/objects.texi new file mode 100644 index 00000000000..ab2fe39a10c --- /dev/null +++ b/lispref/objects.texi @@ -0,0 +1,1468 @@ +@c -*-texinfo-*- +@c This is part of the GNU Emacs Lisp Reference Manual. +@c Copyright (C) 1990, 1991, 1992, 1993, 1994 Free Software Foundation, Inc. +@c See the file elisp.texi for copying conditions. +@setfilename ../info/objects +@node Types of Lisp Object, Numbers, Introduction, Top +@chapter Lisp Data Types +@cindex object +@cindex Lisp object +@cindex type +@cindex data type + + A Lisp @dfn{object} is a piece of data used and manipulated by Lisp +programs. For our purposes, a @dfn{type} or @dfn{data type} is a set of +possible objects. + + Every object belongs to at least one type. Objects of the same type +have similar structures and may usually be used in the same contexts. +Types can overlap, and objects can belong to two or more types. +Consequently, we can ask whether an object belongs to a particular type, +but not for ``the'' type of an object. + +@cindex primitive type + A few fundamental object types are built into Emacs. These, from +which all other types are constructed, are called @dfn{primitive +types}. Each object belongs to one and only one primitive type. These +types include @dfn{integer}, @dfn{float}, @dfn{cons}, @dfn{symbol}, +@dfn{string}, @dfn{vector}, @dfn{subr}, @dfn{byte-code function}, and +several special types, such as @dfn{buffer}, that are related to +editing. (@xref{Editing Types}.) + + Each primitive type has a corresponding Lisp function that checks +whether an object is a member of that type. + + Note that Lisp is unlike many other languages in that Lisp objects are +@dfn{self-typing}: the primitive type of the object is implicit in the +object itself. For example, if an object is a vector, nothing can treat +it as a number; Lisp knows it is a vector, not a number. + + In most languages, the programmer must declare the data type of each +variable, and the type is known by the compiler but not represented in +the data. Such type declarations do not exist in Emacs Lisp. A Lisp +variable can have any type of value, and remembers the type of any value +you store in it. + + This chapter describes the purpose, printed representation, and read +syntax of each of the standard types in GNU Emacs Lisp. Details on how +to use these types can be found in later chapters. + +@menu +* Printed Representation:: How Lisp objects are represented as text. +* Comments:: Comments and their formatting conventions. +* Programming Types:: Types found in all Lisp systems. +* Editing Types:: Types specific to Emacs. +* Type Predicates:: Tests related to types. +* Equality Predicates:: Tests of equality between any two objects. +@end menu + +@node Printed Representation +@comment node-name, next, previous, up +@section Printed Representation and Read Syntax +@cindex printed representation +@cindex read syntax + + The @dfn{printed representation} of an object is the format of the +output generated by the Lisp printer (the function @code{prin1}) for +that object. The @dfn{read syntax} of an object is the format of the +input accepted by the Lisp reader (the function @code{read}) for that +object. Most objects have more than one possible read syntax. Some +types of object have no read syntax; except for these cases, the printed +representation of an object is also a read syntax for it. + + In other languages, an expression is text; it has no other form. In +Lisp, an expression is primarily a Lisp object and only secondarily the +text that is the object's read syntax. Often there is no need to +emphasize this distinction, but you must keep it in the back of your +mind, or you will occasionally be very confused. + +@cindex hash notation + Every type has a printed representation. Some types have no read +syntax, since it may not make sense to enter objects of these types +directly in a Lisp program. For example, the buffer type does not have +a read syntax. Objects of these types are printed in @dfn{hash +notation}: the characters @samp{#<} followed by a descriptive string +(typically the type name followed by the name of the object), and closed +with a matching @samp{>}. Hash notation cannot be read at all, so the +Lisp reader signals the error @code{invalid-read-syntax} whenever it +encounters @samp{#<}. +@kindex invalid-read-syntax + +@example +(current-buffer) + @result{} #<buffer objects.texi> +@end example + + When you evaluate an expression interactively, the Lisp interpreter +first reads the textual representation of it, producing a Lisp object, +and then evaluates that object (@pxref{Evaluation}). However, +evaluation and reading are separate activities. Reading returns the +Lisp object represented by the text that is read; the object may or may +not be evaluated later. @xref{Input Functions}, for a description of +@code{read}, the basic function for reading objects. + +@node Comments +@comment node-name, next, previous, up +@section Comments +@cindex comments +@cindex @samp{;} in comment + + A @dfn{comment} is text that is written in a program only for the sake +of humans that read the program, and that has no effect on the meaning +of the program. In Lisp, a semicolon (@samp{;}) starts a comment if it +is not within a string or character constant. The comment continues to +the end of line. The Lisp reader discards comments; they do not become +part of the Lisp objects which represent the program within the Lisp +system. + + @xref{Comment Tips}, for conventions for formatting comments. + +@node Programming Types +@section Programming Types +@cindex programming types + + There are two general categories of types in Emacs Lisp: those having +to do with Lisp programming, and those having to do with editing. The +former exist in many Lisp implementations, in one form or another. The +latter are unique to Emacs Lisp. + +@menu +* Integer Type:: Numbers without fractional parts. +* Floating Point Type:: Numbers with fractional parts and with a large range. +* Character Type:: The representation of letters, numbers and + control characters. +* Sequence Type:: Both lists and arrays are classified as sequences. +* List Type:: Lists gave Lisp its name (not to mention reputation). +* Array Type:: Arrays include strings and vectors. +* String Type:: An (efficient) array of characters. +* Vector Type:: One-dimensional arrays. +* Symbol Type:: A multi-use object that refers to a function, + variable, property list, or itself. +* Lisp Function Type:: A piece of executable code you can call from elsewhere. +* Lisp Macro Type:: A method of expanding an expression into another + expression, more fundamental but less pretty. +* Primitive Function Type:: A function written in C, callable from Lisp. +* Byte-Code Type:: A function written in Lisp, then compiled. +* Autoload Type:: A type used for automatically loading seldom-used + functions. +@end menu + +@node Integer Type +@subsection Integer Type + + Integers were the only kind of number in Emacs version 18. The range +of values for integers is @minus{}8388608 to 8388607 (24 bits; i.e., +@ifinfo +-2**23 +@end ifinfo +@tex +$-2^{23}$ +@end tex +to +@ifinfo +2**23 - 1) +@end ifinfo +@tex +$2^{23}-1$) +@end tex +on most machines, but is 25 or 26 bits on some systems. It is important +to note that the Emacs Lisp arithmetic functions do not check for +overflow. Thus @code{(1+ 8388607)} is @minus{}8388608 on 24-bit +implementations.@refill + + The read syntax for numbers is a sequence of (base ten) digits with an +optional sign at the beginning and an optional period at the end. The +printed representation produced by the Lisp interpreter never has a +leading @samp{+} or a final @samp{.}. + +@example +@group +-1 ; @r{The integer -1.} +1 ; @r{The integer 1.} +1. ; @r{Also The integer 1.} ++1 ; @r{Also the integer 1.} +16777217 ; @r{Also the integer 1!} + ; @r{ (on a 24-bit or 25-bit implementation)} +@end group +@end example + + @xref{Numbers}, for more information. + +@node Floating Point Type +@subsection Floating Point Type + + Emacs version 19 supports floating point numbers (though there is a +compilation option to disable them). The precise range of floating +point numbers is machine-specific. + + The printed representation for floating point numbers requires either +a decimal point (with at least one digit following), an exponent, or +both. For example, @samp{1500.0}, @samp{15e2}, @samp{15.0e2}, +@samp{1.5e3}, and @samp{.15e4} are five ways of writing a floating point +number whose value is 1500. They are all equivalent. + + @xref{Numbers}, for more information. + +@node Character Type +@subsection Character Type +@cindex @sc{ASCII} character codes + + A @dfn{character} in Emacs Lisp is nothing more than an integer. In +other words, characters are represented by their character codes. For +example, the character @kbd{A} is represented as the @w{integer 65}. + + Individual characters are not often used in programs. It is far more +common to work with @emph{strings}, which are sequences composed of +characters. @xref{String Type}. + + Characters in strings, buffers, and files are currently limited to the +range of 0 to 255---eight bits. If you store a larger integer into a +string, buffer or file, it is truncated to that range. Characters that +represent keyboard input have a much wider range. + +@cindex read syntax for characters +@cindex printed representation for characters +@cindex syntax for characters + Since characters are really integers, the printed representation of a +character is a decimal number. This is also a possible read syntax for +a character, but writing characters that way in Lisp programs is a very +bad idea. You should @emph{always} use the special read syntax formats +that Emacs Lisp provides for characters. These syntax formats start +with a question mark. + + The usual read syntax for alphanumeric characters is a question mark +followed by the character; thus, @samp{?A} for the character +@kbd{A}, @samp{?B} for the character @kbd{B}, and @samp{?a} for the +character @kbd{a}. + + For example: + +@example +?Q @result{} 81 ?q @result{} 113 +@end example + + You can use the same syntax for punctuation characters, but it is +often a good idea to add a @samp{\} to prevent Lisp mode from getting +confused. For example, @samp{?\ } is the way to write the space +character. If the character is @samp{\}, you @emph{must} use a second +@samp{\} to quote it: @samp{?\\}. + +@cindex whitespace +@cindex bell character +@cindex @samp{\a} +@cindex backspace +@cindex @samp{\b} +@cindex tab +@cindex @samp{\t} +@cindex vertical tab +@cindex @samp{\v} +@cindex formfeed +@cindex @samp{\f} +@cindex newline +@cindex @samp{\n} +@cindex return +@cindex @samp{\r} +@cindex escape +@cindex @samp{\e} + You can express the characters Control-g, backspace, tab, newline, +vertical tab, formfeed, return, and escape as @samp{?\a}, @samp{?\b}, +@samp{?\t}, @samp{?\n}, @samp{?\v}, @samp{?\f}, @samp{?\r}, @samp{?\e}, +respectively. Those values are 7, 8, 9, 10, 11, 12, 13, and 27 in +decimal. Thus, + +@example +?\a @result{} 7 ; @r{@kbd{C-g}} +?\b @result{} 8 ; @r{backspace, @key{BS}, @kbd{C-h}} +?\t @result{} 9 ; @r{tab, @key{TAB}, @kbd{C-i}} +?\n @result{} 10 ; @r{newline, @key{LFD}, @kbd{C-j}} +?\v @result{} 11 ; @r{vertical tab, @kbd{C-k}} +?\f @result{} 12 ; @r{formfeed character, @kbd{C-l}} +?\r @result{} 13 ; @r{carriage return, @key{RET}, @kbd{C-m}} +?\e @result{} 27 ; @r{escape character, @key{ESC}, @kbd{C-[}} +?\\ @result{} 92 ; @r{backslash character, @kbd{\}} +@end example + +@cindex escape sequence + These sequences which start with backslash are also known as +@dfn{escape sequences}, because backslash plays the role of an escape +character; this usage has nothing to do with the character @key{ESC}. + +@cindex control characters + Control characters may be represented using yet another read syntax. +This consists of a question mark followed by a backslash, caret, and the +corresponding non-control character, in either upper or lower case. For +example, both @samp{?\^I} and @samp{?\^i} are valid read syntax for the +character @kbd{C-i}, the character whose value is 9. + + Instead of the @samp{^}, you can use @samp{C-}; thus, @samp{?\C-i} is +equivalent to @samp{?\^I} and to @samp{?\^i}: + +@example +?\^I @result{} 9 ?\C-I @result{} 9 +@end example + + For use in strings and buffers, you are limited to the control +characters that exist in @sc{ASCII}, but for keyboard input purposes, +you can turn any character into a control character with @samp{C-}. The +character codes for these non-@sc{ASCII} control characters include the +2**22 bit as well as the code for the corresponding non-control +character. Ordinary terminals have no way of generating non-@sc{ASCII} +control characters, but you can generate them straightforwardly using an +X terminal. + + You can think of the @key{DEL} character as @kbd{Control-?}: + +@example +?\^? @result{} 127 ?\C-? @result{} 127 +@end example + + For representing control characters to be found in files or strings, +we recommend the @samp{^} syntax; for control characters in keyboard +input, we prefer the @samp{C-} syntax. This does not affect the meaning +of the program, but may guide the understanding of people who read it. + +@cindex meta characters + A @dfn{meta character} is a character typed with the @key{META} +modifier key. The integer that represents such a character has the +2**23 bit set (which on most machines makes it a negative number). We +use high bits for this and other modifiers to make possible a wide range +of basic character codes. + + In a string, the 2**7 bit indicates a meta character, so the meta +characters that can fit in a string have codes in the range from 128 to +255, and are the meta versions of the ordinary @sc{ASCII} characters. +(In Emacs versions 18 and older, this convention was used for characters +outside of strings as well.) + + The read syntax for meta characters uses @samp{\M-}. For example, +@samp{?\M-A} stands for @kbd{M-A}. You can use @samp{\M-} together with +octal codes, @samp{\C-}, or any other syntax for a character. Thus, you +can write @kbd{M-A} as @samp{?\M-A}, or as @samp{?\M-\101}. Likewise, +you can write @kbd{C-M-b} as @samp{?\M-\C-b}, @samp{?\C-\M-b}, or +@samp{?\M-\002}. + + The case of an ordinary letter is indicated by its character code as +part of @sc{ASCII}, but @sc{ASCII} has no way to represent whether a +control character is upper case or lower case. Emacs uses the 2**21 bit +to indicate that the shift key was used for typing a control character. +This distinction is possible only when you use X terminals or other +special terminals; ordinary terminals do not indicate the distinction to +the computer in any way. + +@cindex hyper characters +@cindex super characters +@cindex alt characters + The X Window System defines three other modifier bits that can be set +in a character: @dfn{hyper}, @dfn{super} and @dfn{alt}. The syntaxes +for these bits are @samp{\H-}, @samp{\s-} and @samp{\A-}. Thus, +@samp{?\H-\M-\A-x} represents @kbd{Alt-Hyper-Meta-x}. Numerically, the +bit values are 2**18 for alt, 2**19 for super and 2**20 for hyper. + +@cindex @samp{?} in character constant +@cindex question mark in character constant +@cindex @samp{\} in character constant +@cindex backslash in character constant +@cindex octal character code + Finally, the most general read syntax consists of a question mark +followed by a backslash and the character code in octal (up to three +octal digits); thus, @samp{?\101} for the character @kbd{A}, +@samp{?\001} for the character @kbd{C-a}, and @code{?\002} for the +character @kbd{C-b}. Although this syntax can represent any @sc{ASCII} +character, it is preferred only when the precise octal value is more +important than the @sc{ASCII} representation. + +@example +@group +?\012 @result{} 10 ?\n @result{} 10 ?\C-j @result{} 10 +?\101 @result{} 65 ?A @result{} 65 +@end group +@end example + + A backslash is allowed, and harmless, preceding any character without +a special escape meaning; thus, @samp{?\+} is equivalent to @samp{?+}. +There is no reason to add a backslash before most characters. However, +you should add a backslash before any of the characters +@samp{()\|;'`"#.,} to avoid confusing the Emacs commands for editing +Lisp code. Also add a backslash before whitespace characters such as +space, tab, newline and formfeed. However, it is cleaner to use one of +the easily readable escape sequences, such as @samp{\t}, instead of an +actual whitespace character such as a tab. + +@node Sequence Type +@subsection Sequence Types + + A @dfn{sequence} is a Lisp object that represents an ordered set of +elements. There are two kinds of sequence in Emacs Lisp, lists and +arrays. Thus, an object of type list or of type array is also +considered a sequence. + + Arrays are further subdivided into strings and vectors. Vectors can +hold elements of any type, but string elements must be characters in the +range from 0 to 255. However, the characters in a string can have text +properties; vectors do not support text properties even when their +elements happen to be characters. + + Lists, strings and vectors are different, but they have important +similarities. For example, all have a length @var{l}, and all have +elements which can be indexed from zero to @var{l} minus one. Also, +several functions, called sequence functions, accept any kind of +sequence. For example, the function @code{elt} can be used to extract +an element of a sequence, given its index. @xref{Sequences Arrays +Vectors}. + + It is impossible to read the same sequence twice, since sequences are +always created anew upon reading. If you read the read syntax for a +sequence twice, you get two sequences with equal contents. There is one +exception: the empty list @code{()} always stands for the same object, +@code{nil}. + +@node List Type +@subsection List Type +@cindex address field of register +@cindex decrement field of register + + A @dfn{list} is a series of cons cells, linked together. A @dfn{cons +cell} is an object comprising two pointers named the @sc{car} and the +@sc{cdr}. Each of them can point to any Lisp object, but when the cons +cell is part of a list, the @sc{cdr} points either to another cons cell +or to the empty list. @xref{Lists}, for functions that work on lists. + + The names @sc{car} and @sc{cdr} have only historical meaning now. The +original Lisp implementation ran on an @w{IBM 704} computer which +divided words into two parts, called the ``address'' part and the +``decrement''; @sc{car} was an instruction to extract the contents of +the address part of a register, and @sc{cdr} an instruction to extract +the contents of the decrement. By contrast, ``cons cells'' are named +for the function @code{cons} that creates them, which in turn is named +for its purpose, the construction of cells. + +@cindex atom + Because cons cells are so central to Lisp, we also have a word for +``an object which is not a cons cell''. These objects are called +@dfn{atoms}. + +@cindex parenthesis + The read syntax and printed representation for lists are identical, and +consist of a left parenthesis, an arbitrary number of elements, and a +right parenthesis. + + Upon reading, each object inside the parentheses becomes an element +of the list. That is, a cons cell is made for each element. The +@sc{car} of the cons cell points to the element, and its @sc{cdr} points +to the next cons cell which holds the next element in the list. The +@sc{cdr} of the last cons cell is set to point to @code{nil}. + +@cindex box diagrams, for lists +@cindex diagrams, boxed, for lists + A list can be illustrated by a diagram in which the cons cells are +shown as pairs of boxes. (The Lisp reader cannot read such an +illustration; unlike the textual notation, which can be understood both +humans and computers, the box illustrations can only be understood by +humans.) The following represents the three-element list @code{(rose +violet buttercup)}: + +@example +@group + ___ ___ ___ ___ ___ ___ + |___|___|--> |___|___|--> |___|___|--> nil + | | | + | | | + --> rose --> violet --> buttercup +@end group +@end example + + In this diagram, each box represents a slot that can refer to any Lisp +object. Each pair of boxes represents a cons cell. Each arrow is a +reference to a Lisp object, either an atom or another cons cell. + + In this example, the first box, the @sc{car} of the first cons cell, +refers to or ``contains'' @code{rose} (a symbol). The second box, the +@sc{cdr} of the first cons cell, refers to the next pair of boxes, the +second cons cell. The @sc{car} of the second cons cell refers to +@code{violet} and the @sc{cdr} refers to the third cons cell. The +@sc{cdr} of the third (and last) cons cell refers to @code{nil}. + +Here is another diagram of the same list, @code{(rose violet +buttercup)}, sketched in a different manner: + +@smallexample +@group + --------------- ---------------- ------------------- +| car | cdr | | car | cdr | | car | cdr | +| rose | o-------->| violet | o-------->| buttercup | nil | +| | | | | | | | | + --------------- ---------------- ------------------- +@end group +@end smallexample + +@cindex @samp{(@dots{})} in lists +@cindex @code{nil} in lists +@cindex empty list + A list with no elements in it is the @dfn{empty list}; it is identical +to the symbol @code{nil}. In other words, @code{nil} is both a symbol +and a list. + + Here are examples of lists written in Lisp syntax: + +@example +(A 2 "A") ; @r{A list of three elements.} +() ; @r{A list of no elements (the empty list).} +nil ; @r{A list of no elements (the empty list).} +("A ()") ; @r{A list of one element: the string @code{"A ()"}.} +(A ()) ; @r{A list of two elements: @code{A} and the empty list.} +(A nil) ; @r{Equivalent to the previous.} +((A B C)) ; @r{A list of one element} + ; @r{(which is a list of three elements).} +@end example + + Here is the list @code{(A ())}, or equivalently @code{(A nil)}, +depicted with boxes and arrows: + +@example +@group + ___ ___ ___ ___ + |___|___|--> |___|___|--> nil + | | + | | + --> A --> nil +@end group +@end example + +@menu +* Dotted Pair Notation:: An alternative syntax for lists. +* Association List Type:: A specially constructed list. +@end menu + +@node Dotted Pair Notation +@comment node-name, next, previous, up +@subsubsection Dotted Pair Notation +@cindex dotted pair notation +@cindex @samp{.} in lists + + @dfn{Dotted pair notation} is an alternative syntax for cons cells +that represents the @sc{car} and @sc{cdr} explicitly. In this syntax, +@code{(@var{a} .@: @var{b})} stands for a cons cell whose @sc{car} is +the object @var{a}, and whose @sc{cdr} is the object @var{b}. Dotted +pair notation is therefore more general than list syntax. In the dotted +pair notation, the list @samp{(1 2 3)} is written as @samp{(1 . (2 . (3 +. nil)))}. For @code{nil}-terminated lists, the two notations produce +the same result, but list notation is usually clearer and more +convenient when it is applicable. When printing a list, the dotted pair +notation is only used if the @sc{cdr} of a cell is not a list. + + Here's how box notation can illustrate dotted pairs. This example +shows the pair @code{(rose . violet)}: + +@example +@group + ___ ___ + |___|___|--> violet + | + | + --> rose +@end group +@end example + + Dotted pair notation can be combined with list notation to represent a +chain of cons cells with a non-@code{nil} final @sc{cdr}. For example, +@code{(rose violet . buttercup)} is equivalent to @code{(rose . (violet +. buttercup))}. The object looks like this: + +@example +@group + ___ ___ ___ ___ + |___|___|--> |___|___|--> buttercup + | | + | | + --> rose --> violet +@end group +@end example + + These diagrams make it evident why @w{@code{(rose .@: violet .@: +buttercup)}} is invalid syntax; it would require a cons cell that has +three parts rather than two. + + The list @code{(rose violet)} is equivalent to @code{(rose . (violet))} +and looks like this: + +@example +@group + ___ ___ ___ ___ + |___|___|--> |___|___|--> nil + | | + | | + --> rose --> violet +@end group +@end example + + Similarly, the three-element list @code{(rose violet buttercup)} +is equivalent to @code{(rose . (violet . (buttercup)))}. +@ifinfo +It looks like this: + +@example +@group + ___ ___ ___ ___ ___ ___ + |___|___|--> |___|___|--> |___|___|--> nil + | | | + | | | + --> rose --> violet --> buttercup +@end group +@end example +@end ifinfo + +@node Association List Type +@comment node-name, next, previous, up +@subsubsection Association List Type + + An @dfn{association list} or @dfn{alist} is a specially-constructed +list whose elements are cons cells. In each element, the @sc{car} is +considered a @dfn{key}, and the @sc{cdr} is considered an +@dfn{associated value}. (In some cases, the associated value is stored +in the @sc{car} of the @sc{cdr}.) Association lists are often used as +stacks, since it is easy to add or remove associations at the front of +the list. + + For example, + +@example +(setq alist-of-colors + '((rose . red) (lily . white) (buttercup . yellow))) +@end example + +@noindent +sets the variable @code{alist-of-colors} to an alist of three elements. In the +first element, @code{rose} is the key and @code{red} is the value. + + @xref{Association Lists}, for a further explanation of alists and for +functions that work on alists. + +@node Array Type +@subsection Array Type + + An @dfn{array} is composed of an arbitrary number of slots for +referring to other Lisp objects, arranged in a contiguous block of +memory. Accessing any element of an array takes a the same amount of +time. In contrast, accessing an element of a list requires time +proportional to the position of the element in the list. (Elements at +the end of a list take longer to access than elements at the beginning +of a list.) + + Emacs defines two types of array, strings and vectors. A string is an +array of characters and a vector is an array of arbitrary objects. Both +are one-dimensional. (Most other programming languages support +multidimensional arrays, but they are not essential; you can get the +same effect with an array of arrays.) Each type of array has its own +read syntax; see @ref{String Type}, and @ref{Vector Type}. + + An array may have any length up to the largest integer; but once +created, it has a fixed size. The first element of an array has index +zero, the second element has index 1, and so on. This is called +@dfn{zero-origin} indexing. For example, an array of four elements has +indices 0, 1, 2, @w{and 3}. + + The array type is contained in the sequence type and contains both the +string type and the vector type. + +@node String Type +@subsection String Type + + A @dfn{string} is an array of characters. Strings are used for many +purposes in Emacs, as can be expected in a text editor; for example, as +the names of Lisp symbols, as messages for the user, and to represent +text extracted from buffers. Strings in Lisp are constants: evaluation +of a string returns the same string. + +@cindex @samp{"} in strings +@cindex double-quote in strings +@cindex @samp{\} in strings +@cindex backslash in strings + The read syntax for strings is a double-quote, an arbitrary number of +characters, and another double-quote, @code{"like this"}. The Lisp +reader accepts the same formats for reading the characters of a string +as it does for reading single characters (without the question mark that +begins a character literal). You can enter a nonprinting character such +as tab, @kbd{C-a} or @kbd{M-C-A} using the convenient escape sequences, +like this: @code{"\t, \C-a, \M-\C-a"}. You can include a double-quote +in a string by preceding it with a backslash; thus, @code{"\""} is a +string containing just a single double-quote character. +(@xref{Character Type}, for a description of the read syntax for +characters.) + + If you use the @samp{\M-} syntax to indicate a meta character in a +string constant, this sets the 2**7 bit of the character in the string. +This is not the same representation that the meta modifier has in a +character on its own (not inside a string). @xref{Character Type}. + + Strings cannot hold characters that have the hyper, super or alt +modifiers; they can hold @sc{ASCII} control characters, but no others. +They do not distinguish case in @sc{ASCII} control characters. + + In contrast with the C programming language, Emacs Lisp allows +newlines in string literals. But an escaped newline---one that is +preceded by @samp{\}---does not become part of the string; i.e., the +Lisp reader ignores an escaped newline in a string literal. +@cindex newline in strings + +@example +"It is useful to include newlines +in documentation strings, +but the newline is \ +ignored if escaped." + @result{} "It is useful to include newlines +in documentation strings, +but the newline is ignored if escaped." +@end example + + The printed representation of a string consists of a double-quote, the +characters it contains, and another double-quote. However, any +backslash or double-quote characters in the string are preceded with a +backslash like this: @code{"this \" is an embedded quote"}. + + A string can hold properties of the text it contains, in addition to +the characters themselves. This enables programs that copy text between +strings and buffers to preserve the properties with no special effort. +@xref{Text Properties}. Strings with text properties have a special +read and print syntax: + +@example +#("@var{characters}" @var{property-data}...) +@end example + +@noindent +where @var{property-data} consists of zero or more elements, in groups +of three as follows: + +@example +@var{beg} @var{end} @var{plist} +@end example + +@noindent +The elements @var{beg} and @var{end} are integers, and together specify +a range of indices in the string; @var{plist} is the property list for +that range. + + @xref{Strings and Characters}, for functions that work on strings. + +@node Vector Type +@subsection Vector Type + + A @dfn{vector} is a one-dimensional array of elements of any type. It +takes a constant amount of time to access any element of a vector. (In +a list, the access time of an element is proportional to the distance of +the element from the beginning of the list.) + + The printed representation of a vector consists of a left square +bracket, the elements, and a right square bracket. This is also the +read syntax. Like numbers and strings, vectors are considered constants +for evaluation. + +@example +[1 "two" (three)] ; @r{A vector of three elements.} + @result{} [1 "two" (three)] +@end example + + @xref{Vectors}, for functions that work with vectors. + +@node Symbol Type +@subsection Symbol Type + + A @dfn{symbol} in GNU Emacs Lisp is an object with a name. The symbol +name serves as the printed representation of the symbol. In ordinary +use, the name is unique---no two symbols have the same name. + + A symbol can serve as a variable, as a function name, or to hold a +property list. Or it may serve only to be distinct from all other Lisp +objects, so that its presence in a data structure may be recognized +reliably. In a given context, usually only one of these uses is +intended. But you can use one symbol in all of these ways, +independently. + +@cindex @samp{\} in symbols +@cindex backslash in symbols + A symbol name can contain any characters whatever. Most symbol names +are written with letters, digits, and the punctuation characters +@samp{-+=*/}. Such names require no special punctuation; the characters +of the name suffice as long as the name does not look like a number. +(If it does, write a @samp{\} at the beginning of the name to force +interpretation as a symbol.) The characters @samp{_~!@@$%^&:<>@{@}} are +less often used but also require no special punctuation. Any other +characters may be included in a symbol's name by escaping them with a +backslash. In contrast to its use in strings, however, a backslash in +the name of a symbol quotes the single character that follows the +backslash, without conversion. For example, in a string, @samp{\t} +represents a tab character; in the name of a symbol, however, @samp{\t} +merely quotes the letter @kbd{t}. To have a symbol with a tab character +in its name, you must actually use a tab (preceded with a backslash). +But it's rare to do such a thing. + +@cindex CL note---case of letters +@quotation +@b{Common Lisp note:} in Common Lisp, lower case letters are always +``folded'' to upper case, unless they are explicitly escaped. This is +in contrast to Emacs Lisp, in which upper case and lower case letters +are distinct. +@end quotation + + Here are several examples of symbol names. Note that the @samp{+} in +the fifth example is escaped to prevent it from being read as a number. +This is not necessary in the last example because the rest of the name +makes it invalid as a number. + +@example +@group +foo ; @r{A symbol named @samp{foo}.} +FOO ; @r{A symbol named @samp{FOO}, different from @samp{foo}.} +char-to-string ; @r{A symbol named @samp{char-to-string}.} +@end group +@group +1+ ; @r{A symbol named @samp{1+}} + ; @r{(not @samp{+1}, which is an integer).} +@end group +@group +\+1 ; @r{A symbol named @samp{+1}} + ; @r{(not a very readable name).} +@end group +@group +\(*\ 1\ 2\) ; @r{A symbol named @samp{(* 1 2)} (a worse name).} +@c the @'s in this next line use up three characters, hence the +@c apparent misalignment of the comment. ++-*/_~!@@$%^&=:<>@{@} ; @r{A symbol named @samp{+-*/_~!@@$%^&=:<>@{@}}.} + ; @r{These characters need not be escaped.} +@end group +@end example + +@node Lisp Function Type +@subsection Lisp Function Type + + Just as functions in other programming languages are executable, +@dfn{Lisp function} objects are pieces of executable code. However, +functions in Lisp are primarily Lisp objects, and only secondarily the +text which represents them. These Lisp objects are lambda expressions: +lists whose first element is the symbol @code{lambda} (@pxref{Lambda +Expressions}). + + In most programming languages, it is impossible to have a function +without a name. In Lisp, a function has no intrinsic name. A lambda +expression is also called an @dfn{anonymous function} (@pxref{Anonymous +Functions}). A named function in Lisp is actually a symbol with a valid +function in its function cell (@pxref{Defining Functions}). + + Most of the time, functions are called when their names are written in +Lisp expressions in Lisp programs. However, you can construct or obtain +a function object at run time and then call it with the primitive +functions @code{funcall} and @code{apply}. @xref{Calling Functions}. + +@node Lisp Macro Type +@subsection Lisp Macro Type + + A @dfn{Lisp macro} is a user-defined construct that extends the Lisp +language. It is represented as an object much like a function, but with +different parameter-passing semantics. A Lisp macro has the form of a +list whose first element is the symbol @code{macro} and whose @sc{cdr} +is a Lisp function object, including the @code{lambda} symbol. + + Lisp macro objects are usually defined with the built-in +@code{defmacro} function, but any list that begins with @code{macro} is +a macro as far as Emacs is concerned. @xref{Macros}, for an explanation +of how to write a macro. + +@node Primitive Function Type +@subsection Primitive Function Type +@cindex special forms + + A @dfn{primitive function} is a function callable from Lisp but +written in the C programming language. Primitive functions are also +called @dfn{subrs} or @dfn{built-in functions}. (The word ``subr'' is +derived from ``subroutine''.) Most primitive functions evaluate all +their arguments when they are called. A primitive function that does +not evaluate all its arguments is called a @dfn{special form} +(@pxref{Special Forms}).@refill + + It does not matter to the caller of a function whether the function is +primitive. However, this does matter if you try to substitute a +function written in Lisp for a primitive of the same name. The reason +is that the primitive function may be called directly from C code. +Calls to the redefined function from Lisp will use the new definition, +but calls from C code may still use the built-in definition. + + The term @dfn{function} refers to all Emacs functions, whether written +in Lisp or C. @xref{Lisp Function Type}, for information about the +functions written in Lisp.@refill + + Primitive functions have no read syntax and print in hash notation +with the name of the subroutine. + +@example +@group +(symbol-function 'car) ; @r{Access the function cell} + ; @r{of the symbol.} + @result{} #<subr car> +(subrp (symbol-function 'car)) ; @r{Is this a primitive function?} + @result{} t ; @r{Yes.} +@end group +@end example + +@node Byte-Code Type +@subsection Byte-Code Function Type + +The byte compiler produces @dfn{byte-code function objects}. +Internally, a byte-code function object is much like a vector; however, +the evaluator handles this data type specially when it appears as a +function to be called. @xref{Byte Compilation}, for information about +the byte compiler. + +The printed representation for a byte-code function object is like that +for a vector, with an additional @samp{#} before the opening @samp{[}. + +@node Autoload Type +@subsection Autoload Type + + An @dfn{autoload object} is a list whose first element is the symbol +@code{autoload}. It is stored as the function definition of a symbol as +a placeholder for the real definition; it says that the real definition +is found in a file of Lisp code that should be loaded when necessary. +The autoload object contains the name of the file, plus some other +information about the real definition. + + After the file has been loaded, the symbol should have a new function +definition that is not an autoload object. The new definition is then +called as if it had been there to begin with. From the user's point of +view, the function call works as expected, using the function definition +in the loaded file. + + An autoload object is usually created with the function +@code{autoload}, which stores the object in the function cell of a +symbol. @xref{Autoload}, for more details. + +@node Editing Types +@section Editing Types +@cindex editing types + + The types in the previous section are common to many Lisp dialects. +Emacs Lisp provides several additional data types for purposes connected +with editing. + +@menu +* Buffer Type:: The basic object of editing. +* Marker Type:: A position in a buffer. +* Window Type:: Buffers are displayed in windows. +* Frame Type:: Windows subdivide frames. +* Window Configuration Type:: Recording the way a frame is subdivided. +* Process Type:: A process running on the underlying OS. +* Stream Type:: Receive or send characters. +* Keymap Type:: What function a keystroke invokes. +* Syntax Table Type:: What a character means. +* Display Table Type:: How display tables are represented. +* Overlay Type:: How an overlay is represented. +@end menu + +@node Buffer Type +@subsection Buffer Type + + A @dfn{buffer} is an object that holds text that can be edited +(@pxref{Buffers}). Most buffers hold the contents of a disk file +(@pxref{Files}) so they can be edited, but some are used for other +purposes. Most buffers are also meant to be seen by the user, and +therefore displayed, at some time, in a window (@pxref{Windows}). But a +buffer need not be displayed in any window. + + The contents of a buffer are much like a string, but buffers are not +used like strings in Emacs Lisp, and the available operations are +different. For example, insertion of text into a buffer is very +efficient, whereas ``inserting'' text into a string requires +concatenating substrings, and the result is an entirely new string +object. + + Each buffer has a designated position called @dfn{point} +(@pxref{Positions}). At any time, one buffer is the @dfn{current +buffer}. Most editing commands act on the contents of the current +buffer in the neighborhood of point. Many other functions manipulate or +test the characters in the current buffer; a whole chapter in this +manual is devoted to describing these functions (@pxref{Text}). + + Several other data structures are associated with each buffer: + +@itemize @bullet +@item +a local syntax table (@pxref{Syntax Tables}); + +@item +a local keymap (@pxref{Keymaps}); and, + +@item +a local variable binding list (@pxref{Buffer-Local Variables}). +@end itemize + +@noindent +The local keymap and variable list contain entries which individually +override global bindings or values. These are used to customize the +behavior of programs in different buffers, without actually changing the +programs. + + Buffers have no read syntax. They print in hash notation with the +buffer name. + +@example +@group +(current-buffer) + @result{} #<buffer objects.texi> +@end group +@end example + +@node Marker Type +@subsection Marker Type + + A @dfn{marker} denotes a position in a specific buffer. Markers +therefore have two components: one for the buffer, and one for the +position. Changes in the buffer's text automatically relocate the +position value as necessary to ensure that the marker always points +between the same two characters in the buffer. + + Markers have no read syntax. They print in hash notation, giving the +current character position and the name of the buffer. + +@example +@group +(point-marker) + @result{} #<marker at 10779 in objects.texi> +@end group +@end example + +@xref{Markers}, for information on how to test, create, copy, and move +markers. + +@node Window Type +@subsection Window Type + + A @dfn{window} describes the portion of the terminal screen that Emacs +uses to display a buffer. Every window has one associated buffer, whose +contents appear in the window. By contrast, a given buffer may appear +in one window, no window, or several windows. + + Though many windows may exist simultaneously, at any time one window +is designated the @dfn{selected window}. This is the window where the +cursor is (usually) displayed when Emacs is ready for a command. The +selected window usually displays the current buffer, but this is not +necessarily the case. + + Windows are grouped on the screen into frames; each window belongs to +one and only one frame. @xref{Frame Type}. + + Windows have no read syntax. They print in hash notation, giving the +window number and the name of the buffer being displayed. The window +numbers exist to identify windows uniquely, since the buffer displayed +in any given window can change frequently. + +@example +@group +(selected-window) + @result{} #<window 1 on objects.texi> +@end group +@end example + + @xref{Windows}, for a description of the functions that work on windows. + +@node Frame Type +@subsection Frame Type + + A @var{frame} is a rectangle on the screen that contains one or more +Emacs windows. A frame initially contains a single main window (plus +perhaps a minibuffer window) which you can subdivide vertically or +horizontally into smaller windows. + + Frames have no read syntax. They print in hash notation, giving the +frame's title, plus its address in core (useful to identify the frame +uniquely). + +@example +@group +(selected-frame) + @result{} #<frame xemacs@@mole.gnu.ai.mit.edu 0xdac80> +@end group +@end example + + @xref{Frames}, for a description of the functions that work on frames. + +@node Window Configuration Type +@subsection Window Configuration Type +@cindex screen layout + + A @dfn{window configuration} stores information about the positions, +sizes, and contents of the windows in a frame, so you can recreate the +same arrangement of windows later. + + Window configurations do not have a read syntax. They print as +@samp{#<window-configuration>}. @xref{Window Configurations}, for a +description of several functions related to window configurations. + +@node Process Type +@subsection Process Type + + The word @dfn{process} usually means a running program. Emacs itself +runs in a process of this sort. However, in Emacs Lisp, a process is a +Lisp object that designates a subprocess created by the Emacs process. +Programs such as shells, GDB, ftp, and compilers, running in +subprocesses of Emacs, extend the capabilities of Emacs. + + An Emacs subprocess takes textual input from Emacs and returns textual +output to Emacs for further manipulation. Emacs can also send signals +to the subprocess. + + Process objects have no read syntax. They print in hash notation, +giving the name of the process: + +@example +@group +(process-list) + @result{} (#<process shell>) +@end group +@end example + +@xref{Processes}, for information about functions that create, delete, +return information about, send input or signals to, and receive output +from processes. + +@node Stream Type +@subsection Stream Type + + A @dfn{stream} is an object that can be used as a source or sink for +characters---either to supply characters for input or to accept them as +output. Many different types can be used this way: markers, buffers, +strings, and functions. Most often, input streams (character sources) +obtain characters from the keyboard, a buffer, or a file, and output +streams (character sinks) send characters to a buffer, such as a +@file{*Help*} buffer, or to the echo area. + + The object @code{nil}, in addition to its other meanings, may be used +as a stream. It stands for the value of the variable +@code{standard-input} or @code{standard-output}. Also, the object +@code{t} as a stream specifies input using the minibuffer +(@pxref{Minibuffers}) or output in the echo area (@pxref{The Echo +Area}). + + Streams have no special printed representation or read syntax, and +print as whatever primitive type they are. + + @xref{Streams}, for a description of various functions related to +streams, including various parsing and printing functions. + +@node Keymap Type +@subsection Keymap Type + + A @dfn{keymap} maps keys typed by the user to commands. This mapping +controls how the user's command input is executed. A keymap is actually +a list whose @sc{car} is the symbol @code{keymap}. + + @xref{Keymaps}, for information about creating keymaps, handling prefix +keys, local as well as global keymaps, and changing key bindings. + +@node Syntax Table Type +@subsection Syntax Table Type + + A @dfn{syntax table} is a vector of 256 integers. Each element of the +vector defines how one character is interpreted when it appears in a +buffer. For example, in C mode (@pxref{Major Modes}), the @samp{+} +character is punctuation, but in Lisp mode it is a valid character in a +symbol. These modes specify different interpretations by changing the +syntax table entry for @samp{+}, at index 43 in the syntax table. + + Syntax tables are only used for scanning text in buffers, not for +reading Lisp expressions. The table the Lisp interpreter uses to read +expressions is built into the Emacs source code and cannot be changed; +thus, to change the list delimiters to be @samp{@{} and @samp{@}} +instead of @samp{(} and @samp{)} would be impossible. + + @xref{Syntax Tables}, for details about syntax classes and how to make +and modify syntax tables. + +@node Display Table Type +@subsection Display Table Type + + A @dfn{display table} specifies how to display each character code. +Each buffer and each window can have its own display table. A display +table is actually a vector of length 261. @xref{Display Tables}. + +@node Overlay Type +@subsection Overlay Type + + An @dfn{overlay} specifies temporary alteration of the display +appearance of a part of a buffer. It contains markers delimiting a +range of the buffer, plus a property list (a list whose elements are +alternating property names and values). Overlays are used to present +parts of the buffer temporarily in a different display style. + + @xref{Overlays}, for how to create and use overlays. They have no +read syntax, and print in hash notation, giving the buffer name and +range of positions. + +@node Type Predicates +@section Type Predicates +@cindex predicates +@cindex type checking +@kindex wrong-type-argument + + The Emacs Lisp interpreter itself does not perform type checking on +the actual arguments passed to functions when they are called. It could +not do so, since function arguments in Lisp do not have declared data +types, as they do in other programming languages. It is therefore up to +the individual function to test whether each actual argument belongs to +a type that the function can use. + + All built-in functions do check the types of their actual arguments +when appropriate, and signal a @code{wrong-type-argument} error if an +argument is of the wrong type. For example, here is what happens if you +pass an argument to @code{+} which it cannot handle: + +@example +@group +(+ 2 'a) + @error{} Wrong type argument: integer-or-marker-p, a +@end group +@end example + +@cindex type predicates +@cindex testing types + Lisp provides functions, called @dfn{type predicates}, to test whether +an object is a member of a given type. (Following a convention of long +standing, the names of most Emacs Lisp predicates end in @samp{p}.) + +Here is a table of predefined type predicates, in alphabetical order, +with references to further information. + +@table @code +@item atom +@xref{List-related Predicates, atom}. + +@item arrayp +@xref{Array Functions, arrayp}. + +@item bufferp +@xref{Buffer Basics, bufferp}. + +@item byte-code-function-p +@xref{Byte-Code Type, byte-code-function-p}. + +@item case-table-p +@xref{Case Table, case-table-p}. + +@item char-or-string-p +@xref{Predicates for Strings, char-or-string-p}. + +@item commandp +@xref{Interactive Call, commandp}. + +@item consp +@xref{List-related Predicates, consp}. + +@item floatp +@xref{Predicates on Numbers, floatp}. + +@item frame-live-p +@xref{Deleting Frames, frame-live-p}. + +@item framep +@xref{Frames, framep}. + +@item integer-or-marker-p +@xref{Predicates on Markers, integer-or-marker-p}. + +@item integerp +@xref{Predicates on Numbers, integerp}. + +@item keymapp +@xref{Creating Keymaps, keymapp}. + +@item listp +@xref{List-related Predicates, listp}. + +@item markerp +@xref{Predicates on Markers, markerp}. + +@item natnump +@xref{Predicates on Numbers, natnump}. + +@item nlistp +@xref{List-related Predicates, nlistp}. + +@item numberp +@xref{Predicates on Numbers, numberp}. + +@item number-or-marker-p +@xref{Predicates on Markers, number-or-marker-p}. + +@item overlayp +@xref{Overlays, overlayp}. + +@item processp +@xref{Processes, processp}. + +@item sequencep +@xref{Sequence Functions, sequencep}. + +@item stringp +@xref{Predicates for Strings, stringp}. + +@item subrp +@xref{Function Cells, subrp}. + +@item symbolp +@xref{Symbols, symbolp}. + +@item syntax-table-p +@xref{Syntax Tables, syntax-table-p}. + +@item user-variable-p +@xref{Defining Variables, user-variable-p}. + +@item vectorp +@xref{Vectors, vectorp}. + +@item window-configuration-p +@xref{Window Configurations, window-configuration-p}. + +@item window-live-p +@xref{Deleting Windows, window-live-p}. + +@item windowp +@xref{Basic Windows, windowp}. +@end table + +@node Equality Predicates +@section Equality Predicates +@cindex equality + + Here we describe two functions that test for equality between any two +objects. Other functions test equality between objects of specific +types, e.g., strings. See the appropriate chapter describing the data +type for these predicates. + +@defun eq object1 object2 +This function returns @code{t} if @var{object1} and @var{object2} are +the same object, @code{nil} otherwise. The ``same object'' means that a +change in one will be reflected by the same change in the other. + +@code{eq} returns @code{t} if @var{object1} and @var{object2} are +integers with the same value. Also, since symbol names are normally +unique, if the arguments are symbols with the same name, they are +@code{eq}. For other types (e.g., lists, vectors, strings), two +arguments with the same contents or elements are not necessarily +@code{eq} to each other: they are @code{eq} only if they are the same +object. + +(The @code{make-symbol} function returns an uninterned symbol that is +not interned in the standard @code{obarray}. When uninterned symbols +are in use, symbol names are no longer unique. Distinct symbols with +the same name are not @code{eq}. @xref{Creating Symbols}.) + +@example +@group +(eq 'foo 'foo) + @result{} t +@end group + +@group +(eq 456 456) + @result{} t +@end group + +@group +(eq "asdf" "asdf") + @result{} nil +@end group + +@group +(eq '(1 (2 (3))) '(1 (2 (3)))) + @result{} nil +@end group + +@group +(setq foo '(1 (2 (3)))) + @result{} (1 (2 (3))) +(eq foo foo) + @result{} t +(eq foo '(1 (2 (3)))) + @result{} nil +@end group + +@group +(eq [(1 2) 3] [(1 2) 3]) + @result{} nil +@end group + +@group +(eq (point-marker) (point-marker)) + @result{} nil +@end group +@end example + +@end defun + +@defun equal object1 object2 +This function returns @code{t} if @var{object1} and @var{object2} have +equal components, @code{nil} otherwise. Whereas @code{eq} tests if its +arguments are the same object, @code{equal} looks inside nonidentical +arguments to see if their elements are the same. So, if two objects are +@code{eq}, they are @code{equal}, but the converse is not always true. + +@example +@group +(equal 'foo 'foo) + @result{} t +@end group + +@group +(equal 456 456) + @result{} t +@end group + +@group +(equal "asdf" "asdf") + @result{} t +@end group +@group +(eq "asdf" "asdf") + @result{} nil +@end group + +@group +(equal '(1 (2 (3))) '(1 (2 (3)))) + @result{} t +@end group +@group +(eq '(1 (2 (3))) '(1 (2 (3)))) + @result{} nil +@end group + +@group +(equal [(1 2) 3] [(1 2) 3]) + @result{} t +@end group +@group +(eq [(1 2) 3] [(1 2) 3]) + @result{} nil +@end group + +@group +(equal (point-marker) (point-marker)) + @result{} t +@end group + +@group +(eq (point-marker) (point-marker)) + @result{} nil +@end group +@end example + +Comparison of strings uses @code{string=}, and is case-sensitive. + +@example +@group +(equal "asdf" "ASDF") + @result{} nil +@end group +@end example +@end defun + + The test for equality is implemented recursively, and circular lists may +therefore cause infinite recursion (leading to an error). |