summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAndy Wingo <wingo@pobox.com>2011-05-09 00:13:04 +0200
committerAndy Wingo <wingo@pobox.com>2011-05-09 00:13:04 +0200
commite690a3cbf290a39ede4b3a390e940702cddf6da8 (patch)
tree7ce19a861ece2c18fa4d91842b5a4894b26698ea
parent059a588fedf377ffd32cc1f1fee7ed829b263890 (diff)
parent8f6a4b248b12db2c56ab29e909ba1441aa8f512d (diff)
downloadguile-e690a3cbf290a39ede4b3a390e940702cddf6da8.tar.gz
Merge remote-tracking branch 'origin/stable-2.0'
-rw-r--r--benchmark-suite/benchmarks/srfi-1.bm19
-rw-r--r--doc/ref/api-compound.texi9
-rw-r--r--doc/ref/api-io.texi689
-rw-r--r--doc/ref/r6rs.texi17
-rw-r--r--libguile/foreign.c49
-rw-r--r--libguile/hash.c3
-rw-r--r--libguile/inline.h34
-rw-r--r--libguile/ports.c244
-rw-r--r--libguile/print.c84
-rw-r--r--libguile/r6rs-ports.c7
-rw-r--r--libguile/read.c2
-rw-r--r--module/ice-9/vlist.scm21
-rw-r--r--module/language/glil/compile-assembly.scm740
-rw-r--r--module/rnrs/io/ports.scm10
-rw-r--r--module/rnrs/io/simple.scm86
-rw-r--r--module/srfi/srfi-1.scm2
-rw-r--r--test-suite/tests/foreign.test28
-rw-r--r--test-suite/tests/hash.test7
-rw-r--r--test-suite/tests/ports.test23
-rw-r--r--test-suite/tests/vlist.test7
20 files changed, 1724 insertions, 357 deletions
diff --git a/benchmark-suite/benchmarks/srfi-1.bm b/benchmark-suite/benchmarks/srfi-1.bm
index 835608d41..67f79ca24 100644
--- a/benchmark-suite/benchmarks/srfi-1.bm
+++ b/benchmark-suite/benchmarks/srfi-1.bm
@@ -1,7 +1,7 @@
;;; -*- mode: scheme; coding: utf-8; -*-
;;; SRFI-1.
;;;
-;;; Copyright 2010 Free Software Foundation, Inc.
+;;; Copyright 2010, 2011 Free Software Foundation, Inc.
;;;
;;; This program is free software; you can redistribute it and/or
;;; modify it under the terms of the GNU Lesser General Public License
@@ -45,3 +45,20 @@
(benchmark "small" 2000000
(drop-while (lambda (n) #t) %small-list)))
+
+(with-benchmark-prefix "map"
+
+ (benchmark "big" 30
+ (map (lambda (x) x) %big-list))
+
+ (benchmark "small" 2000000
+ (map (lambda (x) x) %small-list)))
+
+(with-benchmark-prefix "for-each"
+
+ (benchmark "big" 30
+ (for-each (lambda (x) #f) %big-list))
+
+ (benchmark "small" 2000000
+ (for-each (lambda (x) #f) %small-list)))
+
diff --git a/doc/ref/api-compound.texi b/doc/ref/api-compound.texi
index 27ba437ad..da8813baf 100644
--- a/doc/ref/api-compound.texi
+++ b/doc/ref/api-compound.texi
@@ -1,7 +1,7 @@
@c -*-texinfo-*-
@c This is part of the GNU Guile Reference Manual.
-@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2009, 2010
-@c Free Software Foundation, Inc.
+@c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2005, 2006,
+@c 2007, 2009, 2010, 2011 Free Software Foundation, Inc.
@c See the file guile.texi for copying conditions.
@node Compound Data Types
@@ -3294,8 +3294,9 @@ Again the choice of @var{hash-proc} must be consistent with previous calls to
@end deffn
@deffn {Scheme Procedure} vhash-fold proc vhash
-Fold over the key/pair elements of @var{vhash}. For each pair call @var{proc}
-as @code{(@var{proc} key value result)}.
+@deffnx {Scheme Procedure} vhash-fold-right proc vhash
+Fold over the key/value elements of @var{vhash} in the given direction.
+For each pair call @var{proc} as @code{(@var{proc} key value result)}.
@end deffn
@deffn {Scheme Procedure} vhash-fold* proc init key vhash [equal? [hash]]
diff --git a/doc/ref/api-io.texi b/doc/ref/api-io.texi
index e7e91edae..09fdc8331 100644
--- a/doc/ref/api-io.texi
+++ b/doc/ref/api-io.texi
@@ -1152,22 +1152,364 @@ The I/O port API of the @uref{http://www.r6rs.org/, Revised Report^6 on
the Algorithmic Language Scheme (R6RS)} is provided by the @code{(rnrs
io ports)} module. It provides features, such as binary I/O and Unicode
string I/O, that complement or refine Guile's historical port API
-presented above (@pxref{Input and Output}).
+presented above (@pxref{Input and Output}). Note that R6RS ports are not
+disjoint from Guile's native ports, so Guile-specific procedures will
+work on ports created using the R6RS API, and vice versa.
+
+The text in this section is taken from the R6RS standard libraries
+document, with only minor adaptions for inclusion in this manual. The
+Guile developers offer their thanks to the R6RS editors for having
+provided the report's text under permissive conditions making this
+possible.
@c FIXME: Update description when implemented.
@emph{Note}: The implementation of this R6RS API is not complete yet.
@menu
+* R6RS File Names:: File names.
+* R6RS File Options:: Options for opening files.
+* R6RS Buffer Modes:: Influencing buffering behavior.
+* R6RS Transcoders:: Influencing port encoding.
* R6RS End-of-File:: The end-of-file object.
* R6RS Port Manipulation:: Manipulating R6RS ports.
+* R6RS Input Ports:: Input Ports.
* R6RS Binary Input:: Binary input.
+* R6RS Textual Input:: Textual input.
+* R6RS Output Ports:: Output Ports.
* R6RS Binary Output:: Binary output.
+* R6RS Textual Output:: Textual output.
@end menu
A subset of the @code{(rnrs io ports)} module is provided by the
@code{(ice-9 binary-ports)} module. It contains binary input/output
procedures and does not rely on R6RS support.
+@node R6RS File Names
+@subsubsection File Names
+
+Some of the procedures described in this chapter accept a file name as an
+argument. Valid values for such a file name include strings that name a file
+using the native notation of filesystem paths on an implementation's
+underlying operating system, and may include implementation-dependent
+values as well.
+
+A @var{filename} parameter name means that the
+corresponding argument must be a file name.
+
+@node R6RS File Options
+@subsubsection File Options
+@cindex file options
+
+When opening a file, the various procedures in this library accept a
+@code{file-options} object that encapsulates flags to specify how the
+file is to be opened. A @code{file-options} object is an enum-set
+(@pxref{rnrs enums}) over the symbols constituting valid file options.
+
+A @var{file-options} parameter name means that the corresponding
+argument must be a file-options object.
+
+@deffn {Scheme Syntax} file-options @var{file-options-symbol} ...
+
+Each @var{file-options-symbol} must be a symbol.
+
+The @code{file-options} syntax returns a file-options object that
+encapsulates the specified options.
+
+When supplied to an operation that opens a file for output, the
+file-options object returned by @code{(file-options)} specifies that the
+file is created if it does not exist and an exception with condition
+type @code{&i/o-file-already-exists} is raised if it does exist. The
+following standard options can be included to modify the default
+behavior.
+
+@table @code
+@item no-create
+ If the file does not already exist, it is not created;
+ instead, an exception with condition type @code{&i/o-file-does-not-exist}
+ is raised.
+ If the file already exists, the exception with condition type
+ @code{&i/o-file-already-exists} is not raised
+ and the file is truncated to zero length.
+@item no-fail
+ If the file already exists, the exception with condition type
+ @code{&i/o-file-already-exists} is not raised,
+ even if @code{no-create} is not included,
+ and the file is truncated to zero length.
+@item no-truncate
+ If the file already exists and the exception with condition type
+ @code{&i/o-file-already-exists} has been inhibited by inclusion of
+ @code{no-create} or @code{no-fail}, the file is not truncated, but
+ the port's current position is still set to the beginning of the
+ file.
+@end table
+
+These options have no effect when a file is opened only for input.
+Symbols other than those listed above may be used as
+@var{file-options-symbol}s; they have implementation-specific meaning,
+if any.
+
+@quotation Note
+ Only the name of @var{file-options-symbol} is significant.
+@end quotation
+@end deffn
+
+@node R6RS Buffer Modes
+@subsubsection Buffer Modes
+
+Each port has an associated buffer mode. For an output port, the
+buffer mode defines when an output operation flushes the buffer
+associated with the output port. For an input port, the buffer mode
+defines how much data will be read to satisfy read operations. The
+possible buffer modes are the symbols @code{none} for no buffering,
+@code{line} for flushing upon line endings and reading up to line
+endings, or other implementation-dependent behavior,
+and @code{block} for arbitrary buffering. This section uses
+the parameter name @var{buffer-mode} for arguments that must be
+buffer-mode symbols.
+
+If two ports are connected to the same mutable source, both ports
+are unbuffered, and reading a byte or character from that shared
+source via one of the two ports would change the bytes or characters
+seen via the other port, a lookahead operation on one port will
+render the peeked byte or character inaccessible via the other port,
+while a subsequent read operation on the peeked port will see the
+peeked byte or character even though the port is otherwise unbuffered.
+
+In other words, the semantics of buffering is defined in terms of side
+effects on shared mutable sources, and a lookahead operation has the
+same side effect on the shared source as a read operation.
+
+@deffn {Scheme Syntax} buffer-mode @var{buffer-mode-symbol}
+
+@var{buffer-mode-symbol} must be a symbol whose name is one of
+@code{none}, @code{line}, and @code{block}. The result is the
+corresponding symbol, and specifies the associated buffer mode.
+
+@quotation Note
+ Only the name of @var{buffer-mode-symbol} is significant.
+@end quotation
+@end deffn
+
+@deffn {Scheme Procedure} buffer-mode? obj
+Returns @code{#t} if the argument is a valid buffer-mode symbol, and
+returns @code{#f} otherwise.
+@end deffn
+
+@node R6RS Transcoders
+@subsubsection Transcoders
+@cindex codec
+@cindex end-of-line style
+@cindex transcoder
+@cindex binary port
+@cindex textual port
+
+Several different Unicode encoding schemes describe standard ways to
+encode characters and strings as byte sequences and to decode those
+sequences. Within this document, a @dfn{codec} is an immutable Scheme
+object that represents a Unicode or similar encoding scheme.
+
+An @dfn{end-of-line style} is a symbol that, if it is not @code{none},
+describes how a textual port transcodes representations of line endings.
+
+A @dfn{transcoder} is an immutable Scheme object that combines a codec
+with an end-of-line style and a method for handling decoding errors.
+Each transcoder represents some specific bidirectional (but not
+necessarily lossless), possibly stateful translation between byte
+sequences and Unicode characters and strings. Every transcoder can
+operate in the input direction (bytes to characters) or in the output
+direction (characters to bytes). A @var{transcoder} parameter name
+means that the corresponding argument must be a transcoder.
+
+A @dfn{binary port} is a port that supports binary I/O, does not have an
+associated transcoder and does not support textual I/O. A @dfn{textual
+port} is a port that supports textual I/O, and does not support binary
+I/O. A textual port may or may not have an associated transcoder.
+
+@deffn {Scheme Procedure} latin-1-codec
+@deffnx {Scheme Procedure} utf-8-codec
+@deffnx {Scheme Procedure} utf-16-codec
+
+These are predefined codecs for the ISO 8859-1, UTF-8, and UTF-16
+encoding schemes.
+
+A call to any of these procedures returns a value that is equal in the
+sense of @code{eqv?} to the result of any other call to the same
+procedure.
+@end deffn
+
+@deffn {Scheme Syntax} eol-style @var{eol-style-symbol}
+
+@var{eol-style-symbol} should be a symbol whose name is one of
+@code{lf}, @code{cr}, @code{crlf}, @code{nel}, @code{crnel}, @code{ls},
+and @code{none}.
+
+The form evaluates to the corresponding symbol. If the name of
+@var{eol-style-symbol} is not one of these symbols, the effect and
+result are implementation-dependent; in particular, the result may be an
+eol-style symbol acceptable as an @var{eol-style} argument to
+@code{make-transcoder}. Otherwise, an exception is raised.
+
+All eol-style symbols except @code{none} describe a specific
+line-ending encoding:
+
+@table @code
+@item lf
+linefeed
+@item cr
+carriage return
+@item crlf
+carriage return, linefeed
+@item nel
+next line
+@item crnel
+carriage return, next line
+@item ls
+line separator
+@end table
+
+For a textual port with a transcoder, and whose transcoder has an
+eol-style symbol @code{none}, no conversion occurs. For a textual input
+port, any eol-style symbol other than @code{none} means that all of the
+above line-ending encodings are recognized and are translated into a
+single linefeed. For a textual output port, @code{none} and @code{lf}
+are equivalent. Linefeed characters are encoded according to the
+specified eol-style symbol, and all other characters that participate in
+possible line endings are encoded as is.
+
+@quotation Note
+ Only the name of @var{eol-style-symbol} is significant.
+@end quotation
+@end deffn
+
+@deffn {Scheme Procedure} native-eol-style
+Returns the default end-of-line style of the underlying platform, e.g.,
+@code{lf} on Unix and @code{crlf} on Windows.
+@end deffn
+
+@deffn {Condition Type} &i/o-decoding
+@deffnx {Scheme Procedure} make-i/o-decoding-error port
+@deffnx {Scheme Procedure} i/o-decoding-error? obj
+
+This condition type could be defined by
+
+@lisp
+(define-condition-type &i/o-decoding &i/o-port
+ make-i/o-decoding-error i/o-decoding-error?)
+@end lisp
+
+An exception with this type is raised when one of the operations for
+textual input from a port encounters a sequence of bytes that cannot be
+translated into a character or string by the input direction of the
+port's transcoder.
+
+When such an exception is raised, the port's position is past the
+invalid encoding.
+@end deffn
+
+@deffn {Condition Type} &i/o-encoding
+@deffnx {Scheme Procedure} make-i/o-encoding-error port char
+@deffnx {Scheme Procedure} i/o-encoding-error? obj
+@deffnx {Scheme Procedure} i/o-encoding-error-char condition
+
+This condition type could be defined by
+
+@lisp
+(define-condition-type &i/o-encoding &i/o-port
+ make-i/o-encoding-error i/o-encoding-error?
+ (char i/o-encoding-error-char))
+@end lisp
+
+An exception with this type is raised when one of the operations for
+textual output to a port encounters a character that cannot be
+translated into bytes by the output direction of the port's transcoder.
+@var{Char} is the character that could not be encoded.
+@end deffn
+
+@deffn {Scheme Syntax} error-handling-mode @var{error-handling-mode-symbol}
+
+@var{error-handling-mode-symbol} should be a symbol whose name is one of
+@code{ignore}, @code{raise}, and @code{replace}. The form evaluates to
+the corresponding symbol. If @var{error-handling-mode-symbol} is not
+one of these identifiers, effect and result are
+implementation-dependent: The result may be an error-handling-mode
+symbol acceptable as a @var{handling-mode} argument to
+@code{make-transcoder}. If it is not acceptable as a
+@var{handling-mode} argument to @code{make-transcoder}, an exception is
+raised.
+
+@quotation Note
+ Only the name of @var{error-handling-style-symbol} is significant.
+@end quotation
+
+The error-handling mode of a transcoder specifies the behavior
+of textual I/O operations in the presence of encoding or decoding
+errors.
+
+If a textual input operation encounters an invalid or incomplete
+character encoding, and the error-handling mode is @code{ignore}, an
+appropriate number of bytes of the invalid encoding are ignored and
+decoding continues with the following bytes.
+
+If the error-handling mode is @code{replace}, the replacement
+character U+FFFD is injected into the data stream, an appropriate
+number of bytes are ignored, and decoding
+continues with the following bytes.
+
+If the error-handling mode is @code{raise}, an exception with condition
+type @code{&i/o-decoding} is raised.
+
+If a textual output operation encounters a character it cannot encode,
+and the error-handling mode is @code{ignore}, the character is ignored
+and encoding continues with the next character. If the error-handling
+mode is @code{replace}, a codec-specific replacement character is
+emitted by the transcoder, and encoding continues with the next
+character. The replacement character is U+FFFD for transcoders whose
+codec is one of the Unicode encodings, but is the @code{?} character
+for the Latin-1 encoding. If the error-handling mode is @code{raise},
+an exception with condition type @code{&i/o-encoding} is raised.
+@end deffn
+
+@deffn {Scheme Procedure} make-transcoder codec
+@deffnx {Scheme Procedure} make-transcoder codec eol-style
+@deffnx {Scheme Procedure} make-transcoder codec eol-style handling-mode
+
+@var{codec} must be a codec; @var{eol-style}, if present, an eol-style
+symbol; and @var{handling-mode}, if present, an error-handling-mode
+symbol.
+
+@var{eol-style} may be omitted, in which case it defaults to the native
+end-of-line style of the underlying platform. @var{Handling-mode} may
+be omitted, in which case it defaults to @code{replace}. The result is
+a transcoder with the behavior specified by its arguments.
+@end deffn
+
+@deffn {Scheme procedure} native-transcoder
+Returns an implementation-dependent transcoder that represents a
+possibly locale-dependent ``native'' transcoding.
+@end deffn
+
+@deffn {Scheme Procedure} transcoder-codec transcoder
+@deffnx {Scheme Procedure} transcoder-eol-style transcoder
+@deffnx {Scheme Procedure} transcoder-error-handling-mode transcoder
+
+These are accessors for transcoder objects; when applied to a
+transcoder returned by @code{make-transcoder}, they return the
+@var{codec}, @var{eol-style}, and @var{handling-mode} arguments,
+respectively.
+@end deffn
+
+@deffn {Scheme Procedure} bytevector->string bytevector transcoder
+
+Returns the string that results from transcoding the
+@var{bytevector} according to the input direction of the transcoder.
+@end deffn
+
+@deffn {Scheme Procedure} string->bytevector string transcoder
+
+Returns the bytevector that results from transcoding the
+@var{string} according to the output direction of the transcoder.
+@end deffn
+
@node R6RS End-of-File
@subsubsection The End-of-File Object
@@ -1200,6 +1542,65 @@ Return the end-of-file (EOF) object.
The procedures listed below operate on any kind of R6RS I/O port.
+@deffn {Scheme Procedure} port? obj
+Returns @code{#t} if the argument is a port, and returns @code{#f}
+otherwise.
+@end deffn
+
+@deffn {Scheme Procedure} port-transcoder port
+Returns the transcoder associated with @var{port} if @var{port} is
+textual and has an associated transcoder, and returns @code{#f} if
+@var{port} is binary or does not have an associated transcoder.
+@end deffn
+
+@deffn {Scheme Procedure} binary-port? port
+Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
+binary data input/output.
+
+Note that internally Guile does not differentiate between binary and
+textual ports, unlike the R6RS. Thus, this procedure returns true when
+@var{port} does not have an associated encoding---i.e., when
+@code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
+port-encoding}). This is the case for ports returned by R6RS procedures
+such as @code{open-bytevector-input-port} and
+@code{make-custom-binary-output-port}.
+
+However, Guile currently does not prevent use of textual I/O procedures
+such as @code{display} or @code{read-char} with binary ports. Doing so
+``upgrades'' the port from binary to textual, under the ISO-8859-1
+encoding. Likewise, Guile does not prevent use of
+@code{set-port-encoding!} on a binary port, which also turns it into a
+``textual'' port.
+@end deffn
+
+@deffn {Scheme Procedure} textual-port? port
+Always return @var{#t}, as all ports can be used for textual I/O in
+Guile.
+@end deffn
+
+@deffn {Scheme Procedure} transcoded-port obj
+The @code{transcoded-port} procedure
+returns a new textual port with the specified @var{transcoder}.
+Otherwise the new textual port's state is largely the same as
+that of @var{binary-port}.
+If @var{binary-port} is an input port, the new textual
+port will be an input port and
+will transcode the bytes that have not yet been read from
+@var{binary-port}.
+If @var{binary-port} is an output port, the new textual
+port will be an output port and
+will transcode output characters into bytes that are
+written to the byte sink represented by @var{binary-port}.
+
+As a side effect, however, @code{transcoded-port}
+closes @var{binary-port} in
+a special way that allows the new textual port to continue to
+use the byte source or sink represented by @var{binary-port},
+even though @var{binary-port} itself is closed and cannot
+be used by the input and output operations described in this
+chapter.
+@end deffn
+
@deffn {Scheme Procedure} port-position port
If @var{port} supports it (see below), return the offset (an integer)
indicating where the next octet will be read from/written to in
@@ -1233,31 +1634,67 @@ Call @var{proc}, passing it @var{port} and closing @var{port} upon exit
of @var{proc}. Return the return values of @var{proc}.
@end deffn
-@deffn {Scheme Procedure} binary-port? port
-Return @code{#t} if @var{port} is a @dfn{binary port}, suitable for
-binary data input/output.
+@node R6RS Input Ports
+@subsubsection Input Ports
-Note that internally Guile does not differentiate between binary and
-textual ports, unlike the R6RS. Thus, this procedure returns true when
-@var{port} does not have an associated encoding---i.e., when
-@code{(port-encoding @var{port})} is @code{#f} (@pxref{Ports,
-port-encoding}). This is the case for ports returned by R6RS procedures
-such as @code{open-bytevector-input-port} and
-@code{make-custom-binary-output-port}.
+@deffn {Scheme Procedure} input-port? obj@
+Returns @code{#t} if the argument is an input port (or a combined input
+and output port), and returns @code{#f} otherwise.
+@end deffn
-However, Guile currently does not prevent use of textual I/O procedures
-such as @code{display} or @code{read-char} with binary ports. Doing so
-``upgrades'' the port from binary to textual, under the ISO-8859-1
-encoding. Likewise, Guile does not prevent use of
-@code{set-port-encoding!} on a binary port, which also turns it into a
-``textual'' port.
+@deffn {Scheme Procedure} port-eof? port
+Returns @code{#t}
+if the @code{lookahead-u8} procedure (if @var{input-port} is a binary port)
+or the @code{lookahead-char} procedure (if @var{input-port} is a textual port)
+would return
+the end-of-file object, and @code{#f} otherwise.
+The operation may block indefinitely if no data is available
+but the port cannot be determined to be at end of file.
@end deffn
-@deffn {Scheme Procedure} textual-port? port
-Always return @var{#t}, as all ports can be used for textual I/O in
-Guile.
+@deffn {Scheme Procedure} open-file-input-port filename
+@deffnx {Scheme Procedure} open-file-input-port filename file-options
+@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode
+@deffnx {Scheme Procedure} open-file-input-port filename file-options buffer-mode maybe-transcoder
+@var{Maybe-transcoder} must be either a transcoder or @code{#f}.
+
+The @code{open-file-input-port} procedure returns an
+input port for the named file. The @var{file-options} and
+@var{maybe-transcoder} arguments are optional.
+
+The @var{file-options} argument, which may determine
+various aspects of the returned port (@pxref{R6RS File Options}),
+defaults to the value of @code{(file-options)}.
+
+The @var{buffer-mode} argument, if supplied,
+must be one of the symbols that name a buffer mode.
+The @var{buffer-mode} argument defaults to @code{block}.
+
+If @var{maybe-transcoder} is a transcoder, it becomes the transcoder associated
+with the returned port.
+
+If @var{maybe-transcoder} is @code{#f} or absent,
+the port will be a binary port and will support the
+@code{port-position} and @code{set-port-position!} operations.
+Otherwise the port will be a textual port, and whether it supports
+the @code{port-position} and @code{set-port-position!} operations
+is implementation-dependent (and possibly transcoder-dependent).
@end deffn
+@deffn {Scheme Procedure} standard-input-port
+Returns a fresh binary input port connected to standard input. Whether
+the port supports the @code{port-position} and @code{set-port-position!}
+operations is implementation-dependent.
+@end deffn
+
+@deffn {Scheme Procedure} current-input-port
+This returns a default textual port for input. Normally, this default
+port is associated with standard input, but can be dynamically
+re-assigned using the @code{with-input-from-file} procedure from the
+@code{io simple (6)} library (@pxref{rnrs io simple}). The port may or
+may not have an associated transcoder; if it does, the transcoder is
+implementation-dependent.
+@end deffn
@node R6RS Binary Input
@subsubsection Binary Input
@@ -1374,6 +1811,173 @@ reached. Return either a new bytevector containing the data read or the
end-of-file object (if no data were available).
@end deffn
+@node R6RS Textual Input
+@subsubsection Textual Input
+
+@deffn {Scheme Procedure} get-char port
+Reads from @var{textual-input-port}, blocking as necessary, until a
+complete character is available from @var{textual-input-port},
+or until an end of file is reached.
+
+If a complete character is available before the next end of file,
+@code{get-char} returns that character and updates the input port to
+point past the character. If an end of file is reached before any
+character is read, @code{get-char} returns the end-of-file object.
+@end deffn
+
+@deffn {Scheme Procedure} lookahead-char port
+The @code{lookahead-char} procedure is like @code{get-char}, but it does
+not update @var{textual-input-port} to point past the character.
+@end deffn
+
+@deffn {Scheme Procedure} get-string-n port count
+
+@var{Count} must be an exact, non-negative integer object, representing
+the number of characters to be read.
+
+The @code{get-string-n} procedure reads from @var{textual-input-port},
+blocking as necessary, until @var{count} characters are available, or
+until an end of file is reached.
+
+If @var{count} characters are available before end of file,
+@code{get-string-n} returns a string consisting of those @var{count}
+characters. If fewer characters are available before an end of file, but
+one or more characters can be read, @code{get-string-n} returns a string
+containing those characters. In either case, the input port is updated
+to point just past the characters read. If no characters can be read
+before an end of file, the end-of-file object is returned.
+@end deffn
+
+@deffn {Scheme Procedure} get-string-n! port string start count
+
+@var{Start} and @var{count} must be exact, non-negative integer objects,
+with @var{count} representing the number of characters to be read.
+@var{String} must be a string with at least $@var{start} + @var{count}$
+characters.
+
+The @code{get-string-n!} procedure reads from @var{textual-input-port}
+in the same manner as @code{get-string-n}. If @var{count} characters
+are available before an end of file, they are written into @var{string}
+starting at index @var{start}, and @var{count} is returned. If fewer
+characters are available before an end of file, but one or more can be
+read, those characters are written into @var{string} starting at index
+@var{start} and the number of characters actually read is returned as an
+exact integer object. If no characters can be read before an end of
+file, the end-of-file object is returned.
+@end deffn
+
+@deffn {Scheme Procedure} get-string-all port count
+Reads from @var{textual-input-port} until an end of file, decoding
+characters in the same manner as @code{get-string-n} and
+@code{get-string-n!}.
+
+If characters are available before the end of file, a string containing
+all the characters decoded from that data are returned. If no character
+precedes the end of file, the end-of-file object is returned.
+@end deffn
+
+@deffn {Scheme Procedure} get-line port
+Reads from @var{textual-input-port} up to and including the linefeed
+character or end of file, decoding characters in the same manner as
+@code{get-string-n} and @code{get-string-n!}.
+
+If a linefeed character is read, a string containing all of the text up
+to (but not including) the linefeed character is returned, and the port
+is updated to point just past the linefeed character. If an end of file
+is encountered before any linefeed character is read, but some
+characters have been read and decoded as characters, a string containing
+those characters is returned. If an end of file is encountered before
+any characters are read, the end-of-file object is returned.
+
+@quotation Note
+ The end-of-line style, if not @code{none}, will cause all line endings
+ to be read as linefeed characters. @xref{R6RS Transcoders}.
+@end quotation
+@end deffn
+
+@deffn {Scheme Procedure} get-datum port count
+Reads an external representation from @var{textual-input-port} and returns the
+datum it represents. The @code{get-datum} procedure returns the next
+datum that can be parsed from the given @var{textual-input-port}, updating
+@var{textual-input-port} to point exactly past the end of the external
+representation of the object.
+
+Any @emph{interlexeme space} (comment or whitespace, @pxref{Scheme
+Syntax}) in the input is first skipped. If an end of file occurs after
+the interlexeme space, the end-of-file object (@pxref{R6RS End-of-File})
+is returned.
+
+If a character inconsistent with an external representation is
+encountered in the input, an exception with condition types
+@code{&lexical} and @code{&i/o-read} is raised. Also, if the end of
+file is encountered after the beginning of an external representation,
+but the external representation is incomplete and therefore cannot be
+parsed, an exception with condition types @code{&lexical} and
+@code{&i/o-read} is raised.
+@end deffn
+
+@node R6RS Output Ports
+@subsubsection Output Ports
+
+@deffn {Scheme Procedure} output-port? obj
+Returns @code{#t} if the argument is an output port (or a
+combined input and output port), @code{#f} otherwise.
+@end deffn
+
+@deffn {Scheme Procedure} flush-output-port port
+Flushes any buffered output from the buffer of @var{output-port} to the
+underlying file, device, or object. The @code{flush-output-port}
+procedure returns an unspecified values.
+@end deffn
+
+@deffn {Scheme Procedure} open-file-output-port filename
+@deffnx {Scheme Procedure} open-file-output-port filename file-options
+@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode
+@deffnx {Scheme Procedure} open-file-output-port filename file-options buffer-mode maybe-transcoder
+
+@var{maybe-transcoder} must be either a transcoder or @code{#f}.
+
+The @code{open-file-output-port} procedure returns an output port for the named file.
+
+The @var{file-options} argument, which may determine various aspects of
+the returned port (@pxref{R6RS File Options}), defaults to the value of
+@code{(file-options)}.
+
+The @var{buffer-mode} argument, if supplied,
+must be one of the symbols that name a buffer mode.
+The @var{buffer-mode} argument defaults to @code{block}.
+
+If @var{maybe-transcoder} is a transcoder, it becomes the transcoder
+associated with the port.
+
+If @var{maybe-transcoder} is @code{#f} or absent,
+the port will be a binary port and will support the
+@code{port-position} and @code{set-port-position!} operations.
+Otherwise the port will be a textual port, and whether it supports
+the @code{port-position} and @code{set-port-position!} operations
+is implementation-dependent (and possibly transcoder-dependent).
+@end deffn
+
+@deffn {Scheme Procedure} standard-output-port
+@deffnx {Scheme Procedure} standard-error-port
+Returns a fresh binary output port connected to the standard output or
+standard error respectively. Whether the port supports the
+@code{port-position} and @code{set-port-position!} operations is
+implementation-dependent.
+@end deffn
+
+@deffn {Scheme Procedure} current-output-port
+@deffnx {Scheme Procedure} current-error-port
+These return default textual ports for regular output and error output.
+Normally, these default ports are associated with standard output, and
+standard error, respectively. The return value of
+@code{current-output-port} can be dynamically re-assigned using the
+@code{with-output-to-file} procedure from the @code{io simple (6)}
+library (@pxref{rnrs io simple}). A port returned by one of these
+procedures may or may not have an associated transcoder; if it does, the
+transcoder is implementation-dependent.
+@end deffn
+
@node R6RS Binary Output
@subsubsection Binary Output
@@ -1432,6 +2036,50 @@ Write the contents of @var{bv} to @var{port}, optionally starting at
index @var{start} and limiting to @var{count} octets.
@end deffn
+@node R6RS Textual Output
+@subsubsection Textual Output
+
+@deffn {Scheme Procedure} put-char port char
+Writes @var{char} to the port. The @code{put-char} procedure returns
+@end deffn
+
+@deffn {Scheme Procedure} put-string port string
+@deffnx {Scheme Procedure} put-string port string start
+@deffnx {Scheme Procedure} put-string port string start count
+
+@var{start} and @var{count} must be non-negative exact integer objects.
+@var{string} must have a length of at least @math{@var{start} +
+@var{count}}. @var{start} defaults to 0. @var{count} defaults to
+@math{@code{(string-length @var{string})} - @var{start}}$. The
+@code{put-string} procedure writes the @var{count} characters of
+@var{string} starting at index @var{start} to the port. The
+@code{put-string} procedure returns an unspecified value.
+@end deffn
+
+@deffn {Scheme Procedure} put-datum port datum
+@var{datum} should be a datum value. The @code{put-datum} procedure
+writes an external representation of @var{datum} to
+@var{textual-output-port}. The specific external representation is
+implementation-dependent. However, whenever possible, an implementation
+should produce a representation for which @code{get-datum}, when reading
+the representation, will return an object equal (in the sense of
+@code{equal?}) to @var{datum}.
+
+@quotation Note
+ Not all datums may allow producing an external representation for which
+ @code{get-datum} will produce an object that is equal to the
+ original. Specifically, NaNs contained in @var{datum} may make
+ this impossible.
+@end quotation
+
+@quotation Note
+ The @code{put-datum} procedure merely writes the external
+ representation, but no trailing delimiter. If @code{put-datum} is
+ used to write several subsequent external representations to an
+ output port, care should be taken to delimit them properly so they can
+ be read back in by subsequent calls to @code{get-datum}.
+@end quotation
+@end deffn
@node I/O Extensions
@subsection Using and Extending Ports in C
@@ -1690,7 +2338,6 @@ Set using
@end table
-
@c Local Variables:
@c TeX-master: "guile.texi"
@c End:
diff --git a/doc/ref/r6rs.texi b/doc/ref/r6rs.texi
index 2fe8d7b76..d054bd39e 100644
--- a/doc/ref/r6rs.texi
+++ b/doc/ref/r6rs.texi
@@ -1428,8 +1428,21 @@ functionality is documented in its own section of the manual;
The @code{(rnrs io simple (6))} library provides convenience functions
for performing textual I/O on ports. This library also exports all of
-the condition types and associated procedures described in
-(@pxref{I/O Conditions}).
+the condition types and associated procedures described in (@pxref{I/O
+Conditions}). In the context of this section, when stating that a
+procedure behaves ``identically'' to the corresponding procedure in
+Guile's core library, this is modulo the behavior wrt. conditions: such
+procedures raise the appropriate R6RS conditions in case of error, but
+otherwise behave identically.
+
+@c FIXME: remove the following note when proper condition behavior has
+@c been verified.
+
+@quotation Note
+There are still known issues regarding condition-correctness; some
+errors may still be thrown as native Guile exceptions instead of the
+appropriate R6RS conditions.
+@end quotation
@deffn {Scheme Procedure} eof-object
@deffnx {Scheme Procedure} eof-object? obj
diff --git a/libguile/foreign.c b/libguile/foreign.c
index 8081c5e3d..e82a8c581 100644
--- a/libguile/foreign.c
+++ b/libguile/foreign.c
@@ -965,9 +965,12 @@ unpack (const ffi_type *type, void *loc, SCM x)
}
#undef FUNC_NAME
-/* Return a Scheme representation of the foreign value at LOC of type TYPE. */
+/* Return a Scheme representation of the foreign value at LOC of type
+ TYPE. When RETURN_VALUE_P is true, LOC is assumed to point to a
+ return value buffer; otherwise LOC is assumed to point to an
+ argument buffer. */
static SCM
-pack (const ffi_type * type, const void *loc)
+pack (const ffi_type * type, const void *loc, int return_value_p)
{
switch (type->type)
{
@@ -977,22 +980,48 @@ pack (const ffi_type * type, const void *loc)
return scm_from_double (*(float *) loc);
case FFI_TYPE_DOUBLE:
return scm_from_double (*(double *) loc);
+
+ /* For integer return values smaller than `int', libffi stores the
+ result in an `ffi_arg'-long buffer, of which only the
+ significant bits must be kept---hence the pair of casts below.
+ See <http://thread.gmane.org/gmane.comp.lib.ffi.general/406>
+ for details. */
+
case FFI_TYPE_UINT8:
- return scm_from_uint8 (*(scm_t_uint8 *) loc);
+ if (return_value_p)
+ return scm_from_uint8 ((scm_t_uint8) *(ffi_arg *) loc);
+ else
+ return scm_from_uint8 (* (scm_t_uint8 *) loc);
case FFI_TYPE_SINT8:
- return scm_from_int8 (*(scm_t_int8 *) loc);
+ if (return_value_p)
+ return scm_from_int8 ((scm_t_int8) *(ffi_arg *) loc);
+ else
+ return scm_from_int8 (* (scm_t_int8 *) loc);
case FFI_TYPE_UINT16:
- return scm_from_uint16 (*(scm_t_uint16 *) loc);
+ if (return_value_p)
+ return scm_from_uint16 ((scm_t_uint16) *(ffi_arg *) loc);
+ else
+ return scm_from_uint16 (* (scm_t_uint16 *) loc);
case FFI_TYPE_SINT16:
- return scm_from_int16 (*(scm_t_int16 *) loc);
+ if (return_value_p)
+ return scm_from_int16 ((scm_t_int16) *(ffi_arg *) loc);
+ else
+ return scm_from_int16 (* (scm_t_int16 *) loc);
case FFI_TYPE_UINT32:
- return scm_from_uint32 (*(scm_t_uint32 *) loc);
+ if (return_value_p)
+ return scm_from_uint32 ((scm_t_uint32) *(ffi_arg *) loc);
+ else
+ return scm_from_uint32 (* (scm_t_uint32 *) loc);
case FFI_TYPE_SINT32:
- return scm_from_int32 (*(scm_t_int32 *) loc);
+ if (return_value_p)
+ return scm_from_int32 ((scm_t_int32) *(ffi_arg *) loc);
+ else
+ return scm_from_int32 (* (scm_t_int32 *) loc);
case FFI_TYPE_UINT64:
return scm_from_uint64 (*(scm_t_uint64 *) loc);
case FFI_TYPE_SINT64:
return scm_from_int64 (*(scm_t_int64 *) loc);
+
case FFI_TYPE_STRUCT:
{
void *mem = scm_gc_malloc_pointerless (type->size, "foreign");
@@ -1060,7 +1089,7 @@ scm_i_foreign_call (SCM foreign, const SCM *argv)
/* off we go! */
ffi_call (cif, func, rvalue, args);
- return pack (cif->rtype, rvalue);
+ return pack (cif->rtype, rvalue, 1);
}
@@ -1082,7 +1111,7 @@ invoke_closure (ffi_cif *cif, void *ret, void **args, void *data)
/* Pack ARGS to SCM values, setting ARGV pointers. */
for (i = 0; i < cif->nargs; i++)
- argv[i] = pack (cif->arg_types[i], args[i]);
+ argv[i] = pack (cif->arg_types[i], args[i], 0);
result = scm_call_n (proc, argv, cif->nargs);
diff --git a/libguile/hash.c b/libguile/hash.c
index 0dcd1c29e..8448c7ce9 100644
--- a/libguile/hash.c
+++ b/libguile/hash.c
@@ -26,6 +26,7 @@
#include <wchar.h>
#endif
+#include <math.h>
#include <unistr.h>
#include "libguile/_scm.h"
@@ -192,7 +193,7 @@ scm_hasher(SCM obj, unsigned long n, size_t d)
case scm_tc16_real:
{
double r = SCM_REAL_VALUE (obj);
- if (floor (r) == r)
+ if (floor (r) == r && !isinf (r) && !isnan (r))
{
obj = scm_inexact_to_exact (obj);
return scm_to_ulong (scm_modulo (obj, scm_from_ulong (n)));
diff --git a/libguile/inline.h b/libguile/inline.h
index 1eae2e40f..51a4db0ab 100644
--- a/libguile/inline.h
+++ b/libguile/inline.h
@@ -3,7 +3,8 @@
#ifndef SCM_INLINE_H
#define SCM_INLINE_H
-/* Copyright (C) 2001, 2002, 2003, 2004, 2006, 2008, 2009, 2010 Free Software Foundation, Inc.
+/* Copyright (C) 2001, 2002, 2003, 2004, 2006, 2008, 2009, 2010,
+ * 2011 Free Software Foundation, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public License
@@ -98,6 +99,7 @@ SCM_API int scm_is_pair (SCM x);
SCM_API int scm_is_string (SCM x);
SCM_API int scm_get_byte_or_eof (SCM port);
+SCM_API int scm_peek_byte_or_eof (SCM port);
SCM_API void scm_putc (char c, SCM port);
SCM_API void scm_puts (const char *str_data, SCM port);
@@ -362,7 +364,7 @@ scm_get_byte_or_eof (SCM port)
if (pt->read_pos >= pt->read_end)
{
- if (scm_fill_input (port) == EOF)
+ if (SCM_UNLIKELY (scm_fill_input (port) == EOF))
return EOF;
}
@@ -371,6 +373,34 @@ scm_get_byte_or_eof (SCM port)
return c;
}
+/* Like `scm_get_byte_or_eof' but does not change PORT's `read_pos'. */
+#ifndef SCM_INLINE_C_INCLUDING_INLINE_H
+SCM_C_EXTERN_INLINE
+#endif
+int
+scm_peek_byte_or_eof (SCM port)
+{
+ int c;
+ scm_t_port *pt = SCM_PTAB_ENTRY (port);
+
+ if (pt->rw_active == SCM_PORT_WRITE)
+ /* may be marginally faster than calling scm_flush. */
+ scm_ptobs[SCM_PTOBNUM (port)].flush (port);
+
+ if (pt->rw_random)
+ pt->rw_active = SCM_PORT_READ;
+
+ if (pt->read_pos >= pt->read_end)
+ {
+ if (SCM_UNLIKELY (scm_fill_input (port) == EOF))
+ return EOF;
+ }
+
+ c = *pt->read_pos;
+
+ return c;
+}
+
#ifndef SCM_INLINE_C_INCLUDING_INLINE_H
SCM_C_EXTERN_INLINE
#endif
diff --git a/libguile/ports.c b/libguile/ports.c
index b5ad95ec7..926149bf9 100644
--- a/libguile/ports.c
+++ b/libguile/ports.c
@@ -1057,6 +1057,7 @@ update_port_lf (scm_t_wchar c, SCM port)
switch (c)
{
case '\a':
+ case EOF:
break;
case '\b':
SCM_DECCOL (port);
@@ -1115,23 +1116,154 @@ utf8_to_codepoint (const scm_t_uint8 *utf8_buf, size_t size)
return codepoint;
}
-/* Read a codepoint from PORT and return it in *CODEPOINT. Fill BUF
- with the byte representation of the codepoint in PORT's encoding, and
- set *LEN to the length in bytes of that representation. Return 0 on
- success and an errno value on error. */
+/* Read a UTF-8 sequence from PORT. On success, return 0 and set
+ *CODEPOINT to the codepoint that was read, fill BUF with its UTF-8
+ representation, and set *LEN to the length in bytes. Return
+ `EILSEQ' on error. */
static int
-get_codepoint (SCM port, scm_t_wchar *codepoint,
- char buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
+get_utf8_codepoint (SCM port, scm_t_wchar *codepoint,
+ scm_t_uint8 buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
{
+#define ASSERT_NOT_EOF(b) \
+ if (SCM_UNLIKELY ((b) == EOF)) \
+ goto invalid_seq
+#define CONSUME_PEEKED_BYTE() \
+ pt->read_pos++
+
+ int byte;
+ scm_t_port *pt;
+
+ *len = 0;
+ pt = SCM_PTAB_ENTRY (port);
+
+ byte = scm_get_byte_or_eof (port);
+ if (byte == EOF)
+ {
+ *codepoint = EOF;
+ return 0;
+ }
+
+ buf[0] = (scm_t_uint8) byte;
+ *len = 1;
+
+ if (buf[0] <= 0x7f)
+ /* 1-byte form. */
+ *codepoint = buf[0];
+ else if (buf[0] >= 0xc2 && buf[0] <= 0xdf)
+ {
+ /* 2-byte form. */
+ byte = scm_peek_byte_or_eof (port);
+ ASSERT_NOT_EOF (byte);
+
+ if (SCM_UNLIKELY ((byte & 0xc0) != 0x80))
+ goto invalid_seq;
+
+ CONSUME_PEEKED_BYTE ();
+ buf[1] = (scm_t_uint8) byte;
+ *len = 2;
+
+ *codepoint = ((scm_t_wchar) buf[0] & 0x1f) << 6UL
+ | (buf[1] & 0x3f);
+ }
+ else if ((buf[0] & 0xf0) == 0xe0)
+ {
+ /* 3-byte form. */
+ byte = scm_peek_byte_or_eof (port);
+ ASSERT_NOT_EOF (byte);
+
+ if (SCM_UNLIKELY ((byte & 0xc0) != 0x80
+ || (buf[0] == 0xe0 && byte < 0xa0)
+ || (buf[0] == 0xed && byte > 0x9f)))
+ goto invalid_seq;
+
+ CONSUME_PEEKED_BYTE ();
+ buf[1] = (scm_t_uint8) byte;
+ *len = 2;
+
+ byte = scm_peek_byte_or_eof (port);
+ ASSERT_NOT_EOF (byte);
+
+ if (SCM_UNLIKELY ((byte & 0xc0) != 0x80))
+ goto invalid_seq;
+
+ CONSUME_PEEKED_BYTE ();
+ buf[2] = (scm_t_uint8) byte;
+ *len = 3;
+
+ *codepoint = ((scm_t_wchar) buf[0] & 0x0f) << 12UL
+ | ((scm_t_wchar) buf[1] & 0x3f) << 6UL
+ | (buf[2] & 0x3f);
+ }
+ else if (buf[0] >= 0xf0 && buf[0] <= 0xf4)
+ {
+ /* 4-byte form. */
+ byte = scm_peek_byte_or_eof (port);
+ ASSERT_NOT_EOF (byte);
+
+ if (SCM_UNLIKELY (((byte & 0xc0) != 0x80)
+ || (buf[0] == 0xf0 && byte < 0x90)
+ || (buf[0] == 0xf4 && byte > 0x8f)))
+ goto invalid_seq;
+
+ CONSUME_PEEKED_BYTE ();
+ buf[1] = (scm_t_uint8) byte;
+ *len = 2;
+
+ byte = scm_peek_byte_or_eof (port);
+ ASSERT_NOT_EOF (byte);
+
+ if (SCM_UNLIKELY ((byte & 0xc0) != 0x80))
+ goto invalid_seq;
+
+ CONSUME_PEEKED_BYTE ();
+ buf[2] = (scm_t_uint8) byte;
+ *len = 3;
+
+ byte = scm_peek_byte_or_eof (port);
+ ASSERT_NOT_EOF (byte);
+
+ if (SCM_UNLIKELY ((byte & 0xc0) != 0x80))
+ goto invalid_seq;
+
+ CONSUME_PEEKED_BYTE ();
+ buf[3] = (scm_t_uint8) byte;
+ *len = 4;
+
+ *codepoint = ((scm_t_wchar) buf[0] & 0x07) << 18UL
+ | ((scm_t_wchar) buf[1] & 0x3f) << 12UL
+ | ((scm_t_wchar) buf[2] & 0x3f) << 6UL
+ | (buf[3] & 0x3f);
+ }
+ else
+ goto invalid_seq;
+
+ return 0;
+
+ invalid_seq:
+ /* Here we could choose the consume the faulty byte when it's not a
+ valid starting byte, but it's not a requirement. What Section 3.9
+ of Unicode 6.0.0 mandates, though, is to not consume a byte that
+ would otherwise be a valid starting byte. */
+
+ return EILSEQ;
+
+#undef CONSUME_PEEKED_BYTE
+#undef ASSERT_NOT_EOF
+}
+
+/* Likewise, read a byte sequence from PORT, passing it through its
+ input conversion descriptor. */
+static int
+get_iconv_codepoint (SCM port, scm_t_wchar *codepoint,
+ char buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
+{
+ scm_t_port *pt;
int err, byte_read;
size_t bytes_consumed, output_size;
char *output;
scm_t_uint8 utf8_buf[SCM_MBCHAR_BUF_SIZE];
- scm_t_port *pt = SCM_PTAB_ENTRY (port);
- if (SCM_UNLIKELY (pt->input_cd == (iconv_t) -1))
- /* Initialize the conversion descriptors. */
- scm_i_set_port_encoding_x (port, pt->encoding);
+ pt = SCM_PTAB_ENTRY (port);
for (output_size = 0, output = (char *) utf8_buf,
bytes_consumed = 0, err = 0;
@@ -1177,31 +1309,46 @@ get_codepoint (SCM port, scm_t_wchar *codepoint,
if (SCM_UNLIKELY (output_size == 0))
/* An unterminated sequence. */
err = EILSEQ;
-
- if (SCM_UNLIKELY (err != 0))
+ else if (SCM_LIKELY (err == 0))
{
- /* Reset the `iconv' state. */
- iconv (pt->input_cd, NULL, NULL, NULL, NULL);
+ /* Convert the UTF8_BUF sequence to a Unicode code point. */
+ *codepoint = utf8_to_codepoint (utf8_buf, output_size);
+ *len = bytes_consumed;
+ }
- if (pt->ilseq_handler == SCM_ICONVEH_QUESTION_MARK)
- {
- *codepoint = '?';
- err = 0;
- }
+ return err;
+}
- /* Fail when the strategy is SCM_ICONVEH_ERROR or
- SCM_ICONVEH_ESCAPE_SEQUENCE (the latter doesn't make sense for
- input encoding errors.) */
- }
+/* Read a codepoint from PORT and return it in *CODEPOINT. Fill BUF
+ with the byte representation of the codepoint in PORT's encoding, and
+ set *LEN to the length in bytes of that representation. Return 0 on
+ success and an errno value on error. */
+static int
+get_codepoint (SCM port, scm_t_wchar *codepoint,
+ char buf[SCM_MBCHAR_BUF_SIZE], size_t *len)
+{
+ int err;
+ scm_t_port *pt = SCM_PTAB_ENTRY (port);
+
+ if (pt->input_cd == (iconv_t) -1)
+ /* Initialize the conversion descriptors, if needed. */
+ scm_i_set_port_encoding_x (port, pt->encoding);
+
+ /* FIXME: In 2.1, add a flag to determine whether a port is UTF-8. */
+ if (pt->input_cd == (iconv_t) -1)
+ err = get_utf8_codepoint (port, codepoint, (scm_t_uint8 *) buf, len);
else
+ err = get_iconv_codepoint (port, codepoint, buf, len);
+
+ if (SCM_LIKELY (err == 0))
+ update_port_lf (*codepoint, port);
+ else if (pt->ilseq_handler == SCM_ICONVEH_QUESTION_MARK)
{
- /* Convert the UTF8_BUF sequence to a Unicode code point. */
- *codepoint = utf8_to_codepoint (utf8_buf, output_size);
+ *codepoint = '?';
+ err = 0;
update_port_lf (*codepoint, port);
}
- *len = bytes_consumed;
-
return err;
}
@@ -2031,28 +2178,35 @@ scm_i_set_port_encoding_x (SCM port, const char *encoding)
if (encoding == NULL)
encoding = "ISO-8859-1";
- pt->encoding = scm_gc_strdup (encoding, "port");
+ if (pt->encoding != encoding)
+ pt->encoding = scm_gc_strdup (encoding, "port");
- if (SCM_CELL_WORD_0 (port) & SCM_RDNG)
+ /* If ENCODING is UTF-8, then no conversion descriptor is opened
+ because we do I/O ourselves. This saves 100+ KiB for each
+ descriptor. */
+ if (strcmp (encoding, "UTF-8"))
{
- /* Open an input iconv conversion descriptor, from ENCODING
- to UTF-8. We choose UTF-8, not UTF-32, because iconv
- implementations can typically convert from anything to
- UTF-8, but not to UTF-32 (see
- <http://lists.gnu.org/archive/html/bug-libunistring/2010-09/msg00007.html>). */
- new_input_cd = iconv_open ("UTF-8", encoding);
- if (new_input_cd == (iconv_t) -1)
- goto invalid_encoding;
- }
+ if (SCM_CELL_WORD_0 (port) & SCM_RDNG)
+ {
+ /* Open an input iconv conversion descriptor, from ENCODING
+ to UTF-8. We choose UTF-8, not UTF-32, because iconv
+ implementations can typically convert from anything to
+ UTF-8, but not to UTF-32 (see
+ <http://lists.gnu.org/archive/html/bug-libunistring/2010-09/msg00007.html>). */
+ new_input_cd = iconv_open ("UTF-8", encoding);
+ if (new_input_cd == (iconv_t) -1)
+ goto invalid_encoding;
+ }
- if (SCM_CELL_WORD_0 (port) & SCM_WRTNG)
- {
- new_output_cd = iconv_open (encoding, "UTF-8");
- if (new_output_cd == (iconv_t) -1)
+ if (SCM_CELL_WORD_0 (port) & SCM_WRTNG)
{
- if (new_input_cd != (iconv_t) -1)
- iconv_close (new_input_cd);
- goto invalid_encoding;
+ new_output_cd = iconv_open (encoding, "UTF-8");
+ if (new_output_cd == (iconv_t) -1)
+ {
+ if (new_input_cd != (iconv_t) -1)
+ iconv_close (new_input_cd);
+ goto invalid_encoding;
+ }
}
}
diff --git a/libguile/print.c b/libguile/print.c
index 139956624..453c8a9f0 100644
--- a/libguile/print.c
+++ b/libguile/print.c
@@ -821,31 +821,57 @@ codepoint_to_utf8 (scm_t_wchar ch, scm_t_uint8 utf8[4])
return len;
}
-/* Display the LEN codepoints in STR to PORT according to STRATEGY;
- return the number of codepoints successfully displayed. If NARROW_P,
- then STR is interpreted as a sequence of `char', denoting a Latin-1
- string; otherwise it's interpreted as a sequence of
- `scm_t_wchar'. */
-static size_t
-display_string (const void *str, int narrow_p,
- size_t len, SCM port,
- scm_t_string_failed_conversion_handler strategy)
-
-{
#define STR_REF(s, x) \
(narrow_p \
? (scm_t_wchar) ((unsigned char *) (s))[x] \
: ((scm_t_wchar *) (s))[x])
+/* Write STR to PORT as UTF-8. STR is a LEN-codepoint string; it is
+ narrow if NARROW_P is true, wide otherwise. Return LEN. */
+static size_t
+display_string_as_utf8 (const void *str, int narrow_p, size_t len,
+ SCM port)
+{
+ size_t printed = 0;
+
+ while (len > printed)
+ {
+ size_t utf8_len, i;
+ char *input, utf8_buf[256];
+
+ /* Convert STR to UTF-8. */
+ for (i = printed, utf8_len = 0, input = utf8_buf;
+ i < len && utf8_len + 4 < sizeof (utf8_buf);
+ i++)
+ {
+ utf8_len += codepoint_to_utf8 (STR_REF (str, i),
+ (scm_t_uint8 *) input);
+ input = utf8_buf + utf8_len;
+ }
+
+ /* INPUT was successfully converted, entirely; print the
+ result. */
+ scm_lfwrite (utf8_buf, utf8_len, port);
+ printed += i - printed;
+ }
+
+ assert (printed == len);
+
+ return len;
+}
+
+/* Convert STR through PORT's output conversion descriptor and write the
+ output to PORT. Return the number of codepoints written. */
+static size_t
+display_string_using_iconv (const void *str, int narrow_p, size_t len,
+ SCM port,
+ scm_t_string_failed_conversion_handler strategy)
+{
size_t printed;
scm_t_port *pt;
pt = SCM_PTAB_ENTRY (port);
- if (SCM_UNLIKELY (pt->output_cd == (iconv_t) -1))
- /* Initialize the conversion descriptors. */
- scm_i_set_port_encoding_x (port, pt->encoding);
-
printed = 0;
while (len > printed)
@@ -928,7 +954,35 @@ display_string (const void *str, int narrow_p,
}
return printed;
+}
+
#undef STR_REF
+
+/* Display the LEN codepoints in STR to PORT according to STRATEGY;
+ return the number of codepoints successfully displayed. If NARROW_P,
+ then STR is interpreted as a sequence of `char', denoting a Latin-1
+ string; otherwise it's interpreted as a sequence of
+ `scm_t_wchar'. */
+static size_t
+display_string (const void *str, int narrow_p,
+ size_t len, SCM port,
+ scm_t_string_failed_conversion_handler strategy)
+
+{
+ scm_t_port *pt;
+
+ pt = SCM_PTAB_ENTRY (port);
+
+ if (pt->output_cd == (iconv_t) -1)
+ /* Initialize the conversion descriptors, if needed. */
+ scm_i_set_port_encoding_x (port, pt->encoding);
+
+ /* FIXME: In 2.1, add a flag to determine whether a port is UTF-8. */
+ if (pt->output_cd == (iconv_t) -1)
+ return display_string_as_utf8 (str, narrow_p, len, port);
+ else
+ return display_string_using_iconv (str, narrow_p, len,
+ port, strategy);
}
/* Attempt to display CH to PORT according to STRATEGY. Return non-zero
diff --git a/libguile/r6rs-ports.c b/libguile/r6rs-ports.c
index 16890505f..015e0b526 100644
--- a/libguile/r6rs-ports.c
+++ b/libguile/r6rs-ports.c
@@ -460,14 +460,11 @@ SCM_DEFINE (scm_lookahead_u8, "lookahead-u8", 1, 0, 0,
SCM_VALIDATE_BINARY_INPUT_PORT (1, port);
- u8 = scm_get_byte_or_eof (port);
+ u8 = scm_peek_byte_or_eof (port);
if (u8 == EOF)
result = SCM_EOF_VAL;
else
- {
- scm_unget_byte (u8, port);
- result = SCM_I_MAKINUM ((scm_t_uint8) u8);
- }
+ result = SCM_I_MAKINUM ((scm_t_uint8) u8);
return result;
}
diff --git a/libguile/read.c b/libguile/read.c
index b36c27c81..676ccf753 100644
--- a/libguile/read.c
+++ b/libguile/read.c
@@ -1135,7 +1135,7 @@ scm_read_scsh_block_comment (scm_t_wchar chr, SCM port)
return SCM_UNSPECIFIED;
}
-static inline SCM
+static SCM
scm_read_shebang (scm_t_wchar chr, SCM port)
{
int c = 0;
diff --git a/module/ice-9/vlist.scm b/module/ice-9/vlist.scm
index 34c7c00c1..d5e28d540 100644
--- a/module/ice-9/vlist.scm
+++ b/module/ice-9/vlist.scm
@@ -33,7 +33,7 @@
vhash? vhash-cons vhash-consq vhash-consv
vhash-assoc vhash-assq vhash-assv
vhash-delete vhash-delq vhash-delv
- vhash-fold
+ vhash-fold vhash-fold-right
vhash-fold* vhash-foldq* vhash-foldv*
alist->vhash))
@@ -245,7 +245,14 @@ tail."
(define (vlist-fold-right proc init vlist)
"Fold over @var{vlist}, calling @var{proc} for each element, starting from
the last element."
- (vlist-fold proc init (vlist-reverse vlist)))
+ (define len (vlist-length vlist))
+
+ (let loop ((index (1- len))
+ (result init))
+ (if (< index 0)
+ result
+ (loop (1- index)
+ (proc (vlist-ref vlist index) result)))))
(define (vlist-reverse vlist)
"Return a new @var{vlist} whose content are those of @var{vlist} in reverse
@@ -553,6 +560,16 @@ with @var{equal?}."
seed
vhash))
+(define (vhash-fold-right proc seed vhash)
+ "Fold over the key/pair elements of @var{vhash}, starting from the 0th
+element. For each pair call @var{proc} as @code{(@var{proc} key value
+result)}."
+ (vlist-fold-right (lambda (key+value result)
+ (proc (car key+value) (cdr key+value)
+ result))
+ seed
+ vhash))
+
(define* (alist->vhash alist #:optional (hash hash))
"Return the vhash corresponding to @var{alist}, an association list."
(fold-right (lambda (pair result)
diff --git a/module/language/glil/compile-assembly.scm b/module/language/glil/compile-assembly.scm
index 76c19b468..a0818227d 100644
--- a/module/language/glil/compile-assembly.scm
+++ b/module/language/glil/compile-assembly.scm
@@ -1,6 +1,6 @@
;;; Guile VM assembler
-;; Copyright (C) 2001, 2009, 2010 Free Software Foundation, Inc.
+;; Copyright (C) 2001, 2009, 2010, 2011 Free Software Foundation, Inc.
;;;; This library is free software; you can redistribute it and/or
;;;; modify it under the terms of the GNU Lesser General Public
@@ -26,10 +26,36 @@
#:use-module (system vm instruction)
#:use-module ((system vm program) #:select (make-binding))
#:use-module (ice-9 receive)
+ #:use-module (ice-9 vlist)
#:use-module ((srfi srfi-1) #:select (fold))
#:use-module (rnrs bytevectors)
#:export (compile-assembly))
+;; Traversal helpers
+;;
+(define (vhash-fold-right2 proc vhash s0 s1)
+ (let lp ((i (vlist-length vhash)) (s0 s0) (s1 s1))
+ (if (zero? i)
+ (values s0 s1)
+ (receive (s0 s1) (let ((pair (vlist-ref vhash (1- i))))
+ (proc (car pair) (cdr pair) s0 s1))
+ (lp (1- i) s0 s1)))))
+
+(define (fold2 proc ls s0 s1)
+ (let lp ((ls ls) (s0 s0) (s1 s1))
+ (if (null? ls)
+ (values s0 s1)
+ (receive (s0 s1) (proc (car ls) s0 s1)
+ (lp (cdr ls) s0 s1)))))
+
+(define (vector-fold2 proc vect s0 s1)
+ (let ((len (vector-length vect)))
+ (let lp ((i 0) (s0 s0) (s1 s1))
+ (if (< i len)
+ (receive (s0 s1) (proc (vector-ref vect i) s0 s1)
+ (lp (1+ i) s0 s1))
+ (values s0 s1)))))
+
;; Variable cache cells go in the object table, and serialize as their
;; keys. The reason we wrap the keys in these records is so they don't
;; compare as `equal?' to other objects in the object table.
@@ -38,13 +64,6 @@
(define-record <variable-cache-cell> key)
-;; Subprograms can be loaded into an object table as well. We need a
-;; disjoint type here too. (Subprograms have their own object tables --
-;; though probably we should just make one table per compilation unit.)
-
-(define-record <subprogram> table prog)
-
-
(define (limn-sources sources)
(let lp ((in sources) (out '()) (filename #f))
(if (null? in)
@@ -68,16 +87,132 @@
(else
(lp (cdr in) out filename)))))))
+
+;; Avoid going through the compiler so as to avoid adding to the
+;; constant store.
(define (make-meta bindings sources arities tail)
- ;; sounds silly, but the only case in which we have no arities is when
- ;; compiling a meta procedure.
- (if (and (null? bindings) (null? sources) (null? arities) (null? tail))
- #f
- (compile-assembly
- (make-glil-program '()
- (list
- (make-glil-const `(,bindings ,sources ,arities ,@tail))
- (make-glil-call 'return 1))))))
+ (let ((body `(,@(dump-object `(,bindings ,sources ,arities ,@tail) 0)
+ (return))))
+ `(load-program ()
+ ,(addr+ 0 body)
+ #f
+ ,@body)))
+
+;; If this is true, the object doesn't need to go in a constant table.
+;;
+(define (immediate? x)
+ (object->assembly x))
+
+;; Note: in all of these procedures that build up constant tables, the
+;; first (zeroth) index is reserved. At runtime it is replaced with the
+;; procedure's module. Hence all of this 1+ length business.
+
+;; Build up a vhash of constant -> index, allowing us to build up a
+;; constant table for a whole compilation unit.
+;;
+(define (build-constant-store x)
+ (define (add-to-store store x)
+ (define (add-to-end store x)
+ (vhash-cons x (1+ (vlist-length store)) store))
+ (cond
+ ((vhash-assoc x store)
+ ;; Already in the store.
+ store)
+ ((immediate? x)
+ ;; Immediates don't need to go in the constant table.
+ store)
+ ((or (number? x)
+ (string? x)
+ (symbol? x)
+ (keyword? x))
+ ;; Atoms.
+ (add-to-end store x))
+ ((variable-cache-cell? x)
+ ;; Variable cache cells (see below).
+ (add-to-end (add-to-store store (variable-cache-cell-key x))
+ x))
+ ((list? x)
+ ;; Add the elements to the store, then the list itself. We could
+ ;; try hashing the cdrs as well, but that seems a bit overkill, and
+ ;; this way we do compress the bytecode a bit by allowing the use of
+ ;; the `list' opcode.
+ (let ((store (fold (lambda (x store)
+ (add-to-store store x))
+ store
+ x)))
+ (add-to-end store x)))
+ ((pair? x)
+ ;; Non-lists get caching on both fields.
+ (let ((store (add-to-store (add-to-store store (car x))
+ (cdr x))))
+ (add-to-end store x)))
+ ((and (vector? x)
+ (equal? (array-shape x) (list (list 0 (1- (vector-length x))))))
+ ;; Likewise, add the elements to the store, then the vector itself.
+ ;; Important for the vectors produced by the psyntax expansion
+ ;; process.
+ (let ((store (fold (lambda (x store)
+ (add-to-store store x))
+ store
+ (vector->list x))))
+ (add-to-end store x)))
+ ((array? x)
+ ;; Naive assumption that if folks are using arrays, that perhaps
+ ;; there's not much more duplication.
+ (add-to-end store x))
+ (else
+ (error "build-constant-store: unrecognized object" x))))
+
+ (let walk ((x x) (store vlist-null))
+ (record-case x
+ ((<glil-program> meta body)
+ (fold walk store body))
+ ((<glil-const> obj)
+ (add-to-store store obj))
+ ((<glil-kw-prelude> kw)
+ (add-to-store store kw))
+ ((<glil-toplevel> op name)
+ ;; We don't add toplevel variable cache cells to the global
+ ;; constant table, because they are sensitive to changes in
+ ;; modules as the toplevel expressions are evaluated. So we just
+ ;; add the name.
+ (add-to-store store name))
+ ((<glil-module> op mod name public?)
+ ;; However, it is fine add module variable cache cells to the
+ ;; global table, as their bindings are not dependent on the
+ ;; current module.
+ (add-to-store store
+ (make-variable-cache-cell (list mod name public?))))
+ (else store))))
+
+;; Analyze one <glil-program> to determine its object table. Produces a
+;; vhash of constant to index.
+;;
+(define (build-object-table x)
+ (define (add store x)
+ (if (vhash-assoc x store)
+ store
+ (vhash-cons x (1+ (vlist-length store)) store)))
+ (record-case x
+ ((<glil-program> meta body)
+ (fold (lambda (x table)
+ (record-case x
+ ((<glil-program> meta body)
+ ;; Add the GLIL itself to the table.
+ (add table x))
+ ((<glil-const> obj)
+ (if (immediate? obj)
+ table
+ (add table obj)))
+ ((<glil-kw-prelude> kw)
+ (add table kw))
+ ((<glil-toplevel> op name)
+ (add table (make-variable-cache-cell name)))
+ ((<glil-module> op mod name public?)
+ (add table (make-variable-cache-cell (list mod name public?))))
+ (else table)))
+ vlist-null
+ body))))
;; A functional stack of names of live variables.
(define (make-open-binding name boxed? index)
@@ -115,21 +250,6 @@
(lambda (x y) (< (car x) (car y)))))
(close-all-bindings (close-binding bindings end) end)))
-;; A functional object table.
-(define *module* 1)
-(define (assoc-ref-or-acons alist x make-y)
- (cond ((assoc-ref alist x)
- => (lambda (y) (values y alist)))
- (else
- (let ((y (make-y x alist)))
- (values y (acons x y alist))))))
-(define (object-index-and-alist x alist)
- (assoc-ref-or-acons alist x
- (lambda (x alist)
- (+ (length alist) *module*))))
-(define (make-object-table objects)
- (and (not (null? objects))
- (list->vector (cons #f objects))))
;; A functional arities thingamajiggy.
;; arities := ((ip nreq [[nopt] [[rest] [kw]]]]) ...)
@@ -152,82 +272,151 @@
(open-arity start nreq nopt rest kw (close-arity end arities)))
(define (compile-assembly glil)
- (receive (code . _)
- (glil->assembly glil #t '(()) '() '() #f '() -1)
- (car code)))
+ (let* ((all-constants (build-constant-store glil))
+ (prog (compile-program glil all-constants))
+ (len (byte-length prog)))
+ ;; The top objcode thunk. We're going to wrap this thunk in
+ ;; a thunk -- yo dawgs -- with the goal being to lift all
+ ;; constants up to the top level. The store forms a DAG, so
+ ;; we can actually build up later elements in terms of
+ ;; earlier ones.
+ ;;
+ (cond
+ ((vlist-null? all-constants)
+ ;; No constants: just emit the inner thunk.
+ prog)
+ (else
+ ;; We have an object store, so write it out, attach it
+ ;; to the inner thunk, and tail call.
+ (receive (tablecode addr) (dump-constants all-constants)
+ (let ((prog (align-program prog addr)))
+ ;; Outer thunk.
+ `(load-program ()
+ ,(+ (addr+ addr prog)
+ 2 ; for (tail-call 0)
+ )
+ #f
+ ;; Load the table, build the inner
+ ;; thunk, then tail call.
+ ,@tablecode
+ ,@prog
+ (tail-call 0))))))))
-(define (glil->assembly glil toplevel? bindings
- source-alist label-alist object-alist arities addr)
+(define (compile-program glil constants)
+ (record-case glil
+ ((<glil-program> meta body)
+ (let lp ((body body) (code '()) (bindings '(())) (source-alist '())
+ (label-alist '()) (arities '()) (addr 0))
+ (cond
+ ((null? body)
+ (let ((code (fold append '() code))
+ (bindings (close-all-bindings bindings addr))
+ (sources (limn-sources (reverse! source-alist)))
+ (labels (reverse label-alist))
+ (arities (reverse (close-arity addr arities)))
+ (len addr))
+ (let* ((meta (make-meta bindings sources arities meta))
+ (meta-pad (if meta (modulo (- 8 (modulo len 8)) 8) 0)))
+ `(load-program ,labels
+ ,(+ len meta-pad)
+ ,meta
+ ,@code
+ ,@(if meta
+ (make-list meta-pad '(nop))
+ '())))))
+ (else
+ (receive (subcode bindings source-alist label-alist arities)
+ (glil->assembly (car body) bindings
+ source-alist label-alist
+ constants arities addr)
+ (lp (cdr body) (cons subcode code)
+ bindings source-alist label-alist arities
+ (addr+ addr subcode)))))))))
+
+(define (compile-objtable constants table addr)
+ (define (load-constant idx)
+ (if (< idx 256)
+ (values `((object-ref ,idx))
+ 2)
+ (values `((long-object-ref
+ ,(quotient idx 256) ,(modulo idx 256)))
+ 3)))
+ (cond
+ ((vlist-null? table)
+ ;; Empty table; just return #f.
+ (values '((make-false))
+ (1+ addr)))
+ (else
+ (call-with-values
+ (lambda ()
+ (vhash-fold-right2
+ (lambda (obj idx codes addr)
+ (cond
+ ((vhash-assoc obj constants)
+ => (lambda (pair)
+ (receive (load len) (load-constant (cdr pair))
+ (values (cons load codes)
+ (+ addr len)))))
+ ((variable-cache-cell? obj)
+ (cond
+ ((vhash-assoc (variable-cache-cell-key obj) constants)
+ => (lambda (pair)
+ (receive (load len) (load-constant (cdr pair))
+ (values (cons load codes)
+ (+ addr len)))))
+ (else (error "vcache cell key not in table" obj))))
+ ((glil-program? obj)
+ ;; Programs are not cached in the global constants
+ ;; table because when a program is loaded, its module
+ ;; is bound, and we want to do that only after any
+ ;; preceding effectful statements.
+ (let* ((table (build-object-table obj))
+ (prog (compile-program obj table)))
+ (receive (tablecode addr)
+ (compile-objtable constants table addr)
+ (let ((prog (align-program prog addr)))
+ (values (cons `(,@tablecode ,@prog)
+ codes)
+ (addr+ addr prog))))))
+ (else
+ (error "unrecognized constant" obj))))
+ table
+ '(((make-false))) (1+ addr)))
+ (lambda (elts addr)
+ (let ((len (1+ (vlist-length table))))
+ (values
+ (fold append
+ `((vector ,(quotient len 256) ,(modulo len 256)))
+ elts)
+ (+ addr 3))))))))
+
+(define (glil->assembly glil bindings source-alist label-alist
+ constants arities addr)
(define (emit-code x)
- (values x bindings source-alist label-alist object-alist arities))
- (define (emit-code/object x object-alist)
- (values x bindings source-alist label-alist object-alist arities))
+ (values x bindings source-alist label-alist arities))
+ (define (emit-object-ref i)
+ (values (if (< i 256)
+ `((object-ref ,i))
+ `((long-object-ref ,(quotient i 256) ,(modulo i 256))))
+ bindings source-alist label-alist arities))
(define (emit-code/arity x nreq nopt rest kw)
- (values x bindings source-alist label-alist object-alist
+ (values x bindings source-alist label-alist
(begin-arity addr (addr+ addr x) nreq nopt rest kw arities)))
(record-case glil
((<glil-program> meta body)
- (define (process-body)
- (let lp ((body body) (code '()) (bindings '(())) (source-alist '())
- (label-alist '()) (object-alist (if toplevel? #f '()))
- (arities '()) (addr 0))
- (cond
- ((null? body)
- (values (reverse code)
- (close-all-bindings bindings addr)
- (limn-sources (reverse! source-alist))
- (reverse label-alist)
- (and object-alist (map car (reverse object-alist)))
- (reverse (close-arity addr arities))
- addr))
- (else
- (receive (subcode bindings source-alist label-alist object-alist
- arities)
- (glil->assembly (car body) #f bindings
- source-alist label-alist object-alist
- arities addr)
- (lp (cdr body) (append (reverse subcode) code)
- bindings source-alist label-alist object-alist arities
- (addr+ addr subcode)))))))
-
- (receive (code bindings sources labels objects arities len)
- (process-body)
- (let* ((meta (make-meta bindings sources arities meta))
- (meta-pad (if meta (modulo (- 8 (modulo len 8)) 8) 0))
- (prog `(load-program ,labels
- ,(+ len meta-pad)
- ,meta
- ,@code
- ,@(if meta
- (make-list meta-pad '(nop))
- '()))))
- (cond
- (toplevel?
- ;; toplevel bytecode isn't loaded by the vm, no way to do
- ;; object table or closure capture (not in the bytecode,
- ;; anyway)
- (emit-code (align-program prog addr)))
- (else
- (let ((table (make-object-table objects)))
- (cond
- (object-alist
- ;; if we are being compiled from something with an object
- ;; table, cache the program there
- (receive (i object-alist)
- (object-index-and-alist (make-subprogram table prog)
- object-alist)
- (emit-code/object `(,(if (< i 256)
- `(object-ref ,i)
- `(long-object-ref ,(quotient i 256)
- ,(modulo i 256))))
- object-alist)))
- (else
- ;; otherwise emit a load directly
- (let ((table-code (dump-object table addr)))
- (emit-code
- `(,@table-code
- ,@(align-program prog (addr+ addr table-code)))))))))))))
+ (cond
+ ((vhash-assoc glil constants)
+ ;; We are cached in someone's objtable; just emit a load.
+ => (lambda (pair)
+ (emit-object-ref (cdr pair))))
+ (else
+ ;; Otherwise, build an objtable for the program, compile it, and
+ ;; emit a load-program.
+ (let* ((table (build-object-table glil))
+ (prog (compile-program glil table)))
+ (receive (tablecode addr) (compile-objtable constants table addr)
+ (emit-code `(,@tablecode ,@(align-program prog addr))))))))
((<glil-std-prelude> nreq nlocs else-label)
(emit-code/arity
@@ -277,61 +466,60 @@
nreq nopt rest #f)))
((<glil-kw-prelude> nreq nopt rest kw allow-other-keys? nlocs else-label)
- (receive (kw-idx object-alist)
- (object-index-and-alist kw object-alist)
- (let* ((bind-required
- (if else-label
- `((br-if-nargs-lt ,(quotient nreq 256)
- ,(modulo nreq 256)
- ,else-label))
- `((assert-nargs-ge ,(quotient nreq 256)
- ,(modulo nreq 256)))))
- (ntotal (apply max (+ nreq nopt) (map 1+ (map cdr kw))))
- (bind-optionals-and-shuffle
- `((bind-optionals/shuffle
- ,(quotient nreq 256)
- ,(modulo nreq 256)
- ,(quotient (+ nreq nopt) 256)
- ,(modulo (+ nreq nopt) 256)
- ,(quotient ntotal 256)
- ,(modulo ntotal 256))))
- (bind-kw
- ;; when this code gets called, all optionals are filled
- ;; in, space has been made for kwargs, and the kwargs
- ;; themselves have been shuffled above the slots for all
- ;; req/opt/kwargs locals.
- `((bind-kwargs
- ,(quotient kw-idx 256)
- ,(modulo kw-idx 256)
- ,(quotient ntotal 256)
- ,(modulo ntotal 256)
- ,(logior (if rest 2 0)
- (if allow-other-keys? 1 0)))))
- (bind-rest
- (if rest
- `((bind-rest ,(quotient ntotal 256)
- ,(modulo ntotal 256)
- ,(quotient rest 256)
- ,(modulo rest 256)))
- '())))
+ (let* ((kw-idx (or (and=> (vhash-assoc kw constants) cdr)
+ (error "kw not in objtable")))
+ (bind-required
+ (if else-label
+ `((br-if-nargs-lt ,(quotient nreq 256)
+ ,(modulo nreq 256)
+ ,else-label))
+ `((assert-nargs-ge ,(quotient nreq 256)
+ ,(modulo nreq 256)))))
+ (ntotal (apply max (+ nreq nopt) (map 1+ (map cdr kw))))
+ (bind-optionals-and-shuffle
+ `((bind-optionals/shuffle
+ ,(quotient nreq 256)
+ ,(modulo nreq 256)
+ ,(quotient (+ nreq nopt) 256)
+ ,(modulo (+ nreq nopt) 256)
+ ,(quotient ntotal 256)
+ ,(modulo ntotal 256))))
+ (bind-kw
+ ;; when this code gets called, all optionals are filled
+ ;; in, space has been made for kwargs, and the kwargs
+ ;; themselves have been shuffled above the slots for all
+ ;; req/opt/kwargs locals.
+ `((bind-kwargs
+ ,(quotient kw-idx 256)
+ ,(modulo kw-idx 256)
+ ,(quotient ntotal 256)
+ ,(modulo ntotal 256)
+ ,(logior (if rest 2 0)
+ (if allow-other-keys? 1 0)))))
+ (bind-rest
+ (if rest
+ `((bind-rest ,(quotient ntotal 256)
+ ,(modulo ntotal 256)
+ ,(quotient rest 256)
+ ,(modulo rest 256)))
+ '())))
- (let ((code `(,@bind-required
- ,@bind-optionals-and-shuffle
- ,@bind-kw
- ,@bind-rest
- (reserve-locals ,(quotient nlocs 256)
- ,(modulo nlocs 256)))))
- (values code bindings source-alist label-alist object-alist
- (begin-arity addr (addr+ addr code) nreq nopt rest
- (and kw (cons allow-other-keys? kw))
- arities))))))
+ (let ((code `(,@bind-required
+ ,@bind-optionals-and-shuffle
+ ,@bind-kw
+ ,@bind-rest
+ (reserve-locals ,(quotient nlocs 256)
+ ,(modulo nlocs 256)))))
+ (values code bindings source-alist label-alist
+ (begin-arity addr (addr+ addr code) nreq nopt rest
+ (and kw (cons allow-other-keys? kw))
+ arities)))))
((<glil-bind> vars)
(values '()
(open-binding bindings vars addr)
source-alist
label-alist
- object-alist
arities))
((<glil-mv-bind> vars rest)
@@ -340,13 +528,11 @@
bindings
source-alist
label-alist
- object-alist
arities)
(values `((truncate-values ,(length vars) ,(if rest 1 0)))
(open-binding bindings vars addr)
source-alist
label-alist
- object-alist
arities)))
((<glil-unbind>)
@@ -354,7 +540,6 @@
(close-binding bindings addr)
source-alist
label-alist
- object-alist
arities))
((<glil-source> props)
@@ -362,7 +547,6 @@
bindings
(acons addr props source-alist)
label-alist
- object-alist
arities))
((<glil-void>)
@@ -373,16 +557,10 @@
((object->assembly obj)
=> (lambda (code)
(emit-code (list code))))
- ((not object-alist)
- (emit-code (dump-object obj addr)))
- (else
- (receive (i object-alist)
- (object-index-and-alist obj object-alist)
- (emit-code/object (if (< i 256)
- `((object-ref ,i))
- `((long-object-ref ,(quotient i 256)
- ,(modulo i 256))))
- object-alist)))))
+ ((vhash-assoc obj constants)
+ => (lambda (pair)
+ (emit-object-ref (cdr pair))))
+ (else (error "const not in table" obj))))
((<glil-lexical> local? boxed? op index)
(emit-code
@@ -442,30 +620,38 @@
(case op
((ref set)
(cond
- ((not object-alist)
- (emit-code `(,@(dump-object name addr)
- (link-now)
- ,(case op
- ((ref) '(variable-ref))
- ((set) '(variable-set))))))
+ ((and=> (vhash-assoc (make-variable-cache-cell name) constants)
+ cdr)
+ => (lambda (i)
+ (emit-code (if (< i 256)
+ `((,(case op
+ ((ref) 'toplevel-ref)
+ ((set) 'toplevel-set))
+ ,i))
+ `((,(case op
+ ((ref) 'long-toplevel-ref)
+ ((set) 'long-toplevel-set))
+ ,(quotient i 256)
+ ,(modulo i 256)))))))
(else
- (receive (i object-alist)
- (object-index-and-alist (make-variable-cache-cell name)
- object-alist)
- (emit-code/object (if (< i 256)
- `((,(case op
- ((ref) 'toplevel-ref)
- ((set) 'toplevel-set))
- ,i))
- `((,(case op
- ((ref) 'long-toplevel-ref)
- ((set) 'long-toplevel-set))
- ,(quotient i 256)
- ,(modulo i 256))))
- object-alist)))))
+ (let ((i (or (and=> (vhash-assoc name constants) cdr)
+ (error "toplevel name not in objtable" name))))
+ (emit-code `(,(if (< i 256)
+ `(object-ref ,i)
+ `(long-object-ref ,(quotient i 256)
+ ,(modulo i 256)))
+ (link-now)
+ ,(case op
+ ((ref) '(variable-ref))
+ ((set) '(variable-set)))))))))
((define)
- (emit-code `(,@(dump-object name addr)
- (define))))
+ (let ((i (or (and=> (vhash-assoc name constants) cdr)
+ (error "toplevel name not in objtable" name))))
+ (emit-code `(,(if (< i 256)
+ `(object-ref ,i)
+ `(long-object-ref ,(quotient i 256)
+ ,(modulo i 256)))
+ (define)))))
(else
(error "unknown toplevel var kind" op name))))
@@ -473,21 +659,19 @@
(let ((key (list mod name public?)))
(case op
((ref set)
- (cond
- ((not object-alist)
- (emit-code `(,@(dump-object key addr)
- (link-now)
- ,(case op
- ((ref) '(variable-ref))
- ((set) '(variable-set))))))
- (else
- (receive (i object-alist)
- (object-index-and-alist (make-variable-cache-cell key)
- object-alist)
- (emit-code/object (case op
- ((ref) `((toplevel-ref ,i)))
- ((set) `((toplevel-set ,i))))
- object-alist)))))
+ (let ((i (or (and=> (vhash-assoc (make-variable-cache-cell key)
+ constants) cdr)
+ (error "module vcache not in objtable" key))))
+ (emit-code (if (< i 256)
+ `((,(case op
+ ((ref) 'toplevel-ref)
+ ((set) 'toplevel-set))
+ ,i))
+ `((,(case op
+ ((ref) 'long-toplevel-ref)
+ ((set) 'long-toplevel-set))
+ ,(quotient i 256)
+ ,(modulo i 256)))))))
(else
(error "unknown module var kind" op key)))))
@@ -497,7 +681,6 @@
bindings
source-alist
(acons label (addr+ addr code) label-alist)
- object-alist
arities)))
((<glil-branch> inst label)
@@ -533,11 +716,6 @@
(cond
((object->assembly x) => list)
((variable-cache-cell? x) (dump-object (variable-cache-cell-key x) addr))
- ((subprogram? x)
- (let ((table-code (dump-object (subprogram-table x) addr)))
- `(,@table-code
- ,@(align-program (subprogram-prog x)
- (addr+ addr table-code)))))
((number? x)
`((load-number ,(number->string x))))
((string? x)
@@ -608,5 +786,153 @@
,(logand #xff len)))
codes)))))
(else
- (error "assemble: unrecognized object" x))))
+ (error "dump-object: unrecognized object" x))))
+
+(define (dump-constants constants)
+ (define (ref-or-dump x i addr)
+ (let ((pair (vhash-assoc x constants)))
+ (if (and pair (< (cdr pair) i))
+ (let ((idx (cdr pair)))
+ (if (< idx 256)
+ (values `((object-ref ,idx))
+ (+ addr 2))
+ (values `((long-object-ref ,(quotient idx 256)
+ ,(modulo idx 256)))
+ (+ addr 3))))
+ (dump1 x i addr))))
+ (define (dump1 x i addr)
+ (cond
+ ((object->assembly x)
+ => (lambda (code)
+ (values (list code)
+ (+ (byte-length code) addr))))
+ ((or (number? x)
+ (string? x)
+ (symbol? x)
+ (keyword? x))
+ ;; Atoms.
+ (let ((code (dump-object x addr)))
+ (values code (addr+ addr code))))
+ ((variable-cache-cell? x)
+ (dump1 (variable-cache-cell-key x) i addr))
+ ((list? x)
+ (receive (codes addr)
+ (fold2 (lambda (x codes addr)
+ (receive (subcode addr) (ref-or-dump x i addr)
+ (values (cons subcode codes) addr)))
+ x '() addr)
+ (values (fold append
+ (let ((len (length x)))
+ `((list ,(quotient len 256) ,(modulo len 256))))
+ codes)
+ (+ addr 3))))
+ ((pair? x)
+ (receive (car-code addr) (ref-or-dump (car x) i addr)
+ (receive (cdr-code addr) (ref-or-dump (cdr x) i addr)
+ (values `(,@car-code ,@cdr-code (cons))
+ (1+ addr)))))
+ ((and (vector? x)
+ (equal? (array-shape x) (list (list 0 (1- (vector-length x))))))
+ (receive (codes addr)
+ (vector-fold2 (lambda (x codes addr)
+ (receive (subcode addr) (ref-or-dump x i addr)
+ (values (cons subcode codes) addr)))
+ x '() addr)
+ (values (fold append
+ (let ((len (vector-length x)))
+ `((vector ,(quotient len 256) ,(modulo len 256))))
+ codes)
+ (+ addr 3))))
+ ((and (array? x) (symbol? (array-type x)))
+ (receive (type addr) (ref-or-dump (array-type x) i addr)
+ (receive (shape addr) (ref-or-dump (array-shape x) i addr)
+ (let ((bv (align-code `(load-array ,(uniform-array->bytevector x))
+ addr 8 4)))
+ (values `(,@type ,@shape ,@bv)
+ (addr+ addr bv))))))
+ ((array? x)
+ (let ((contents (array-contents x)))
+ (receive (codes addr)
+ (vector-fold2 (lambda (x codes addr)
+ (receive (subcode addr) (ref-or-dump x i addr)
+ (values (cons subcode codes) addr)))
+ x '() addr)
+ (receive (shape addr) (ref-or-dump (array-shape x) i addr)
+ (values (fold append
+ (let ((len (vector-length contents)))
+ `(,@shape
+ (make-array ,(quotient (ash len -16) 256)
+ ,(logand #xff (ash len -8))
+ ,(logand #xff len))))
+ codes)
+ (+ addr 4))))))
+ (else
+ (error "write-table: unrecognized object" x))))
+ (receive (codes addr)
+ (vhash-fold-right2 (lambda (obj idx code addr)
+ ;; The vector is on the stack. Dup it, push
+ ;; the index, push the val, then vector-set.
+ (let ((pre `((dup)
+ ,(object->assembly idx))))
+ (receive (valcode addr) (dump1 obj idx
+ (addr+ addr pre))
+ (values (cons* '((vector-set))
+ valcode
+ pre
+ code)
+ (1+ addr)))))
+ constants
+ '(((assert-nargs-ee/locals 1)
+ ;; Push the vector.
+ (local-ref 0)))
+ 4)
+ (let* ((len (1+ (vlist-length constants)))
+ (pre-prog-addr (+ 2 ; reserve-locals
+ len 3 ; empty vector
+ 2 ; local-set
+ 1 ; new-frame
+ 2 ; local-ref
+ ))
+ (prog (align-program
+ `(load-program ()
+ ,(+ addr 1)
+ #f
+ ;; The `return' will be at the tail of the
+ ;; program. The vector is already pushed
+ ;; on the stack.
+ . ,(fold append '((return)) codes))
+ pre-prog-addr)))
+ (values `(;; Reserve storage for the vector.
+ (assert-nargs-ee/locals ,(logior 0 (ash 1 3)))
+ ;; Push the vector, and store it in slot 0.
+ ,@(make-list len '(make-false))
+ (vector ,(quotient len 256) ,(modulo len 256))
+ (local-set 0)
+ ;; Now we open the call frame.
+ ;;
+ (new-frame)
+ ;; Now build a thunk to init the constants. It will
+ ;; have the unfinished constant table both as its
+ ;; argument and as its objtable. The former allows it
+ ;; to update the objtable, with vector-set!, and the
+ ;; latter allows init code to refer to previously set
+ ;; values.
+ ;;
+ ;; Grab the vector, to be the objtable.
+ (local-ref 0)
+ ;; Now the load-program, properly aligned. Pops the vector.
+ ,@prog
+ ;; Grab the vector, as an argument this time.
+ (local-ref 0)
+ ;; Call the init thunk with the vector as an arg.
+ (call 1)
+ ;; The thunk also returns the vector. Leave it on the
+ ;; stack for compile-assembly to use.
+ )
+ ;; The byte length of the init code, which we can
+ ;; determine without folding over the code again.
+ (+ (addr+ pre-prog-addr prog) ; aligned program
+ 2 ; local-ref
+ 2 ; call
+ )))))
diff --git a/module/rnrs/io/ports.scm b/module/rnrs/io/ports.scm
index 04d167a2c..3dbaa039b 100644
--- a/module/rnrs/io/ports.scm
+++ b/module/rnrs/io/ports.scm
@@ -110,7 +110,9 @@
(rnrs files) ;for the condition types
(srfi srfi-8)
(ice-9 rdelim)
- (except (guile) raise))
+ (except (guile) raise display)
+ (prefix (only (guile) display)
+ guile:))
@@ -377,6 +379,12 @@ return the characters accumulated in that port."
(else
(display s port)))))
+;; Defined here to be able to make use of `with-i/o-encoding-error', but
+;; not exported from here, but from `(rnrs io simple)'.
+(define* (display object #:optional (port (current-output-port)))
+ (with-i/o-encoding-error
+ (guile:display object port)))
+
;;;
;;; Textual input.
diff --git a/module/rnrs/io/simple.scm b/module/rnrs/io/simple.scm
index 59e614de0..031628b38 100644
--- a/module/rnrs/io/simple.scm
+++ b/module/rnrs/io/simple.scm
@@ -1,6 +1,6 @@
;;; simple.scm --- The R6RS simple I/O library
-;; Copyright (C) 2010 Free Software Foundation, Inc.
+;; Copyright (C) 2010, 2011 Free Software Foundation, Inc.
;;
;; This library is free software; you can redistribute it and/or
;; modify it under the terms of the GNU Lesser General Public
@@ -85,42 +85,76 @@
(import (only (rnrs io ports)
call-with-port
+ close-port
open-file-input-port
open-file-output-port
eof-object
- eof-object?
-
+ eof-object?
+ file-options
+ native-transcoder
+ get-char
+ lookahead-char
+ get-datum
+ put-char
+ put-datum
+
input-port?
output-port?)
- (only (guile) @@
- current-input-port
- current-output-port
- current-error-port
-
- with-input-from-file
- with-output-to-file
-
- open-input-file
- open-output-file
-
- close-input-port
- close-output-port
-
- read-char
- peek-char
- read
- write-char
- newline
- display
- write)
+ (only (guile)
+ @@
+ current-input-port
+ current-output-port
+ current-error-port
+
+ define*
+
+ with-input-from-port
+ with-output-to-port)
(rnrs base (6))
(rnrs files (6)) ;for the condition types
)
+ (define display (@@ (rnrs io ports) display))
+
(define (call-with-input-file filename proc)
(call-with-port (open-file-input-port filename) proc))
(define (call-with-output-file filename proc)
(call-with-port (open-file-output-port filename) proc))
-
-)
+
+ (define (with-input-from-file filename thunk)
+ (call-with-input-file filename
+ (lambda (port) (with-input-from-port port thunk))))
+
+ (define (with-output-to-file filename thunk)
+ (call-with-output-file filename
+ (lambda (port) (with-output-to-port port thunk))))
+
+ (define (open-input-file filename)
+ (open-file-input-port filename (file-options) (native-transcoder)))
+
+ (define (open-output-file filename)
+ (open-file-output-port filename (file-options) (native-transcoder)))
+
+ (define close-input-port close-port)
+ (define close-output-port close-port)
+
+ (define* (read-char #:optional (port (current-input-port)))
+ (get-char port))
+
+ (define* (peek-char #:optional (port (current-input-port)))
+ (lookahead-char port))
+
+ (define* (read #:optional (port (current-input-port)))
+ (get-datum port))
+
+ (define* (write-char char #:optional (port (current-output-port)))
+ (put-char port char))
+
+ (define* (newline #:optional (port (current-output-port)))
+ (put-char port #\newline))
+
+ (define* (write object #:optional (port (current-output-port)))
+ (put-datum port object))
+
+ )
diff --git a/module/srfi/srfi-1.scm b/module/srfi/srfi-1.scm
index bcaca65d7..c60f6257f 100644
--- a/module/srfi/srfi-1.scm
+++ b/module/srfi/srfi-1.scm
@@ -566,6 +566,8 @@ has just one element then that's the return value."
(mapn (cdr l1) (map cdr rest) (1- len)
(cons (apply f (car l1) (map car rest)) out))))))))
+(define map-in-order map)
+
(define for-each
(case-lambda
((f l)
diff --git a/test-suite/tests/foreign.test b/test-suite/tests/foreign.test
index 60b466e1c..5ddd31c1d 100644
--- a/test-suite/tests/foreign.test
+++ b/test-suite/tests/foreign.test
@@ -124,24 +124,32 @@
(pass-if "pointer from bits"
(let* ((bytes (iota (sizeof '*)))
- (bv (u8-list->bytevector bytes)))
+ (bv (u8-list->bytevector bytes))
+ (fold (case (native-endianness)
+ ((little) fold-right)
+ ((big) fold)
+ (else (error "unsupported endianness")))))
(= (pointer-address
(make-pointer (bytevector-uint-ref bv 0 (native-endianness)
(sizeof '*))))
- (fold-right (lambda (byte address)
- (+ byte (* 256 address)))
- 0
- bytes))))
+ (fold (lambda (byte address)
+ (+ byte (* 256 address)))
+ 0
+ bytes))))
(pass-if "dereference-pointer"
(let* ((bytes (iota (sizeof '*)))
- (bv (u8-list->bytevector bytes)))
+ (bv (u8-list->bytevector bytes))
+ (fold (case (native-endianness)
+ ((little) fold-right)
+ ((big) fold)
+ (else (error "unsupported endianness")))))
(= (pointer-address
(dereference-pointer (bytevector->pointer bv)))
- (fold-right (lambda (byte address)
- (+ byte (* 256 address)))
- 0
- bytes)))))
+ (fold (lambda (byte address)
+ (+ byte (* 256 address)))
+ 0
+ bytes)))))
(with-test-prefix "pointer<->string"
diff --git a/test-suite/tests/hash.test b/test-suite/tests/hash.test
index d2bde481c..f3d603db3 100644
--- a/test-suite/tests/hash.test
+++ b/test-suite/tests/hash.test
@@ -1,6 +1,6 @@
;;;; hash.test --- test guile hashing -*- scheme -*-
;;;;
-;;;; Copyright (C) 2004, 2005, 2006, 2008 Free Software Foundation, Inc.
+;;;; Copyright (C) 2004, 2005, 2006, 2008, 2011 Free Software Foundation, Inc.
;;;;
;;;; This library is free software; you can redistribute it and/or
;;;; modify it under the terms of the GNU Lesser General Public
@@ -32,7 +32,10 @@
(hash #t 0))
(pass-if (= 0 (hash #t 1)))
(pass-if (= 0 (hash #f 1)))
- (pass-if (= 0 (hash noop 1))))
+ (pass-if (= 0 (hash noop 1)))
+ (pass-if (= 0 (hash +inf.0 1)))
+ (pass-if (= 0 (hash -inf.0 1)))
+ (pass-if (= 0 (hash +nan.0 1))))
;;;
;;; hashv
diff --git a/test-suite/tests/ports.test b/test-suite/tests/ports.test
index 9fb6a9632..d4a333f56 100644
--- a/test-suite/tests/ports.test
+++ b/test-suite/tests/ports.test
@@ -572,21 +572,40 @@
eof))
(test-decoding-error (#xc2 #x41 #x42) "UTF-8"
- (error ;; 41: should be in the 80..BF range
+ ;; Section 3.9 of Unicode 6.0.0 reads:
+ ;; "If the converter encounters an ill-formed UTF-8 code unit
+ ;; sequence which starts with a valid first byte, but which does
+ ;; not continue with valid successor bytes (see Table 3-7), it
+ ;; must not consume the successor bytes".
+ ;; Glibc/libiconv do not conform to it and instead swallow the
+ ;; #x41. This example appears literally in Section 3.9.
+ (error ;; 41: invalid successor
+ #\A ;; 41: valid starting byte
#\B
eof))
- (test-decoding-error (#xe0 #x88 #x88) "UTF-8"
+ (test-decoding-error (#xf0 #x80 #x80 #x41) "UTF-8"
+ ;; According to Unicode 6.0.0, Section 3.9, "the only formal
+ ;; requirement mandated by Unicode conformance for a converter is
+ ;; that the <41> be processed and correctly interpreted as
+ ;; <U+0041>".
(error ;; 2nd byte should be in the A0..BF range
+ error ;; 80: not a valid starting byte
+ error ;; 80: not a valid starting byte
+ #\A
eof))
(test-decoding-error (#xe0 #xa0 #x41 #x42) "UTF-8"
(error ;; 3rd byte should be in the 80..BF range
+ #\A
#\B
eof))
(test-decoding-error (#xf0 #x88 #x88 #x88) "UTF-8"
(error ;; 2nd byte should be in the 90..BF range
+ error ;; 88: not a valid starting byte
+ error ;; 88: not a valid starting byte
+ error ;; 88: not a valid starting byte
eof))))
(with-test-prefix "call-with-output-string"
diff --git a/test-suite/tests/vlist.test b/test-suite/tests/vlist.test
index b590bbda1..d939284c1 100644
--- a/test-suite/tests/vlist.test
+++ b/test-suite/tests/vlist.test
@@ -301,6 +301,13 @@
(alist (fold alist-cons '() keys values)))
(equal? alist (reverse (vhash-fold alist-cons '() vh)))))
+ (pass-if "vhash-fold-right"
+ (let* ((keys '(a b c d e f g d h i))
+ (values '(1 2 3 4 5 6 7 0 8 9))
+ (vh (fold vhash-cons vlist-null keys values))
+ (alist (fold alist-cons '() keys values)))
+ (equal? alist (vhash-fold-right alist-cons '() vh))))
+
(pass-if "alist->vhash"
(let* ((keys '(a b c d e f g d h i))
(values '(1 2 3 4 5 6 7 0 8 9))