summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorAndy Wingo <wingo@pobox.com>2013-01-11 21:15:28 +0100
committerAndy Wingo <wingo@pobox.com>2013-01-11 21:15:28 +0100
commit5ed4ea90a9abe64c024bbc0c664476b0673556b3 (patch)
tree66b036d41adf3f9cae651dff5870b9d1078797da
parent990b11c53f8da2a6c14e1190bc4e76939db32d07 (diff)
downloadguile-5ed4ea90a9abe64c024bbc0c664476b0673556b3.tar.gz
Change iconv procedures to take optional instead of keyword arg
* module/ice-9/iconv.scm (call-with-encoded-output-string): (string->bytevector, bytevector->string): Take an optional instead of a keyword argument. * doc/ref/api-data.texi (Representing Strings as Bytes): Adapt docs to change, and fix a number of errors. Thanks to Ludovic Courtès for the pointers. * test-suite/tests/iconv.test ("wide non-ascii string"): Add a test for the 'substitute path.
-rw-r--r--doc/ref/api-data.texi26
-rw-r--r--module/ice-9/iconv.scm11
-rw-r--r--test-suite/tests/iconv.test7
3 files changed, 29 insertions, 15 deletions
diff --git a/doc/ref/api-data.texi b/doc/ref/api-data.texi
index 3bd38d28b..21398f48d 100644
--- a/doc/ref/api-data.texi
+++ b/doc/ref/api-data.texi
@@ -4190,6 +4190,11 @@ sequences of bytes. @xref{Bytevectors}, for more on how Guile
represents raw byte sequences. This module gets its name from the
common @sc{unix} command of the same name.
+Note that often it is sufficient to just read and write strings from
+ports instead of using these functions. To do this, specify the port
+encoding using @code{set-port-encoding!}. @xref{Ports}, for more on
+ports and character encodings.
+
Unlike the rest of the procedures in this section, you have to load the
@code{iconv} module before having access to these procedures:
@@ -4197,31 +4202,32 @@ Unlike the rest of the procedures in this section, you have to load the
(use-modules (ice-9 iconv))
@end example
-@deffn string->bytevector string encoding [#:conversion-strategy='error]
+@deffn string->bytevector string encoding [conversion-strategy]
Encode @var{string} as a sequence of bytes.
The string will be encoded in the character set specified by the
@var{encoding} string. If the string has characters that cannot be
represented in the encoding, by default this procedure raises an
-@code{encoding-error}, though the @code{#:conversion-strategy} keyword
-can specify other behaviors.
+@code{encoding-error}. Pass a @var{conversion-strategy} argument to
+specify other behaviors.
The return value is a bytevector. @xref{Bytevectors}, for more on
bytevectors. @xref{Ports}, for more on character encodings and
conversion strategies.
@end deffn
-@deffn bytevector->string bytevector encoding
+@deffn bytevector->string bytevector encoding [conversion-strategy]
Decode @var{bytevector} into a string.
The bytes will be decoded from the character set by the @var{encoding}
string. If the bytes do not form a valid encoding, by default this
-procedure raises an @code{decoding-error}, though that may be overridden
-with the @code{#:conversion-strategy} keyword. @xref{Ports}, for more
-on character encodings and conversion strategies.
+procedure raises an @code{decoding-error}. As with
+@code{string->bytevector}, pass the optional @var{conversion-strategy}
+argument to modify this behavior. @xref{Ports}, for more on character
+encodings and conversion strategies.
@end deffn
-@deffn call-with-output-encoded-string encoding proc [#:conversion-strategy='error]
+@deffn call-with-output-encoded-string encoding proc [conversion-strategy]
Like @code{call-with-output-string}, but instead of returning a string,
returns a encoding of the string according to @var{encoding}, as a
bytevector. This procedure can be more efficient than collecting a
@@ -4371,7 +4377,7 @@ If @var{lenp} is @code{NULL}, this function will return a null-terminated C
string. It will throw an error if the string contains a null
character.
-The Scheme interface to this function is @code{encode-string}, from the
+The Scheme interface to this function is @code{string->bytevector}, from the
@code{ice-9 iconv} module. @xref{Representing Strings as Bytes}.
@end deftypefn
@@ -4382,7 +4388,7 @@ string is passed as the ASCII, null-terminated C string @code{encoding}.
The @var{handler} parameters suggests a strategy for dealing with
unconvertable characters.
-The Scheme interface to this function is @code{decode-string}.
+The Scheme interface to this function is @code{bytevector->string}.
@xref{Representing Strings as Bytes}.
@end deftypefn
diff --git a/module/ice-9/iconv.scm b/module/ice-9/iconv.scm
index 40d595473..0f0c1a3cf 100644
--- a/module/ice-9/iconv.scm
+++ b/module/ice-9/iconv.scm
@@ -43,7 +43,8 @@
bv))))
(define* (call-with-encoded-output-string encoding proc
- #:key (conversion-strategy 'error))
+ #:optional
+ (conversion-strategy 'error))
(if (string-ci=? encoding "utf-8")
;; I don't know why, but this appears to be faster; at least for
;; serving examples/debug-sxml.scm (1464 reqs/s versus 850
@@ -59,16 +60,18 @@
;; TODO: Provide C implementations that call scm_from_stringn and
;; friends?
-(define* (string->bytevector str encoding #:key (conversion-strategy 'error))
+(define* (string->bytevector str encoding
+ #:optional (conversion-strategy 'error))
(if (string-ci=? encoding "utf-8")
(string->utf8 str)
(call-with-encoded-output-string
encoding
(lambda (port)
(display str port))
- #:conversion-strategy conversion-strategy)))
+ conversion-strategy)))
-(define* (bytevector->string bv encoding #:key (conversion-strategy 'error))
+(define* (bytevector->string bv encoding
+ #:optional (conversion-strategy 'error))
(if (string-ci=? encoding "utf-8")
(utf8->string bv)
(let ((p (open-bytevector-input-port bv)))
diff --git a/test-suite/tests/iconv.test b/test-suite/tests/iconv.test
index e6ee90d1d..9083cd256 100644
--- a/test-suite/tests/iconv.test
+++ b/test-suite/tests/iconv.test
@@ -112,4 +112,9 @@
(string->bytevector s "ascii"))
(pass-if-exception "encode as latin1" exception:encoding-error
- (string->bytevector s "latin1"))))
+ (string->bytevector s "latin1"))
+
+ (pass-if "encode as ascii with substitutions"
+ (equal? (make-string (string-length s) #\?)
+ (bytevector->string (string->bytevector s "ascii" 'substitute)
+ "ascii")))))