diff options
author | Andy Wingo <wingo@pobox.com> | 2013-01-11 21:15:28 +0100 |
---|---|---|
committer | Andy Wingo <wingo@pobox.com> | 2013-01-11 21:15:28 +0100 |
commit | 5ed4ea90a9abe64c024bbc0c664476b0673556b3 (patch) | |
tree | 66b036d41adf3f9cae651dff5870b9d1078797da | |
parent | 990b11c53f8da2a6c14e1190bc4e76939db32d07 (diff) | |
download | guile-5ed4ea90a9abe64c024bbc0c664476b0673556b3.tar.gz |
Change iconv procedures to take optional instead of keyword arg
* module/ice-9/iconv.scm (call-with-encoded-output-string):
(string->bytevector, bytevector->string): Take an optional instead of
a keyword argument.
* doc/ref/api-data.texi (Representing Strings as Bytes): Adapt docs to
change, and fix a number of errors. Thanks to Ludovic Courtès for the
pointers.
* test-suite/tests/iconv.test ("wide non-ascii string"): Add a test for
the 'substitute path.
-rw-r--r-- | doc/ref/api-data.texi | 26 | ||||
-rw-r--r-- | module/ice-9/iconv.scm | 11 | ||||
-rw-r--r-- | test-suite/tests/iconv.test | 7 |
3 files changed, 29 insertions, 15 deletions
diff --git a/doc/ref/api-data.texi b/doc/ref/api-data.texi index 3bd38d28b..21398f48d 100644 --- a/doc/ref/api-data.texi +++ b/doc/ref/api-data.texi @@ -4190,6 +4190,11 @@ sequences of bytes. @xref{Bytevectors}, for more on how Guile represents raw byte sequences. This module gets its name from the common @sc{unix} command of the same name. +Note that often it is sufficient to just read and write strings from +ports instead of using these functions. To do this, specify the port +encoding using @code{set-port-encoding!}. @xref{Ports}, for more on +ports and character encodings. + Unlike the rest of the procedures in this section, you have to load the @code{iconv} module before having access to these procedures: @@ -4197,31 +4202,32 @@ Unlike the rest of the procedures in this section, you have to load the (use-modules (ice-9 iconv)) @end example -@deffn string->bytevector string encoding [#:conversion-strategy='error] +@deffn string->bytevector string encoding [conversion-strategy] Encode @var{string} as a sequence of bytes. The string will be encoded in the character set specified by the @var{encoding} string. If the string has characters that cannot be represented in the encoding, by default this procedure raises an -@code{encoding-error}, though the @code{#:conversion-strategy} keyword -can specify other behaviors. +@code{encoding-error}. Pass a @var{conversion-strategy} argument to +specify other behaviors. The return value is a bytevector. @xref{Bytevectors}, for more on bytevectors. @xref{Ports}, for more on character encodings and conversion strategies. @end deffn -@deffn bytevector->string bytevector encoding +@deffn bytevector->string bytevector encoding [conversion-strategy] Decode @var{bytevector} into a string. The bytes will be decoded from the character set by the @var{encoding} string. If the bytes do not form a valid encoding, by default this -procedure raises an @code{decoding-error}, though that may be overridden -with the @code{#:conversion-strategy} keyword. @xref{Ports}, for more -on character encodings and conversion strategies. +procedure raises an @code{decoding-error}. As with +@code{string->bytevector}, pass the optional @var{conversion-strategy} +argument to modify this behavior. @xref{Ports}, for more on character +encodings and conversion strategies. @end deffn -@deffn call-with-output-encoded-string encoding proc [#:conversion-strategy='error] +@deffn call-with-output-encoded-string encoding proc [conversion-strategy] Like @code{call-with-output-string}, but instead of returning a string, returns a encoding of the string according to @var{encoding}, as a bytevector. This procedure can be more efficient than collecting a @@ -4371,7 +4377,7 @@ If @var{lenp} is @code{NULL}, this function will return a null-terminated C string. It will throw an error if the string contains a null character. -The Scheme interface to this function is @code{encode-string}, from the +The Scheme interface to this function is @code{string->bytevector}, from the @code{ice-9 iconv} module. @xref{Representing Strings as Bytes}. @end deftypefn @@ -4382,7 +4388,7 @@ string is passed as the ASCII, null-terminated C string @code{encoding}. The @var{handler} parameters suggests a strategy for dealing with unconvertable characters. -The Scheme interface to this function is @code{decode-string}. +The Scheme interface to this function is @code{bytevector->string}. @xref{Representing Strings as Bytes}. @end deftypefn diff --git a/module/ice-9/iconv.scm b/module/ice-9/iconv.scm index 40d595473..0f0c1a3cf 100644 --- a/module/ice-9/iconv.scm +++ b/module/ice-9/iconv.scm @@ -43,7 +43,8 @@ bv)))) (define* (call-with-encoded-output-string encoding proc - #:key (conversion-strategy 'error)) + #:optional + (conversion-strategy 'error)) (if (string-ci=? encoding "utf-8") ;; I don't know why, but this appears to be faster; at least for ;; serving examples/debug-sxml.scm (1464 reqs/s versus 850 @@ -59,16 +60,18 @@ ;; TODO: Provide C implementations that call scm_from_stringn and ;; friends? -(define* (string->bytevector str encoding #:key (conversion-strategy 'error)) +(define* (string->bytevector str encoding + #:optional (conversion-strategy 'error)) (if (string-ci=? encoding "utf-8") (string->utf8 str) (call-with-encoded-output-string encoding (lambda (port) (display str port)) - #:conversion-strategy conversion-strategy))) + conversion-strategy))) -(define* (bytevector->string bv encoding #:key (conversion-strategy 'error)) +(define* (bytevector->string bv encoding + #:optional (conversion-strategy 'error)) (if (string-ci=? encoding "utf-8") (utf8->string bv) (let ((p (open-bytevector-input-port bv))) diff --git a/test-suite/tests/iconv.test b/test-suite/tests/iconv.test index e6ee90d1d..9083cd256 100644 --- a/test-suite/tests/iconv.test +++ b/test-suite/tests/iconv.test @@ -112,4 +112,9 @@ (string->bytevector s "ascii")) (pass-if-exception "encode as latin1" exception:encoding-error - (string->bytevector s "latin1")))) + (string->bytevector s "latin1")) + + (pass-if "encode as ascii with substitutions" + (equal? (make-string (string-length s) #\?) + (bytevector->string (string->bytevector s "ascii" 'substitute) + "ascii"))))) |