@c -*-texinfo-*- @c This is part of the GNU Guile Reference Manual. @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006, 2007, @c 2009, 2010, 2017 Free Software Foundation, Inc. @c See the file guile.texi for copying conditions. @node Internationalization @section Support for Internationalization @cindex internationalization @cindex i18n Guile provides internationalization@footnote{For concision and style, programmers often like to refer to internationalization as ``i18n''.} support for Scheme programs in two ways. First, procedures to manipulate text and data in a way that conforms to particular cultural conventions (i.e., in a ``locale-dependent'' way) are provided in the @code{(ice-9 i18n)}. Second, Guile allows the use of GNU @code{gettext} to translate program message strings. @menu * i18n Introduction:: Introduction to Guile's i18n support. * Text Collation:: Sorting strings and characters. * Character Case Mapping:: Case mapping. * Number Input and Output:: Parsing and printing numbers. * Accessing Locale Information:: Detailed locale information. * Gettext Support:: Translating message strings. @end menu @node i18n Introduction, Text Collation, Internationalization, Internationalization @subsection Internationalization with Guile In order to make use of the functions described thereafter, the @code{(ice-9 i18n)} module must be imported in the usual way: @example (use-modules (ice-9 i18n)) @end example @cindex cultural conventions The @code{(ice-9 i18n)} module provides procedures to manipulate text and other data in a way that conforms to the cultural conventions chosen by the user. Each region of the world or language has its own customs to, for instance, represent real numbers, classify characters, collate text, etc. All these aspects comprise the so-called ``cultural conventions'' of that region or language. @cindex locale @cindex locale category Computer systems typically refer to a set of cultural conventions as a @dfn{locale}. For each particular aspect that comprise those cultural conventions, a @dfn{locale category} is defined. For instance, the way characters are classified is defined by the @code{LC_CTYPE} category, while the language in which program messages are issued to the user is defined by the @code{LC_MESSAGES} category (@pxref{Locales, General Locale Information} for details). @cindex locale object The procedures provided by this module allow the development of programs that adapt automatically to any locale setting. As we will see later, many of these procedures can optionally take a @dfn{locale object} argument. This additional argument defines the locale settings that must be followed by the invoked procedure. When it is omitted, then the current locale settings of the process are followed (@pxref{Locales, @code{setlocale}}). The following procedures allow the manipulation of such locale objects. @deffn {Scheme Procedure} make-locale category-list locale-name [base-locale] @deffnx {C Function} scm_make_locale (category_list, locale_name, base_locale) Return a reference to a data structure representing a set of locale datasets. @var{locale-name} should be a string denoting a particular locale (e.g., @code{"aa_DJ"}) and @var{category-list} should be either a list of locale categories or a single category as used with @code{setlocale} (@pxref{Locales, @code{setlocale}}). Optionally, if @code{base-locale} is passed, it should be a locale object denoting settings for categories not listed in @var{category-list}. The following invocation creates a locale object that combines the use of Swedish for messages and character classification with the default settings for the other categories (i.e., the settings of the default @code{C} locale which usually represents conventions in use in the USA): @example (make-locale (list LC_MESSAGES LC_CTYPE) "sv_SE") @end example The following example combines the use of Esperanto messages and conventions with monetary conventions from Croatia: @example (make-locale LC_MONETARY "hr_HR" (make-locale LC_ALL "eo_EO")) @end example A @code{system-error} exception (@pxref{Handling Errors}) is raised by @code{make-locale} when @var{locale-name} does not match any of the locales compiled on the system. Note that on non-GNU systems, this error may be raised later, when the locale object is actually used. @end deffn @deffn {Scheme Procedure} locale? obj @deffnx {C Function} scm_locale_p (obj) Return true if @var{obj} is a locale object. @end deffn @defvr {Scheme Variable} %global-locale @defvrx {C Variable} scm_global_locale This variable is bound to a locale object denoting the current process locale as installed using @code{setlocale ()} (@pxref{Locales}). It may be used like any other locale object, including as a third argument to @code{make-locale}, for instance. @end defvr @node Text Collation, Character Case Mapping, i18n Introduction, Internationalization @subsection Text Collation The following procedures provide support for text collation, i.e., locale-dependent string and character sorting. @deffn {Scheme Procedure} string-locale? s1 s2 [locale] @deffnx {C Function} scm_string_locale_gt (s1, s2, locale) @deffnx {Scheme Procedure} string-locale-ci? s1 s2 [locale] @deffnx {C Function} scm_string_locale_ci_gt (s1, s2, locale) Compare strings @var{s1} and @var{s2} in a locale-dependent way. If @var{locale} is provided, it should be locale object (as returned by @code{make-locale}) and will be used to perform the comparison; otherwise, the current system locale is used. For the @code{-ci} variants, the comparison is made in a case-insensitive way. @end deffn @deffn {Scheme Procedure} string-locale-ci=? s1 s2 [locale] @deffnx {C Function} scm_string_locale_ci_eq (s1, s2, locale) Compare strings @var{s1} and @var{s2} in a case-insensitive, and locale-dependent way. If @var{locale} is provided, it should be a locale object (as returned by @code{make-locale}) and will be used to perform the comparison; otherwise, the current system locale is used. @end deffn @deffn {Scheme Procedure} char-locale? c1 c2 [locale] @deffnx {C Function} scm_char_locale_gt (c1, c2, locale) @deffnx {Scheme Procedure} char-locale-ci? c1 c2 [locale] @deffnx {C Function} scm_char_locale_ci_gt (c1, c2, locale) Compare characters @var{c1} and @var{c2} according to either @var{locale} (a locale object as returned by @code{make-locale}) or the current locale. For the @code{-ci} variants, the comparison is made in a case-insensitive way. @end deffn @deffn {Scheme Procedure} char-locale-ci=? c1 c2 [locale] @deffnx {C Function} scm_char_locale_ci_eq (c1, c2, locale) Return true if character @var{c1} is equal to @var{c2}, in a case insensitive way according to @var{locale} or to the current locale. @end deffn @node Character Case Mapping, Number Input and Output, Text Collation, Internationalization @subsection Character Case Mapping The procedures below provide support for ``character case mapping'', i.e., to convert characters or strings to their upper-case or lower-case equivalent. Note that SRFI-13 provides procedures that look similar (@pxref{Alphabetic Case Mapping}). However, the SRFI-13 procedures are locale-independent. Therefore, they do not take into account specificities of the customs in use in a particular language or region of the world. For instance, while most languages using the Latin alphabet map lower-case letter ``i'' to upper-case letter ``I'', Turkish maps lower-case ``i'' to ``Latin capital letter I with dot above''. The following procedures allow programmers to provide idiomatic character mapping. @deffn {Scheme Procedure} char-locale-downcase chr [locale] @deffnx {C Function} scm_char_locale_upcase (chr, locale) Return the lowercase character that corresponds to @var{chr} according to either @var{locale} or the current locale. @end deffn @deffn {Scheme Procedure} char-locale-upcase chr [locale] @deffnx {C Function} scm_char_locale_downcase (chr, locale) Return the uppercase character that corresponds to @var{chr} according to either @var{locale} or the current locale. @end deffn @deffn {Scheme Procedure} char-locale-titlecase chr [locale] @deffnx {C Function} scm_char_locale_titlecase (chr, locale) Return the titlecase character that corresponds to @var{chr} according to either @var{locale} or the current locale. @end deffn @deffn {Scheme Procedure} string-locale-upcase str [locale] @deffnx {C Function} scm_string_locale_upcase (str, locale) Return a new string that is the uppercase version of @var{str} according to either @var{locale} or the current locale. @end deffn @deffn {Scheme Procedure} string-locale-downcase str [locale] @deffnx {C Function} scm_string_locale_downcase (str, locale) Return a new string that is the down-case version of @var{str} according to either @var{locale} or the current locale. @end deffn @deffn {Scheme Procedure} string-locale-titlecase str [locale] @deffnx {C Function} scm_string_locale_titlecase (str, locale) Return a new string that is the titlecase version of @var{str} according to either @var{locale} or the current locale. @end deffn @node Number Input and Output, Accessing Locale Information, Character Case Mapping, Internationalization @subsection Number Input and Output The following procedures allow programs to read and write numbers written according to a particular locale. As an example, in English, ``ten thousand and a half'' is usually written @code{10,000.5} while in French it is written @code{10 000,5}. These procedures allow such differences to be taken into account. @findex strtod @deffn {Scheme Procedure} locale-string->integer str [base [locale]] @deffnx {C Function} scm_locale_string_to_integer (str, base, locale) Convert string @var{str} into an integer according to either @var{locale} (a locale object as returned by @code{make-locale}) or the current process locale. If @var{base} is specified, then it determines the base of the integer being read (e.g., @code{16} for an hexadecimal number, @code{10} for a decimal number); by default, decimal numbers are read. Return two values (@pxref{Multiple Values}): an integer (on success) or @code{#f}, and the number of characters read from @var{str} (@code{0} on failure). This function is based on the C library's @code{strtol} function (@pxref{Parsing of Integers, @code{strtol},, libc, The GNU C Library Reference Manual}). @end deffn @findex strtod @deffn {Scheme Procedure} locale-string->inexact str [locale] @deffnx {C Function} scm_locale_string_to_inexact (str, locale) Convert string @var{str} into an inexact number according to either @var{locale} (a locale object as returned by @code{make-locale}) or the current process locale. Return two values (@pxref{Multiple Values}): an inexact number (on success) or @code{#f}, and the number of characters read from @var{str} (@code{0} on failure). This function is based on the C library's @code{strtod} function (@pxref{Parsing of Floats, @code{strtod},, libc, The GNU C Library Reference Manual}). @end deffn @deffn {Scheme Procedure} number->locale-string number [fraction-digits [locale]] Convert @var{number} (an inexact) into a string according to the cultural conventions of either @var{locale} (a locale object) or the current locale. By default, print as many fractional digits as necessary, up to an upper bound. Optionally, @var{fraction-digits} may be bound to an integer specifying the number of fractional digits to be displayed. @end deffn @deffn {Scheme Procedure} monetary-amount->locale-string amount intl? [locale] Convert @var{amount} (an inexact denoting a monetary amount) into a string according to the cultural conventions of either @var{locale} (a locale object) or the current locale. If @var{intl?} is true, then the international monetary format for the given locale is used (@pxref{Currency Symbol, international and locale monetary formats,, libc, The GNU C Library Reference Manual}). @end deffn @node Accessing Locale Information, Gettext Support, Number Input and Output, Internationalization @subsection Accessing Locale Information @findex nl_langinfo @cindex low-level locale information It is sometimes useful to obtain very specific information about a locale such as the word it uses for days or months, its format for representing floating-point figures, etc. The @code{(ice-9 i18n)} module provides support for this in a way that is similar to the libc functions @code{nl_langinfo ()} and @code{localeconv ()} (@pxref{Locale Information, accessing locale information from C,, libc, The GNU C Library Reference Manual}). The available functions are listed below. @deffn {Scheme Procedure} locale-encoding [locale] Return the name of the encoding (a string whose interpretation is system-dependent) of either @var{locale} or the current locale. @end deffn The following functions deal with dates and times. @deffn {Scheme Procedure} locale-day day [locale] @deffnx {Scheme Procedure} locale-day-short day [locale] @deffnx {Scheme Procedure} locale-month month [locale] @deffnx {Scheme Procedure} locale-month-short month [locale] Return the word (a string) used in either @var{locale} or the current locale to name the day (or month) denoted by @var{day} (or @var{month}), an integer between 1 and 7 (or 1 and 12). The @code{-short} variants provide an abbreviation instead of a full name. @end deffn @deffn {Scheme Procedure} locale-am-string [locale] @deffnx {Scheme Procedure} locale-pm-string [locale] Return a (potentially empty) string that is used to denote @i{ante meridiem} (or @i{post meridiem}) hours in 12-hour format. @end deffn @deffn {Scheme Procedure} locale-date+time-format [locale] @deffnx {Scheme Procedure} locale-date-format [locale] @deffnx {Scheme Procedure} locale-time-format [locale] @deffnx {Scheme Procedure} locale-time+am/pm-format [locale] @deffnx {Scheme Procedure} locale-era-date-format [locale] @deffnx {Scheme Procedure} locale-era-date+time-format [locale] @deffnx {Scheme Procedure} locale-era-time-format [locale] These procedures return format strings suitable to @code{strftime} (@pxref{Time}) that may be used to display (part of) a date/time according to certain constraints and to the conventions of either @var{locale} or the current locale (@pxref{The Elegant and Fast Way, the @code{nl_langinfo ()} items,, libc, The GNU C Library Reference Manual}). @end deffn @deffn {Scheme Procedure} locale-era [locale] @deffnx {Scheme Procedure} locale-era-year [locale] These functions return, respectively, the era and the year of the relevant era used in @var{locale} or the current locale. Most locales do not define this value. In this case, the empty string is returned. An example of a locale that does define this value is the Japanese one. @end deffn The following procedures give information about number representation. @deffn {Scheme Procedure} locale-decimal-point [locale] @deffnx {Scheme Procedure} locale-thousands-separator [locale] These functions return a string denoting the representation of the decimal point or that of the thousand separator (respectively) for either @var{locale} or the current locale. @end deffn @deffn {Scheme Procedure} locale-digit-grouping [locale] Return a (potentially circular) list of integers denoting how digits of the integer part of a number are to be grouped, starting at the decimal point and going to the left. The list contains integers indicating the size of the successive groups, from right to left. If the list is non-circular, then no grouping occurs for digits beyond the last group. For instance, if the returned list is a circular list that contains only @code{3} and the thousand separator is @code{","} (as is the case with English locales), then the number @code{12345678} should be printed @code{12,345,678}. @end deffn The following procedures deal with the representation of monetary amounts. Some of them take an additional @var{intl?} argument (a boolean) that tells whether the international or local monetary conventions for the given locale are to be used. @deffn {Scheme Procedure} locale-monetary-decimal-point [locale] @deffnx {Scheme Procedure} locale-monetary-thousands-separator [locale] @deffnx {Scheme Procedure} locale-monetary-grouping [locale] These are the monetary counterparts of the above procedures. These procedures apply to monetary amounts. @end deffn @deffn {Scheme Procedure} locale-currency-symbol intl? [locale] Return the currency symbol (a string) of either @var{locale} or the current locale. The following example illustrates the difference between the local and international monetary formats: @example (define us (make-locale LC_MONETARY "en_US")) (locale-currency-symbol #f us) @result{} "-$" (locale-currency-symbol #t us) @result{} "USD " @end example @end deffn @deffn {Scheme Procedure} locale-monetary-fractional-digits intl? [locale] Return the number of fractional digits to be used when printing monetary amounts according to either @var{locale} or the current locale. If the locale does not specify it, then @code{#f} is returned. @end deffn @deffn {Scheme Procedure} locale-currency-symbol-precedes-positive? intl? [locale] @deffnx {Scheme Procedure} locale-currency-symbol-precedes-negative? intl? [locale] @deffnx {Scheme Procedure} locale-positive-separated-by-space? intl? [locale] @deffnx {Scheme Procedure} locale-negative-separated-by-space? intl? [locale] These procedures return a boolean indicating whether the currency symbol should precede a positive/negative number, and whether a whitespace should be inserted between the currency symbol and a positive/negative amount. @end deffn @deffn {Scheme Procedure} locale-monetary-positive-sign [locale] @deffnx {Scheme Procedure} locale-monetary-negative-sign [locale] Return a string denoting the positive (respectively negative) sign that should be used when printing a monetary amount. @end deffn @deffn {Scheme Procedure} locale-positive-sign-position @deffnx {Scheme Procedure} locale-negative-sign-position These functions return a symbol telling where a sign of a positive/negative monetary amount is to appear when printing it. The possible values are: @table @code @item parenthesize The currency symbol and quantity should be surrounded by parentheses. @item sign-before Print the sign string before the quantity and currency symbol. @item sign-after Print the sign string after the quantity and currency symbol. @item sign-before-currency-symbol Print the sign string right before the currency symbol. @item sign-after-currency-symbol Print the sign string right after the currency symbol. @item unspecified Unspecified. We recommend you print the sign after the currency symbol. @end table @end deffn Finally, the two following procedures may be helpful when programming user interfaces: @deffn {Scheme Procedure} locale-yes-regexp [locale] @deffnx {Scheme Procedure} locale-no-regexp [locale] Return a string that can be used as a regular expression to recognize a positive (respectively, negative) response to a yes/no question. For the C locale, the default values are typically @code{"^[yY]"} and @code{"^[nN]"}, respectively. Here is an example: @example (use-modules (ice-9 rdelim)) (format #t "Does Guile rock?~%") (let lp ((answer (read-line))) (cond ((string-match (locale-yes-regexp) answer) (format #t "High fives!~%")) ((string-match (locale-no-regexp) answer) (format #t "How about now? Does it rock yet?~%") (lp (read-line))) (else (format #t "What do you mean?~%") (lp (read-line))))) @end example For an internationalized yes/no string output, @code{gettext} should be used (@pxref{Gettext Support}). @end deffn Example uses of some of these functions are the implementation of the @code{number->locale-string} and @code{monetary-amount->locale-string} procedures (@pxref{Number Input and Output}), as well as that the SRFI-19 date and time conversion to/from strings (@pxref{SRFI-19}). @node Gettext Support, , Accessing Locale Information, Internationalization @subsection Gettext Support Guile provides an interface to GNU @code{gettext} for translating message strings (@pxref{Introduction,,, gettext, GNU @code{gettext} utilities}). Messages are collected in domains, so different libraries and programs maintain different message catalogues. The @var{domain} parameter in the functions below is a string (it becomes part of the message catalog filename). When @code{gettext} is not available, or if Guile was configured @samp{--without-nls}, dummy functions doing no translation are provided. When @code{gettext} support is available in Guile, the @code{i18n} feature is provided (@pxref{Feature Tracking}). @deffn {Scheme Procedure} gettext msg [domain [category]] @deffnx {C Function} scm_gettext (msg, domain, category) Return the translation of @var{msg} in @var{domain}. @var{domain} is optional and defaults to the domain set through @code{textdomain} below. @var{category} is optional and defaults to @code{LC_MESSAGES} (@pxref{Locales}). Normal usage is for @var{msg} to be a literal string. @command{xgettext} can extract those from the source to form a message catalogue ready for translators (@pxref{xgettext Invocation,, Invoking the @command{xgettext} Program, gettext, GNU @code{gettext} utilities}). @example (display (gettext "You are in a maze of twisty passages.")) @end example It is conventional to use @code{G_} as a shorthand for @code{gettext}.@footnote{Users of @code{gettext} might be a bit surprised that @code{G_} is the conventional abbreviation for @code{gettext}. In most other languages, the conventional shorthand is @code{_}. Guile uses @code{G_} because @code{_} is already taken, as it is bound to a syntactic keyword used by @code{syntax-rules}, @code{match}, and other macros.} Libraries can define @code{G_} in such a way to look up translations using its specific @var{domain}, allowing different parts of a program to have different translation sources. @example (define (G_ msg) (gettext msg "mylibrary")) (display (G_ "File not found.")) @end example @code{G_} is also a good place to perhaps strip disambiguating extra text from the message string, as for instance in @ref{GUI program problems,, How to use @code{gettext} in GUI programs, gettext, GNU @code{gettext} utilities}. @end deffn @deffn {Scheme Procedure} ngettext msg msgplural n [domain [category]] @deffnx {C Function} scm_ngettext (msg, msgplural, n, domain, category) Return the translation of @var{msg}/@var{msgplural} in @var{domain}, with a plural form chosen appropriately for the number @var{n}. @var{domain} is optional and defaults to the domain set through @code{textdomain} below. @var{category} is optional and defaults to @code{LC_MESSAGES} (@pxref{Locales}). @var{msg} is the singular form, and @var{msgplural} the plural. When no translation is available, @var{msg} is used if @math{@var{n} = 1}, or @var{msgplural} otherwise. When translated, the message catalogue can have a different rule, and can have more than two possible forms. As per @code{gettext} above, normal usage is for @var{msg} and @var{msgplural} to be literal strings, since @command{xgettext} can extract them from the source to build a message catalogue. For example, @example (define (done n) (format #t (ngettext "~a file processed\n" "~a files processed\n" n) n)) (done 1) @print{} 1 file processed (done 3) @print{} 3 files processed @end example It's important to use @code{ngettext} rather than plain @code{gettext} for plurals, since the rules for singular and plural forms in English are not the same in other languages. Only @code{ngettext} will allow translators to give correct forms (@pxref{Plural forms,, Additional functions for plural forms, gettext, GNU @code{gettext} utilities}). @end deffn @deffn {Scheme Procedure} textdomain [domain] @deffnx {C Function} scm_textdomain (domain) Get or set the default gettext domain. When called with no parameter the current domain is returned. When called with a parameter, @var{domain} is set as the current domain, and that new value returned. For example, @example (textdomain "myprog") @result{} "myprog" @end example @end deffn @deffn {Scheme Procedure} bindtextdomain domain [directory] @deffnx {C Function} scm_bindtextdomain (domain, directory) Get or set the directory under which to find message files for @var{domain}. When called without a @var{directory} the current setting is returned. When called with a @var{directory}, @var{directory} is set for @var{domain} and that new setting returned. For example, @example (bindtextdomain "myprog" "/my/tree/share/locale") @result{} "/my/tree/share/locale" @end example When using Autoconf/Automake, an application should arrange for the configured @code{localedir} to get into the program (by substituting, or by generating a config file) and set that for its domain. This ensures the catalogue can be found even when installed in a non-standard location. @end deffn @deffn {Scheme Procedure} bind-textdomain-codeset domain [encoding] @deffnx {C Function} scm_bind_textdomain_codeset (domain, encoding) Get or set the text encoding to be used by @code{gettext} for messages from @var{domain}. @var{encoding} is a string, the name of a coding system, for instance @nicode{"8859_1"}. (On a Unix/POSIX system the @command{iconv} program can list all available encodings.) When called without an @var{encoding} the current setting is returned, or @code{#f} if none yet set. When called with an @var{encoding}, it is set for @var{domain} and that new setting returned. For example, @example (bind-textdomain-codeset "myprog") @result{} #f (bind-textdomain-codeset "myprog" "latin-9") @result{} "latin-9" @end example The encoding requested can be different from the translated data file, messages will be recoded as necessary. But note that when there is no translation, @code{gettext} returns its @var{msg} unchanged, ie.@: without any recoding. For that reason source message strings are best as plain ASCII. Currently Guile has no understanding of multi-byte characters, and string functions won't recognise character boundaries in multi-byte strings. An application will at least be able to pass such strings through to some output though. Perhaps this will change in the future. @end deffn @c Local Variables: @c TeX-master: "guile.texi" @c ispell-local-dictionary: "american" @c End: